A Dip Inside The Aperture Library

The nightmare scenario is this: Steve Jobs is dead and now Steve Ballmer runs Apple. A mandatory system update to install a DRM-protected Clippy on Mac OS X has wrecked things so badly that Aperture will not launch. You have 500TB of images in Aperture libraries that you can no longer view, sort, search, or export.

Now what do you do?

I recently had a look inside the Aperture library to see what I could see. The results are described here. For this investigation I put together a test library that looks like this in Aperture:
The Aperture library lives by default in your Photos folder inside your home folder and looks like the kind of washing machine that Jonathon Ive would design:
If you double click on it, you'll launch Aperture, so that is no good. The Aperture library is a type of OS X folder called a package. That's really just a folder with some special information that makes the Finder display it and manipulate it as a single icon. So control-click on the library and select Show Package Contents. My example looks like this:
The built-in Smart Album folder contains these small files:
and it is obvious how these connect to the built-in Smart Albums that the library shows:
The folders Blog, Commercial, Government, and Retail all correspond to the Blue Folders I created, so clearly the Aperture library employs a folder structure that follows the one displayed.

info.plist is a standard OS X property list file. It contains structured data about the library package that relates properties to things:
That is XML. The green header references information about how the information is presented and what the dict and key words mean. The rest gives two properties marked as keys CFBundleGetInfoString and CFBundleShortVersionString with values that are strings, "Aperture Library 1.1.1" and "1.1.1" respectively. Those two values are packaged into a dictionary (dict). If you Get Info in the Aperture library you see the first one displayed as the version.

Lets look inside Aperture.aplib:
More plists, an empty folder with archive information (Vaults?) and a Library.apdb file. That's an SQLite database and if you open it with a text editor you get nothing useful. But checking it with SQLite Database Browser shows the database structure and files:
That database provides fast access to all the images, albums, keywords, projects, filters, and everything else that makes Aperture go. It's actually expendable. If you delete it from the folder and launch Aperture, it will get regenerated from the data stored in the rest of the library. So if you cannot use the database, all you have lost is speed.

The database is there as a back-end to Core Data. Core Data is part of OS X that helps programmers work with persistent data relationships. It manages all the details of performing queries, fetching data, handling deletions, storing changes, and things like that.

Lets check the Blog folder that is at the top level of the library:
Here we see Aperture project files. These are packages too, so to go further, control-click and Show Package Contents:
Now this is getting more interesting. Compare this with the project structure as displayed:
The folder structure matches the Brown Folder structure and the apalbum files correspond to the albums. There are some extra files too. Album.implicitalbum looks like it is a hidden album that displays everything in the Project. The three files AP.minis, AP.Thumbnails, and AP.Tinies are binary files that contain three different sizes of thumbnails. Again, these are there for speed only and can be regenerated from the originals at any time.

Info.apfolder is another plist file. With a text editor it looks like this:
There is actually a better way of viewing this. The Mac OS X developer tools include a plist editor. That gives this display:
Lots of information about this folder. The UUID numbers are universally unique identifiers. They ensure that everything is uniquely numbered, making corruption identifiable and recovery possible.

So far we have not found any images, and that was the goal of the whole exercise. There are two importgroup folders present. One contains nothing and the other has 59MB of data, so lets look in there:
Those are the names of the files that were imported. Lets look inside BEE:
And there is the original JPEG that was imported. There is a file that describes the Bee.jpg file:
Here we also meet the sidecar files. There are two apversion files that describe the adjustments to the original (master) and to one version. As versions are created, more version files are added. In this way the original is never touched. Lets look inside the apversion file:
More UUIDs to help tied everything together. Looking at what the UUIDs reference it is not hard to imaging that the whole library could be reconstituted from just a sea of files. Here we see ratings, time zones, image size, stack index, and image cache information. Lets look inside the image adjustments:
And there are all the gory details of the adjustments that define this version. We can also look at the searchable properties:
And in there we see all the metadata. So as long as you can parse XML, the metadata can be read.

How about the Albums? Back here in the Photos project we see a Bombers album:
The Bombers.apalbum looks like this:
The version UUIDs is the list of image versions in the album. The filter information is the thumbnail filter that is currently active on the album. Opening up the subqueries gives:
Smart albums are the same, except they contain more information and don't have a list if UUIDs, since they need to generate the list on the fly:
Clearly it would take some work to recover all of the information about particular images, but I am sure that someone will create a tool or script to extract it. Extensive use of UUIDs would allow a database to be generated and albums and projects to be recreated automatically. Keywords and all the other metadata could also be extracted and associated with the images.

How about the images themselves? That's easy: opening up the library and dragging all the contents onto iView MediaPro creates a catalog that contains all the master images. From there you can sort, search, and copy. Any other image-finding program could do the same thing easily.

Some of the limitations of Aperture stem from the limitations of Core Data. It's single-user and designed to run with a single store (database). It is not a far stretch to imagine that these limitations will disappear as OS X matures and Aperture gains some very attractive workgroup features: multiple users, multiple libraries, and off-line storage management.

So the Aperture library is not as dark and dangerous as it might at first seem. It's really just a filing system like the one that manages your hard drive already. Could you find a file on your disk without the filing system? Not without a lot of trouble, yet nobody is really worried about that.
The Bagelturf site welcomes Donations of any size