<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: apple-photos</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/apple-photos.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2020-05-26T15:53:08+00:00</updated><author><name>Simon Willison</name></author><entry><title>Serving photos locally with datasette-media</title><link href="https://simonwillison.net/2020/May/26/serving-photos-locally-datasette-media/#atom-tag" rel="alternate"/><published>2020-05-26T15:53:08+00:00</published><updated>2020-05-26T15:53:08+00:00</updated><id>https://simonwillison.net/2020/May/26/serving-photos-locally-datasette-media/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/dogsheep/dogsheep-photos/tree/0.4.1#serving-photos-locally-with-datasette-media"&gt;Serving photos locally with datasette-media&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
datasette-media is a new Datasette plugin which can serve static files from disk in response to a configured SQL query that maps incoming URL parameters to a path to a file. I built it so I could run dogsheep-photos locally on my laptop and serve up thumbnails of images that match particular queries. I’ve added documentation to the dogsheep-photos README explaining how to use datasette-media, datasette-json-html and datasette-template-sql to create custom interfaces onto Apple Photos data on your machine.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/plugins"&gt;plugins&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/dogsheep"&gt;dogsheep&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/apple-photos"&gt;apple-photos&lt;/a&gt;&lt;/p&gt;



</summary><category term="plugins"/><category term="projects"/><category term="datasette"/><category term="dogsheep"/><category term="apple-photos"/></entry><entry><title>Using SQL to find my best photo of a pelican according to Apple Photos</title><link href="https://simonwillison.net/2020/May/21/dogsheep-photos/#atom-tag" rel="alternate"/><published>2020-05-21T19:16:38+00:00</published><updated>2020-05-21T19:16:38+00:00</updated><id>https://simonwillison.net/2020/May/21/dogsheep-photos/#atom-tag</id><summary type="html">
    &lt;p&gt;According to the Apple Photos internal SQLite database, this is the most aesthetically pleasing photograph I have ever taken of a pelican:&lt;/p&gt;

&lt;p&gt;&lt;img src="https://photos.simonwillison.net/i/cbfe463f1a67e37a1d36c5db44f0159ef6f86a0d64a987b129b63b52e555f1af.jpeg?w=800" alt="A pelican" style="max-width: 100%" /&gt;&lt;/p&gt;

&lt;p&gt;Here's the SQL query that found me my best ten pelican photos:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;select
  sha256,
  ext,
  uuid,
  date,
  ZOVERALLAESTHETICSCORE
from
  photos_with_apple_metadata
where
  uuid in (
    select
      uuid
    from
      labels
    where
      normalized_string = 'pelican'
  )
order by
  ZOVERALLAESTHETICSCORE desc
limit
  10&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;You can &lt;a href="https://dogsheep-photos.dogsheep.net/public?sql=select%0D%0A++json_object%28%0D%0A++++%27img_src%27%2C%0D%0A++++%27https%3A%2F%2Fphotos.simonwillison.net%2Fi%2F%27+%7C%7C+sha256+%7C%7C+%27.%27+%7C%7C+ext+%7C%7C+%27%3Fw%3D600%27%0D%0A++%29+as+photo%2C%0D%0A++sha256%2C%0D%0A++ext%2C%0D%0A++uuid%2C%0D%0A++date%2C%0D%0A++ZOVERALLAESTHETICSCORE%0D%0Afrom%0D%0A++photos_with_apple_metadata%0D%0Awhere%0D%0A++uuid+in+%28%0D%0A++++select%0D%0A++++++uuid%0D%0A++++from%0D%0A++++++labels%0D%0A++++where%0D%0A++++++normalized_string+%3D+%3Alabel%0D%0A++%29%0D%0Aorder+by%0D%0A++ZOVERALLAESTHETICSCORE+desc%0D%0Alimit%0D%0A++10&amp;amp;label=pelican"&gt;try it out here&lt;/a&gt; (with some extra &lt;a href="https://github.com/simonw/datasette-json-html/blob/master/README.md#images"&gt;datasette-json-html&lt;/a&gt; magic to display the actual photos). Or try &lt;a href="https://dogsheep-photos.dogsheep.net/public?sql=select%0D%0A++json_object%28%0D%0A++++%27img_src%27%2C%0D%0A++++%27https%3A%2F%2Fphotos.simonwillison.net%2Fi%2F%27+%7C%7C+sha256+%7C%7C+%27.%27+%7C%7C+ext+%7C%7C+%27%3Fw%3D600%27%0D%0A++%29+as+photo%2C%0D%0A++sha256%2C%0D%0A++ext%2C%0D%0A++uuid%2C%0D%0A++date%2C%0D%0A++ZOVERALLAESTHETICSCORE%0D%0Afrom%0D%0A++photos_with_apple_metadata%0D%0Awhere%0D%0A++uuid+in+%28%0D%0A++++select%0D%0A++++++uuid%0D%0A++++from%0D%0A++++++labels%0D%0A++++where%0D%0A++++++normalized_string+%3D+%3Alabel%0D%0A++%29%0D%0Aorder+by%0D%0A++ZOVERALLAESTHETICSCORE+desc%0D%0Alimit%0D%0A++10&amp;amp;label=lemur"&gt;lemur&lt;/a&gt; or &lt;a href="https://dogsheep-photos.dogsheep.net/public?sql=select%0D%0A++json_object%28%0D%0A++++%27img_src%27%2C%0D%0A++++%27https%3A%2F%2Fphotos.simonwillison.net%2Fi%2F%27+%7C%7C+sha256+%7C%7C+%27.%27+%7C%7C+ext+%7C%7C+%27%3Fw%3D600%27%0D%0A++%29+as+photo%2C%0D%0A++sha256%2C%0D%0A++ext%2C%0D%0A++uuid%2C%0D%0A++date%2C%0D%0A++ZOVERALLAESTHETICSCORE%0D%0Afrom%0D%0A++photos_with_apple_metadata%0D%0Awhere%0D%0A++uuid+in+%28%0D%0A++++select%0D%0A++++++uuid%0D%0A++++from%0D%0A++++++labels%0D%0A++++where%0D%0A++++++normalized_string+%3D+%3Alabel%0D%0A++%29%0D%0Aorder+by%0D%0A++ZOVERALLAESTHETICSCORE+desc%0D%0Alimit%0D%0A++10&amp;amp;label=seal"&gt;seal&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I actually think this is my best pelican photo, but Apple Photos rated it fifth:&lt;/p&gt;

&lt;p&gt;&lt;img src="https://photos.simonwillison.net/i/a444857c4ac71ceae6af5192c8acc5ac35934ed589259136df0ed11295dbb085.jpeg?w=800" alt="A pelican" style="max-width: 100%" /&gt;&lt;/p&gt;

&lt;h3&gt;How this works&lt;/h3&gt;

&lt;p&gt;Apple Photos keeps photo metadata in a SQLite database. It runs machine learning models to identify the contents of every photo, and separate machine learning models to calculate quality scores for those photographs. All of this data lives in SQLite files on my laptop. The trick is knowing where to look.&lt;/p&gt;

&lt;p&gt;I'm not running queries directly against the Apple Photos SQLite file - it's a little hard to work with, and the label metadata is stored in a separate database file. Instead, this query runs against a combined database created by my new &lt;a href="https://github.com/dogsheep/dogsheep-photos"&gt;dogsheep-photos&lt;/a&gt; tool.&lt;/p&gt;

&lt;h3&gt;An aside: Why I love Apple Photos&lt;/h3&gt;

&lt;p&gt;The Apple Photos app - on both macOS and iOS - is in my opinion Apple's most underappreciated piece of software. In my experience most people who use it are missing some of the most valuable features. A few highlights:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;It can show you ALL of your photos on a map. On iOS go to the "Albums" tab, scroll half way down and then click on "Places" (no wonder people miss this feature!) - on macOS Photos it's the "Library -&amp;gt; Places" sidebar item.  It still baffles me that Google Photos doesn't do this (I have &lt;a href="https://twitter.com/simonw/status/1227060020694503425"&gt;conspiracy theories&lt;/a&gt; about it). This is my most common way for finding a photo I've taken - I remember where it was, then zoom in on that area of the map.&lt;/li&gt;&lt;li&gt;It runs machine learning models &lt;em&gt;on your phone&lt;/em&gt; (or laptop) to identify the subject of your photos, and makes them searchable. Try searching for "dog" and you'll see all of the photos you've taken of dogs! I love that this runs on-device: it's much less creepy than uploading your photos to the cloud in order to do this.&lt;/li&gt;&lt;li&gt;It has a really great faceted search implementation - particularly in the phone app. Try searching for "dog", then add "selfie" and the name of a city to see all of the selfies you've taken with dogs in that place!&lt;/li&gt;&lt;li&gt;It has facial recognition, again running on device, which you can use to teach it who your friends are (autocompleting against your contacts). A little bit of effort spent training this and you can see photos you've taken of specific friends in specific places and with specific animals!&lt;/li&gt;&lt;/ul&gt;

&lt;p&gt;As with most Apple software, Photos uses SQLite under the hood. The underlying database is undocumented and clearly not intended as a public API, but it exists. And I've wanted to gain access to what's in it for years.&lt;/p&gt;

&lt;h3&gt;Querying the Apple Photos SQLite database&lt;/h3&gt;

&lt;p&gt;If you run Apple Photos on a Mac (which will synchronize with your phone via iCloud) then most of your photo metadata can be found in a database file that lives here:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;~/Pictures/Photos\ Library.photoslibrary/database/Photos.sqlite&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Mine is 752MB, for aroud 40,000 photos. There's a lot of detailed metadata in there!&lt;/p&gt;

&lt;p&gt;Querying the database isn't straight-forward. Firstly it's almost always locked by some other process - the workaround for that is to create a copy of the file. Secondly, it uses some custom undocumented Apple SQLite extensions. I've not figured out a way to load these, and without them a lot of my queries ended up throwing errors.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/RhetTbull/osxphotos"&gt;osxphotos&lt;/a&gt; to the rescue! I ran a GitHub code search for one of the tables in that database (searching for &lt;a href="https://github.com/search?l=Python&amp;amp;q=RKPerson&amp;amp;type=Code"&gt;RKPerson in Python code&lt;/a&gt;) and was delighted to stumble across the &lt;code&gt;osxphotos&lt;/code&gt; project by Rhet Turnbull. It's a well designed and extremely actively maintained Python tool for accessing the Apple Photos database, including code to handle several iterations of the underlying database structure.&lt;/p&gt;

&lt;p&gt;Thanks to &lt;code&gt;osxphotos&lt;/code&gt; the first iteration of my own code for accessing the Apple Photos metadata was &lt;a href="https://github.com/dogsheep/dogsheep-photos/commit/b3c20e08b1a99c8898f13cc0266e1c5c012cf23c"&gt;less than 100 lines of code&lt;/a&gt;. This gave me locations, people, albums and places (human names of geographical areas) almost for free!&lt;/p&gt;

&lt;h3&gt;Quality scores&lt;/h3&gt;

&lt;p&gt;Apple Photos has a fascinating database table called &lt;code&gt;ZCOMPUTEDASSETATTRIBUTES&lt;/code&gt;, with a bewildering collection of columns. Each one is a floating point number calculated presumably by some kind of machine learning model. Here's a full list, each one linking to my public photos sorted by that score:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;&lt;a href="https://dogsheep-photos.dogsheep.net/public/photos_with_apple_metadata?_sort_desc=ZBEHAVIORALSCORE"&gt;ZBEHAVIORALSCORE&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://dogsheep-photos.dogsheep.net/public/photos_with_apple_metadata?_sort_desc=ZFAILURESCORE"&gt;ZFAILURESCORE&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://dogsheep-photos.dogsheep.net/public/photos_with_apple_metadata?_sort_desc=ZHARMONIOUSCOLORSCORE"&gt;ZHARMONIOUSCOLORSCORE&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://dogsheep-photos.dogsheep.net/public/photos_with_apple_metadata?_sort_desc=ZIMMERSIVENESSSCORE"&gt;ZIMMERSIVENESSSCORE&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://dogsheep-photos.dogsheep.net/public/photos_with_apple_metadata?_sort_desc=ZINTERACTIONSCORE"&gt;ZINTERACTIONSCORE&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://dogsheep-photos.dogsheep.net/public/photos_with_apple_metadata?_sort_desc=ZINTERESTINGSUBJECTSCORE"&gt;ZINTERESTINGSUBJECTSCORE&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://dogsheep-photos.dogsheep.net/public/photos_with_apple_metadata?_sort_desc=ZINTRUSIVEOBJECTPRESENCESCORE"&gt;ZINTRUSIVEOBJECTPRESENCESCORE&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://dogsheep-photos.dogsheep.net/public/photos_with_apple_metadata?_sort_desc=ZLIVELYCOLORSCORE"&gt;ZLIVELYCOLORSCORE&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://dogsheep-photos.dogsheep.net/public/photos_with_apple_metadata?_sort_desc=ZLOWLIGHT"&gt;ZLOWLIGHT&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://dogsheep-photos.dogsheep.net/public/photos_with_apple_metadata?_sort_desc=ZNOISESCORE"&gt;ZNOISESCORE&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://dogsheep-photos.dogsheep.net/public/photos_with_apple_metadata?_sort_desc=ZPLEASANTCAMERATILTSCORE"&gt;ZPLEASANTCAMERATILTSCORE&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://dogsheep-photos.dogsheep.net/public/photos_with_apple_metadata?_sort_desc=ZPLEASANTCOMPOSITIONSCORE"&gt;ZPLEASANTCOMPOSITIONSCORE&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://dogsheep-photos.dogsheep.net/public/photos_with_apple_metadata?_sort_desc=ZPLEASANTLIGHTINGSCORE"&gt;ZPLEASANTLIGHTINGSCORE&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://dogsheep-photos.dogsheep.net/public/photos_with_apple_metadata?_sort_desc=ZPLEASANTPATTERNSCORE"&gt;ZPLEASANTPATTERNSCORE&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://dogsheep-photos.dogsheep.net/public/photos_with_apple_metadata?_sort_desc=ZPLEASANTPERSPECTIVESCORE"&gt;ZPLEASANTPERSPECTIVESCORE&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://dogsheep-photos.dogsheep.net/public/photos_with_apple_metadata?_sort_desc=ZPLEASANTPOSTPROCESSINGSCORE"&gt;ZPLEASANTPOSTPROCESSINGSCORE&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://dogsheep-photos.dogsheep.net/public/photos_with_apple_metadata?_sort_desc=ZPLEASANTREFLECTIONSSCORE"&gt;ZPLEASANTREFLECTIONSSCORE&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://dogsheep-photos.dogsheep.net/public/photos_with_apple_metadata?_sort_desc=ZPLEASANTSYMMETRYSCORE"&gt;ZPLEASANTSYMMETRYSCORE&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://dogsheep-photos.dogsheep.net/public/photos_with_apple_metadata?_sort_desc=ZSHARPLYFOCUSEDSUBJECTSCORE"&gt;ZSHARPLYFOCUSEDSUBJECTSCORE&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://dogsheep-photos.dogsheep.net/public/photos_with_apple_metadata?_sort_desc=ZTASTEFULLYBLURREDSCORE"&gt;ZTASTEFULLYBLURREDSCORE&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://dogsheep-photos.dogsheep.net/public/photos_with_apple_metadata?_sort_desc=ZWELLCHOSENSUBJECTSCORE"&gt;ZWELLCHOSENSUBJECTSCORE&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://dogsheep-photos.dogsheep.net/public/photos_with_apple_metadata?_sort_desc=ZWELLFRAMEDSUBJECTSCORE"&gt;ZWELLFRAMEDSUBJECTSCORE&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://dogsheep-photos.dogsheep.net/public/photos_with_apple_metadata?_sort_desc=ZWELLTIMEDSHOTSCORE"&gt;ZWELLTIMEDSHOTSCORE&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;

&lt;p&gt;I'm not enormously impressed with the results I get from these. They're clearly not intended for end-user visibility, and sorting them might not even be something that makes sense.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;ZGENERICASSET&lt;/code&gt; table provides four more scores, which seem to provide much more useful results:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;&lt;a href="https://dogsheep-photos.dogsheep.net/public/photos_with_apple_metadata?_sort_desc=ZOVERALLAESTHETICSCORE"&gt;ZOVERALLAESTHETICSCORE&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://dogsheep-photos.dogsheep.net/public/photos_with_apple_metadata?_sort_desc=ZCURATIONSCORE"&gt;ZCURATIONSCORE&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://dogsheep-photos.dogsheep.net/public/photos_with_apple_metadata?_sort_desc=ZPROMOTIONSCORE"&gt;ZPROMOTIONSCORE&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://dogsheep-photos.dogsheep.net/public/photos_with_apple_metadata?_sort_desc=ZHIGHLIGHTVISIBILITYSCORE"&gt;ZHIGHLIGHTVISIBILITYSCORE&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;

&lt;p&gt;My guess is that these overall scores are derived from the &lt;code&gt;ZCOMPUTEDASSETATTRIBUTES&lt;/code&gt; ones. I've seen the best results from &lt;code&gt;ZOVERALLAESTHETICSCORE&lt;/code&gt;, so that's the one I used in my "show me my best photo of a pelican" query.&lt;/p&gt;

&lt;h3&gt;A note about the demo&lt;/h3&gt;

&lt;p&gt;The demo I'm running at &lt;a href="https://dogsheep-photos.dogsheep.net/public/photos_with_apple_metadata"&gt;dogsheep-photos.dogsheep.net&lt;/a&gt; currently only contains 496 photos. My private instance of this has over 40,000, but I decided to just publish a subset of that in the demo so I wouldn't have to carefully filter out private screenshots and photos with sensitive locations and suchlike. Details of how the demo work (using the &lt;code&gt;dogsheep-photos create-subset&lt;/code&gt; command to create a subset database containing just photos in my Public album) can be found &lt;a href="https://github.com/dogsheep/dogsheep-photos/issues/25"&gt;in this issue&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;Automatic labeling of photo contents&lt;/h3&gt;

&lt;p&gt;Even more impressive than the quality scores are the machine learning labels.&lt;/p&gt;

&lt;p&gt;Automatically labeling the content of a photo is surprisingly easy these days, thanks to &lt;a href="https://en.wikipedia.org/wiki/Convolutional_neural_network"&gt;convolutional neural networks&lt;/a&gt;. I wrote a bit about these in &lt;a href="https://simonwillison.net/2018/Oct/29/transfer-learning/"&gt;Automatically playing science communication games with transfer learning and fastai&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Apple download a machine learning model to your device and do the label classification there. After quite a bit of hunting (I ended up using Activity Monitor's Inspect -&amp;gt; Open Files and Ports option against the &lt;code&gt;photoanalysisd&lt;/code&gt; process) I finally figured out where the results go: the &lt;code&gt;~/Pictures/Photos\ Library.photoslibrary/database/search/psi.sqlite&lt;/code&gt; database file.&lt;/p&gt;

&lt;p&gt;(Inspecting &lt;code&gt;photoanalysisd&lt;/code&gt; also lead me to the &lt;code&gt;/System/Library/Frameworks/Vision.framework/Versions/A/Resources/&lt;/code&gt; folder, which solved another mystery: where do Apple keep the models? There are &lt;a href="https://gist.github.com/simonw/6ce25981931e3c99f51f2ff0c8bcb0b1"&gt;some fascinating files&lt;/a&gt; in there.)&lt;/p&gt;

&lt;p&gt;It took &lt;a href="https://github.com/dogsheep/dogsheep-photos/issues/16"&gt;some work&lt;/a&gt; to figure out how to match those labels with their corresponding photos, mainly because the &lt;code&gt;psi.sqlite&lt;/code&gt; database stores photo UUIDs as a pair of signed integers whereas the &lt;code&gt;Photos.sqlite&lt;/code&gt; database stores a UUID string.&lt;/p&gt;

&lt;p&gt;I'm now pulling the labels out into a separate &lt;code&gt;labels&lt;/code&gt; table. You can &lt;a href="https://dogsheep-photos.dogsheep.net/public/labels?_facet=category"&gt;browse that in the demo&lt;/a&gt; to see how it is structured. Labels belong to numeric categories - here are some of my guesses as to what those mean:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;&lt;a href="https://dogsheep-photos.dogsheep.net/public/labels?_facet=category&amp;amp;category=2024"&gt;Category 2024&lt;/a&gt; appears to be actual content labels - Seal, Water Body, Pelican etc.&lt;/li&gt;&lt;li&gt;&lt;a href="https://dogsheep-photos.dogsheep.net/public/labels?_facet=category&amp;amp;category=2027"&gt;Category 2027&lt;/a&gt; is more contextual: Entertainment, Trip, Travel, Museum, Beach Activity etc.&lt;/li&gt;&lt;li&gt;&lt;a href="https://dogsheep-photos.dogsheep.net/public/labels?_facet=category&amp;amp;category=1014"&gt;Category 1014&lt;/a&gt; is simply the month the photo was taken. &lt;a href="https://dogsheep-photos.dogsheep.net/public/labels?_facet=category&amp;amp;category=1015"&gt;1015&lt;/a&gt; is the year, and &lt;a href="https://dogsheep-photos.dogsheep.net/public/labels?_facet=category&amp;amp;category=2030"&gt;2030&lt;/a&gt; is the season.&lt;/li&gt;&lt;li&gt;&lt;a href="https://dogsheep-photos.dogsheep.net/public/labels?_facet=category&amp;amp;category=2056"&gt;Category 2056&lt;/a&gt; is the original filename.&lt;/li&gt;&lt;li&gt;&lt;a href="https://dogsheep-photos.dogsheep.net/public/labels?_facet=category&amp;amp;category=12"&gt;Category 12&lt;/a&gt; is the country the photo was taken in.&lt;/li&gt;&lt;/ul&gt;

Here's &lt;a href="https://dogsheep-photos.dogsheep.net/public?sql=select%0D%0A++photo%2C%0D%0A++%28%0D%0A++++select%0D%0A++++++json_group_array%28%0D%0A++++++++normalized_string%0D%0A++++++%29%0D%0A++++from%0D%0A++++++labels%0D%0A++++where%0D%0A++++++labels.uuid+%3D+photos_with_apple_metadata.uuid%0D%0A++%29+as+labels%2C%0D%0A++date%2C%0D%0A++albums%2C%0D%0A++persons%2C%0D%0A++ZOVERALLAESTHETICSCORE%0D%0Afrom%0D%0A++photos_with_apple_metadata"&gt;a query&lt;/a&gt; that shows the labels (from every category) next to each photo.

&lt;h3&gt;Geography&lt;/h3&gt;

&lt;p&gt;Photos taken on an iPhone have embedded latitudes and longitudes... which means I can &lt;a href="https://dogsheep-photos.dogsheep.net/public/photos_on_a_map"&gt;display them on a map&lt;/a&gt;!&lt;/p&gt;

&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2020/photos-on-a-map.png" alt="My photos on a map" style="max-width: 100%" /&gt;&lt;/p&gt;

&lt;p&gt;Apple also perform reverse-geocoding on those photos, resolving them to cities, regions and countries. This is great for faceted browse: here are my photos &lt;a href="https://dogsheep-photos.dogsheep.net/public/photos_with_apple_metadata?_facet=place_state_province&amp;amp;_facet=place_country&amp;amp;_facet=place_city"&gt;faceted by country, city and state/province&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;Hosting and serving the images&lt;/h3&gt;

&lt;p&gt;My least favourite thing about Apple Photos is how hard it is to get images from it onto the internet. If you enable iCloud sharing your images are accessible through &lt;a href="https://www.icloud.com/"&gt;icloud.com&lt;/a&gt; - but they aren't given publicly accessible URLs, so you can't embed them in blog entries or do other webby things with them.&lt;/p&gt;

&lt;p&gt;I also really want to "own" my images. I want them in a place that I control.&lt;/p&gt;

&lt;p&gt;Amazon S3 is ideal for image storage. It's incredibly inexpensive and essentially infinite.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;dogsheep-photos upload&lt;/code&gt; command takes ANY directory as input, scans through that directory for image files and then uploads them to the configured S3 bucket.&lt;/p&gt;

&lt;p&gt;I designed this to work independently of Apple Photos, mainly to preserve my ability to switch to alternative image solutions in the future.&lt;/p&gt;

&lt;p&gt;I'm using the &lt;a href="https://en.wikipedia.org/wiki/Content-addressable_storage"&gt;content addressable storage&lt;/a&gt; pattern to store the images. Their filename is the sha256 hash of the file contents. The idea is that since sensible photo management software leaves the original files unmodified I should be able to de-duplicate my photo files no matter where they are from and store everything in the one bucket.&lt;/p&gt;

&lt;p&gt;Original image files come with privacy concerns: they embed accurate latitude and longitude data in the EXIF data, so they can be used to reconstruct your exact location history and even figure out your address. This is why systems like Google Photos &lt;a href="https://issuetracker.google.com/issues/80379228"&gt;make it difficult&lt;/a&gt; to export images with location data intact.&lt;/p&gt;

&lt;p&gt;I've addressed this by making the content in my S3 bucket private. Access to the images takes place through &lt;a href="https://github.com/simonw/s3-image-proxy"&gt;s3-image-proxy&lt;/a&gt; - a proxy server I wrote and deployed on &lt;a href="https://vercel.com/"&gt;Vercel&lt;/a&gt; (previously Zeit Now). The proxy strips EXIF data and can optionally resize images based on querystring parameters. It also serves them with far-future cache expire headers, which means they sit in Vercel's CDN cache rather than being resized every time they are accessed.&lt;/p&gt;

&lt;p&gt;iPhones default to saving photos in HEIC format, which fails to display using with the &lt;code&gt;&amp;lt;img src=""&amp;gt;&lt;/code&gt; tag in the browsers I tested. The proxy uses &lt;a href="https://pypi.org/project/pyheif/"&gt;pyheif&lt;/a&gt; to convert those into JPEGs.&lt;/p&gt;

&lt;p&gt;Here's an example HEIC image, resized by the proxy and converted to JPEG:
&lt;a href="https://photos.simonwillison.net/i/59854a70f125154cdf8dad89a4c730e6afde06466d4a6de24689439539c2d863.heic?w=600"&gt;https://photos.simonwillison.net/i/59854a70f125154cdf8dad89a4c730e6afde06466d4a6de24689439539c2d863.heic?w=600&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;Next steps&lt;/h3&gt;

&lt;p&gt;This project is a little daunting in that there are so many possibilities for where to take it next!&lt;/p&gt;

&lt;p&gt;In the short term:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;&lt;a href="https://github.com/dogsheep/dogsheep-photos/issues/3"&gt;Import EXIF data&lt;/a&gt; from the images into a table. The Apple Photos tables give me some of this already (particularly GPS data) but I want things like ISO, aperture, what lens I used.&lt;/li&gt;&lt;li&gt;Load the labels into SQLite full-text search.&lt;/li&gt;&lt;li&gt;I'd like other people to be able to play with this easily. Getting it all up and running right now is a fair amount of work - I think I can improve this with usability improvements and better documentation.&lt;/li&gt;&lt;li&gt;The system only handles static images at the moment. I'd like to &lt;a href="https://github.com/dogsheep/dogsheep-photos/issues/13"&gt;get my movies&lt;/a&gt; and more importantly my live photos in there as well.&lt;/li&gt;&lt;/ul&gt;

&lt;p&gt;And in the longer term:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;Only iPhone photos have location data at the moment - I'd like to derive approximate latitude/longitude points for my DSLR images by matching against images from my phone based on date.&lt;/li&gt;&lt;li&gt;Running my photos through other computer vision systems like Google's Cloud Vision APIs &lt;a href="https://github.com/dogsheep/dogsheep-photos/issues/14"&gt;could be really interesting&lt;/a&gt;.&lt;/li&gt;&lt;li&gt;For better spotting of duplicate images I'm interested in exploring &lt;a href="https://github.com/dogsheep/dogsheep-photos/issues/7"&gt;image content hashing&lt;/a&gt;.&lt;/li&gt;&lt;li&gt;The UI for all of this right now is just regular Datasette. Building a custom UI (running against the Datasette JSON API) could be a lot of fun.&lt;/li&gt;&lt;/ul&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/photography"&gt;photography&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/photos"&gt;photos&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sql"&gt;sql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/dogsheep"&gt;dogsheep&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/weeknotes"&gt;weeknotes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/apple-photos"&gt;apple-photos&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="photography"/><category term="photos"/><category term="projects"/><category term="sql"/><category term="sqlite"/><category term="datasette"/><category term="dogsheep"/><category term="weeknotes"/><category term="apple-photos"/></entry><entry><title>Weeknotes: Datasette 0.41, photos breakthroughs</title><link href="https://simonwillison.net/2020/May/7/weeknotes/#atom-tag" rel="alternate"/><published>2020-05-07T05:02:51+00:00</published><updated>2020-05-07T05:02:51+00:00</updated><id>https://simonwillison.net/2020/May/7/weeknotes/#atom-tag</id><summary type="html">
    &lt;p&gt;Shorter weeknotes this week, because my main project for the week warrants a detailed write-up on its own (coming soon... update 21st May &lt;a href="https://simonwillison.net/2020/May/21/dogsheep-photos/"&gt;here it is&lt;/a&gt;).&lt;/p&gt;

&lt;h3&gt;Datasette 0.41&lt;/h3&gt;

&lt;p&gt;I released &lt;a href="https://datasette.readthedocs.io/en/stable/changelog.html#v0-41"&gt;Datasette 0.41&lt;/a&gt; this morning. Highlights include:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;Custom pages and configuration directory mode, both &lt;a href="https://simonwillison.net/2020/Apr/29/weeknotes/#new-datasette-features"&gt;described here last week&lt;/a&gt;&lt;/li&gt;&lt;li&gt;A new NOT LIKE table filter expression, &lt;code&gt;?colname__notlike=X&lt;/code&gt; - &lt;a href="https://github.com/simonw/datasette/issues/750"&gt;#750&lt;/a&gt;&lt;/li&gt;&lt;li&gt;A CSS pattern portfolio! I've been meaning to ship one of these for over two years now - you can see it at &lt;a href="https://latest.datasette.io/-/patterns"&gt;latest.datasette.io/-/patterns&lt;/a&gt; - &lt;a href="https://github.com/simonw/datasette/issues/151"&gt;#151&lt;/a&gt;&lt;/li&gt;&lt;li&gt;Various minor fixes and documentation improvements&lt;/li&gt;&lt;li&gt;PRAGMA function support - &lt;a href="https://github.com/simonw/datasette/issues/761"&gt;#761&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;

&lt;p&gt;A late addition to the release - thought of and implemented just this morning, as the result of &lt;a href="https://github.com/simonw/til/blob/master/sqlite/list-all-columns-in-a-database.md"&gt;this TIL&lt;/a&gt;, is the ability to use various &lt;a href="https://www.sqlite.org/pragma.html#pragfunc"&gt;SQLite PRAGMA functions&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Datasette blocks user-provided SQL queries containing the term &lt;code&gt;PRAGMA&lt;/code&gt; because it can be used to dynamically update various database configuration settings, and Datasette likes to keep queries read-only.&lt;/p&gt;

&lt;p&gt;But... SQLite provides access to a number of useful introspection capabilities through a collection of PRAGMA functions. Things like this:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;select * from pragma_foreign_key_list('complex_foreign_keys')&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;These are harmless - the SQLite documentation specifically states that "Table-valued functions exist only for PRAGMAs that return results and that have no side-effects".&lt;/p&gt;

&lt;p&gt;SO I enabled them using a negative lookahead assertion in the regular expression that &lt;a href="https://github.com/simonw/datasette/blob/0784f2ef9d3ff6dd9df05f54cb51de29a6d11764/datasette/utils/__init__.py#L174-L193"&gt;prevents PRAGMA from being used&lt;/a&gt;:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;allowed_pragmas = (
    "database_list",
    "foreign_key_list",
    "function_list",
    "index_info",
    "index_list",
    "index_xinfo",
    "page_count",
    "max_page_count",
    "page_size",
    "schema_version",
    "table_info",
    "table_xinfo",
)
disallawed_sql_res = [
    (
        re.compile("pragma(?!_({}))".format("|".join(allowed_pragmas))),
        "Statement may not contain PRAGMA",
    )
]&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The regular expression here is dynamically composed from the list of allowed PRAGMA functions - it looks something like this:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;re.compile("pragma(?!_(database_list|foreign_key_list|...))")&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;In Python regular expressions &lt;code&gt;pragme(?!_foo)&lt;/code&gt; means "pragma but only if not followed by &lt;code&gt;_foo&lt;/code&gt;".&lt;/p&gt;

&lt;h4&gt;Plotting foreign key relationships with Observable and D3&lt;/h4&gt;

&lt;p&gt;With PRAGMA functions support shipped in Datasette 0.41 it's now possible to make use of them in queries. Here's an example query which finds all foreign key relationships in a database:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;select
    sqlite_master.name as table_from,
    fk_info.[from] as column_from,
    fk_info.[table] as table_to,
    fk_info.[to] as column_to
from
    sqlite_master
join
    pragma_foreign_key_list(sqlite_master.name) as fk_info
order by
    sqlite_master.name&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Here's that query &lt;a href="https://github-to-sqlite.dogsheep.net/github?sql=select%0D%0A++sqlite_master.name+as+table_from%2C%0D%0A++fk_info.%5Bfrom%5D+as+column_from%2C%0D%0A++fk_info.%5Btable%5D+as+table_to%2C%0D%0A++fk_info.%5Bto%5D+as+column_to%0D%0Afrom%0D%0A++sqlite_master%0D%0A++join+pragma_foreign_key_list%28sqlite_master.name%29+as+fk_info%0D%0Aorder+by%0D%0A++sqlite_master.name"&gt;run against my github-to-sqlite demo database&lt;/a&gt;. Exported as JSON, the results look like this:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;[
    {
        "table_from": "commits",
        "column_from": "committer",
        "table_to": "users",
        "column_to": "id"
    },
    {
        "table_from": "commits",
        "column_from": "author",
        "table_to": "users",
        "column_to": "id"
    }
    ...
]&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;I used this to put together &lt;a href="https://observablehq.com/@simonw/datasette-table-diagram"&gt;an Observable Notebook&lt;/a&gt; which takes this data (from any public Datasette instance if you provide it with a URL) and renders those relationships as a force-directed graph using D3.&lt;/p&gt;

&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2020/observable-foreign-keys.png" style="max-width: 100%" alt="Force-directed D3 graph of foreign keys" /&gt;&lt;/p&gt;

&lt;h3&gt;Apple Photos breakthrough&lt;/h3&gt;

&lt;p&gt;I'll write about this more in a separate post, but I had a &lt;em&gt;huge&lt;/em&gt; breakthrough with my &lt;a href="https://github.com/dogsheep/photos-to-sqlite"&gt;photos-to-sqlite&lt;/a&gt; project this week.&lt;/p&gt;

&lt;p&gt;The goal of the project is to give my SQL access to the photos I've taken. I use Apple Photos for this, and my holy grail has always been the metadata that Apple generate around my photos using machine learning.&lt;/p&gt;

&lt;p&gt;Rhet Turnbull's fantastic &lt;a href="https://github.com/RhetTbull/osxphotos"&gt;osxphotos&lt;/a&gt; project gave me my first breakthrough, allowing me to easily start importing albums, people (from on-device facial recognition) and places.&lt;/p&gt;

&lt;p&gt;Apple run sophisticated label classification against my photos &lt;em&gt;on device&lt;/em&gt; - so if I search for "dog" they show me all of the photos I've taken of dogs. I was certain that this information was in a SQLite database somewhere, and this week I finally figured out where: the &lt;code&gt;~/Pictures/Photos\ Library.photoslibrary/database/search/psi.sqlite&lt;/code&gt; file.&lt;/p&gt;

&lt;p&gt;After a &lt;a href="https://github.com/dogsheep/photos-to-sqlite/issues/16"&gt;whole lot of digging around&lt;/a&gt; I worked out how to join it against my photos database, and as a result I can now run SQL queries that select all of my photos of dogs! Or seals! Or pelicans!&lt;/p&gt;

&lt;p&gt;Combine that with some &lt;a href="https://github.com/dogsheep/photos-to-sqlite/issues/15"&gt;intriguing automated photo quality scores&lt;/a&gt; I found (&lt;code&gt;order by ZPLEASANTCOMPOSITIONSCORE&lt;/code&gt; is a thing) and my Dogsheep photos solution is beginning to take shape. I'll write it up in more detail over the next few days.&lt;/p&gt;

&lt;h3&gt;Other projects&lt;/h3&gt;

&lt;ul&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/conditional-get/releases/tag/0.2a"&gt;conditional-get 0.2a&lt;/a&gt; now streams large downloads to disk and shows a progress bar in &lt;code&gt;-v&lt;/code&gt; mode.&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/datasette-atom/releases/tag/0.5"&gt;datasette-atom 0.5&lt;/a&gt; adds support for the Atom author element, so you can add author information to feed entries.&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/sqlite-utils"&gt;sqlite-utils&lt;/a&gt; went through versions 2.7, 2.7.1, 2.7.2 and 2.8 - see &lt;a href="https://sqlite-utils.readthedocs.io/en/stable/changelog.html#v2-8"&gt;release notes&lt;/a&gt;.&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/dogsheep/github-to-sqlite/releases"&gt;github-to-sqlite&lt;/a&gt; 2.1 and 2.2 introduced two new commands:&lt;ul&gt;&lt;li&gt;&lt;code&gt;github-to-sqlite scrape-dependents&lt;/code&gt; scrapes the GitHub &lt;a href="https://github.com/simonw/datasette/network/dependents"&gt;dependents page&lt;/a&gt; (which isn't yet in an API) to gather details of repos that depend on your repos&lt;/li&gt;&lt;li&gt;&lt;code&gt;github-to-sqlite stargazers&lt;/code&gt; fetches the users who have starred a specific list of repositories&lt;/li&gt;&lt;li&gt;I also added several new SQL views for better exploring the retrieved data&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;&lt;/ul&gt;

&lt;p&gt;My &lt;a href="https://github-to-sqlite.dogsheep.net/"&gt;github-to-sqlite demo&lt;/a&gt; is updated daily via a &lt;a href="https://github.com/dogsheep/github-to-sqlite/blob/master/.github/workflows/deploy-demo.yml"&gt;scheduled GitHub Action&lt;/a&gt; and retrieves all kinds of data about my various &lt;a href="https://github.com/dogsheep"&gt;Dogsheep&lt;/a&gt; projects plus sqlite-utils and Datasette. Being able to see things like the &lt;a href="https://github-to-sqlite.dogsheep.net/github/dependent_repos"&gt;most recent dependent repos&lt;/a&gt; across my projects in one place is &lt;em&gt;extremely&lt;/em&gt; useful.&lt;/p&gt;

&lt;h3&gt;TIL this week&lt;/h3&gt;

&lt;ul&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/til/blob/master/node/constant-time-compare-strings.md"&gt;Constant-time comparison of strings in Node&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/til/blob/master/sqlite/list-all-columns-in-a-database.md"&gt;List all columns in a SQLite database&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/til/blob/master/sqlite/compile-sqlite3-ubuntu.md"&gt;Compile a new sqlite3 binary on Ubuntu&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/til/blob/master/firefox/search-across-all-resources.md"&gt;Search across all loaded resources in Firefox&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/til/blob/master/github/dependencies-graphql-api.md"&gt;Accessing repository dependencies in the GitHub GraphQL API&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/til/blob/master/python/platform-specific-dependencies.md"&gt;Use setup.py to install platform-specific dependencies&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="https://github.com/simonw/til/blob/master/python/installing-upgrading-plugins-with-pipx.md"&gt;Installing and upgrading Datasette plugins with pipx&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sqlite"&gt;sqlite&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/observable"&gt;observable&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/dogsheep"&gt;dogsheep&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/weeknotes"&gt;weeknotes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/apple-photos"&gt;apple-photos&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="projects"/><category term="sqlite"/><category term="datasette"/><category term="observable"/><category term="dogsheep"/><category term="weeknotes"/><category term="apple-photos"/></entry></feed>