Weird things you'll find when reading EXIF data

I wrote python code that uses Pillow to extract EXIF data from thousands of photographs. The image files come from my personal photos. These are some of the oddities I encountered:

  • Null bytes
    The Make, Model and various Date/TimeStamp fields can have null bytes at the end. It's a very common problem, and it can break things down the pipeline. You must use .replace('\x00', '').strip() on most fields.
  • Invalid date formats
    The default EXIF date format is '%Y:%m:%d %H:%M:%S'. Some photos used '%Y-%m-%d %H:%M:%S' instead.
  • Missing GPS attributes
    One photo had a GPSDateStamp, but no GPSTimeStamp. One had a GPSLongitude but no GPSLongitudeRef.

That's all for now. It seems to process tens of thousand of JPG photos without issues.