Posted on
I wrote python code that uses Pillow to extract EXIF data from thousands of photographs. The image files come from my personal photos. These are some of the oddities I encountered:
- Null bytes
The Make, Model and various Date/TimeStamp fields can have null bytes at the end. It's a very common problem, and it can break things down the pipeline. You must use.replace('\x00', '').strip()
on most fields. - Invalid date formats
The default EXIF date format is'%Y:%m:%d %H:%M:%S'
. Some photos used'%Y-%m-%d %H:%M:%S'
instead. - Missing GPS attributes
One photo had a GPSDateStamp, but no GPSTimeStamp. One had a GPSLongitude but no GPSLongitudeRef.
That's all for now. It seems to process tens of thousand of JPG photos without issues.