Before selling my Nexus 4, I made a backup of all the text messages I sent or received since I bought the device 2 years ago. Although the effort was mainly precautionary, I thought I might get some interesting data out of it.
Organizing the data
The messages were exported to XML using SMS Backup and Restore. Each line contained the contact name, the phone number, the direction of the message, the date and the body of the message. It only took a few lines of Python to export the data to a more suitable SQLite database. Normalization be damned, everything was dumped in a single table, although the phone numbers were properly formatted first.
The big picture
The database contains 17 658 text messages exchanged between 130 contacts between April 15 2013 and March 15 2015.
On average, I sent 12.11 messages per day and received 12.76 for a total of 24.87 exchanges per day over 710 days. There were 87 days without any message exchanges.
In general, I tend to favour short replies. My messages average 28.02 characters while the messages I receive average 43.26 characters. I tend to be far more verbose with professional contacts, since I spare them the usual ok, lol, haha.
Love in 160 characters
Aside from single-word responses, "Je t'aime" and its variations was the most exchanged message by a landslide, so I took it upon myself to investigate further.
Unsurprisingly, a majority of the messages were exchanged with significant others. These 10 732 messages split between two people account for 61% of all messages. A significant chunk of the remaining messages were exchanged with my close friends and family.
Taking a closer look at my messaging rate over time, it's easy to see the effect of love on my rate of messaging. The peak around January 1st, 2014 represents the days around the moment a relationship became official. My busiest day was on December 26, 2014, when we found in each other a convenient way to escape the least exciting parts of the holidays.
The correlation becomes even more obvious once we only look at the messages exchanged with that person. Key events in the relationship are marked with either peaks or dead silence. We met on November 24 2014, made it official on January 1 2015 and broke up on November 4 2014. The lull in march also sticks out like a sore thumb. Although we remained good friends, we mostly communicate via Facebook, hence the lack of data points after December 2014.
Looking at messages exchanged with contacts without names, I can clearly see the the single-day March 2014 apartment hunt followed by the June 2014 scramble to get out of it after finding mice in my pantry.
As discovered earlier, there is a significant error margin in the data since a lot of messages were exchanged via Facebook. My family also tends to prefer phone calls to text messages and I rarely text my best friend since I see him almost every day.
Nonetheless, I found the data pretty darn interesting. There is an "Artifacts" folder on my computer in which I put historically significant documents such as old conversations, letters of note and school projects. This massive compilation of text messages will be a fantastic addition to this growing collection.
The people concerned by these statistics approved the publication of this article.