How your smartphone tells your story: A dive into Android activity data


Beautiful Soup to the rescue.

Its a really popular python library for web scraping.

UPDATE: The newer version as of May 2019 has JSON format data but about a random tenth of the records from the over 2 years back are missing.

Strange how logs disappear!The only 2 important pieces of information in this file were the app name and the time at which it was opened.

Here is div associated with one use of an app from the HTML file:I parsed the HTML using Beautiful Soup, extracted the app names and time stamps and dumped them into a Pandas data frame.

Over to cleaning the data.

Skadoosh!Panda (by Mélody P on Unsplash)With about 80000 data points over a period of 3.

5 years from 474 unique apps (When did I install these many?!) it was pretty evident something was wrong.

A peek into the data showed that there were many system apps included in the list, like the home launcher, face-unlock, system clock, etc; about 102 of these.

Here I also realized that the data was from 3 devices, a Xiaomi Mi 4, Moto G5 and a OnePlus 6.

After removing all the entries associated with the system apps, I was left with 372 apps, 67236 app interactions over a period of 1134 days.

Here is a word cloud generated from the different words in the names of the apps:Of the 372 apps, I was pretty sure that I did not have most of them installed for long periods of time.

Some apps I installed only to try them out.

Only 21 apps had over 500 interactions:['Wear OS by Google Smartwatch (was Android Wear)', 'Instagram', 'Google', 'Twitter', 'Google Chrome: Fast & Secure', 'Gmail', 'WhatsApp Messenger', 'Android', 'Google Photos', 'Splitwise', 'YouTube', 'Messenger – Text and Video Chat for Free', 'Contacts', 'Google Calendar', 'Home', 'Moto Camera', 'Facebook', 'Truecaller: Caller ID, spam blocking & Call Record', 'Calorie Counter – MyFitnessPal', 'com.


camera', 'Inshorts – 60 words News summary']MyFitnesPal and Inshorts surprised me, maybe they had the most interactions because of their constant push notifications.

Similarly Truecaller also spun up with each call and hence made it to the list.

8 Google Apps, 3 Social Media Apps, 2 Messaging Apps, 2 camera Apps.

Also, the number of Google Apps was about a tenth of all the other apps.

For further analysis, I only considered apps that had more than 100 interactions (those that I used about once every week on an average).

This brought the unique app-count down to 59.

Here are the interaction counts for these 59 apps:Also, when I say ‘interactions’, it only refers to the number of times I opened the app and no way indicates the duration I spent on it.

History and LoreHere is a plot of my daily app interactions since the time this data was being collected over a period of 3.

5 years:Lets try to break this down.

Chapter 1: Smartphones through the agesI seemed to be really excited about my first smartphone.

I used it 33% more in the first two months than what I used it in the remaining 22 months on an average.

But my overall cellphone usage increased significantly when I got new cellphones, about 32% more than when I had the older phone.

My very first android phone lasted 753 days, the next one 253 days and the current one is at a tender age of 136 days.

Chapter 2: Blank SpacesI, knowing myself, was curious as to how there were some blank spaces in the timeline, how could I have not used the phone at all for more than a couple of days?.So I dug a little deeper to figure out what happened during these time periods.

December 2015I went in to these particular dates in the data, searched for the apps that I used and the only social media app I found was Instagram.

I went in there to check and found only one post with my long time school friends.

February 2017Again I resorted to social media to track where I was during this period in 2017.

Saw only one post on Facebook, that too just before the blackout:I had just returned from India and got back to studies, starting with my very first Masters course in Computer Science.

This was the first time I had decided to keep my cellphone away while studying, was quite productive.

Chapter 3: The United States of Apps: Part 1From here on, I only used the data from the time I was in the United States as it better reflected my social media consumption and my current smartphone usage.

While in India, I had lectures at college five days a week and was living with my parents.

Result?.Somewhat controlled sleep cycles: waking up around 6-7 in the morning.

Casual social media usage was evident throughout the day, but I had my college friends to alleviate the time spent on social media.

WhatsApp was the communication app I used, mainly because almost everyone in my friend circle was on it, with majority of the use at night before winding up the day.

The light patch between 1 am to 7 am shows my average sleep time.

Coming to the US was a new experience.

I didn’t know many people, well almost no one to start with.

So the I had to resort to social media to be in touch with my friends and family and be occupied in my spare time; WhatsApp usage increased drastically.

Facebook and Instagram usage also increased as those were the only medium by which I could be aware of what was happening in the lives of those back at home with whom I couldn’t communicate regularly.

Another evident thing, my sleep scheduled: totally messed up!.I didn’t have lectures everyday and and when I had, they were late in the day, so I had no reason to get up early (other than the obvious fact that its healthy!).

Also, I felt more productive working late nights.

The gloomy Seattle weather didn’t help much either, getting me to wake up just before or around noon.

Late night messaging also increased due to the time zone difference between Seattle and all my friends back in India.

The dark patch between 5/6am to 11/12pm shows my average sleep time during my first year in the US!.Later, after the first 2 semesters my schedule improved a lot.

While the above heat map was only for my first year in the US, the one below shows data for my stay till now.

Facebook Messenger usage also saw some increase as most of my friends here in the US didn’t use WhatsApp, so this was our means of communication other than calls.

As for productivity apps, they weren’t used as much as they should have been.

Contacts (or calls) were predominant during late nights, when it was morning back home in India.

Gmail was real bright just after noon, as soon as I got up (during the first year) and while I was at work or school at noon(during the second year).

I began using emails magnitudes more than what I used to back in India, mainly for my internship applications and other communication from the university.

Chapter 4: WorkOnce I started working 40 hours a week during my internship, some apps’ usage saw a stark change, specially the productivity apps.

I had presumed that I would be using more of emails while at work, which I did, but not on my phone.

Most of the emailing happened on the company’s computer.

Email usage during the 2 months before getting an internship was twice the usage after getting one.

Before getting a job, each and every email notification felt like a notification from a prospective interviewer so I made it a point to read every email I received on my phone.

The overall weekly trend stayed the same; rare usage during the weekends.

The most astounding realization that I had, which I always used to think was just a gangland-myth whenever I heard it from someone : “Your daily schedule will get back to normal once you start working”.

Well, it did, somewhat.

I now sleep around 1 am and get up around 7–8 am.

Not quite a healthy practice, but still better than sleeping at 4 or 5 am.

Its just been 2 months, I am still getting used to this change in my schedule.

These violin plots show the distribution of app interactions for different periods.

The thinner regions show my the sleeping hours, where app use is minimal (not visible as zeros because violin plots use interpolation and approximations to plot the density) Even though the sleep cycle is almost back to how it was in India, the overall social media interactions have increased after the internship began.

Chapter 5: The United States of Apps: Part 2I stopped using Facebook on my phone since I got the new device, I felt that I spent way too much time on it, endlessly scrolling through the feed.

I only use Facebook on my computer now.

Instagram and Twitter saw and dip in usage towards August/September 2018 when I was preparing for my interviews in full swing.

Twitter on the other hand saw a steady increase in use since early 2018.

I felt it was the fastest source of information on the current affairs and other topics.

I started the #100DaysOfML challenge by Siraj Raval in August and since then Twitter became a big source of links to Machine Learning articles and current happenings in that domain.

YouTube interactions in the USAAround December 2016 was the time I started subscribing to Tech Channels, Vloggers and Late Night talk shows shows on YouTube and since then my YouTube usage just kept going up.

By mid 2018, there weren’t any good new YouTube channels coming up so my subscriptions have stagnated since, and also because I actively stopped diving into rabbit holes on YouTube.

Chapter 6: Click!Some of my camera interactions.

May 2019Since writing this article long time back 6 months ago, I have actively made it a point to keep a check on my smartphone usage.

Since then, YouTube and many other social media platforms have come up with a feature that gives you notifications when you have spent a certain amount of time on the app.

On the other hand, night mode on these apps has proved out of be a bane in disguise.

I haven’t had a look at my new stats but I’m pretty sure my night time is of the apps might have shot up.

Also, now that the data is readily available in JSON format, I plan on doing much detailed analysis using Tableau, a faster way of producing the same visualizations with better interactivity.

Various other data points like email interactions, location based data, etc are also available, a good source of data for training ML algorithms on real real-life data.

Periodicity analysis and seasonal variations are some of the studies that I would surely look into.

I’m currently reading the book 21 Lessons for the 21st Century by Yuval Noah Harari.

Reading this book after completing Sapiens, another book by the same author, just opens up your mind about how much the human race has evolved.

How much it has changed, for better or worse.

How the human race, which once tamed jungles and animals and built tools to help civilizations grow, is now being held captive by the very tools its creating.

The pace at which innovation is burgeoning cannot be stopped and neither should it be, but we as humans must be vary about the unintended ill-effects that these means of convenience pose to us, as individuals and as a society.


.. More details

Leave a Reply