Data Storage Policy for Android Apps

Dinesh Shanmugam C
5 min readJun 19, 2021

Almost every Android app would need to persist data. With this comes the challenge of choosing between various techniques of storing data. There are different ways of storing data in an Android app. And there is no single solution for all data storage requirements. This is where a data policy for your app can help maintain the health of the app as it scales.

There are a few questions that we need to ask ourselves when dealing with persistent data.

  • Type of data : What kind of data am I storing ? Ex : Text data, Structured data, Binary data, relational data?
  • Volume : What is the volume of data that am expected to store?
  • Scale : Would the data grow in volume/structure over time ?
  • Freshness : How critical is the data to the App? Is freshness of data critical to the app? (Social media apps can live with stale data for an offline experience but not mission critical apps)

And then there is a choice of API’s that allow us to persist data.

Data persistence on Android apps are facilitated via different API’s, such as

In addition to these, there are NoSQL databases like

For building high quality applications, it is necessary to know which API to use when.

Key values : Shared Preferences / Data Store

Structured data that has a schema : SQLite / Room

Data shared between applications : Content Providers

Non textual data like Images, Music : Disk storage (Scoped storage)

Unstructured/Data that changes often in structure : NoSql

Byte sized data

Store only minimal values that are not subject to too many modifications, in SharedPreference. Boolean Values, Integer values are good candidates to be stored in SharedPreferences.

It is very easy to over utilize SharedPreference. A classical case is to store stringified Json in SharedPreference. Though there is nothing wrong in doing this, over time, this can potentially cause nasty ANR’s which would become so hard to figure out. There is a reason to this. Loading SharedPreference takes place on the UI thread. The more heavy the SharedPreferences is, the more time it takes for the loading the SharedPreference into memory. Its because, the preferences is an xml file that is read from the disk, parsed and loaded into a the RAM as a HashMap. This can potentially cause an ANR, since the UI thread is blocked for sometime. The more the size, the longer the UI thread is going to be blocked.

When the size of SharedPreference exceeds ~20 KB, it would be a good time to get wary.

This is being solved to a good extent by Data Store which changes SharedPreference access from being synchronous to asynchronous. This would introduce a mental shift to developers who have been using SharedPreference synchronously in their code. But its all for the greater good to make Apps smooth and performant.

The trend is to access all data asynchronously.

Structured Data

Highly relational, structured data should be stored in an SQL backed database. SQLite is low level and needs lots of boiler plate to be written. But it is optimized for mobile and is highly recommended. Room Persistence Library can be used for this use case, since Room is a wrapper around SQLite and provides amazing functionality like

  • ORM
  • Migration paths
  • Annotations that are convenient to use

Example Use Case for Room :

  • A music application which can be used offline. Meta data about songs are good examples of relational Database. There are many entities to the data such as Artists, Genre, Duration of the song, Thumbnail etc.

Unstructured data

Offline First Applications:

There is a thin line between Offline first applications and Offline applications.

Let’s consider the example of a music application, where freshness of data is not mandatory but its important for the user to use the app when there is no internet.

Let’s consider another example of an eCommerce application which gives certain recommendations. It is important for the data to be as fresh as possible. It is still OK for the data to be not real time, though relevant.

Now, we have a database which needs to be refreshed as often as possible and presented to the user without much latency. This is where Cloud Synced databases like Couch, Room, Firebase database are a good fit. The data is synced even when the app is not being used actively and is presented to the user as soon as the app is opened.

Examples Use cases for cloud synced databases:

Chat applications

  • Supports real time chat when network is available.
  • Supports loading of chats when network is not available
  • Supports refreshing data when remote data has changed
Choice of data storage api’s for different types of data

The outliers

One of the most frequently encountered situation is to store a json somewhere to be accessed later across app sessions. And the commonly used route is to stringify the Json and store it in SharedPreference.

Is this right or wrong?

There is no straight answer to this. We can debate forever. Let’s try to debate a bit.

Droid 1 : Json is not a primitive data type. Hence it’s not a candidate to be stored in SharedPreference.

Droid 2 : But, whats wrong in storing it in SharedPreference?

Droid 1 : It’s may cause performance issues !!!

Droid 2 : What performance issue?

Droid 1 : Well, SharedPreference is loaded in the Ui thread. If your JSon is heavy(large sized), then the disk read and load may take some time. And there is an additional process of converting the string back to a Json (a.k.a parsing) which again takes place in the Ui thread. This may lead to Janky Frames or in the worst case an ANR..

Droid 2 : Okayyyy. But what if my Json is just a tiny one ? Like only 2KB?

Droid 1 : Well, How can you promise that your Json would never cross 2 KB? Can you promise that it would stay that way for ever? What’s the problem with storing the JSON in a NOSQL db like Couch or Realm?

Droid 2 : Well, mine is a very small use case. I don’t want to add boat loads of boiler plate code or some extra library and increase the app size.

… Conversation continues

As you can see, the conversation would go on for ever and there can be no conclusion. Software Engineering is all about tradeoff’s. We need to decide between tradeoff’s. So each situation demands a differential thinking and no single solution can help fit all use cases.

It is very important to come up with a data storage policy for any application in the long run. Deciding democratically upon the API’s and the mechanism for storing various types of data would help the app to be smooth in the long run.

But whatever you do, don’t do expensive stuff in the UI Thread. That would spoil your application in the long run. Small operations quickly add up and you would end up considerable time later to deal with it.

Image Credit : Photo by Patrick Lindenberg on Unsplash

--

--