mobile app back end database with large number of entires

Posted on 2013-01-12
Last Modified: 2014-11-12
I am looking to develop a mobile phone application that will need to send data to a back-end database every 30 seconds. The data is relatively simple and consist of a userid, location information and a few simple numbers. If I base my estimates on having 100,000 users and that the app is used 1 hour per day then that is potentially 12 million rows per day and we need to keep the data for at least 2 years meaning we have potentially 8.7 billion rows of data. The database logically is very simple with a user table and a larger table with the captured data in it. There may be other tables with relationships to the user table but it will always be quite simple We then need to perform normal SQL queries and also time series analysis on it using standard and bespoke tools we will develop. I am very familiar with relational databases such as SQl Server , Oracle and MySQl but was wondering if there are better solutions for this since I assume other applications have already built similar back-ends for data storage

Ideally since we are just starting up we want a free or low initial cost solution.

Thanks for any help

Question by:prices1964
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
LVL 25

Accepted Solution

Tomas Helgi Johannsson earned 450 total points
ID: 38771825

For a such magnitude of data I strongly recommend that you use partitioning, both for maintenance and also for the queries as the queries will take advantage of the partition pruning which gives you optimal performance of such large data.

Also consider using InnoDB engine instead of MyISAM as you will get row-level locking in InnoDB vs locking on the entire table in MyISAM when executing queries.

     Tomas Helgi
LVL 29

Expert Comment

ID: 38772359
You should consider another frequency for updates; although you need data to be captured twice a minute on the mobile, you probably need a lower frequency for the transmission to the database, and so the burden to the database would be lower, since you would have a single sql overhead for several records, instead of 1 for each record.

As an example: for an application I am working with, GPS data is captured by the vehicle-embarked box twice a minute, but sent to the monitoring system only once every 3 minutes: the overhead of data transmission / bandwidth consumption / sql update happens 6 times less than if we were capturing everys 30 seconds.
By doing so you may occasionally lose some data (eg if the mobile is out of network at transmission time): you need to assess the best interval for your needs.

Of course, this is possible only if the mobile can keep the data for the chosen interval (or even more).
- If you are writing a native app, this should not be a problem.
- If you are writing a web app in HTML5 flavor, you can use the local mobile database except AFAIK for windows phone, since its IE browser does not handle this.

Author Comment

ID: 38772430

thanks for that suggestion, it makes a lot of sense

in terms of then querying the data for statistical analysis would a normal database such as SQL server / oracle be the best solution ?
LVL 29

Expert Comment

ID: 38772473
For statistical analysis with big volumes of data, sql  in some cases is not the best solution: its normal forms avoid logical and physical redundancy (consequently keeping the data volume to its minimum), but "OLAP" analytics processes when dealing with huge numbers of records are most effective with hypercubes of redundant data (so that eg, a city name is available directly in the record rather than thru a relation with an additional table).

Note in addition that if you sql-query huge volumes of data directly on your production database, the one that collects your permanent data, you might eat all of the CPU for your statistics, thus preventing the database to be updated (even if it is partitioned)

A good strategy might be to collect / record data in sql (partitioned if needed), then to backup it to some other place (at a peaceful time, eg, at 1am). The backup data can then be placed on another machine (remember, statistics and analytics are cpu intensive) where it will be used in sql if the performance is OK, otherwise translated into a denormalized hypercube.

If you partition your sql data, consider amongst the different solution the partition on date (eg, create a new database every 24 hours so that new data work only on the data of today, not on the last x days). In that case conversion to denormalized data will simply add to the previous hypercubes.

You have probably done the maths to compute the rate at which your data will be created, and what will be the average additional daily volume. When this becomes large, remember that the time needed to add records to N existing records is, at best, proportional to Log (N)  and very often worse than that...

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

These days, all we hear about hacktivists took down so and so websites and retrieved thousands of user’s data. One of the techniques to get unauthorized access to database is by performing SQL injection. This article is quite lengthy which gives bas…
Introduction This article is intended for those who are new to PHP error handling (  It addresses one of the most common problems that plague beginning PHP develop…
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…
This is a high-level webinar that covers the history of enterprise open source database use. It addresses both the advantages companies see in using open source database technologies, as well as the fears and reservations they might have. In this…

729 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question