mobile app back end database with large number of entires

Posted on 2013-01-12
Last Modified: 2014-11-12
I am looking to develop a mobile phone application that will need to send data to a back-end database every 30 seconds. The data is relatively simple and consist of a userid, location information and a few simple numbers. If I base my estimates on having 100,000 users and that the app is used 1 hour per day then that is potentially 12 million rows per day and we need to keep the data for at least 2 years meaning we have potentially 8.7 billion rows of data. The database logically is very simple with a user table and a larger table with the captured data in it. There may be other tables with relationships to the user table but it will always be quite simple We then need to perform normal SQL queries and also time series analysis on it using standard and bespoke tools we will develop. I am very familiar with relational databases such as SQl Server , Oracle and MySQl but was wondering if there are better solutions for this since I assume other applications have already built similar back-ends for data storage

Ideally since we are just starting up we want a free or low initial cost solution.

Thanks for any help

Question by:prices1964
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
LVL 25

Accepted Solution

Tomas Helgi Johannsson earned 450 total points
ID: 38771825

For a such magnitude of data I strongly recommend that you use partitioning, both for maintenance and also for the queries as the queries will take advantage of the partition pruning which gives you optimal performance of such large data.

Also consider using InnoDB engine instead of MyISAM as you will get row-level locking in InnoDB vs locking on the entire table in MyISAM when executing queries.

     Tomas Helgi
LVL 29

Expert Comment

ID: 38772359
You should consider another frequency for updates; although you need data to be captured twice a minute on the mobile, you probably need a lower frequency for the transmission to the database, and so the burden to the database would be lower, since you would have a single sql overhead for several records, instead of 1 for each record.

As an example: for an application I am working with, GPS data is captured by the vehicle-embarked box twice a minute, but sent to the monitoring system only once every 3 minutes: the overhead of data transmission / bandwidth consumption / sql update happens 6 times less than if we were capturing everys 30 seconds.
By doing so you may occasionally lose some data (eg if the mobile is out of network at transmission time): you need to assess the best interval for your needs.

Of course, this is possible only if the mobile can keep the data for the chosen interval (or even more).
- If you are writing a native app, this should not be a problem.
- If you are writing a web app in HTML5 flavor, you can use the local mobile database except AFAIK for windows phone, since its IE browser does not handle this.

Author Comment

ID: 38772430

thanks for that suggestion, it makes a lot of sense

in terms of then querying the data for statistical analysis would a normal database such as SQL server / oracle be the best solution ?
LVL 29

Expert Comment

ID: 38772473
For statistical analysis with big volumes of data, sql  in some cases is not the best solution: its normal forms avoid logical and physical redundancy (consequently keeping the data volume to its minimum), but "OLAP" analytics processes when dealing with huge numbers of records are most effective with hypercubes of redundant data (so that eg, a city name is available directly in the record rather than thru a relation with an additional table).

Note in addition that if you sql-query huge volumes of data directly on your production database, the one that collects your permanent data, you might eat all of the CPU for your statistics, thus preventing the database to be updated (even if it is partitioned)

A good strategy might be to collect / record data in sql (partitioned if needed), then to backup it to some other place (at a peaceful time, eg, at 1am). The backup data can then be placed on another machine (remember, statistics and analytics are cpu intensive) where it will be used in sql if the performance is OK, otherwise translated into a denormalized hypercube.

If you partition your sql data, consider amongst the different solution the partition on date (eg, create a new database every 24 hours so that new data work only on the data of today, not on the last x days). In that case conversion to denormalized data will simply add to the previous hypercubes.

You have probably done the maths to compute the rate at which your data will be created, and what will be the average additional daily volume. When this becomes large, remember that the time needed to add records to N existing records is, at best, proportional to Log (N)  and very often worse than that...

Featured Post

U.S. Department of Agriculture and Acronis Access

With the new era of mobile computing, smartphones and tablets, wireless communications and cloud services, the USDA sought to take advantage of a mobilized workforce and the blurring lines between personal and corporate computing resources.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

A company’s centralized system that manages user data, security, and distributed resources is often a focus of criminal attention. Active Directory (AD) is no exception. In truth, it’s even more likely to be targeted due to the number of companies …
This article shows the steps required to install WordPress on Azure. Web Apps, Mobile Apps, API Apps, or Functions, in Azure all these run in an App Service plan. WordPress is no exception and requires an App Service Plan and Database to install
Video by: Steve
Using examples as well as descriptions, step through each of the common simple join types, explaining differences in syntax, differences in expected outputs and showing how the queries run along with the actual outputs based upon a simple set of dem…
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

730 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question