Solved

mobile app back end database with large number of entires

Posted on 2013-01-12
4
327 Views
Last Modified: 2014-11-12
I am looking to develop a mobile phone application that will need to send data to a back-end database every 30 seconds. The data is relatively simple and consist of a userid, location information and a few simple numbers. If I base my estimates on having 100,000 users and that the app is used 1 hour per day then that is potentially 12 million rows per day and we need to keep the data for at least 2 years meaning we have potentially 8.7 billion rows of data. The database logically is very simple with a user table and a larger table with the captured data in it. There may be other tables with relationships to the user table but it will always be quite simple We then need to perform normal SQL queries and also time series analysis on it using standard and bespoke tools we will develop. I am very familiar with relational databases such as SQl Server , Oracle and MySQl but was wondering if there are better solutions for this since I assume other applications have already built similar back-ends for data storage

Ideally since we are just starting up we want a free or low initial cost solution.

Thanks for any help

Steve
0
Comment
Question by:prices1964
  • 2
4 Comments
 
LVL 24

Accepted Solution

by:
Tomas Helgi Johannsson earned 450 total points
ID: 38771825
Hi!

For a such magnitude of data I strongly recommend that you use partitioning, both for maintenance and also for the queries as the queries will take advantage of the partition pruning which gives you optimal performance of such large data.
http://dev.mysql.com/doc/refman/5.5/en/partitioning.html

Also consider using InnoDB engine instead of MyISAM as you will get row-level locking in InnoDB vs locking on the entire table in MyISAM when executing queries.

Regards,
     Tomas Helgi
0
 
LVL 29

Expert Comment

by:fibo
ID: 38772359
You should consider another frequency for updates; although you need data to be captured twice a minute on the mobile, you probably need a lower frequency for the transmission to the database, and so the burden to the database would be lower, since you would have a single sql overhead for several records, instead of 1 for each record.

As an example: for an application I am working with, GPS data is captured by the vehicle-embarked box twice a minute, but sent to the monitoring system only once every 3 minutes: the overhead of data transmission / bandwidth consumption / sql update happens 6 times less than if we were capturing everys 30 seconds.
By doing so you may occasionally lose some data (eg if the mobile is out of network at transmission time): you need to assess the best interval for your needs.

Of course, this is possible only if the mobile can keep the data for the chosen interval (or even more).
- If you are writing a native app, this should not be a problem.
- If you are writing a web app in HTML5 flavor, you can use the local mobile database except AFAIK for windows phone, since its IE browser does not handle this.
0
 

Author Comment

by:prices1964
ID: 38772430
hi

thanks for that suggestion, it makes a lot of sense

in terms of then querying the data for statistical analysis would a normal database such as SQL server / oracle be the best solution ?
0
 
LVL 29

Expert Comment

by:fibo
ID: 38772473
For statistical analysis with big volumes of data, sql  in some cases is not the best solution: its normal forms avoid logical and physical redundancy (consequently keeping the data volume to its minimum), but "OLAP" analytics processes when dealing with huge numbers of records are most effective with hypercubes of redundant data (so that eg, a city name is available directly in the record rather than thru a relation with an additional table).

Note in addition that if you sql-query huge volumes of data directly on your production database, the one that collects your permanent data, you might eat all of the CPU for your statistics, thus preventing the database to be updated (even if it is partitioned)

A good strategy might be to collect / record data in sql (partitioned if needed), then to backup it to some other place (at a peaceful time, eg, at 1am). The backup data can then be placed on another machine (remember, statistics and analytics are cpu intensive) where it will be used in sql if the performance is OK, otherwise translated into a denormalized hypercube.

If you partition your sql data, consider amongst the different solution the partition on date (eg, create a new database every 24 hours so that new data work only on the data of today, not on the last x days). In that case conversion to denormalized data will simply add to the previous hypercubes.

You have probably done the maths to compute the rate at which your data will be created, and what will be the average additional daily volume. When this becomes large, remember that the time needed to add records to N existing records is, at best, proportional to Log (N)  and very often worse than that...
0

Featured Post

Enabling OSINT in Activity Based Intelligence

Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

Join & Write a Comment

Entity Framework is a powerful tool to help you interact with the DataBase but still doesn't help much when we have a Stored Procedure that returns more than one resultset. The solution takes some of out-of-the-box thinking; read on!
Shadow IT is coming out of the shadows as more businesses are choosing cloud-based applications. It is now a multi-cloud world for most organizations. Simultaneously, most businesses have yet to consolidate with one cloud provider or define an offic…
Video by: Steve
Using examples as well as descriptions, step through each of the common simple join types, explaining differences in syntax, differences in expected outputs and showing how the queries run along with the actual outputs based upon a simple set of dem…
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

758 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

18 Experts available now in Live!

Get 1:1 Help Now