Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win

x
?
Solved

mobile app back end database with large number of entires

Posted on 2013-01-12
4
Medium Priority
?
391 Views
Last Modified: 2014-11-12
I am looking to develop a mobile phone application that will need to send data to a back-end database every 30 seconds. The data is relatively simple and consist of a userid, location information and a few simple numbers. If I base my estimates on having 100,000 users and that the app is used 1 hour per day then that is potentially 12 million rows per day and we need to keep the data for at least 2 years meaning we have potentially 8.7 billion rows of data. The database logically is very simple with a user table and a larger table with the captured data in it. There may be other tables with relationships to the user table but it will always be quite simple We then need to perform normal SQL queries and also time series analysis on it using standard and bespoke tools we will develop. I am very familiar with relational databases such as SQl Server , Oracle and MySQl but was wondering if there are better solutions for this since I assume other applications have already built similar back-ends for data storage

Ideally since we are just starting up we want a free or low initial cost solution.

Thanks for any help

Steve
0
Comment
Question by:prices1964
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
4 Comments
 
LVL 25

Accepted Solution

by:
Tomas Helgi Johannsson earned 1350 total points
ID: 38771825
Hi!

For a such magnitude of data I strongly recommend that you use partitioning, both for maintenance and also for the queries as the queries will take advantage of the partition pruning which gives you optimal performance of such large data.
http://dev.mysql.com/doc/refman/5.5/en/partitioning.html

Also consider using InnoDB engine instead of MyISAM as you will get row-level locking in InnoDB vs locking on the entire table in MyISAM when executing queries.

Regards,
     Tomas Helgi
0
 
LVL 29

Expert Comment

by:fibo
ID: 38772359
You should consider another frequency for updates; although you need data to be captured twice a minute on the mobile, you probably need a lower frequency for the transmission to the database, and so the burden to the database would be lower, since you would have a single sql overhead for several records, instead of 1 for each record.

As an example: for an application I am working with, GPS data is captured by the vehicle-embarked box twice a minute, but sent to the monitoring system only once every 3 minutes: the overhead of data transmission / bandwidth consumption / sql update happens 6 times less than if we were capturing everys 30 seconds.
By doing so you may occasionally lose some data (eg if the mobile is out of network at transmission time): you need to assess the best interval for your needs.

Of course, this is possible only if the mobile can keep the data for the chosen interval (or even more).
- If you are writing a native app, this should not be a problem.
- If you are writing a web app in HTML5 flavor, you can use the local mobile database except AFAIK for windows phone, since its IE browser does not handle this.
0
 

Author Comment

by:prices1964
ID: 38772430
hi

thanks for that suggestion, it makes a lot of sense

in terms of then querying the data for statistical analysis would a normal database such as SQL server / oracle be the best solution ?
0
 
LVL 29

Expert Comment

by:fibo
ID: 38772473
For statistical analysis with big volumes of data, sql  in some cases is not the best solution: its normal forms avoid logical and physical redundancy (consequently keeping the data volume to its minimum), but "OLAP" analytics processes when dealing with huge numbers of records are most effective with hypercubes of redundant data (so that eg, a city name is available directly in the record rather than thru a relation with an additional table).

Note in addition that if you sql-query huge volumes of data directly on your production database, the one that collects your permanent data, you might eat all of the CPU for your statistics, thus preventing the database to be updated (even if it is partitioned)

A good strategy might be to collect / record data in sql (partitioned if needed), then to backup it to some other place (at a peaceful time, eg, at 1am). The backup data can then be placed on another machine (remember, statistics and analytics are cpu intensive) where it will be used in sql if the performance is OK, otherwise translated into a denormalized hypercube.

If you partition your sql data, consider amongst the different solution the partition on date (eg, create a new database every 24 hours so that new data work only on the data of today, not on the last x days). In that case conversion to denormalized data will simply add to the previous hypercubes.

You have probably done the maths to compute the rate at which your data will be created, and what will be the average additional daily volume. When this becomes large, remember that the time needed to add records to N existing records is, at best, proportional to Log (N)  and very often worse than that...
0

Featured Post

 [eBook] Windows Nano Server

Download this FREE eBook and learn all you need to get started with Windows Nano Server, including deployment options, remote management
and troubleshooting tips and tricks

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this series, we will discuss common questions received as a database Solutions Engineer at Percona. In this role, we speak with a wide array of MySQL and MongoDB users responsible for both extremely large and complex environments to smaller singl…
This article shows how to get a list of available printers for display in a drop-down list, and then to use the selected printer to print an Access report or a Word document filled with Access data, using different syntax as needed for working with …
In this video, Percona Solution Engineer Dimitri Vanoverbeke discusses why you want to use at least three nodes in a database cluster. To discuss how Percona Consulting can help with your design and architecture needs for your database and infras…
In this video, Percona Director of Solution Engineering Jon Tobin discusses the function and features of Percona Server for MongoDB. How Percona can help Percona can help you determine if Percona Server for MongoDB is the right solution for …
Suggested Courses

610 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question