api and command line help

We have api which gets hits of 10000 per minute during mid october till jan end which we process with third party tool and save it in db.
Last year we were unable to process all records.

Probable reason.
1> Due to huge no of hits
2> While storing the api details we process these data via third party tool which is too slow.

Our problem

1> We can't do any thing for the tool nor we can look for its substitute for these third party tool.
2> we can't use help of load balancer due to some reason

Solution what we thought

To monitor the system after periodic time and check how much data has been received within 15 min of time and to know
what is the min and max time that has taken to process 1,00,000 records.

But it would get slow as record in db gets more 1,00,00,000 records and fetching data after every 15 min would surely affect the system

http://localhost/index.php?mobile=9012093492&custid=3&codeid=8&text=PLAY+GAMES


CREATE TABLE mobile_record (
      id int NOT NULL auto_increment,
      mobile varchar(20),
      custid int,
      codeid int,
      text varchar(255),
      token varchar(60),
      created_at DATETIME,
      PRIMARY KEY(id)
    ) ;

Is there some good method to resolve these issue.
which would process and store the hits in db and command line tool which would give me list of records received and stored in db and
other tool which would give me list of record which are not saved in db

php,mysql
os: ubuntu,centos
LVL 15
InsoftserviceAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Ray PaseurCommented:
I don't understand why you would not try to use a load balancer - it's the most obvious solution.  Without knowing any more specifics, such as the third party tool, here is something to consider.  You might try taking this process apart into two or more component applications.

The first part, which must be done in real time, is to receive and record the requests that are coming to your site.  This should be feasible - it's under 200 hits per second.  The main part of this is to write the request data into a database table that you will use to log the requests.

The second, and perhaps 3rd, 4th, or Nth parts of the application will be the processing of the requests.  In this part of the application, you'll query the log of requests, looking for requests that have not been processed yet.  As each one is acquired, call the third party tool (or do whatever is needed to satisfy the request) and when you've gotten a response, mark that entry in the log "complete."

This design _may_ allow you to receive the requests at a higher rate than if you try to complete the processing of each one as it is received.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
InsoftserviceAuthor Commented:
so you mean to say store the data of the api in db and log the information and then send it to process it to third party tool and log the information of its success or failure and later scan the log file to get the o/p of failure and success.
Third party tool process these information from other server whose information is also stored in db as token field.
Gerwin Jansen, EE MVETopic Advisor Commented:
The issue you have here is a very common one: you have a bottleneck in your system. All systems have bottlenecks and one of the ways to design a system is start with the bottlenecks. In your case the bottleneck is an external system that you now suffer from. The external system is most likely designed to handle x requests on average over a year, meaning x/12 requests per month. If you have 3 months of triple the usual load, your external system is not able to handle it, simply because it was not designed for that peak load. What you want to achieve (and what Ray above is describing) is to lower the average load to the external system so it doesn't run to it's (design) limits. So storing data (requests) temporarily is one of the ways to do that, the important thing here is that your temporary storage solution is capable of storing 3x the monthly average load. A dependency that you may run into is that the external system is also used for lookup or reporting purposes and is only allowing x hours of delay. If that is not an issue, you can go ahead and create the intermediate storage. If the delay is an issue then there is nothing else you can do but determine the system has grown beyond its (external) design limits and that you cannot support it anymore. The external system must then be upgraded, improved or managed (= accept delay).
Protecting & Securing Your Critical Data

Considering 93 percent of companies file for bankruptcy within 12 months of a disaster that blocked access to their data for 10 days or more, planning for the worst is just smart business. Learn how Acronis Backup integrates security at every stage

InsoftserviceAuthor Commented:
We cant delay the o/p .
Is there any way to handle the query on such huge DB
As we can be asked for report of any month. Do partition rang would help these issue.
Gerwin Jansen, EE MVETopic Advisor Commented:
What query? Partition rang? I don't understand.

Maybe invest in faster dB hardware an option?

Without any details we can hardly offer assistance here.
InsoftserviceAuthor Commented:
db hardware whats that. sorry have no idea

partiton range is method of mysql to range the column name
Gerwin Jansen, EE MVETopic Advisor Commented:
Explain what you mean with that query again?

Db hardware is database hardware, better/faster hardware is more performance. Nothing beats iron...
InsoftserviceAuthor Commented:
Its monitoring system which checks whether data received via api is stored in db or not.


"To monitor the system after periodic time and check how much data has been received within 15 min of time and to know
what is the min and max time that has taken to process 1,00,000 records.

But it would get slow as record in db gets more 1,00,00,000 records and fetching data after every 15 min would surely affect the system"
Bernard S.CTOCommented:
You should really consider Ray's suggestion.
I would pribably lic aall the incoming logs not in a mysql table, but in some NoSQL such as Redis or others, and pull data for processing from this table
Shalom CarmelCTOCommented:
Gerwin Jansen wrote an excellent answer explaining the constraints you have, let me rephrase in other words.
You have 2 problems.
1. Your external 3rd party system has a certain processing speed, and you cannot exceed that speed.
2. Your internal buffer, currently deployed in a database table, becomes unmanageable as it grows too large.

We can't only partly solve the 1st problem. You should have a buffer to enable different rates of data flow, and you do have it in the database. However, we cannot make your 3rd party service go any faster. If you tell us what it does maybe we can suggest substitutes, even though you say that this is impossible.

For the 2nd problem, instead of using a database table to store the events for processing, use alternatives.
* a NoSQL solution  - MongoDB looks like a good solution as it is a document database.
NoSQL databases have a much larger capacity that a mysql database and better performance for very large data sets.

* a good  queue system like RabbitMQ or WebsphereMQ
A queue has the advantage of you not having to manage the list of unprocessed events, as you will read the queue in the rate imposed by your 3rd party. It has the disadvantage of not exposing the contents of unread messages, so you can't query for a list of unprocessed.
InsoftserviceAuthor Commented:
changing db structure would be really pain as this system db is used by every one and all have to be changed for all process.
Any one have idea about partition range ,Partition Pruning whether it can resolve these issue
Gerwin Jansen, EE MVETopic Advisor Commented:
You are pretty fixed on not changing anything to the database, yet you want to solve your issue by creating partitions. That is also a database change. How about you show us your monitoring query so we can try and help you with that. Or some details of the database, which columns have indexes etc. "You can't make an omelette without breaking eggs."
InsoftserviceAuthor Commented:
db schema is already there.

$t15m_ago = new DateTime("15 minutes ago");
$s = $t15m_ago->format("Y-m-d H:i:s");
$result = mysql_query("SELECT count(*) from mo where created_at > '$s'");
$response['last_15_min_mo_count'] = current(mysql_fetch_row($result));
$result = mysql_query("SELECT min(created_at), max(created_at) from mo order by id DESC limit 10000");
$response['time_span_last_10k'] = mysql_fetch_row($result);

I hope redis would help in this case can anyone provide php script that would connect to redis and insert and fetch the data from redis server

[root@localhost ~]# netstat -tulpn | grep :6379
tcp        0      0 127.0.0.1:6379              0.0.0.0:*                   LISTEN      2508/redis-server



I tried with below code but its not working
<?php
   //Connecting to Redis server on localhost
   $redis = new Redis();
   $redis->connect('127.0.0.1', 6379);
   echo "Connection to server sucessfully";
   //set the data in redis string
   $redis->set("tutorial-name", "Redis tutorial");
   // Get the stored data and print it
   echo "Stored string in redis:: " + jedis.get("tutorial-name");
?>
Gerwin Jansen, EE MVETopic Advisor Commented:
Instead of looking at the date, maybe you can look at the ID field, assuming that field is based on increasing numbers. Get 1 record that is 15 minutes old, get the most recent record. Subtract the 2 ID's to get the amount of records for 15 minutes. This would save you the select * as well as the date comparison. Without any database changes allowed, no modification of external programs, no investment etc. consider this my final suggestion.
Gerwin Jansen, EE MVETopic Advisor Commented:
I would suggest to delete this question, despite the fact that several experts including myself have put effort in this one.

Sadly the asker cannot, will not or is not allowed to do anything with our suggestions, which leaves this question unresolved.
InsoftserviceAuthor Commented:
i hope in my previous comment i have specified that i am ready to move to redis as suggested  by @fibo.
DB schema what u have asked was already provided in my question. I have implemented ray suggestion for my previous question but i am stuck in second one. Currently i have moved to both mysql and redis i cant leave mysql as system is already built and changing the db for a small task is simply not acceptable if ur db is already been used in other modules of project and hence i was fixed to my db and was asking help to do it in mysql itself .
InsoftserviceAuthor Commented:
only first part has been resolved till certain extent . 2 nd part still remains
Ray PaseurCommented:
Recommend delete the question - theoretical, opinionated, lacking in specifics, nothing to learn here.
InsoftserviceAuthor Commented:
Have created new question for redis php as suggested by @SouthMod

http://www.experts-exchange.com/questions/28789579/Redis-php-help.html
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
PHP

From novice to tech pro — start learning today.