Consolidating MySQL data into a new table or?

Posted on 2014-10-23
Last Modified: 2014-11-18
In an application I have, we collect ping data from various networks to keep track of any problems.
We basically need an ongoing history of how circuits are doing.
The servers are sending ping results every 5 minutes and since there are many servers, that ping data is adding up and becomes somewhat useless after a while.

I currently have 73687 rows of data in my pings table and much of that data is now useless and that is the point of this question.

I order to keep an ongoing history which has some value, I am trying to figure out what I should be keeping and what I should be deleting. In terms of keeping, I think that I need to run a script which consolidates the data on a regular basis.

Here is an example of what the data looks like;

	id 	time 	client_id 	loss 	min 	max 	avg 	outage
	1 	2014-09-11 00:18:14 	9 	0 	25.081 	73.032 	37.190 	
	2 	2014-09-11 00:18:45 	41 	0 	24.113 	30.712 	26.753 	
	3 	2014-09-11 00:18:58 	162 	0 	21.935 	22.399 	22.112 	
	4 	2014-09-11 00:19:10 	157 	0 	99.395 	101.644 	100.854 	
	5 	2014-09-11 00:19:13 	8 	0 	21.843 	105.745 	45.488 	
	6 	2014-09-11 00:19:23 	9 	0 	24.918 	25.789 	25.234 	
	8 	2014-09-11 00:19:52 	41 	0 	21.632 	34.260 	26.638 	
	9 	2014-09-11 00:20:06 	162 	0 	22.106 	23.326 	22.835 	
	10 	2014-09-11 00:20:21 	8 	0 	23.290 	79.051 	38.768 	
	11 	2014-09-11 00:20:28 	157 	0 	99.585 	102.277 	100.737 	
	12 	2014-09-11 00:20:32 	9 	0 	24.993 	25.453 	25.193 	

Open in new window

The clientid simply means which server sent the data.

This is what the ongoing graph looks like;

The above graph is simply to show a range of time but beyond that, I would like to consolidate this somehow and in some meaningful way.
There is another section however for long term ongoing averages which shows min/max/avg and this is where I would like to display stats in some meaningful manner.

For example, when the user clicks on ping averages, I would like to show averages for the entire term, for the last year and the last month. This is where I would remove old data from the current table and create new data in a new table specifically for ongoing historical data.

This is where I seek help from anyone who has done this sort of thing before. I am not sure at all how I would consolidate this as useful data.
Question by:projects
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 3

Accepted Solution

Stampel earned 250 total points
ID: 40401189
You could split / archive your data in archive_ping_table, to get a reasonnable active history in the main table and have always fast queries response on it.
Then you could have a stored procedure that would move the data from active table to archive table on a regular basis.
This way when you need full history you can query both tables using union in your query
LVL 25

Assisted Solution

by:Tomas Helgi Johannsson
Tomas Helgi Johannsson earned 250 total points
ID: 40402198

Have you considered to partitioning the table(s).
That could help you in maintaining the data and roll out old data when no longer needed.
Partition pruning would then help in your queries shortening the execution time as the query only reads the partitions that meet it's predicates and not the whole table(s).

     Tomas Helgi

Author Comment

ID: 40402476
Yes, one of the problems I am facing at the moment is server load. Ping data is being saved every 5 minutes. So if I show a default of 48hrs worth of pings for each graph/person, that ends up being quite the Mysql read.

It would be much more efficient to only keep what ever amount of data I need in the raw table at any given time. So if my default view is set to 48hrs, then only have 48hrs worth of ping data for any given clientid in the table, the rest removed/consolidated.

The stored procedure you mention, is this something the mysql server would do on it's own or via an externally run task such as a cron job?

Yes, the server does support this function so this could be used.

I basically just don't want to waste the data since we've collected it but it's worthless as raw data beyond the 'graph' for a quick view of a short range of time. After that, it would have more value if it could be consolidated, taking up way less rows as well.

The thing is, part of my question is how to deal with such data, in order to make it useful, historical. Solving the problem with this table will help me to find a solution for other tables I am having a similar issue with.
Windows Server 2016: All you need to know

Learn about Hyper-V features that increase functionality and usability of Microsoft Windows Server 2016. Also, throughout this eBook, you’ll find some basic PowerShell examples that will help you leverage the scripts in your environments!


Expert Comment

ID: 40403181
Yes sure, that is exactly what i mean : keep only minimu data you need in the active table, and archive the rest that you would query only in rare case (when you need more history).

You can call a MySQL stored procedure from a MySQL trigger. But you can also launch a cron job to call the stored proc on a regular time basis.

Author Comment

ID: 40403276
Is this something I can do on my own right now by following some simple steps or do I need a DBA?
I am often able to get things done with enough input.

Expert Comment

ID: 40403872
Depending on your will, you probably can sort this alone by yourself.
I will do my best to help you.
First you need to create an archive table. For that i would use the following :
Create table ping_archive as (select * from ping);

Then try to create your first stored procedure to understand the basic things with this URL

Also have a look at triggers (you may use scheduled event trigger)
Let me know what you need help with

Author Comment

ID: 40404939
I'll work on this tomorrow and post as I do.

Featured Post

Visualize your virtual and backup environments

Create well-organized and polished visualizations of your virtual and backup environments when planning VMware vSphere, Microsoft Hyper-V or Veeam deployments. It helps you to gain better visibility and valuable business insights.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Background Still having to process all these year-end "csv" files received from all these sources (including Government entities), sometimes we have the need to examine the contents due to data error, etc... As a "Unix" shop, our only readily …
When table data gets too large to manage or queries take too long to execute the solution is often to buy bigger hardware or assign more CPUs and memory resources to the machine to solve the problem. However, the best, cheapest and most effective so…
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
In a recent question ( here at Experts Exchange, a member asked how to run an AutoHotkey script (.AHK) directly from Notepad++ (aka NPP). This video…

617 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question