erzoolander
asked on
MySQL Optimization/Options
Hi -
We have a tracking system on our website that's begun to become a bit cumbersome on performance. Essentially - when someone hits a certain type of page - there's a sequence that:
1. Checks the IP address of the person hitting the page
2: Checks to see if that IP address has hit the page before
3: If the IP address is new/unique - logs it into the database for that page
4: If the IP address is not unique - adds to a counter for that IP address
The system works - but we're now at about 750k records and growing...and it's slowing down (obviously).
I know enough about SQL to make that process work - but need some guidance on how to optimize it. How exactly do you deal with large recordsets like that without a degradation in performance? Any suggestions/solutions that have worked for you?
We have a tracking system on our website that's begun to become a bit cumbersome on performance. Essentially - when someone hits a certain type of page - there's a sequence that:
1. Checks the IP address of the person hitting the page
2: Checks to see if that IP address has hit the page before
3: If the IP address is new/unique - logs it into the database for that page
4: If the IP address is not unique - adds to a counter for that IP address
The system works - but we're now at about 750k records and growing...and it's slowing down (obviously).
I know enough about SQL to make that process work - but need some guidance on how to optimize it. How exactly do you deal with large recordsets like that without a degradation in performance? Any suggestions/solutions that have worked for you?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
don't forget to index the combo-column in step 2 ;)
Another optimization idea: break the IP into its distinct octets:
When you store the IPs, don't store them as text. Instead, break them into their constituent parts. Index the ipHistory table on (octet1,octet2,octet3,octe t4). This prevents you from having to use a text-based index (like varchar), and should provide some very nice speed improvements.
Of course, this does require some significant changes to your routine, but I think you'll find it is worth it.
CREATE TABLE ipHistory (octet1 UNSIGNED TINYINT,octet2 UNSIGNED TINYINT,octet3 UNSIGNED TINYINT,octet4 UNSIGNED TINYINT);
When you store the IPs, don't store them as text. Instead, break them into their constituent parts. Index the ipHistory table on (octet1,octet2,octet3,octe
Of course, this does require some significant changes to your routine, but I think you'll find it is worth it.
But u can't use these, if you don't have them exist only once in a table.
I guess the biggest work for MySql is the searching.
You could use Indexes in 3 ways (or more):
1. Use table with unique ip's
Create a table where all the IP's are only mentioned once.
Index this column as key.
What happens when a visitor comes over:
1. script checks the new table for the IP.
2a. if it exists it will run a insert-on-duplicate-update
2b. if it doesn't exist, it inserts a record in the table u're already using. Plus it inserts the IP into the new table.
2. add new unique column from the merging of 2 colums
add a new column to ur table and there fill the following data "IP - page".
IP's may exist multiple times, pages too.. but the combination of both shouldn't. ;)
So when a visitor comes over:
1. script just runs a insert-on-duplicate-update
3. break table in parts
Duplicate the table 9x (for example)
in the 1st table keep every record where IP starts with a 1
in the 2nd table .... where IP starts with a 2
in the..... etc..
Allthough the 2nd option is the quickest fix, the 1st one may give u the best performance.
(well, actually the 3rd idea will, however u break the parts, but that won't be necesarry and is more for huge tables (> milions of records)