Link to home
Start Free TrialLog in
Avatar of frasierphilips
frasierphilips

asked on

Update speed issues on SQL database maintained by PERL script

We have an online database, hosted on a dedicated UNIX-BASED web server by the German company '1 and 1' and we are having an intermittent problem with the speed of writes when we update any records in the database.  Reads from the database are always fine - it is writes only.

The database is used to drive an ecommerce website and contains approx. 120,000 records.  As part of the routine magement of the database, a retired guy comes in a couple of hours a day and modifies such things as catalogue numbers, stock quantities, prices etc.

I cannot stress this enough - this is a very low-key business - it is not Amazon - at his busiest, he probably updates 50 - 100 records a day.  Record fields are accessed by an html form with a search box pulling up a sheet with a record's details (approx. 15 fields in a record).  Individual fields are then updated on-screen and an update button saves the record back via an SQL 'Update' statement constructed by a receiving perl script (called als.cgi).

The problem seems to occur after variable amounts of time - sometimes he can update 30 records, sometimes 10, sometimes 2 or 3 - normally when he clicks the update button, the write update takes 1-2 seconds - when this issue arises an update can take 2-3 minutes!  No suprise, the hosting company's support line was totally unhelpful - they sent me a 150 page PDF file about using SQL and told me that I was using 'Slow Queries'.

The one helpful thing they did tell me was to use SSH and the TOP command to see what processes were running on the Server at the time of the slow updates - I did and was amazed to find that when the update issue arose, for 2 -3 minutes there were up to 6 versions of the script (als.cgi) showing in the process list (all taking 0% processor time).  After a few minutes of them bobbing up and down the list, they intermittently vanished one at a time and the updates went back to normal speed until the next time.

As the code for the admin routines used to make the updates is on the back-end of the ecommerce website, in the same perl file as the routines that drive the front-end, I assumed the other instances of als.cgi were people browsing the products on the site (which is not very busy usually) but when this happpens there are always the same 4 - 6 instances of the script in the list - it was my understanding that PERL is an epoch level language, i.e. everytime the script is run, all resources are created on entry and destroyed on exit - how can there be any versions of the script hanging about in the process list?

To be sure, I re-coded the admin screen (which I wrote about 4 years ago) using modern techniques such as AJAX for the user interface and seperate .CGI files for read (get.cgi) and write access (put.cgi) to the database . I also coded the script to only update database fields which had been changed on the form (rather than all fields in the record as it does at present) - this should have dramatically reduced the load on the server. To my amazement after a while it did exactly the same thing - began to run slowly for updates, with multiple copies of als.cgi in the process list - bear in mind, als.cgi wasn't even being executed for updates at this point, put.cgi/get.cgi were used instead but no sign of them in the process list! - I presume that this confirms my hypothesis that there are people browsing the site (the routines for which are contained in als.cgi) at the same time as we are updating records on the back-end.

The hosting company seem to be sticking to the line that the site is inefficient and it is a quota issue - I think this is rubbish, we have our own server and I would expect it to be able to handle one guy slowly updating a few dozen record fields a day without the server breaking a sweat.
Avatar of Fairlight2cx
Fairlight2cx
Flag of United States of America image

Perl programs do pre-compile at startup, allocate their stuff, and destroy everything on exit (unless you're using full mod_perl, which is another story).  That, however, simply means that each -instance- of a program follows this behaviour for its own resources.  Unless a program bothers to check a PID file or some other mechanism to ensure single-instance operations, there's nothing stopping 6 or 6000 instances from running concurrently, short of system RAM and CPU cycles.

As for what may be going on...  If reads are always fine, but writes are problematic, my first suspicioin would be a cluster that's not performing correctly.  The normal configuration for a cluster would be multiple machines for reads, but a single machine for writes--and therein lies the potential bottleneck.  Unless his MySQL server is on a VPS dedicated to him, it is likely a shared solution, which means it doesn't necessarily matter how busy this one person is, if thirty others are bombarding the cluster with writes.  I'm not saying it -is- a cluster, I'm saying that what you've described sounds like that kind of behaviour.

As for the scripts...  If you're on a system with strace(1), try attaching strace to the PID of the scripts when they're sitting there, and see where they're hanging--or at least spending a lot of time.  If it's a system that supports the proc filesystem, you may also want to look in /proc/[PID]/fd at the file descriptors in a long-listing format, to see -what- is being accessed by file descriptors if something is attempting to be read or written.  Depending on unix variant, there are other analogs to strace, but last I heard, 1 and 1 uses linux.
Avatar of frasierphilips
frasierphilips

ASKER

Thanks for this - I know where they're hanging - on the line that actually invokes the update query.  I'm not quite as knowldgeable as you re the debugging infrastructure available on UNIX but all my scripts have a 'showEnv' routine which basically dumps the environment variables to the display and exits the script - I use it to find choke points all the time.  If I put the call to showEnv before it runs the query it executes immediately, after it, it executes 2-3 minutes later (in line with the speed issue).
Avatar of arnold
I think part of the options 1 and 1 provide with in their control panel is the use of PHPmyADMIN.
You could use that to see how you can improve the performance of your mysql.  The issue might be RACE conditions (locks if used) as well as I/O which fairligh2cx covered.  
Why would this be an issue with such as straightforward database?  I could understand it if there were thousands of queries being executed per day but there are no more than a few hundred per week.  When I went onto the PHPmyAdmin screen I did notice the database had an 'overhead' of 450000 (I can't remember the exact amount) - what is this?  I searched the internet but I couldn't find a clear definition.  I did optimize the database (as some of the articles instructed) and the overhead vanished.
If it's a shared hosting/virtualised environment, could be as simple as disk resource contention with other activity on the same physical node.  Could be a bottleneck in the internal network on NAS drives, if that's how the provider is running their storage.  It's really hard to say without knowing the exact topography and configuration over at this particular site.  If the database server itself is shared, it could be oversold, and in high contention as a resource.

Most straightforward approach that will waste the least amount of speculation would be to open a support ticket with 1 and 1, let them know what's going on, and ask for a resolution.
Contacting the support helpdesk was the first thing I did - they responded by sending me a 150 page mySQL manual in PDF format and suggesting I was running 'slow queries' - they ignored all my follow up emails and when I phoned them they just told me to read the PDF file and repeated over and over that it wasn't a hardware problem, it was a script issue - I was speechless by the end of the call!
I'd suggest moving it to a VPS.  I use ViUX (viux.com), and have been quite pleased with them.  Totally unmetered bandwidth, as well, on their VPS solutions...and I've clocked mine at 44mbit, which may have actually been a limitation of the remote end, not my server.  I lease an entirely virtualised server.  And all their servers, even dedicated ones, are Virtuozzo based virtuals, so if you need to migrate from a VPS to dedicated, it's simple for them to do.  They're really responsive on support issues, as well.

I've had other people report issues with being displeased with 1 and 1.  This isn't the first time I've heard of their service allegedly being sub-par.
Check the phpmyadmin and if you could post the output/statistics with where the concern might be.

Do you have triggers/constraints on your tables which might also explain the slowness of inserts/updates ?
Actually, the "slow queries" thing doesn't make sense, if you're only hitting problems on writes.  *boggle*

What I'd be looking at is whether or not the table(s) in question need(s) a lock, and if that's not what you're ending up waiting on.
Why would I need a lock?  At present there is only one person updating the table at any one time but several users could be reading from the same table - would this make a difference?  Also, I've just noticed that when the slow access problem is happening, phpAdmin also suffers from the same speed problems.
The issue might be two fold as fairlight2cx explained.  Depending on how your database is hosted, could explain the issue.
I.e. the resources of the system are being depleted with your complex queries. The server is shared and they have too many high use users during some period of time. etc.
The only thing you can do to eliminate your setup schema/queries as the cause is to optimize what you control.
use: explain your query and see what if any suggestions could be taken
i.e. adding another index on a column might speed up the query.
On insert/update is there a trigger/constraint that delays the process?
Guys - thanks for all your help, but I was worried this might happen - I'm getting all these involved and complex solutions yet you all seem to be ignoring my main point - this is a simple,simple,simple database on a realtively quiet website maintained by a retired guy who works part-time and at a very sedate pace - even when I modified the script so that the updates were basically to one field, the system still ground to a halt - surely given the circumstamces, this should never happen?  There has been nothing outside a basic setup because this is a basic database - records only contain 15 fields.  Like I said in my initial blurb, this isn't amazon.com.
You said 4-6 instances of the script can be in the process table at once.  If even one of them gets as far as locking the table to do an update, the rest need to wait until the table is unlocked.

As for the simplicity, if it's really that simple and still running that slow, it again seems to point towards resource contention.  You still haven't given a clue as to the topography.  Is the actual database engine specific to -your- server, or is it 1 and 1's server on which you have a database?

Have you tried something simple...like grabbing a copy of VirtualBox, installing it, throwing a quick and basic linux install onto it, plugging the web application and database into it, and seeing if the same problems exhibit themselves.  If they don't, then you can rule out the application code.  If they do, then you at least know it's not limited to 1 and 1, and it's something else.  You need to break the stalemate in terms of, "What might be wrong?"  Start taking steps to eliminate possibilities.  What I've just described doing costs nothing in terms of software, little in terms of time, and is a lot faster than guessing at the problem indefinitely.
I have another install of this exact code running on XAMPP on a Windows XP Pro machine in the warehouse of a local enginering business which is far busier than this and it has been flawless.  Also, I've just been on to the 1and1 server and did 20 stock updates with no problems whatsoever - I'm fairly convinced it's a 1and1 sharing issue/config problem - I fired off a fairly blistering email threatening Trading Standards etc last night so we'll see what happens next.
ASKER CERTIFIED SOLUTION
Avatar of arnold
arnold
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial