Link to home
Start Free TrialLog in
Avatar of t3chguy
t3chguyFlag for United States of America

asked on

Optimizing Script Performance

I have a script that I"ve written in PHP.

I wrote a script for a company, but it's taking extremely long.  All table columns are indexed properly, but sometimes the script can take up to 20 minutes to run.

I can provide code examples if needed, but first allow me to explain how the script runs to see if it can be optimized or organized a little better.

A user will select a start date, end date, and location from a form.  Upon submission, the form submits to an ajax call that passes the POST variables to the main script called run_audit.php

The script begins by connecting to an external API and returns XML data.  From there, it loops through the XML and inserts records into the database.  I then make another call to the API to return different data, for the specified location. I need to update the table I just wrote to with the correct type for each bus record I've inserted.  Unfortunately these are two separate API calls and actions.

I notice that the update script takes a little while to run, I've tried pulling the bus numbers out, looping through there and updating, and I've also tried looping through the returned API results and updating.  Both take about the same time.

The program then makes a call to an MSSQL server and runs a query that returns the employee name, employee id, and job title. This part takes a little while as well, but I'm assuming that the MSSQL tables I'm pulling from are not indexed properly [this part is not my issue, as I'm just connecting to another person's table].

With the employee query I just ran, I put the results in a while loop.  Within this while loop, I have a foreach loop that loops through each day of the week.

Now within each day, I have to run queries to get the data initially inserted from the API call.

For each person, for each day of the week, I run queries for the bus number that they drove OUT of the lot time and back IN the lot, for each day segment.

Day Segments are broken up like this: AM, Mid, PM, and Late PM.

The loop now is looping for each employee [~200] per day [5 days] per segment [4 per day] with both and IN and OUT query.

By my calculations, my script is running about 8,000 queries, which could be the reason the script is extremely slow.

As each piece of data is being collected, I am adding it to an array, and then doing a json_encode on that array to pass back to the javascript and building HTML tables dynamically based on the returned data.

I'm wondering if there is a better way to structure the process so this script doesn't take 20 minutes.

I have attached an image below of the JSON array that is being returned from the script.

Again, I know without code it's almost impossible to visualize, and will be happy to provide if needed.

Thank you in advance.
SH.png
Avatar of Ray Paseur
Ray Paseur
Flag of United States of America image

... running about 8,000 queries
That is likely to explain at least part of why it is slow.  How many results sets are created from the 8,000 queries?  Have you used EXPLAIN SELECT (or similar - I'm not sure about MSSQL's capabilities here)?
Avatar of t3chguy

ASKER

In the example where there are 200 employees at a location, then 1,000 keys exist in the final array, 44 keys nested in each of those 1,000 keys.

I have not given the Explain SELECT a shot but will try that shortly and post back later this evening with some more detailed information.
Maybe show us the CREATE TABLE statements, too.  I have a feeling that anything getting data about 200 employees should be lightning fast.
Avatar of t3chguy

ASKER

Will do, thank you!
Avatar of t3chguy

ASKER

Sorry it has taken so long to get back to this.  I've sort of torn the script apart and put it together piece by piece.

I'm thinking the problem is I am connecting to both a MySQL and MSSQL database server.

My initial code is a call to an API which returns data in 1,089ms.

Then I loop through that data  and if the bus is a certain subtype, I run an update query.

		#Loop through the new array to update bus sub types.
		foreach($asset_array['asset'] AS $key => $arr) {
			if($arr['subtype'] == 'Wheelchair') {
				echo $arr['fleet'] . ' - ' . $arr['subtype'] . '<br />';
				$upd_bus_types = mysqli_query($link, "UPDATE auto_pay_data SET bus_type = 'Wheelchair' WHERE fleet_number = '".$arr['fleet']."'") or die("Cannot update bus type - " . $arr['fleet'] . ": " . mysqli_error($link));
			}
		}

Open in new window


The API plus the DB updates completes in about 2400 ms.

From here, I go back to MSSQL and run a query to pull all employees out of a table and loop through them.

I first run the following query (field / table names changed for security:

$emp_sql = sqlsrv_query($mssql_link, "SELECT [tbl_Profile].i_EmployeeID, EmployeeFirstName,  EmployeeLastName, EmployeeJobTitle  FROM [dbname].[dbo].[tbl_Profile] LEFT JOIN [dbname].[dbo].[tbl_EmpProfile] ON [dbname].[dbo].[tbl_Profile].i_EmployeeID = [".$cic_db_name."].[dbo].[tbl_EmpProfile].i_EmployeeID WHERE EmployeeJobStatus = '1' AND (i_EmployeeJobTitle = '1' OR i_EmployeeJobTitle = '6')", array(), array("Scrollable"=>"buffered"));

Open in new window


This returns 246 rows.  I'm able to loop through two different employee time tables as well as the bus number table, and this by itself returns in 7493ms.

The problem comes in when I run both the API + update query in addition to the previous mssql script, the script just takes a long, long time to run...about 20 minutes.

It's important to note that within the employee loop I just posted, I'll run 3 queries in mssql and about 10 queries in mysql.

Do I need to do something differently when I'm switching between mysql queries and mssql queries?

Thanks.
Avatar of t3chguy

ASKER

After digging into this a lot more, allow me to explain correctly, what is happening.

As I previously stated, I'm looping through ~246 employees, and within that loop, I'm running queries for each day of the week.

As an example, my loop looks like this:

foreach($employee as $v) {
foreach($day as $d) {

//Do a bunch of queries, data manipulations, etc.

}

Open in new window


In addition, right before I close the "day" loop, I'm building an array to store all the data generated on this page.

At the end of the loop, I do a
json_encode($report_data)

Open in new window

, which returns data back to the initial ajax call and displays the data in an html table.

The queries and data manipulation happen almost instantly...~30 seconds to run all these queries.

The bottleneck occurs when I build that final array.  It seems to come to a complete standstill, in fact.

Again, in that second loop:

foreach($employee as $v) {
foreach($day as $d) {

//Do a bunch of queries, data manipulations, etc.
//...
$report_data[] = array(
	#Common Fields
	"day" => $day, 
	"date" => $date, 
	"first_name" => trim($row['c_EmployeeFirstName']), 
	"last_name" => trim($row['c_EmployeeLastName']), 
	"employee_id" => $row['i_EmployeeID'], 
	"job_title" => trim($job_title), 
	
	#AM Segment Specific
	"am_bus_number" => $am_bus_number, 
	"am_in_time" => $am_in_time, 
	"am_out_time" => $am_out_time,
	"am_zonar_out" => $am_zonar_out,
	"am_time_diff_in" => $am_time_diff_in,
	"am_zonar_in" => $am_zonar_in,
	"am_time_diff_out" => $am_time_diff_out,
	"am_zonar_out_violation" => $am_zonar_out_violation,
	"am_zonar_in_violation" => $am_zonar_in_violation,
	
	#Mid Segment Specific
	"mid_bus_number" => $mid_bus_number, 
	"mid_in_time" => $mid_in_time, 
	"mid_out_time" => $mid_out_time,
	"mid_zonar_out" => $mid_zonar_out,
	"mid_time_diff_in" => $mid_time_diff_in,
	"mid_zonar_in" => $mid_zonar_in,
	"mid_time_diff_out" => $mid_time_diff_out,
	"mid_zonar_out_violation" => $mid_zonar_out_violation,
	"mid_zonar_in_violation" => $mid_zonar_in_violation,

	#PM Segment Specific
	"pm_bus_number" => $pm_bus_number, 
	"pm_in_time" => $pm_in_time, 
	"pm_out_time" => $pm_out_time,
	"pm_zonar_out" => $pm_zonar_out,
	"pm_time_diff_in" => $pm_time_diff_in,
	"pm_zonar_in" => $pm_zonar_in,
	"pm_time_diff_out" => $pm_time_diff_out,
	"pm_zonar_out_violation" => $pm_zonar_out_violation,
	"pm_zonar_in_violation" => $pm_zonar_in_violation,

	
	#LPM Segment Specific
	"lpm_bus_number" => $lpm_bus_number, 
	"lpm_in_time" => $lpm_in_time, 
	"lpm_out_time" => $lpm_out_time,
	"lpm_zonar_out" => $lpm_zonar_out,
	"lpm_time_diff_in" => $lpm_time_diff_in,
	"lpm_zonar_in" => $lpm_zonar_in,
	"lpm_time_diff_out" => $lpm_time_diff_out,
	"lpm_zonar_out_violation" => $lpm_zonar_out_violation,
	"lpm_zonar_in_violation" => $lpm_zonar_in_violation
	);
}
}

Open in new window


How can I get this data put into the array quicker, so that I can still echo it through with json_encode and pass it back to my html page?
ASKER CERTIFIED SOLUTION
Avatar of Ray Paseur
Ray Paseur
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Francisco Igor
Can you comment the array bulding and json_encode stages to verify if the script time is affected?

which is the expected time to run ?
Thinking in the amount of iterations (20min for ~4000 iterations)  you have about ~200ms per iteration (at some point of view, faster).
So the best optimization is to reduce the amount of iterations, or reducing the iteration time to 100ms or less, to reach your expectations.