ambientsbs
asked on
Faster MySQL Update Query to update field in table based on other table
Hello Experts,
I reached out here earlier to get some code for the following problem, and it's worked great, however I'm now using it on a table with about 3 Million records and it takes forever and eventually times out.
Here's what I have. I have a table with Cities and States and an empty Zip field. I'm using the statement below to populate the zip field based on the name of the city and state and a table that stores all my city, state, zip info:
Is there any way that I can maybe order the Z_MyTable within this query in order to make it run faster? I'm at a loss, but it just doesn't seem to work great with that many records...
Thanks for the help!
I reached out here earlier to get some code for the following problem, and it's worked great, however I'm now using it on a table with about 3 Million records and it takes forever and eventually times out.
Here's what I have. I have a table with Cities and States and an empty Zip field. I'm using the statement below to populate the zip field based on the name of the city and state and a table that stores all my city, state, zip info:
UPDATE Z_MyTable S
STRAIGHT_JOIN CityStateZip_LatLong SA
ON S.State = SA.state
SET S.Zip = SA.zipcode
WHERE S.City LIKE CONCAT('%',SA.city,'%')
AND S.State = SA.state
Is there any way that I can maybe order the Z_MyTable within this query in order to make it run faster? I'm at a loss, but it just doesn't seem to work great with that many records...
Thanks for the help!
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
This WHERE clause is incorrect and unoptimal as you are asking MySQL
to do a substring search within a string:
WHERE S.City LIKE CONCAT('%',SA.city,'%')
Incorrect? Now, we don't know depends on the contents of the city fields
Unoptimal ! Yes sure, but every where, order by or group by is Unoptimal if the index in not available, But using wildcard is very unoptimal :-)
btw ambientsbs
Why do you use this %like% ?
to do a substring search within a string:
WHERE S.City LIKE CONCAT('%',SA.city,'%')
Incorrect? Now, we don't know depends on the contents of the city fields
Unoptimal ! Yes sure, but every where, order by or group by is Unoptimal if the index in not available, But using wildcard is very unoptimal :-)
btw ambientsbs
Why do you use this %like% ?
I am presuming the CityStateZip_LatLong table (which is aliased to "SA")
is a reference table of cities, states, zip codes, etc .
I am presuming that the city field contains the name of each city in the
U.S. and therefore starts in column 1. It is therefore more correct -- as
I stated -- for proposed query to use a WHERE clause like this:
WHERE S.City LIKE CONCAT('%',SA.city,'%')
As it turns out it should be more optimal as the MySQL optimizer can now
internally do a match against a city index (should it exist) instead of
a substring search through the entire table.
is a reference table of cities, states, zip codes, etc .
I am presuming that the city field contains the name of each city in the
U.S. and therefore starts in column 1. It is therefore more correct -- as
I stated -- for proposed query to use a WHERE clause like this:
WHERE S.City LIKE CONCAT('%',SA.city,'%')
As it turns out it should be more optimal as the MySQL optimizer can now
internally do a match against a city index (should it exist) instead of
a substring search through the entire table.
In that case you need the '%',SA.city,'%'
SA.city,'%' is not enough, there are loads of countries that have the ZIP code in front of the city name.
By the way, this will not work if one of the cities is also part of a larger city name:
eg. "new york" like '%york%' is also correct
yes I know it's a different state, but I'm not smart enough to find a better example :-)
SA.city,'%' is not enough, there are loads of countries that have the ZIP code in front of the city name.
By the way, this will not work if one of the cities is also part of a larger city name:
eg. "new york" like '%york%' is also correct
yes I know it's a different state, but I'm not smart enough to find a better example :-)
ASKER
Perhaps I'm doing all of this wrong and there's a simpler way to do it.....let me clarify what my data looks like. I have a list of a couple million vehicles with their cities and states but no zip codes. The cities and states will show things like:
city state
san francisco bay area CA
new york city NY
greenville/upstate SC
I also have the CityStateZip_LatLong table that has a listing of every single city, state, and zip code in the US.
I'm using the LIKE statement because I never know what the city is going to look like, but I think it will work fine to just use the WHERE S.City LIKE CONCAT(SA.city,'%') statement.
Is there a cleaner/faster way to assign a zip code to every record based on city/state than what I am doing? Hope this helps clarify....
I'll work on the suggestions given and provide feedback. Thanks for the initial responses!
city state
san francisco bay area CA
new york city NY
greenville/upstate SC
I also have the CityStateZip_LatLong table that has a listing of every single city, state, and zip code in the US.
I'm using the LIKE statement because I never know what the city is going to look like, but I think it will work fine to just use the WHERE S.City LIKE CONCAT(SA.city,'%') statement.
Is there a cleaner/faster way to assign a zip code to every record based on city/state than what I am doing? Hope this helps clarify....
I'll work on the suggestions given and provide feedback. Thanks for the initial responses!
If this must be like it is, you can add a "where s.zip is null" and limit the number of rows being updated in one step.
This won't reduce the overall time, but you can successfully update all records without timeouts.