Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win

x
?
Solved

Access 2007, complex query

Posted on 2010-09-23
24
Medium Priority
?
462 Views
Last Modified: 2013-11-27
I have a table that consists of 3 columns: state, rate and zip. Each record in this table is unique. Every state has several zip codes and each one of those has a rate assigned to it. The table has about 40,000 records and I need to get it to a much smaller number. I need to combine some of the records and use zip ranges that have the same rate. Example:
State    Zip       Rate
AL       02103   .07
AL       02550   .07
AL       05982   .07

I need to make it look like this:
State    Zip_from     Zip_to      Rate
AL         02103        05982       .07

That would've been easy as I could do group by state and group by zip query and then add 2 columns Min(zip) and Max(zip), BUT there are occasions when we could find in the same state there is another record with the zip code that falls into the same range, but with different rate: to add to example above there could be a record like this:

State    Zip       Rate
AL       03276   .08

It falls within this range:

State    Zip_from     Zip_to      Rate
AL         02103        05982       .07

but it has a different rate. So, these ranges have to be somehow evaluated to see if there  are other records with different zip codes and rates that would fall within the range. If so, the range would have to be stopped and then restarted. So, following the example above we would have 2 ranges instead of 1:

this range:
State    Zip_from     Zip_to      Rate
AL         02103        05982       .07
will become 2 ranges:
State    Zip_from     Zip_to      Rate
AL         02103        02550       .07
AL         05982        05982       .07

That way this record:
State    Zip       Rate
AL       03276   .08
will be in it's own range.

I hope I explained it well enough.
Thanks
0
Comment
Question by:gkhotline
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 7
  • 7
  • 6
  • +3
24 Comments
 
LVL 48

Expert Comment

by:Dale Fye
ID: 33744526
Why do you feel the need to make this transition?  

I would advise against it for just the reason you are proposing.  What happens when the rate changes for a particular zip that is in the middle of a range.  You now have to parse that one record into 3.  40,000 records is a small table for a well designed relational database.

Leave it the way it is, it is far easier to use in this format, despite a slight database size decrease.
0
 

Author Comment

by:gkhotline
ID: 33744655
Well, if the rate changes it would just be assigned into already existing range or to a new range of it's own. So it could only create one more record at the maximum. 40,000 records is only for now. The total number may increase significantly and we need to make the operation as quick as possible with the fewest records possible. I have been given this task and it has to be done. I have been trying different things, but cant figure it out.
Thanks
0
 
LVL 93

Expert Comment

by:Patrick Matthews
ID: 33744837
gkhotline,I have to second fyed's recommendation to leave your table as is.  40k records is not a lot in the grand scheme of things, and your current design is exactly what I would have recommended a priori.Think about it: today you might have a valid range of 02103 - 05982, but if tomorrow that range changes you might have to do a heck of a lot of work to catch up.Patrick
0
NEW Veeam Agent for Microsoft Windows

Backup and recover physical and cloud-based servers and workstations, as well as endpoint devices that belong to remote users. Avoid downtime and data loss quickly and easily for Windows-based physical or public cloud-based workloads!

 

Author Comment

by:gkhotline
ID: 33744876
as a followup: i need some sort of a code that would go through every record and for every state where we have same rate - look at the zip codes 1 by 1 and see if any of them go in order (ex. 02110,02111, 02112). If that is the case - put them into a range. So range would only be created if there are consecutive zip codes for a same state and rate. The rest of them would just be individual ranges of 1 record
0
 

Author Comment

by:gkhotline
ID: 33744928
Basically, I would be importing about 70,000 records every month into this tool that I am building and then will run all these queries to try to get it down to a much smaller number. Right now it is down to 40,000 and I need to make it even smaller - that is my requirement. Then I would export this file to be used in a different application. And the structure that it needs to have is:

State   zip_from   zip_to    rate

Any suggestions?
0
 
LVL 48

Expert Comment

by:Dale Fye
ID: 33750397
So what you are basically saying is that you have no choice, and management will not listen to common sense that says the current structure, when indexed properly will produce excellent results, without the added headache of this unnecessary step.
0
 
LVL 31

Expert Comment

by:Helen Feddema
ID: 33750439
That does happen, from time to time.  Sometimes, after the warned-about problems occur, they will changes their minds.
0
 
LVL 48

Accepted Solution

by:
Dale Fye earned 1000 total points
ID: 33750484
Given that, I would do something that looks like the following.

   
Public Sub MergeZipRates()

    Dim db As DAO.Database
    Dim rsS As DAO.Recordset, rsD As DAO.Recordset
    Dim strPrevState As String, strPrevZip As String, sngPrevRate As Single
    
    On Error GoTo ProcError
    
    Set db = CurrentDb
    Set rsS = db.OpenRecordset("SELECT * FROM tbl_Zip ORDER BY [State], [Zip]", , dbFailOnError)
    
    db.Execute "DELETE * FROM tbl_ZipNew"
    Set rsD = db.OpenRecordset("SELECT State, Zip_From, Zip_To, Rate FROM tbl_ZipNew", , dbFailOnError)

    While Not rsS.EOF
        If rsS!State = strPrevState And rsS!Rate = sngPrevRate Then
            strPrevZip = rsS!Zip
        Else
            rsD!Zip_To = strPrevZip
            rsD.Update
            rsD.AddNew
            rsD!State = rsS!State
            strPrevState = rsS!State
            rsD!Zip_From = rsS!Zip
            strPrevZip = rsS!Zip
            rsD!Rate = rsS!Rate
            sngPrevRate = rsS!Rate
        End If
        rsS.MoveNext
    Wend
    
ProcExit:
    rsS.Close
    Set rsS = Nothing
    
    rsD!Zip_To = strPrevZip
    rsD.Update
    rsD.Close
    Set rsD = Nothing
        
    Exit Sub
ProcError:
    If Err.Number = 3020 Then   'Update or CancelUpdate without AddNew or Edit.
        Resume Next
    Else
        MsgBox Err.Number & vbCrLf & Err.Description
        Resume ProcExit
    End If
    
End Sub

Open in new window

0
 
LVL 48

Expert Comment

by:Dale Fye
ID: 33750498
sample database attached.
ZipRanges.mdb
0
 
LVL 51

Assisted Solution

by:Mark Wills
Mark Wills earned 1000 total points
ID: 33750782
Can do that in a straight query...



select state, min(zip) as start_zip, max(zip) as end_zip, rate
from (
      SELECT top 100 percent z.state,z.zip,z.rate,(select top 1 n.zip from tbl_zip n where n.state = z.state and n.zip > z.zip and n.rate <> z.rate order by n.state, n.zip) as next_rate_break
      FROM tbl_Zip as Z
      ORDER BY z.state, z.zip, z.rate
     ) as SQ
Group by state, rate, next_rate_break
order by 1,2,3,4

Open in new window

0
 
LVL 51

Expert Comment

by:Mark Wills
ID: 33750801
Oops, forgot to add the database back, and hope fyed doesnt mind too much that I borrowed it :D
ZipRanges.mdb
0
 
LVL 48

Expert Comment

by:Dale Fye
ID: 33752389
Mark,

I prefer SQL solutions over code, but I just cannot get my brain to work this early.  I'll have to take a look at this later.  The vba took me about 5 minutes, it would probably have taken me an hour to come up with the SQL.
0
 
LVL 51

Expert Comment

by:Mark Wills
ID: 33752485
G'Day fyed,

For me, it is the other way round :) And didnt mean anything by that comment "it can be done as a straight query" in reference to your code, just that it could also be done as a query. Hope I didnt offend. And yes, it was a bit cheeky to grab your database - ta muchly for posting it though :)
0
 
LVL 48

Expert Comment

by:Dale Fye
ID: 33754109
No offense taken.  The point is to help the OP, and learn a bit at the same time.

I'm pretty good at nested queries, but start to get confused with 3rd order nesting.
0
 

Author Comment

by:gkhotline
ID: 33754624
You guys are good! It will take me some time to understand step by step what you did... How do you want me to split the points? Given the fact fyed gave his solution first... Just want to be fair...
0
 
LVL 51

Expert Comment

by:Mark Wills
ID: 33754749
However you like - it is your question :)  

And please just ask if there is anything I can help with explaining what the heck that query is really doing :) :)
0
 

Author Comment

by:gkhotline
ID: 33754860
Thank you Mark,
Would that be too much to ask to tell me step by step what you are doing? I could possible create a new question for this so that you dont waist your time for nothing.
0
 
LVL 51

Expert Comment

by:Mark Wills
ID: 33754939
No worries, and no need for another question... Explaination is an integral part of the solution :)
0
 
LVL 51

Expert Comment

by:Mark Wills
ID: 33755204
So, first step is understanding what the real problem is. Believe it or not, you actually had the solution, just didnt recognise it as a query...

>> That would've been easy as I could do group by state and group by zip query and then add 2 columns Min(zip) and Max(zip),

followed by :

>> BUT there are occasions when we could find in the same state there is another record...

So, the basis is we can use min and max except when there is a break point. That break point is an intercepting zip code with a different rate. And you provided that information, you knew my query before I did *laughing*.

Step 1 was to then start constructing the "raw" elements, gathering the necessary information so we can run a query.

What it meant in this case was starting with a fairly "raw" query to see if we can get those breakpoints.

      SELECT top 100 percent z.state,z.zip,z.rate,  '???????'  as next_rate_break
      FROM tbl_Zip as Z
      ORDER BY z.state, z.zip, z.rate

the challenge then becomes how do we find that breakpoint ? We might be able to join, but in reality, we are really looking for just the very next instance of a zip code with a different rate from the "current" row.

So we introduce an in-line query :

     (select top 1 n.zip from tbl_zip n where n.state = z.state and n.zip > z.zip and n.rate <> z.rate order by n.state, n.zip) as next_rate_break

Now it doesnt run on its own because it needs the "current" row as a reference. We also dont have to be hugely concerned where it lives in the overall scheme of things (just yet) so long as we do know that for "this" current series, it will stop at "next_rate_break" if there is one.

Lets now incorporate that back into our query replacing the question marks:

      SELECT top 100 percent z.state,z.zip,z.rate,(select top 1 n.zip from tbl_zip n where n.state = z.state and n.zip > z.zip and n.rate <> z.rate order by n.state, n.zip) as next_rate_break
      FROM tbl_Zip as Z
      ORDER BY z.state, z.zip, z.rate

And running that we see that the detail rows are in fact picking up an instance of "next break point" when there really is one. But the rows look a bit of a mess.

So, now we need to filter out that information, so we wrap up the above inside another query. ie, it becomes a subquery

select state, min(zip) as start_zip, max(zip) as end_zip, rate
from (
      SELECT top 100 percent z.state,z.zip,z.rate,(select top 1 n.zip from tbl_zip n where n.state = z.state and n.zip > z.zip and n.rate <> z.rate order by n.state, n.zip) as next_rate_break
      FROM tbl_Zip as Z
      ORDER BY z.state, z.zip, z.rate
     ) as SQ
Group by state, rate, next_rate_break
order by 1,2,3,4


Notice how we wrap up our supquery in brackets and give it an alias (a name) of "SQ". This is also where Access can be a bit tiresome because it reckons it knows better and ssometimes tries to replace some of our brackets with sqare brackets next time you edit it. Just be careful of that ;)

So, we now have an outer query selecting from our inner or subquery "SQ" and we need to (now) find the appropriate pieces from that raw information to report on.

Back to your original observations. We can use min and max except where there is a breakpoint. Guess what, we now know that there is a breakpoint, and can use that vital piece of information to help the MIN and MAX aggregate functions.

For our outer query, we already know we want to group things by State and rate, and as observed, that might be broken up because of those breakpoints, and so, that also must be included into our grouping...

   select state, min(zip) as start_zip, max(zip) as end_zip, rate
   from (...) as SQ
   Group by state, rate, next_rate_break
   order by 1,2,3,4

Which gives us the group by clause  "   Group by state, rate, next_rate_break " and all we really have to do know is get the state and rate, and then the min(zip) and max(zip) within that breakpoint.

Oh, and to finish it off we also have to sort it in some sequence and I tend to use the "column poistion" type notation rather than column names (seen in the order by)

So, it really becomes a compound select based on facts / information as we can gather...
1) get the "raw" rows (our subquery)
2) add in those breakpoints that will upset the sequence (ie a different rate within that state ) (add an inline query into our subquery)
3) Now select and present that information (the outer query)

Does that help explain it a bit more ?
0
 

Author Comment

by:gkhotline
ID: 33756283
Wow,
I didnt expect such a detailed answer! Thank you very much! I will go through it a little bit later, but it looks like if I can't understand this then all is lost for me.
Again, thank you! I will post my comments later...
0
 
LVL 51

Expert Comment

by:Mark Wills
ID: 33758299
No problems....

I do apologise for all the typos in there - and didnt explain why I was selecting top 100 percent.

That part is an old habit. In SQL if you use an ORDER BY then you normally need a selection qualifier for that ORDER BY to work within a sub query or view in a lot of different databases (e.g. MS SQL Server). So, when / if using an order by as part of a subquery, then I normally throw in the top 100 percent.

In reality I am not so sure that the order by nor the top 100 percent is actually required in the subquery apart from wanting to make sure that we do have all the postcodes in correct sequence for the state (regardless of when the postcodes were added to the table).

In those type of situations, if you are not sure what the sequence will really be, then always best to take control of it yourself to avoid any unexpected results. In the above, however, the inline query will always get the "next-break-point" regardless of sequence because that too is also using an order by and so could get some performance improvement if we really needed to by testing the need for the "top 100 percent" and "order by" within the "SQ" subquery (it is definitely needed in the inline query).

Also, if you have any "where" qualifiers then you would first look to add them to the "SQ" subquery part to minimise the amount of "raw" data, or, you can add it to the end as well if it is performing OK given the volume of data you have.

See ? Always room for more explanation, so, if it doesnt make sense, simply ask away :)
0
 
LVL 31

Expert Comment

by:hnasr
ID: 33758784
I go with mark_wills's comment ID:33750782. It is easier for me to check. Did not check other comments, whicj may give you the same accepted result.

If you still need more clarification to this comment, I may help.
0
 

Author Closing Comment

by:gkhotline
ID: 33788576
Thank you for your solutions and explanations. Very, very helpful!
0

Featured Post

New feature and membership benefit!

New feature! Upgrade and increase expert visibility of your issues with Priority Questions.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Traditionally, the method to display pictures in Access forms and reports is to first download them from URLs to a folder, record the path in a table and then let the form or report pull the pictures from that folder. But why not let Windows retr…
If you need to forecast numbers -- typically for finance -- the Windows and Mac versions of Excel 2016 have a basket of tools to get the job done.
This Micro Tutorial will demonstrate how to use longer labels with horizontal bar charts instead of the vertical column chart.
Excel styles will make formatting consistent and let you apply and change formatting faster. In this tutorial, you'll learn how to use Excel's built-in styles, how to modify styles, and how to create your own. You'll also learn how to use your custo…

610 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question