Solved

Parse non-delimited Address field

Posted on 2013-11-10
11
473 Views
Last Modified: 2013-11-26
I have a field in an Access record data-set that contain the addresses of businesses. The table contains 7,000 records.

Current Record:

fldBusAddress
|646 RACOON RDG SCOTTSVILLE VA 24590              |
|1120 KINGSWAY RD AFTON VA 22920                  |
|3606 MILLINGTON RD FREE UNION VA 22940           |
|248 REAS FORD RD EARLYSVILLE VA 22936            |
|2684 CHAPEL SPRINGS LN FREE UNION VA 22940       |
|2030 AVON CT UNIT 8 CHARLOTTESVILLE VA 22902     |
|1667 BAILEYS RETREAT RD CHARLOTTESVILLE VA 22901 |

Desired Record:

HseNum|Street Name       |Suite  |City            |State |Zip
646   |RACOON RDG        |       |SCOTTSVILLE     |VA    |24590
1120  |KINGSWAY RD       |       |AFTON           |VA    |22920
3606  |MILLINGTON RD     |       |FREE UNION      |VA    |22940
248   |REAS FORD RD      |       |EARLYSVILLE     |VA    |22936
2684  |CHAPEL SPRINGS LN |       |FREE UNION      |VA    |22940
2030  |AVON CT           |UNIT 8 |CHARLOTTESVILLE |VA    |22902
1667  |BAILEYS RETREAT RD|       |CHARLOTTESVILLE |VA    |22901

I have uploaded two .txt files. One is of the example above and the other is a 100 record data-set.

I have been successful using Left() to get the HseNum and InStrRev() to get the State and Zip but the records that have a two-part City name and/or a two-part Street name has me stumped. I have also checked the knowledge base but could not find a satisfactory solution.

I am getting near a deadline and need help to move forward. Thanks in advance!!
BusAddrExample.txt
BusinessAddress.txt
0
Comment
Question by:ergenbgr
  • 3
  • 2
  • 2
  • +4
11 Comments
 
LVL 57
ID: 39637033
<<but the records that have a two-part City name and/or a two-part Street name has me stumped. I have also checked the knowledge base but could not find a satisfactory solution.>>

 That's because there is none really.   An address like this:

|2684 CHAPEL SPRINGS LN FREE UNION VA 22940  

is very though to break up.

Your best bet in these cases is to have zip code data base, lookup the city and then see if it matches some part of the string.

That doesn't always work however because many times the zip given has a city name that's different from the "city" used in the address.

 For example, 14051 gives "Amherst, NY", but someone might use the city of Williamsville, NY

 What I do in these cases is have a flag for non-validated addresses and then have someone break them up manually.

Generally, by the time you get down to the real exceptions, there's only a handful in comparison to the full list.

Jim.
0
 
LVL 57
ID: 39637038
The other thing you can do is look for things like "Lane", "Road", "Route", "LN", "RD", "RTE", etc  and try and determine where the address info ends.

Anything between there and the state is the city.

Jim.
0
 
LVL 70

Expert Comment

by:KCTS
ID: 39637095
I can't see how you are going to automate this. Its   pretty simple task to split the data into individual words but the problem here is that there is no consistency, By that I mean there is no way to tell where one part of the address end and the other starts since the street address can be one, two or three words, suite is sometimes present and sometimes missing, city again can be one, two or perhaps three words.

All I can suggest is that you revisit the record and manually insert a delimiter - perhaps use the '|' symbol as a delimiter, then use that to split the strings.
0
 
LVL 39

Expert Comment

by:als315
ID: 39637167
Look at sample. Idea is the same as proposed by JDettman: list of address info ends.
I have problem with one line:
355 RIO RD W STE 102 CHARLOTTESVILLE VA 22901
STE 102 seems to be Suite, but where should be W?
DBImportFile.accdb
0
 
LVL 57
ID: 39637198
<<STE 102 seems to be Suite, but where should be W? >>

 It's part of the address:

355 RIO ROAD WEST
Suite 102
Charlottesville
VA
22901

 and that's all you can do.   Look at the exceptions, add code to compensate, and then do it again.

  At some point there is diminishing returns and it's simplest just to handle them manually.   But at what point that ends up happening is hard to say.

 Out of 7,000, if I got it down to a couple hundred and it was a one time deal, I'd stop there.   But if it was a repeating process, then I might try harder with code.

 Really dicey with pushing with code though; next time through, something might not match up with the code you have and then you have in-accurate addresses.   Often I find it's just simpler to go the manual route.

Jim.
0
Complete Microsoft Windows PC® & Mac Backup

Backup and recovery solutions to protect all your PCs & Mac– on-premises or in remote locations. Acronis backs up entire PC or Mac with patented reliable disk imaging technology and you will be able to restore workstations to a new, dissimilar hardware in minutes.

 
LVL 21
ID: 39637289
I am getting near a deadline and need help to move forward.

If this data is going to be used to do a mailing?

If yes and this were my project I would get with a local post office or mailing services that can spit out the data for you and verify CASS)  it.  I would also check the USPS web site to see what it offers. Since it appears that time is not on your side, I would definitely look into using a service.
0
 
LVL 31

Expert Comment

by:Helen_Feddema
ID: 39637299
Try this:  

SELECT tblContactsTest.WholeAddress, Left([WholeAddress],InStr([WholeAddress],",")-1) AS Street, Mid([WholeAddress],InStr([WholeAddress],",")+1,([StatePos]-[StreetPos]-4)) AS City, Mid([WholeAddress],Len([WholeAddress])-7,2) AS State, Right([WholeAddress],5) AS Zip, InStr([WholeAddress],",")-1 AS StreetPos, Len([WholeAddress])-7 AS StatePos
FROM tblContactsTest;

WholeAddress is the field with the entire address in it.  This will split the address into Street, City, State and Zip.  You may have to tweak the expression a bit -- I think it was done for a case where there was a comma after the city.  Without the comma, it is very difficult to split the address correctly.
0
 
LVL 70

Expert Comment

by:KCTS
ID: 39637419
no commas or any other delimiters in the data - hence my previous comments
0
 
LVL 48

Accepted Solution

by:
PortletPaul earned 500 total points
ID: 39637907
As already noted there probably isn't any 100% method for dividing those strings onto all the parts you would like, and additionaly I can't produce the Access code. But below using sql server I could get this far with the sample data
| FIRST_NUM |                    ADDR1 |                            ADDR2 | STATE |    ZIP | EXCEPTION |
|-----------|--------------------------|----------------------------------|-------|--------|-----------|
|     2755  |                   (null) |   SCHELFORD FARM CHARLOTTESVILLE |    VA |  22901 |     CHECK |
|     1171  |                   (null) |  LITTLE MTN FARM CHARLOTTESVILLE |    VA |  22911 |     CHECK |
|      675  |     BERKMAR CIR STE 101  |                  CHARLOTTESVILLE |    VA |  22901 |     CHECK |
|      355  |        RIO RD W STE 102  |                  CHARLOTTESVILLE |    VA |  22901 |     CHECK |
|      355  |        RIO RD W STE 102  |                  CHARLOTTESVILLE |    VA |  22901 |     CHECK |
|     1710  |      SEMINOLE TRL STE 2  |                  CHARLOTTESVILLE |    VA |  22901 |     CHECK |
|     1410  |  INCARNATION DR STE 201  |                           CVILLE |    VA |  22901 |     CHECK |
|     2030  |          AVON CT UNIT 8  |                  CHARLOTTESVILLE |    VA |  22902 |     CHECK |
|      503  |      FAULCONER DR STE 5  |                  CHARLOTTESVILLE |    VA |  22903 |     CHECK |
|     1667  |      BAILEYS RETREAT RD  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|      518  |        BARRACKS FARM RD  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     3002  |              BERKMAR DR  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     3002  |              BERKMAR DR  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     3036  |              BERKMAR DR  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     2902  |            BROOKMERE RD  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     1334  |            HIGH VIEW DR  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|      465  |             IVY FARM DR  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|      445  |             IVY FARM DR  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|      125  |            IVY RIDGE RD  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     1844  |              IVYWEST LN  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     2655  |           MERIWETHER DR  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     2735  |           MERIWETHER DR  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|       84  |          OAK FOREST CIR  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     1710  |            OLD FORGE RD  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     1100  |             PEN PARK LN  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     2408  |          PINE GARTH RUN  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|      460  |             PREMIER CIR  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     2271  |            SEMINOLE TRL  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     2271  |            SEMINOLE TRL  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|      881  |               TILMAN RD  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     2655  |          BUCK ISLAND RD  |                  CHARLOTTESVILLE |    VA |  22902 |           |
|     2624  |          BUCK ISLAND RD  |                  CHARLOTTESVILLE |    VA |  22902 |           |
|      912  |        MICHIE TAVERN LN  |                  CHARLOTTESVILLE |    VA |  22902 |           |
|     4392  |                NAHOR RD  |                  CHARLOTTESVILLE |    VA |  22902 |           |
|     2390  |     ROSE HILL CHURCH LN  |                  CHARLOTTESVILLE |    VA |  22902 |           |
|     2962  |          SCOTTSVILLE RD  |                  CHARLOTTESVILLE |    VA |  22902 |           |
|     2866  |          SCOTTSVILLE RD  |                  CHARLOTTESVILLE |    VA |  22902 |           |
|        1  |           BOARS HEAD PL  |                  CHARLOTTESVILLE |    VA |  22903 |           |
|      635  |               BOONE TRL  |                  CHARLOTTESVILLE |    VA |  22903 |           |
|     3610  |              COLSTON DR  |                  CHARLOTTESVILLE |    VA |  22903 |           |
|     1100  |               DRYDEN LN  |                  CHARLOTTESVILLE |    VA |  22903 |           |
|     1345  |      DUDLEY MOUNTAIN RD  |                  CHARLOTTESVILLE |    VA |  22903 |           |
|     1805  |          FIDDLESTICK LN  |                  CHARLOTTESVILLE |    VA |  22903 |           |
|      719  |            GRASSMERE RD  |                  CHARLOTTESVILLE |    VA |  22903 |           |
|     2220  |                  IVY RD  |                  CHARLOTTESVILLE |    VA |  22903 |           |
|     3244  |        OLD LYNCHBURG RD  |                  CHARLOTTESVILLE |    VA |  22903 |           |
|     3635  |         RALEIGH MTN TRL  |                  CHARLOTTESVILLE |    VA |  22903 |           |
|     4189  |             RED HILL RD  |                  CHARLOTTESVILLE |    VA |  22903 |           |
|      644  |          TAYLORS GAP RD  |                  CHARLOTTESVILLE |    VA |  22903 |           |
|      410  |         WHITE GABLES LN  |                  CHARLOTTESVILLE |    VA |  22903 |           |
|     2840  |                CINDY LN  |                  CHARLOTTESVILLE |    VA |  22911 |           |
|      310  |           CRESTFIELD CT  |                  CHARLOTTESVILLE |    VA |  22911 |           |
|     1900  |             FRANKLIN DR  |                  CHARLOTTESVILLE |    VA |  22911 |           |
|     1705  |           GOLDENTREE PL  |                  CHARLOTTESVILLE |    VA |  22911 |           |
|     2831  |        RIGGORY RIDGE RD  |                  CHARLOTTESVILLE |    VA |  22911 |           |
|      597  |         ROCKY HOLLOW RD  |                  CHARLOTTESVILLE |    VA |  22911 |           |
|      175  |            S PANTOPS DR  |                  CHARLOTTESVILLE |    VA |  22911 |           |
|     3661  |          STONY POINT RD  |                  CHARLOTTESVILLE |    VA |  22911 |           |
|     2040  |              TREMONT RD  |                  CHARLOTTESVILLE |    VA |  22911 |           |
|     3363  |           WATTS PASSAGE  |                  CHARLOTTESVILLE |    VA |  22911 |           |
|     2693  |            HEARTWOOD RD  |                            AFTON |    VA |  22920 |           |
|     1120  |             KINGSWAY RD  |                            AFTON |    VA |  22920 |           |
|     4640  |           EAGLE NEST LN  |                    BARBOURSVILLE |    VA |  22923 |           |
|     5327  |        MONACAN TRAIL RD  |                       COVESVILLE |    VA |  22931 |           |
|     3224  |         FOX MOUNTAIN RD  |                           CROZET |    VA |  22932 |           |
|     4438  |                GARTH RD  |                           CROZET |    VA |  22932 |           |
|     5784  |               LOCUST LN  |                           CROZET |    VA |  22932 |           |
|     6310  |                ESTES LN  |                             DYKE |    VA |  22935 |           |
|      555  |            ARROWHEAD CT  |                      EARLYSVILLE |    VA |  22936 |           |
|     5854  |        BUFFALO RIVER RD  |                      EARLYSVILLE |    VA |  22936 |           |
|     2610  |          EARLYSVILLE RD  |                      EARLYSVILLE |    VA |  22936 |           |
|     3458  |            REAS FORD LN  |                      EARLYSVILLE |    VA |  22936 |           |
|      248  |            REAS FORD RD  |                      EARLYSVILLE |    VA |  22936 |           |
|      963  |            RED CREST LN  |                      EARLYSVILLE |    VA |  22936 |           |
|     6040  |              VINTAGE RD  |                      EARLYSVILLE |    VA |  22936 |           |
|     7814  |       CHESTNUT GROVE RD  |                           ESMONT |    VA |  22937 |           |
|     7395  |              PORTERS RD  |                           ESMONT |    VA |  22937 |           |
|     3066  |              RED ROW LN  |                           ESMONT |    VA |  22937 |           |
|     2118  |        BUCK MOUNTAIN RD  |                       FREE UNION |    VA |  22940 |           |
|     4869  |            CATTERTON RD  |                       FREE UNION |    VA |  22940 |           |
|     2684  |       CHAPEL SPRINGS LN  |                       FREE UNION |    VA |  22940 |           |
|     3606  |           MILLINGTON RD  |                       FREE UNION |    VA |  22940 |           |
|     7154  |             JOHNSTON LN  |                     GORDONSVILLE |    VA |  22942 |           |
|     8311  |       ROCKFISH GAP TPKE  |                        GREENWOOD |    VA |  22943 |           |
|     6751  |              FORTUNE LN  |                            KEENE |    VA |  22946 |           |
|      572  |              MAXWELL RD  |                          KESWICK |    VA |  22947 |           |
|     2779  |        MONACAN TRAIL RD  |                     NORTH GARDEN |    VA |  22959 |           |
|     3401  |        OLD LYNCHBURG RD  |                     NORTH GARDEN |    VA |  22959 |           |
|     2210  |           FRAYS MILL RD  |                     RUCKERSVILLE |    VA |  22968 |           |
|     6627  |             BLENHEIM RD  |                      SCOTTSVILLE |    VA |  24590 |           |
|     5499  |       JEFFERSON MILL RD  |                      SCOTTSVILLE |    VA |  24590 |           |
|      646  |              RACOON RDG  |                      SCOTTSVILLE |    VA |  24590 |           |
|     2350  |           SECRETARYS RD  |                      SCOTTSVILLE |    VA |  24590 |           |
|     2582  |           SECRETARYS RD  |                      SCOTTSVILLE |    VA |  24590 |           |
		

Open in new window

I had to build a small table to assist, it contains some of the roadway abbreviations used (but not 'farm' that's a problem for "farm st" etc). So here is the query, I used several layers of nesting to try an make it more legible.  Maybe it will help.
CREATE TABLE road
	([abbrev] varchar(7))
;
	
INSERT INTO road
	([abbrev])
VALUES
	('CIR'),
	('CT'),
	('DR'),
	('LN'),
	('PASSAGE'),
	('PL'),
	('RDG'),
	('RD'),
	('RUN'),
	('TPKE'),
	('TRL')
;

SELECT
        first_num
      , CASE WHEN has_nums > 0 THEN left(portion, charindex(' ',portion,has_nums))
             WHEN abbrev IS NOT NULL THEN left(portion, charindex(' ' + abbrev +' ', portion) + len(abbrev)+1 )
        END AS addr1
      , CASE WHEN has_nums > 0 THEN substring(portion,charindex(' ',portion,has_nums),islong)
             WHEN abbrev IS NOT NULL THEN substring(portion, charindex(' ' + abbrev +' ', portion) + len(abbrev)+1, islong )
             ELSE portion
        END AS addr2
      , STATE
      , zip
      --, abbrev
      , case when abbrev is null or has_nums > 0 then 'CHECK' else '' end as Exception
FROM (
      SELECT
              fldBusAddress
            , first_num
            , substring(fldBusAddress,pos_start, (islong - (pos_zip + pos_state + pos_start - 1))) AS portion
            , PatIndex('%[0-9]%',substring(fldBusAddress,pos_start, (islong - (pos_zip + pos_state + pos_start - 1))) ) AS has_nums
            , right(substring(fldBusAddress,1,(islong - pos_zip)),pos_state) AS STATE
            , zip
            , islong
            , abbrev
      FROM (
            SELECT
                    fldBusAddress
                  , islong
                  , charindex(' ',fldBusAddress) AS pos_start
                  , left(fldBusAddress, charindex(' ',fldBusAddress)) AS first_num
                  , charindex(' ',revAddress) AS pos_zip
                  , right(fldBusAddress, charindex(' ',revAddress)) AS zip
                  , charindex(' ',reverse(substring(fldBusAddress,1,islong-charindex(' ',revAddress)))) AS pos_state
            FROM (
                  SELECT
                         fldBusAddress
                       , reverse(fldBusAddress)                                  AS revAddress
                       , len(fldBusAddress)                                      AS islong
                       , len(fldBusAddress) - len(replace(fldBusAddress,' ','')) AS spaces
                  FROM yourtable
                 ) a
            WHERE spaces > 3
            ) b
      LEFT JOIN road ON fldBusAddress LIKE '% ' + road.abbrev + ' %'
     ) c
ORDER BY
        CASE WHEN abbrev IS NULL THEN 0 ELSE 1 END
      , CASE WHEN has_nums > 0 THEN 0 ELSE 1 END
      , zip
      , addr2
      , addr1
;

-- http://sqlfiddle.com/#!3/01a910/18

Open in new window

0
 

Author Closing Comment

by:ergenbgr
ID: 39679518
Thanks!!
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39679529
Pleased I could assist. Thanks for the grading. Cheers, Paul
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Tags from access to excel 3 24
Sub Reports 8 21
Prevent use of Microsoft Office application 9 52
SQL Query Conversion of IIF statement into CASE - Syntax issue 17 28
If you have heard of RFC822 date formats, they can be quite a challenge in SQL Server. RFC822 is an Internet standard format for email message headers, including all dates within those headers. The RFC822 protocols are available in detail at:   ht…
In this article we discuss how to recover the missing Outlook 2011 for Mac data like Emails and Contacts manually.
In Microsoft Access, when working with VBA, learn some techniques for writing readable and easily maintained code.
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

947 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

22 Experts available now in Live!

Get 1:1 Help Now