Solved

Parse non-delimited Address field

Posted on 2013-11-10
11
470 Views
Last Modified: 2013-11-26
I have a field in an Access record data-set that contain the addresses of businesses. The table contains 7,000 records.

Current Record:

fldBusAddress
|646 RACOON RDG SCOTTSVILLE VA 24590              |
|1120 KINGSWAY RD AFTON VA 22920                  |
|3606 MILLINGTON RD FREE UNION VA 22940           |
|248 REAS FORD RD EARLYSVILLE VA 22936            |
|2684 CHAPEL SPRINGS LN FREE UNION VA 22940       |
|2030 AVON CT UNIT 8 CHARLOTTESVILLE VA 22902     |
|1667 BAILEYS RETREAT RD CHARLOTTESVILLE VA 22901 |

Desired Record:

HseNum|Street Name       |Suite  |City            |State |Zip
646   |RACOON RDG        |       |SCOTTSVILLE     |VA    |24590
1120  |KINGSWAY RD       |       |AFTON           |VA    |22920
3606  |MILLINGTON RD     |       |FREE UNION      |VA    |22940
248   |REAS FORD RD      |       |EARLYSVILLE     |VA    |22936
2684  |CHAPEL SPRINGS LN |       |FREE UNION      |VA    |22940
2030  |AVON CT           |UNIT 8 |CHARLOTTESVILLE |VA    |22902
1667  |BAILEYS RETREAT RD|       |CHARLOTTESVILLE |VA    |22901

I have uploaded two .txt files. One is of the example above and the other is a 100 record data-set.

I have been successful using Left() to get the HseNum and InStrRev() to get the State and Zip but the records that have a two-part City name and/or a two-part Street name has me stumped. I have also checked the knowledge base but could not find a satisfactory solution.

I am getting near a deadline and need help to move forward. Thanks in advance!!
BusAddrExample.txt
BusinessAddress.txt
0
Comment
Question by:ergenbgr
  • 3
  • 2
  • 2
  • +4
11 Comments
 
LVL 57
ID: 39637033
<<but the records that have a two-part City name and/or a two-part Street name has me stumped. I have also checked the knowledge base but could not find a satisfactory solution.>>

 That's because there is none really.   An address like this:

|2684 CHAPEL SPRINGS LN FREE UNION VA 22940  

is very though to break up.

Your best bet in these cases is to have zip code data base, lookup the city and then see if it matches some part of the string.

That doesn't always work however because many times the zip given has a city name that's different from the "city" used in the address.

 For example, 14051 gives "Amherst, NY", but someone might use the city of Williamsville, NY

 What I do in these cases is have a flag for non-validated addresses and then have someone break them up manually.

Generally, by the time you get down to the real exceptions, there's only a handful in comparison to the full list.

Jim.
0
 
LVL 57
ID: 39637038
The other thing you can do is look for things like "Lane", "Road", "Route", "LN", "RD", "RTE", etc  and try and determine where the address info ends.

Anything between there and the state is the city.

Jim.
0
 
LVL 70

Expert Comment

by:KCTS
ID: 39637095
I can't see how you are going to automate this. Its   pretty simple task to split the data into individual words but the problem here is that there is no consistency, By that I mean there is no way to tell where one part of the address end and the other starts since the street address can be one, two or three words, suite is sometimes present and sometimes missing, city again can be one, two or perhaps three words.

All I can suggest is that you revisit the record and manually insert a delimiter - perhaps use the '|' symbol as a delimiter, then use that to split the strings.
0
 
LVL 39

Expert Comment

by:als315
ID: 39637167
Look at sample. Idea is the same as proposed by JDettman: list of address info ends.
I have problem with one line:
355 RIO RD W STE 102 CHARLOTTESVILLE VA 22901
STE 102 seems to be Suite, but where should be W?
DBImportFile.accdb
0
 
LVL 57
ID: 39637198
<<STE 102 seems to be Suite, but where should be W? >>

 It's part of the address:

355 RIO ROAD WEST
Suite 102
Charlottesville
VA
22901

 and that's all you can do.   Look at the exceptions, add code to compensate, and then do it again.

  At some point there is diminishing returns and it's simplest just to handle them manually.   But at what point that ends up happening is hard to say.

 Out of 7,000, if I got it down to a couple hundred and it was a one time deal, I'd stop there.   But if it was a repeating process, then I might try harder with code.

 Really dicey with pushing with code though; next time through, something might not match up with the code you have and then you have in-accurate addresses.   Often I find it's just simpler to go the manual route.

Jim.
0
Free Trending Threat Insights Every Day

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

 
LVL 21
ID: 39637289
I am getting near a deadline and need help to move forward.

If this data is going to be used to do a mailing?

If yes and this were my project I would get with a local post office or mailing services that can spit out the data for you and verify CASS)  it.  I would also check the USPS web site to see what it offers. Since it appears that time is not on your side, I would definitely look into using a service.
0
 
LVL 31

Expert Comment

by:Helen_Feddema
ID: 39637299
Try this:  

SELECT tblContactsTest.WholeAddress, Left([WholeAddress],InStr([WholeAddress],",")-1) AS Street, Mid([WholeAddress],InStr([WholeAddress],",")+1,([StatePos]-[StreetPos]-4)) AS City, Mid([WholeAddress],Len([WholeAddress])-7,2) AS State, Right([WholeAddress],5) AS Zip, InStr([WholeAddress],",")-1 AS StreetPos, Len([WholeAddress])-7 AS StatePos
FROM tblContactsTest;

WholeAddress is the field with the entire address in it.  This will split the address into Street, City, State and Zip.  You may have to tweak the expression a bit -- I think it was done for a case where there was a comma after the city.  Without the comma, it is very difficult to split the address correctly.
0
 
LVL 70

Expert Comment

by:KCTS
ID: 39637419
no commas or any other delimiters in the data - hence my previous comments
0
 
LVL 48

Accepted Solution

by:
PortletPaul earned 500 total points
ID: 39637907
As already noted there probably isn't any 100% method for dividing those strings onto all the parts you would like, and additionaly I can't produce the Access code. But below using sql server I could get this far with the sample data
| FIRST_NUM |                    ADDR1 |                            ADDR2 | STATE |    ZIP | EXCEPTION |
|-----------|--------------------------|----------------------------------|-------|--------|-----------|
|     2755  |                   (null) |   SCHELFORD FARM CHARLOTTESVILLE |    VA |  22901 |     CHECK |
|     1171  |                   (null) |  LITTLE MTN FARM CHARLOTTESVILLE |    VA |  22911 |     CHECK |
|      675  |     BERKMAR CIR STE 101  |                  CHARLOTTESVILLE |    VA |  22901 |     CHECK |
|      355  |        RIO RD W STE 102  |                  CHARLOTTESVILLE |    VA |  22901 |     CHECK |
|      355  |        RIO RD W STE 102  |                  CHARLOTTESVILLE |    VA |  22901 |     CHECK |
|     1710  |      SEMINOLE TRL STE 2  |                  CHARLOTTESVILLE |    VA |  22901 |     CHECK |
|     1410  |  INCARNATION DR STE 201  |                           CVILLE |    VA |  22901 |     CHECK |
|     2030  |          AVON CT UNIT 8  |                  CHARLOTTESVILLE |    VA |  22902 |     CHECK |
|      503  |      FAULCONER DR STE 5  |                  CHARLOTTESVILLE |    VA |  22903 |     CHECK |
|     1667  |      BAILEYS RETREAT RD  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|      518  |        BARRACKS FARM RD  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     3002  |              BERKMAR DR  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     3002  |              BERKMAR DR  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     3036  |              BERKMAR DR  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     2902  |            BROOKMERE RD  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     1334  |            HIGH VIEW DR  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|      465  |             IVY FARM DR  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|      445  |             IVY FARM DR  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|      125  |            IVY RIDGE RD  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     1844  |              IVYWEST LN  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     2655  |           MERIWETHER DR  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     2735  |           MERIWETHER DR  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|       84  |          OAK FOREST CIR  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     1710  |            OLD FORGE RD  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     1100  |             PEN PARK LN  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     2408  |          PINE GARTH RUN  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|      460  |             PREMIER CIR  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     2271  |            SEMINOLE TRL  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     2271  |            SEMINOLE TRL  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|      881  |               TILMAN RD  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     2655  |          BUCK ISLAND RD  |                  CHARLOTTESVILLE |    VA |  22902 |           |
|     2624  |          BUCK ISLAND RD  |                  CHARLOTTESVILLE |    VA |  22902 |           |
|      912  |        MICHIE TAVERN LN  |                  CHARLOTTESVILLE |    VA |  22902 |           |
|     4392  |                NAHOR RD  |                  CHARLOTTESVILLE |    VA |  22902 |           |
|     2390  |     ROSE HILL CHURCH LN  |                  CHARLOTTESVILLE |    VA |  22902 |           |
|     2962  |          SCOTTSVILLE RD  |                  CHARLOTTESVILLE |    VA |  22902 |           |
|     2866  |          SCOTTSVILLE RD  |                  CHARLOTTESVILLE |    VA |  22902 |           |
|        1  |           BOARS HEAD PL  |                  CHARLOTTESVILLE |    VA |  22903 |           |
|      635  |               BOONE TRL  |                  CHARLOTTESVILLE |    VA |  22903 |           |
|     3610  |              COLSTON DR  |                  CHARLOTTESVILLE |    VA |  22903 |           |
|     1100  |               DRYDEN LN  |                  CHARLOTTESVILLE |    VA |  22903 |           |
|     1345  |      DUDLEY MOUNTAIN RD  |                  CHARLOTTESVILLE |    VA |  22903 |           |
|     1805  |          FIDDLESTICK LN  |                  CHARLOTTESVILLE |    VA |  22903 |           |
|      719  |            GRASSMERE RD  |                  CHARLOTTESVILLE |    VA |  22903 |           |
|     2220  |                  IVY RD  |                  CHARLOTTESVILLE |    VA |  22903 |           |
|     3244  |        OLD LYNCHBURG RD  |                  CHARLOTTESVILLE |    VA |  22903 |           |
|     3635  |         RALEIGH MTN TRL  |                  CHARLOTTESVILLE |    VA |  22903 |           |
|     4189  |             RED HILL RD  |                  CHARLOTTESVILLE |    VA |  22903 |           |
|      644  |          TAYLORS GAP RD  |                  CHARLOTTESVILLE |    VA |  22903 |           |
|      410  |         WHITE GABLES LN  |                  CHARLOTTESVILLE |    VA |  22903 |           |
|     2840  |                CINDY LN  |                  CHARLOTTESVILLE |    VA |  22911 |           |
|      310  |           CRESTFIELD CT  |                  CHARLOTTESVILLE |    VA |  22911 |           |
|     1900  |             FRANKLIN DR  |                  CHARLOTTESVILLE |    VA |  22911 |           |
|     1705  |           GOLDENTREE PL  |                  CHARLOTTESVILLE |    VA |  22911 |           |
|     2831  |        RIGGORY RIDGE RD  |                  CHARLOTTESVILLE |    VA |  22911 |           |
|      597  |         ROCKY HOLLOW RD  |                  CHARLOTTESVILLE |    VA |  22911 |           |
|      175  |            S PANTOPS DR  |                  CHARLOTTESVILLE |    VA |  22911 |           |
|     3661  |          STONY POINT RD  |                  CHARLOTTESVILLE |    VA |  22911 |           |
|     2040  |              TREMONT RD  |                  CHARLOTTESVILLE |    VA |  22911 |           |
|     3363  |           WATTS PASSAGE  |                  CHARLOTTESVILLE |    VA |  22911 |           |
|     2693  |            HEARTWOOD RD  |                            AFTON |    VA |  22920 |           |
|     1120  |             KINGSWAY RD  |                            AFTON |    VA |  22920 |           |
|     4640  |           EAGLE NEST LN  |                    BARBOURSVILLE |    VA |  22923 |           |
|     5327  |        MONACAN TRAIL RD  |                       COVESVILLE |    VA |  22931 |           |
|     3224  |         FOX MOUNTAIN RD  |                           CROZET |    VA |  22932 |           |
|     4438  |                GARTH RD  |                           CROZET |    VA |  22932 |           |
|     5784  |               LOCUST LN  |                           CROZET |    VA |  22932 |           |
|     6310  |                ESTES LN  |                             DYKE |    VA |  22935 |           |
|      555  |            ARROWHEAD CT  |                      EARLYSVILLE |    VA |  22936 |           |
|     5854  |        BUFFALO RIVER RD  |                      EARLYSVILLE |    VA |  22936 |           |
|     2610  |          EARLYSVILLE RD  |                      EARLYSVILLE |    VA |  22936 |           |
|     3458  |            REAS FORD LN  |                      EARLYSVILLE |    VA |  22936 |           |
|      248  |            REAS FORD RD  |                      EARLYSVILLE |    VA |  22936 |           |
|      963  |            RED CREST LN  |                      EARLYSVILLE |    VA |  22936 |           |
|     6040  |              VINTAGE RD  |                      EARLYSVILLE |    VA |  22936 |           |
|     7814  |       CHESTNUT GROVE RD  |                           ESMONT |    VA |  22937 |           |
|     7395  |              PORTERS RD  |                           ESMONT |    VA |  22937 |           |
|     3066  |              RED ROW LN  |                           ESMONT |    VA |  22937 |           |
|     2118  |        BUCK MOUNTAIN RD  |                       FREE UNION |    VA |  22940 |           |
|     4869  |            CATTERTON RD  |                       FREE UNION |    VA |  22940 |           |
|     2684  |       CHAPEL SPRINGS LN  |                       FREE UNION |    VA |  22940 |           |
|     3606  |           MILLINGTON RD  |                       FREE UNION |    VA |  22940 |           |
|     7154  |             JOHNSTON LN  |                     GORDONSVILLE |    VA |  22942 |           |
|     8311  |       ROCKFISH GAP TPKE  |                        GREENWOOD |    VA |  22943 |           |
|     6751  |              FORTUNE LN  |                            KEENE |    VA |  22946 |           |
|      572  |              MAXWELL RD  |                          KESWICK |    VA |  22947 |           |
|     2779  |        MONACAN TRAIL RD  |                     NORTH GARDEN |    VA |  22959 |           |
|     3401  |        OLD LYNCHBURG RD  |                     NORTH GARDEN |    VA |  22959 |           |
|     2210  |           FRAYS MILL RD  |                     RUCKERSVILLE |    VA |  22968 |           |
|     6627  |             BLENHEIM RD  |                      SCOTTSVILLE |    VA |  24590 |           |
|     5499  |       JEFFERSON MILL RD  |                      SCOTTSVILLE |    VA |  24590 |           |
|      646  |              RACOON RDG  |                      SCOTTSVILLE |    VA |  24590 |           |
|     2350  |           SECRETARYS RD  |                      SCOTTSVILLE |    VA |  24590 |           |
|     2582  |           SECRETARYS RD  |                      SCOTTSVILLE |    VA |  24590 |           |
		

Open in new window

I had to build a small table to assist, it contains some of the roadway abbreviations used (but not 'farm' that's a problem for "farm st" etc). So here is the query, I used several layers of nesting to try an make it more legible.  Maybe it will help.
CREATE TABLE road
	([abbrev] varchar(7))
;
	
INSERT INTO road
	([abbrev])
VALUES
	('CIR'),
	('CT'),
	('DR'),
	('LN'),
	('PASSAGE'),
	('PL'),
	('RDG'),
	('RD'),
	('RUN'),
	('TPKE'),
	('TRL')
;

SELECT
        first_num
      , CASE WHEN has_nums > 0 THEN left(portion, charindex(' ',portion,has_nums))
             WHEN abbrev IS NOT NULL THEN left(portion, charindex(' ' + abbrev +' ', portion) + len(abbrev)+1 )
        END AS addr1
      , CASE WHEN has_nums > 0 THEN substring(portion,charindex(' ',portion,has_nums),islong)
             WHEN abbrev IS NOT NULL THEN substring(portion, charindex(' ' + abbrev +' ', portion) + len(abbrev)+1, islong )
             ELSE portion
        END AS addr2
      , STATE
      , zip
      --, abbrev
      , case when abbrev is null or has_nums > 0 then 'CHECK' else '' end as Exception
FROM (
      SELECT
              fldBusAddress
            , first_num
            , substring(fldBusAddress,pos_start, (islong - (pos_zip + pos_state + pos_start - 1))) AS portion
            , PatIndex('%[0-9]%',substring(fldBusAddress,pos_start, (islong - (pos_zip + pos_state + pos_start - 1))) ) AS has_nums
            , right(substring(fldBusAddress,1,(islong - pos_zip)),pos_state) AS STATE
            , zip
            , islong
            , abbrev
      FROM (
            SELECT
                    fldBusAddress
                  , islong
                  , charindex(' ',fldBusAddress) AS pos_start
                  , left(fldBusAddress, charindex(' ',fldBusAddress)) AS first_num
                  , charindex(' ',revAddress) AS pos_zip
                  , right(fldBusAddress, charindex(' ',revAddress)) AS zip
                  , charindex(' ',reverse(substring(fldBusAddress,1,islong-charindex(' ',revAddress)))) AS pos_state
            FROM (
                  SELECT
                         fldBusAddress
                       , reverse(fldBusAddress)                                  AS revAddress
                       , len(fldBusAddress)                                      AS islong
                       , len(fldBusAddress) - len(replace(fldBusAddress,' ','')) AS spaces
                  FROM yourtable
                 ) a
            WHERE spaces > 3
            ) b
      LEFT JOIN road ON fldBusAddress LIKE '% ' + road.abbrev + ' %'
     ) c
ORDER BY
        CASE WHEN abbrev IS NULL THEN 0 ELSE 1 END
      , CASE WHEN has_nums > 0 THEN 0 ELSE 1 END
      , zip
      , addr2
      , addr1
;

-- http://sqlfiddle.com/#!3/01a910/18

Open in new window

0
 

Author Closing Comment

by:ergenbgr
ID: 39679518
Thanks!!
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39679529
Pleased I could assist. Thanks for the grading. Cheers, Paul
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

Suggested Solutions

PaperPort has a feature called the "Send To Bar". It provides a convenient, drag-and-drop interface for using other installed software, such as Microsoft Office. However, this article shows that the latest Office 2016 apps (installed with an Office …
Using Word 2013, I was experiencing some incredible lag when typing.  Here's what worked for me....
With Microsoft Access, learn how to specify relationships between tables and set various options on the relationship. Add the tables: Create the relationship: Decide if you’re going to set referential integrity: Decide if you want cascade upda…
The viewer will learn how to  create a slide that will launch other presentations in Microsoft PowerPoint. In the finished slide, each item launches a new PowerPoint presentation and when each is finished it automatically comes back to this slide: …

708 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

18 Experts available now in Live!

Get 1:1 Help Now