Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

Parse non-delimited Address field

Posted on 2013-11-10
11
Medium Priority
?
493 Views
Last Modified: 2013-11-26
I have a field in an Access record data-set that contain the addresses of businesses. The table contains 7,000 records.

Current Record:

fldBusAddress
|646 RACOON RDG SCOTTSVILLE VA 24590              |
|1120 KINGSWAY RD AFTON VA 22920                  |
|3606 MILLINGTON RD FREE UNION VA 22940           |
|248 REAS FORD RD EARLYSVILLE VA 22936            |
|2684 CHAPEL SPRINGS LN FREE UNION VA 22940       |
|2030 AVON CT UNIT 8 CHARLOTTESVILLE VA 22902     |
|1667 BAILEYS RETREAT RD CHARLOTTESVILLE VA 22901 |

Desired Record:

HseNum|Street Name       |Suite  |City            |State |Zip
646   |RACOON RDG        |       |SCOTTSVILLE     |VA    |24590
1120  |KINGSWAY RD       |       |AFTON           |VA    |22920
3606  |MILLINGTON RD     |       |FREE UNION      |VA    |22940
248   |REAS FORD RD      |       |EARLYSVILLE     |VA    |22936
2684  |CHAPEL SPRINGS LN |       |FREE UNION      |VA    |22940
2030  |AVON CT           |UNIT 8 |CHARLOTTESVILLE |VA    |22902
1667  |BAILEYS RETREAT RD|       |CHARLOTTESVILLE |VA    |22901

I have uploaded two .txt files. One is of the example above and the other is a 100 record data-set.

I have been successful using Left() to get the HseNum and InStrRev() to get the State and Zip but the records that have a two-part City name and/or a two-part Street name has me stumped. I have also checked the knowledge base but could not find a satisfactory solution.

I am getting near a deadline and need help to move forward. Thanks in advance!!
BusAddrExample.txt
BusinessAddress.txt
0
Comment
Question by:ergenbgr
  • 3
  • 2
  • 2
  • +4
11 Comments
 
LVL 58
ID: 39637033
<<but the records that have a two-part City name and/or a two-part Street name has me stumped. I have also checked the knowledge base but could not find a satisfactory solution.>>

 That's because there is none really.   An address like this:

|2684 CHAPEL SPRINGS LN FREE UNION VA 22940  

is very though to break up.

Your best bet in these cases is to have zip code data base, lookup the city and then see if it matches some part of the string.

That doesn't always work however because many times the zip given has a city name that's different from the "city" used in the address.

 For example, 14051 gives "Amherst, NY", but someone might use the city of Williamsville, NY

 What I do in these cases is have a flag for non-validated addresses and then have someone break them up manually.

Generally, by the time you get down to the real exceptions, there's only a handful in comparison to the full list.

Jim.
0
 
LVL 58
ID: 39637038
The other thing you can do is look for things like "Lane", "Road", "Route", "LN", "RD", "RTE", etc  and try and determine where the address info ends.

Anything between there and the state is the city.

Jim.
0
 
LVL 70

Expert Comment

by:KCTS
ID: 39637095
I can't see how you are going to automate this. Its   pretty simple task to split the data into individual words but the problem here is that there is no consistency, By that I mean there is no way to tell where one part of the address end and the other starts since the street address can be one, two or three words, suite is sometimes present and sometimes missing, city again can be one, two or perhaps three words.

All I can suggest is that you revisit the record and manually insert a delimiter - perhaps use the '|' symbol as a delimiter, then use that to split the strings.
0
Get your Conversational Ransomware Defense e‑book

This e-book gives you an insight into the ransomware threat and reviews the fundamentals of top-notch ransomware preparedness and recovery. To help you protect yourself and your organization. The initial infection may be inevitable, so the best protection is to be fully prepared.

 
LVL 40

Expert Comment

by:als315
ID: 39637167
Look at sample. Idea is the same as proposed by JDettman: list of address info ends.
I have problem with one line:
355 RIO RD W STE 102 CHARLOTTESVILLE VA 22901
STE 102 seems to be Suite, but where should be W?
DBImportFile.accdb
0
 
LVL 58
ID: 39637198
<<STE 102 seems to be Suite, but where should be W? >>

 It's part of the address:

355 RIO ROAD WEST
Suite 102
Charlottesville
VA
22901

 and that's all you can do.   Look at the exceptions, add code to compensate, and then do it again.

  At some point there is diminishing returns and it's simplest just to handle them manually.   But at what point that ends up happening is hard to say.

 Out of 7,000, if I got it down to a couple hundred and it was a one time deal, I'd stop there.   But if it was a repeating process, then I might try harder with code.

 Really dicey with pushing with code though; next time through, something might not match up with the code you have and then you have in-accurate addresses.   Often I find it's just simpler to go the manual route.

Jim.
0
 
LVL 21
ID: 39637289
I am getting near a deadline and need help to move forward.

If this data is going to be used to do a mailing?

If yes and this were my project I would get with a local post office or mailing services that can spit out the data for you and verify CASS)  it.  I would also check the USPS web site to see what it offers. Since it appears that time is not on your side, I would definitely look into using a service.
0
 
LVL 31

Expert Comment

by:Helen Feddema
ID: 39637299
Try this:  

SELECT tblContactsTest.WholeAddress, Left([WholeAddress],InStr([WholeAddress],",")-1) AS Street, Mid([WholeAddress],InStr([WholeAddress],",")+1,([StatePos]-[StreetPos]-4)) AS City, Mid([WholeAddress],Len([WholeAddress])-7,2) AS State, Right([WholeAddress],5) AS Zip, InStr([WholeAddress],",")-1 AS StreetPos, Len([WholeAddress])-7 AS StatePos
FROM tblContactsTest;

WholeAddress is the field with the entire address in it.  This will split the address into Street, City, State and Zip.  You may have to tweak the expression a bit -- I think it was done for a case where there was a comma after the city.  Without the comma, it is very difficult to split the address correctly.
0
 
LVL 70

Expert Comment

by:KCTS
ID: 39637419
no commas or any other delimiters in the data - hence my previous comments
0
 
LVL 49

Accepted Solution

by:
PortletPaul earned 2000 total points
ID: 39637907
As already noted there probably isn't any 100% method for dividing those strings onto all the parts you would like, and additionaly I can't produce the Access code. But below using sql server I could get this far with the sample data
| FIRST_NUM |                    ADDR1 |                            ADDR2 | STATE |    ZIP | EXCEPTION |
|-----------|--------------------------|----------------------------------|-------|--------|-----------|
|     2755  |                   (null) |   SCHELFORD FARM CHARLOTTESVILLE |    VA |  22901 |     CHECK |
|     1171  |                   (null) |  LITTLE MTN FARM CHARLOTTESVILLE |    VA |  22911 |     CHECK |
|      675  |     BERKMAR CIR STE 101  |                  CHARLOTTESVILLE |    VA |  22901 |     CHECK |
|      355  |        RIO RD W STE 102  |                  CHARLOTTESVILLE |    VA |  22901 |     CHECK |
|      355  |        RIO RD W STE 102  |                  CHARLOTTESVILLE |    VA |  22901 |     CHECK |
|     1710  |      SEMINOLE TRL STE 2  |                  CHARLOTTESVILLE |    VA |  22901 |     CHECK |
|     1410  |  INCARNATION DR STE 201  |                           CVILLE |    VA |  22901 |     CHECK |
|     2030  |          AVON CT UNIT 8  |                  CHARLOTTESVILLE |    VA |  22902 |     CHECK |
|      503  |      FAULCONER DR STE 5  |                  CHARLOTTESVILLE |    VA |  22903 |     CHECK |
|     1667  |      BAILEYS RETREAT RD  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|      518  |        BARRACKS FARM RD  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     3002  |              BERKMAR DR  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     3002  |              BERKMAR DR  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     3036  |              BERKMAR DR  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     2902  |            BROOKMERE RD  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     1334  |            HIGH VIEW DR  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|      465  |             IVY FARM DR  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|      445  |             IVY FARM DR  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|      125  |            IVY RIDGE RD  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     1844  |              IVYWEST LN  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     2655  |           MERIWETHER DR  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     2735  |           MERIWETHER DR  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|       84  |          OAK FOREST CIR  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     1710  |            OLD FORGE RD  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     1100  |             PEN PARK LN  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     2408  |          PINE GARTH RUN  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|      460  |             PREMIER CIR  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     2271  |            SEMINOLE TRL  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     2271  |            SEMINOLE TRL  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|      881  |               TILMAN RD  |                  CHARLOTTESVILLE |    VA |  22901 |           |
|     2655  |          BUCK ISLAND RD  |                  CHARLOTTESVILLE |    VA |  22902 |           |
|     2624  |          BUCK ISLAND RD  |                  CHARLOTTESVILLE |    VA |  22902 |           |
|      912  |        MICHIE TAVERN LN  |                  CHARLOTTESVILLE |    VA |  22902 |           |
|     4392  |                NAHOR RD  |                  CHARLOTTESVILLE |    VA |  22902 |           |
|     2390  |     ROSE HILL CHURCH LN  |                  CHARLOTTESVILLE |    VA |  22902 |           |
|     2962  |          SCOTTSVILLE RD  |                  CHARLOTTESVILLE |    VA |  22902 |           |
|     2866  |          SCOTTSVILLE RD  |                  CHARLOTTESVILLE |    VA |  22902 |           |
|        1  |           BOARS HEAD PL  |                  CHARLOTTESVILLE |    VA |  22903 |           |
|      635  |               BOONE TRL  |                  CHARLOTTESVILLE |    VA |  22903 |           |
|     3610  |              COLSTON DR  |                  CHARLOTTESVILLE |    VA |  22903 |           |
|     1100  |               DRYDEN LN  |                  CHARLOTTESVILLE |    VA |  22903 |           |
|     1345  |      DUDLEY MOUNTAIN RD  |                  CHARLOTTESVILLE |    VA |  22903 |           |
|     1805  |          FIDDLESTICK LN  |                  CHARLOTTESVILLE |    VA |  22903 |           |
|      719  |            GRASSMERE RD  |                  CHARLOTTESVILLE |    VA |  22903 |           |
|     2220  |                  IVY RD  |                  CHARLOTTESVILLE |    VA |  22903 |           |
|     3244  |        OLD LYNCHBURG RD  |                  CHARLOTTESVILLE |    VA |  22903 |           |
|     3635  |         RALEIGH MTN TRL  |                  CHARLOTTESVILLE |    VA |  22903 |           |
|     4189  |             RED HILL RD  |                  CHARLOTTESVILLE |    VA |  22903 |           |
|      644  |          TAYLORS GAP RD  |                  CHARLOTTESVILLE |    VA |  22903 |           |
|      410  |         WHITE GABLES LN  |                  CHARLOTTESVILLE |    VA |  22903 |           |
|     2840  |                CINDY LN  |                  CHARLOTTESVILLE |    VA |  22911 |           |
|      310  |           CRESTFIELD CT  |                  CHARLOTTESVILLE |    VA |  22911 |           |
|     1900  |             FRANKLIN DR  |                  CHARLOTTESVILLE |    VA |  22911 |           |
|     1705  |           GOLDENTREE PL  |                  CHARLOTTESVILLE |    VA |  22911 |           |
|     2831  |        RIGGORY RIDGE RD  |                  CHARLOTTESVILLE |    VA |  22911 |           |
|      597  |         ROCKY HOLLOW RD  |                  CHARLOTTESVILLE |    VA |  22911 |           |
|      175  |            S PANTOPS DR  |                  CHARLOTTESVILLE |    VA |  22911 |           |
|     3661  |          STONY POINT RD  |                  CHARLOTTESVILLE |    VA |  22911 |           |
|     2040  |              TREMONT RD  |                  CHARLOTTESVILLE |    VA |  22911 |           |
|     3363  |           WATTS PASSAGE  |                  CHARLOTTESVILLE |    VA |  22911 |           |
|     2693  |            HEARTWOOD RD  |                            AFTON |    VA |  22920 |           |
|     1120  |             KINGSWAY RD  |                            AFTON |    VA |  22920 |           |
|     4640  |           EAGLE NEST LN  |                    BARBOURSVILLE |    VA |  22923 |           |
|     5327  |        MONACAN TRAIL RD  |                       COVESVILLE |    VA |  22931 |           |
|     3224  |         FOX MOUNTAIN RD  |                           CROZET |    VA |  22932 |           |
|     4438  |                GARTH RD  |                           CROZET |    VA |  22932 |           |
|     5784  |               LOCUST LN  |                           CROZET |    VA |  22932 |           |
|     6310  |                ESTES LN  |                             DYKE |    VA |  22935 |           |
|      555  |            ARROWHEAD CT  |                      EARLYSVILLE |    VA |  22936 |           |
|     5854  |        BUFFALO RIVER RD  |                      EARLYSVILLE |    VA |  22936 |           |
|     2610  |          EARLYSVILLE RD  |                      EARLYSVILLE |    VA |  22936 |           |
|     3458  |            REAS FORD LN  |                      EARLYSVILLE |    VA |  22936 |           |
|      248  |            REAS FORD RD  |                      EARLYSVILLE |    VA |  22936 |           |
|      963  |            RED CREST LN  |                      EARLYSVILLE |    VA |  22936 |           |
|     6040  |              VINTAGE RD  |                      EARLYSVILLE |    VA |  22936 |           |
|     7814  |       CHESTNUT GROVE RD  |                           ESMONT |    VA |  22937 |           |
|     7395  |              PORTERS RD  |                           ESMONT |    VA |  22937 |           |
|     3066  |              RED ROW LN  |                           ESMONT |    VA |  22937 |           |
|     2118  |        BUCK MOUNTAIN RD  |                       FREE UNION |    VA |  22940 |           |
|     4869  |            CATTERTON RD  |                       FREE UNION |    VA |  22940 |           |
|     2684  |       CHAPEL SPRINGS LN  |                       FREE UNION |    VA |  22940 |           |
|     3606  |           MILLINGTON RD  |                       FREE UNION |    VA |  22940 |           |
|     7154  |             JOHNSTON LN  |                     GORDONSVILLE |    VA |  22942 |           |
|     8311  |       ROCKFISH GAP TPKE  |                        GREENWOOD |    VA |  22943 |           |
|     6751  |              FORTUNE LN  |                            KEENE |    VA |  22946 |           |
|      572  |              MAXWELL RD  |                          KESWICK |    VA |  22947 |           |
|     2779  |        MONACAN TRAIL RD  |                     NORTH GARDEN |    VA |  22959 |           |
|     3401  |        OLD LYNCHBURG RD  |                     NORTH GARDEN |    VA |  22959 |           |
|     2210  |           FRAYS MILL RD  |                     RUCKERSVILLE |    VA |  22968 |           |
|     6627  |             BLENHEIM RD  |                      SCOTTSVILLE |    VA |  24590 |           |
|     5499  |       JEFFERSON MILL RD  |                      SCOTTSVILLE |    VA |  24590 |           |
|      646  |              RACOON RDG  |                      SCOTTSVILLE |    VA |  24590 |           |
|     2350  |           SECRETARYS RD  |                      SCOTTSVILLE |    VA |  24590 |           |
|     2582  |           SECRETARYS RD  |                      SCOTTSVILLE |    VA |  24590 |           |
		

Open in new window

I had to build a small table to assist, it contains some of the roadway abbreviations used (but not 'farm' that's a problem for "farm st" etc). So here is the query, I used several layers of nesting to try an make it more legible.  Maybe it will help.
CREATE TABLE road
	([abbrev] varchar(7))
;
	
INSERT INTO road
	([abbrev])
VALUES
	('CIR'),
	('CT'),
	('DR'),
	('LN'),
	('PASSAGE'),
	('PL'),
	('RDG'),
	('RD'),
	('RUN'),
	('TPKE'),
	('TRL')
;

SELECT
        first_num
      , CASE WHEN has_nums > 0 THEN left(portion, charindex(' ',portion,has_nums))
             WHEN abbrev IS NOT NULL THEN left(portion, charindex(' ' + abbrev +' ', portion) + len(abbrev)+1 )
        END AS addr1
      , CASE WHEN has_nums > 0 THEN substring(portion,charindex(' ',portion,has_nums),islong)
             WHEN abbrev IS NOT NULL THEN substring(portion, charindex(' ' + abbrev +' ', portion) + len(abbrev)+1, islong )
             ELSE portion
        END AS addr2
      , STATE
      , zip
      --, abbrev
      , case when abbrev is null or has_nums > 0 then 'CHECK' else '' end as Exception
FROM (
      SELECT
              fldBusAddress
            , first_num
            , substring(fldBusAddress,pos_start, (islong - (pos_zip + pos_state + pos_start - 1))) AS portion
            , PatIndex('%[0-9]%',substring(fldBusAddress,pos_start, (islong - (pos_zip + pos_state + pos_start - 1))) ) AS has_nums
            , right(substring(fldBusAddress,1,(islong - pos_zip)),pos_state) AS STATE
            , zip
            , islong
            , abbrev
      FROM (
            SELECT
                    fldBusAddress
                  , islong
                  , charindex(' ',fldBusAddress) AS pos_start
                  , left(fldBusAddress, charindex(' ',fldBusAddress)) AS first_num
                  , charindex(' ',revAddress) AS pos_zip
                  , right(fldBusAddress, charindex(' ',revAddress)) AS zip
                  , charindex(' ',reverse(substring(fldBusAddress,1,islong-charindex(' ',revAddress)))) AS pos_state
            FROM (
                  SELECT
                         fldBusAddress
                       , reverse(fldBusAddress)                                  AS revAddress
                       , len(fldBusAddress)                                      AS islong
                       , len(fldBusAddress) - len(replace(fldBusAddress,' ','')) AS spaces
                  FROM yourtable
                 ) a
            WHERE spaces > 3
            ) b
      LEFT JOIN road ON fldBusAddress LIKE '% ' + road.abbrev + ' %'
     ) c
ORDER BY
        CASE WHEN abbrev IS NULL THEN 0 ELSE 1 END
      , CASE WHEN has_nums > 0 THEN 0 ELSE 1 END
      , zip
      , addr2
      , addr1
;

-- http://sqlfiddle.com/#!3/01a910/18

Open in new window

0
 

Author Closing Comment

by:ergenbgr
ID: 39679518
Thanks!!
0
 
LVL 49

Expert Comment

by:PortletPaul
ID: 39679529
Pleased I could assist. Thanks for the grading. Cheers, Paul
0

Featured Post

Veeam Disaster Recovery in Microsoft Azure

Veeam PN for Microsoft Azure is a FREE solution designed to simplify and automate the setup of a DR site in Microsoft Azure using lightweight software-defined networking. It reduces the complexity of VPN deployments and is designed for businesses of ALL sizes.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

We live in a world of interfaces like the one in the title picture. VBA also allows to use interfaces which offers a lot of possibilities. This article describes how to use interfaces in VBA and how to work around their bugs.
Microsoft Access has a limit of 255 columns in a single table; SQL Server allows tables with over 255 columns, but reading that data is not necessarily simple.  The final solution for this task involved creating a custom text parser and then reading…
Do you want to know how to make a graph with Microsoft Access? First, create a query with the data for the chart. Then make a blank form and add a chart control. This video also shows how to change what data is displayed on the graph as well as form…
Visualize your data even better in Access queries. Given a date and a value, this lesson shows how to compare that value with the previous value, calculate the difference, and display a circle if the value is the same, an up triangle if it increased…

876 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question