Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 298
  • Last Modified:

Detect Address String.

Does anybody know of a way to detect an address from a string entered in a random order?

For example the user can simply type their address into a text field, and we can locate and extract the following components:

postcode
house number

I'll up the points for a detailed response :)
0
MoOTottle
Asked:
MoOTottle
  • 7
  • 7
1 Solution
 
mosphatCommented:
The first thing that comes to mind are regular expressions. However, there are a lot of different ways to write an address. Which kind of addresses (that is, which countries) do you want to support?
0
 
MoOTottleAuthor Commented:
It will be all UK addresses,

they are "normally" in the format:

Joe Blogs, 55 Springfield Road, Springfield, Manchester, MJ15 4FB
0
 
mosphatCommented:
Most people will probably write it like:

Joe Blogs
55 Springfield Road
Springfield
Manchester
MJ15 4FB

But I'm sure there are (many) other ways. (Different people, different habits)

Is the user going to get some kind of confirm page in which he/she can make corrections, in case the address is entered in a way not foreseen by the parser you're about to write?

Which language do you intend to use? (Not that it matters to the problem, it just might be more convenient when giving examples)
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
MoOTottleAuthor Commented:
it will be a single line input, as the user interface is very limited.

its basically just like a command prompt.

And its a one way process im afraid, once the data has been submitted we have one shot to grab the info as best we can.

I am able to use either coldfusion, Classic ASP or VB 6.0. Although ASP is my preference.

I managed to hack this together for the postcode from reading around, but it doesnt seem to be working.
0
 
MoOTottleAuthor Commented:
<%
Sub GetAdd(strInput)

  Dim objRegExp, objMatch
  Set objRegExp = New RegExp
 
  objRegExp.IgnoreCase = True
  objRegExp.Pattern = "^((([A-PR-UWYZ])([0-9][0-9A-HJKS-UW]?))|(([A-PR-UWYZ][A-HK-Y])([0-9][0-9ABEHMNPRV-Y]?))\s{0,2}(([0-9])([ABD-HJLNP-UW-Z])([ABD-HJLNP-UW-Z])))|(((GI)(R))\s{0,2}((0)(A)(A)))$"
  objRegExp.Global = True


  set colMatches = objRegExp.Execute(strInput)

  'Print the # of matches we found
  Response.Write colMatches.Count & " matches found...<P>"

  'Step through our matches
  For Each objMatch in colMatches
     Response.Write objMatch.Value & "<BR>"
  Next

  'Clean up
  Set colMatches = Nothing
  Set objRegExp = Nothing



End Sub


strAdd = "Joe Blogs, 55 Springfield Road, Springfield, Manchester, MJ15 4FB"
%>
<%GetAdd(strAdd)%>
0
 
mosphatCommented:
That's because the ^ at the beginning and the $ at the end tell the regex parser that the match should start at the beginning and end at the end. Obviously the postcode doesn't start at the beginning in your example.
Try removing the ^ and $.
0
 
MoOTottleAuthor Commented:
excellent, its working a treat now. :)

so is there a better way to return one match? rather than an array of matches?

And how would i go about pulling out the house number?

(btw thanks for the help so far :] )
0
 
mosphatCommented:
Somewhere along the lines of this?

objRegExp.Pattern = "([0-9\-])+\s+([A-Z\s])(?:,)"

The first group would contain the housenumber, the second one the streetname.
Mind you, this is a very basic regex. I'm sure the UK has some exotic streetnames that won't be found this way. But it's a start.
0
 
MoOTottleAuthor Commented:
that one doesnt seem to be working?

its returning 0 matches.

Sub GetAddress(strInput)

  Dim objRegExp, objMatch
  Set objRegExp = New RegExp
 
  objRegExp.IgnoreCase = True
  objRegExp.Pattern = "([0-9\-])+\s+([A-Z\s])(?:,)"
  objRegExp.Global = True


  set colMatches = objRegExp.Execute(strInput)

  'Print the # of matches we found
  Response.Write colMatches.Count & " matches found...<P>"

  'Step through our matches
  For Each objMatch in colMatches
     Response.Write objMatch.Value & "<BR>"
  Next

  'Clean up
  Set colMatches = Nothing
  Set objRegExp = Nothing



End Sub
0
 
mosphatCommented:
Sorry, get rid of the (?:,)
0
 
MoOTottleAuthor Commented:
it now gives an output of "55 S"

when i feed
"Joe Blogs, 55 Springfield Road, Springfield, Manchester, MJ15 4FB"

into it?
0
 
mosphatCommented:
Hmm, I took the liberty of actually testing what I'm claiming here :)

This really works: ([0-9\-]+)\s+([A-Z\s\-\.]+)
On my machine that is...
0
 
MoOTottleAuthor Commented:
n1, thanks for the help. points upped to 200 ;)
0
 
mosphatCommented:
You're very welcome and thank you for the extra points.
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

  • 7
  • 7
Tackle projects and never again get stuck behind a technical roadblock.
Join Now