chunky_uk
asked on
Regular expression for UK postcode.
Hi,
Does anyone have a current regular expression for checking UK postcodes. I was using:
(GIR 0AA|[A-PR-UWYZ]([0-9]{1,2} |([A-HK-Y] [0-9]|[A-H K-Y][0-9]( [0-9]|[ABE HMNPRV-Y]) )|[0-9][A- HJKPS-UW]) [0-9][ABD-HJLNP-UW-Z]{2})
but according to here http://www.cabinetoffice.gov.uk/govtalk/schemasstandards/e-gif/datastandards/address/postcode.aspx this is now out of date?
Thanks,
C
Does anyone have a current regular expression for checking UK postcodes. I was using:
(GIR 0AA|[A-PR-UWYZ]([0-9]{1,2}
but according to here http://www.cabinetoffice.gov.uk/govtalk/schemasstandards/e-gif/datastandards/address/postcode.aspx this is now out of date?
Thanks,
C
ASKER
Hi Raj - I quoted that regular expression in my question, it's out of date.
Thanks,
C
Thanks,
C
Yes. I know you are looking for updated regular expression of UK postcode.
I googled and got some different UK postcode regular experssions - that I posted above.
Did you try those two ?
Thanks
Raj
I googled and got some different UK postcode regular experssions - that I posted above.
Did you try those two ?
Thanks
Raj
HOw about:
(GIR 0AA|(([A-PR-UWYZ]([0-9]([0 -9]|[A-HJK S-UW]){0,1 }|[A-HK-Y] ([0-9]([0- 9]|[ABEHMN PRV-Y]){0, 1})))) [0-9][ABD-JLNP-UW-X]{2})
Seems ok on my test.
Chris
(GIR 0AA|(([A-PR-UWYZ]([0-9]([0
Seems ok on my test.
Chris
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
I settled on this as a solution: -
^([A-PR-UWYZ][A-HK-Y0-9][A -HJKS-UW0- 9]?[ABEHMN PRVWXY0-9] ?{1,2} [0-9][ABD-HJLNP-UW-Z]{2})$
^([A-PR-UWYZ][A-HK-Y0-9][A
Hi chunky,
Congrats to figure out the solution :-)
Raj
Congrats to figure out the solution :-)
Raj
With the solution I posted I tried to match up to the provided reference. For example therefore the Girobank traditional postcode. That aside the proposed self solution has not addressed why the proposal by me was wrong and out of interest the author solution passes the invalid posrtcode whereas mine does not:
AAAA 1AA
I would appreciate some guidance as to why my solution is rejected and the authors should be accepted.
Chris
AAAA 1AA
I would appreciate some guidance as to why my solution is rejected and the authors should be accepted.
Chris
ASKER
Hi Chris - thanks for your solution. The truth is I ran out of time with this, I was just posting my solution to close the thread. I'll give your expression a go today and update accordingly.
Thanks.
Thanks.
With reference to the supplied link of http://www.cabinetoffice.gov.uk/govtalk/schemasstandards/e-gif/datastandards/address/postcode.aspx the regex I supplied in http:#32353186 was tested by me to check it met all the examples.
I tested the suthor supplied solution from http:#32631157 and noted it passed an invalid sample of AAAA 1AA.
Given the author comment that they did not have time to test the provided regex from me and then sought to close the question based on a flawed solution of their own over 4 days later I do not believe the author explanation makes sense.
There being no indication that my prior post is incorrect I believe the only correct course is to accept my own post of http:#32353186
Chris
I tested the suthor supplied solution from http:#32631157 and noted it passed an invalid sample of AAAA 1AA.
Given the author comment that they did not have time to test the provided regex from me and then sought to close the question based on a flawed solution of their own over 4 days later I do not believe the author explanation makes sense.
There being no indication that my prior post is incorrect I believe the only correct course is to accept my own post of http:#32353186
Chris
ASKER
Apologies for the confusion and delay in closing this, thanks for the solution Chris...
Sorry I was difficult over the closure but i'm glad you have a solution that meets your needs and I hope I didn't offend you in the process.
Chris
Chris
ASKER
Hi Chris,
A couple of defects have arisen with this expression, now that we've had time to test fully.
These are: -
1. It's possible to enter I, J or Z in the second position (e.g. KI1 8SH), this should not be allowed.
2. The only letters to appear in the fourth position are A, B, E, H, M, N, P, R, V, W, X and Y, in fact only I,L,X and Z are excluded from the fourth position.
Any ideas?
Thanks...
A couple of defects have arisen with this expression, now that we've had time to test fully.
These are: -
1. It's possible to enter I, J or Z in the second position (e.g. KI1 8SH), this should not be allowed.
2. The only letters to appear in the fourth position are A, B, E, H, M, N, P, R, V, W, X and Y, in fact only I,L,X and Z are excluded from the fourth position.
Any ideas?
Thanks...
Initially I would think the problem comes with word boundaries so see if:
(GIR 0AA|(\b([A-PR-UWYZ]([0-9]( [0-9]|[A-H JKS-UW]){0 ,1}|[A-HK- Y]([0-9]([ 0-9]|[ABEH MNPRV-Y]){ 0,1})))) [0-9][ABD-HJLNP-UW-X]{2}\b )
Resolves both issues.
Chris
(GIR 0AA|(\b([A-PR-UWYZ]([0-9](
Resolves both issues.
Chris
ASKER
Hi Chris - no that didn't work, all postcodes fail now with the \b added?
SLight change in flavour then as I am unfamiliar with the anchors in oracle - I would expect however the previous regex would also fail similarly:
(GIR 0AA|(\m([A-PR-UWYZ]([0-9]( [0-9]|[A-H JKS-UW]){0 ,1}|[A-HK- Y]([0-9]([ 0-9]|[ABEH MNPRV-Y]){ 0,1})))) [0-9][ABD-HJLNP-UW-X]{2}$)
Chris
(GIR 0AA|(\m([A-PR-UWYZ]([0-9](
Chris
ASKER
Hi Chris, no same result, \b is the word boundary anchor in Oracle though?
I've just found an Oracle reference and \b is definitely valid syntax for word boundary so:
(GIR 0AA|(\b([A-PR-UWYZ]([0-9]( [0-9]|[A-H JKS-UW]){0 ,1}|[A-HK- Y]([0-9]([ 0-9]|[ABEH MNPRV-Y]){ 0,1})))) [0-9][ABD-HJLNP-UW-X]{2}\b )
Should have worked and i'm afraid I can't do any more therefore.
Chris
(GIR 0AA|(\b([A-PR-UWYZ]([0-9](
Should have worked and i'm afraid I can't do any more therefore.
Chris
I have just tried the original regex in your initial post and as expected that also fails the same way, which is as I surmised.
Chris
Chris
ASKER
Hi Chris,
Indeed, I was aware mine didn't fully meet the new requirements, but no idea why the expression you provided, doesn't meet the 2 requirements I mentioned above (sorry I actually meant to quote the 3 position, not the 4th position)?
* The letters I, J and Z are not used in the second position.
* The only letters to appear in the third position are A, B, C, D, E, F, G, H, J, K, S, T, U and W.
So EC1I 8SH is allowed when it shouldn't be?
Thanks...
Indeed, I was aware mine didn't fully meet the new requirements, but no idea why the expression you provided, doesn't meet the 2 requirements I mentioned above (sorry I actually meant to quote the 3 position, not the 4th position)?
* The letters I, J and Z are not used in the second position.
* The only letters to appear in the third position are A, B, C, D, E, F, G, H, J, K, S, T, U and W.
So EC1I 8SH is allowed when it shouldn't be?
Thanks...
The supplied change does not accept the EC1I 8SH structure either so still should do the job. GIven the validity of the \b token for word boundary there is no logic to the failure. Are you sure therefore there are no additional codes anywhere therein?
Chris
Chris
ASKER
Do you mean that: -
(GIR 0AA|(\b([A-PR-UWYZ]([0-9]( [0-9]|[A-H JKS-UW]){0 ,1}|[A-HK- Y]([0-9]([ 0-9]|[ABEH MNPRV-Y]){ 0,1})))) [0-9][ABD-HJLNP-UW-X]{2}\b )
satisfies: -
* The letters I, J and Z are not used in the second position.
* The only letters to appear in the third position are A, B, C, D, E, F, G, H, J, K, S, T, U and W.
in your non-Oracle environment? The expression above runs OK in my Oracle environment, but doesn't validate any postcodes?
Thanks..
(GIR 0AA|(\b([A-PR-UWYZ]([0-9](
satisfies: -
* The letters I, J and Z are not used in the second position.
* The only letters to appear in the third position are A, B, C, D, E, F, G, H, J, K, S, T, U and W.
in your non-Oracle environment? The expression above runs OK in my Oracle environment, but doesn't validate any postcodes?
Thanks..
SOme revision later, it looks as though there is no word boundary and I haven't been able to create it by a group either.
Is there anything else that can be used ... i.e. will there always be a space before the postcode for example
Chris
Is there anything else that can be used ... i.e. will there always be a space before the postcode for example
Chris
ASKER
Hi,
The postcode is entered through a web form, and then passed into a PL/SQL procedure to be validated, so I could append e.g. $ before and after the postcode before it is validated, would that help?
The postcode is entered through a web form, and then passed into a PL/SQL procedure to be validated, so I could append e.g. $ before and after the postcode before it is validated, would that help?
In that case assuming no extraneous characters try the following which anchors to the line start and end:
^(GIR 0AA|(([A-PR-UWYZ]([0-9]([0 -9]|[A-HJK S-UW]){0,1 }|[A-HK-Y] ([0-9]([0- 9]|[ABEHMN PRV-Y]){0, 1})))) [0-9][ABD-HJLNP-UW-X]{2})$
Chris
^(GIR 0AA|(([A-PR-UWYZ]([0-9]([0
Chris
If not a dollar prefix and suffix would be:
\$(GIR 0AA|(([A-PR-UWYZ]([0-9]([0 -9]|[A-HJK S-UW]){0,1 }|[A-HK-Y] ([0-9]([0- 9]|[ABEHMN PRV-Y]){0, 1})))) [0-9][ABD-HJLNP-UW-X]{2})\ $
Chris
\$(GIR 0AA|(([A-PR-UWYZ]([0-9]([0
Chris
ASKER
Hi Chris,
Ah so close! Thought that was it cracked, I'm using the \$, but this postcode is still accepted: -
EC1X 8SH
Ah so close! Thought that was it cracked, I'm using the \$, but this postcode is still accepted: -
EC1X 8SH
ASKER
Looks to be just that 3rd character position that is not quite right now?
As far as I can see from the reference document EC1X 8SH is valid
Chris
Chris
ASKER
Hi,
No it is a little confusing, the way they refer to 3rd/4th position.
This is the rule that I think is being broken: -
* The only letters to appear in the third position are A, B, C, D, E, F, G, H, J, K, S, T, U and W.
By third, they mean the third character position (e.g. EC1A 1BB). So EC1X 8SH should fail?
No it is a little confusing, the way they refer to 3rd/4th position.
This is the rule that I think is being broken: -
* The only letters to appear in the third position are A, B, C, D, E, F, G, H, J, K, S, T, U and W.
By third, they mean the third character position (e.g. EC1A 1BB). So EC1X 8SH should fail?
To me froom teh spec and common expectation:
AANA NAA EC1A 1BB
i.e. a numeric is an option for the third 'character' therefore a numeric is the only option for the third character when the first 'component has 4 characters and the third letter limitation only applies to postcodes with three letters in the first group, (W1A 1HQ).
Chris
AANA NAA EC1A 1BB
i.e. a numeric is an option for the third 'character' therefore a numeric is the only option for the third character when the first 'component has 4 characters and the third letter limitation only applies to postcodes with three letters in the first group, (W1A 1HQ).
Chris
ASKER
Yes that does make sense, thanks so much for your help :o)
From http://www.regxlib.com/REDetails.aspx?regexp_id=260
^([A-PR-UWYZ0-9][A-HK-Y0-9
--------------------------
A regular expression is given in the comments of the schema, which implements full checking of all the stated BS 7666 postcode format rules. That regular expression can be restated as a "traditional" regular expression:
(GIR 0AA|[A-PR-UWYZ]([0-9]{1,2}
British Forces Post Office postcodes do not follow the BS 7666 rules, but have the format "BFPO NNN" or "BFPO c/o NNN", where NNN is 1 to 4 numerical digits. A regular expression to implement the BS 7666 rules:[45]
(GIR 0AA)|((([A-Z-[QVX]][0-9][0
Alternative short regular expression from BS7666 Schema is:
[A-Z]{1,2}[0-9R][0-9A-Z]? [0-9][ABD-HJLNP-UW-Z]{2}
Courtesy:- http://en.wikipedia.org/wiki/Postcodes_in_the_United_Kingdom
Raj