create table `country_postalcodes` (
`id` integer unsigned not null auto_increment,
`country` varchar(3) not null,
`postalcode` varchar(10) not null,
primary key(`id`)
)
;
2. Insert sample data into table from previous step.
insert into country_postalcodes(country, postalcode)
select 'OMN', 'PC-133'
union select 'DEU', '45128'
union select 'POL', '80-958'
union select 'USA', '78759'
union select 'FIN', '00950'
union select 'GTM', '01057'
union select 'SWE', 'AB85176'
union select 'CHL', '282-1255'
union select 'NLD', '3980CD'
union select 'CHN', '215021'
union select 'USA', '33166'
union select 'ISR', '66851'
union select 'ARG', '1674'
union select 'KWT', '40180'
union select 'ISL', 'A-987'
union select 'VEN', '1040A'
union select 'GER', '60549'
union select 'CZE', '10200'
union select 'CHE', 'CH8127'
union select 'ITA', '04011'
union select 'SGP', '338763'
union select 'MEX', 'CP76020'
union select 'IDN', '76116'
union select 'USA', '77043'
union select 'GRC', '12243'
union select 'USA', '49417'
union select 'ASM', '96799'
union select 'TUR', '38030'
union select 'OMN', 'PC-A12'
union select 'AUT', 'A-1014'
;
3. Verify.
select id, country, postalcode
from country_postalcodes
order by postalcode
;
Results should appear similar to the below, barring any differences in auto-increment id:
id country postalcode
9 FIN 00950
12 GTM 01057
16 ITA 04011
7 CZE 10200
30 VEN 1040A
11 GRC 12243
1 ARG 1674
6 CHN 215021
5 CHL 282-1255
26 USA 33166
23 SGP 338763
25 TUR 38030
19 NLD 3980CD
17 KWT 40180
8 DEU 45128
27 USA 49417
10 GER 60549
15 ISR 66851
13 IDN 76116
28 USA 77043
29 USA 78759
22 POL 80-958
2 ASM 96799
3 AUT A-1014
14 ISL A-987
24 SWE AB85176
4 CHE CH8127
18 MEX CP76020
20 OMN PC-133
21 OMN PC-A12
11 GRC 12243
1 ARG 1674
3 AUT A-1014
14 ISL A-987
6 CHN 215021
5 CHL 282-1255
Second example also plants the seed for our more complex need of sorting on either side of a separator like hyphen ("-"). The third is another example for our complex case as values with numeric only data on both sides of "-" will need to be treated/sorted as numeric so that for example United States zip+4 style postal code 45458-1234 would be sorted with shortened code 45458 as they would as text but additionally in the correct numerical sequence with the remainder of the data.
7 CZE 10200
30 VEN 1040A
Aside from wanting these separated, note how the grouping of alphanumeric and numeric only data causes seemingly incorrect numerical ordering. As another learning experience, you can probably explore sorting this text (beginning with numbers) to correctly align with numerical values just with a suffix; however, we will continue to keep heading down our original path where these values should be considered as text.
20 OMN PC-133
21 OMN PC-A12
Not really a problem with the fact that numbers are first in an alphanumeric sort, but drawing to your attention here as we will explore changing this behavior to show some added benefit of our methodology.
select id, country, postalcode
from country_postalcodes
order by isnumeric(postalcode), postalcode
;
And with the results, you will see that the new sort now has all text based values at the top of the list ordered alphabetically; then, numeric ones.
id country postalcode
30 VEN 1040A
5 CHL 282-1255
19 NLD 3980CD
22 POL 80-958
3 AUT A-1014
14 ISL A-987
24 SWE AB85176
4 CHE CH8127
18 MEX CP76020
20 OMN PC-133
21 OMN PC-A12
9 FIN 00950
12 GTM 01057
16 ITA 04011
7 CZE 10200
11 GRC 12243
1 ARG 1674
6 CHN 215021
26 USA 33166
23 SGP 338763
25 TUR 38030
17 KWT 40180
8 DEU 45128
27 USA 49417
10 GER 60549
15 ISR 66851
13 IDN 76116
28 USA 77043
29 USA 78759
2 ASM 96799
So without a lot of effort, IsNumeric() has just eliminated two-thirds of our task or at least fifty percent of our worries since our last task has two parts: numerical sequencing; handling left/right side of "-" based on three other rules.
select id, country, postalcode
from country_postalcodes
order by
case isnumeric(postalcode)
when 0 then 0
else cast(postalcode as int)
end, postalcode
;
or
select id, country, postalcode
from country_postalcodes
order by isnumeric(postalcode),
case isnumeric(postalcode)
when 0 then postalcode
else right(replicate('0',10)+postalcode, 10)
end
;
Resulting in (top 15 shown):
id country postalcode
30 VEN 1040A
5 CHL 282-1255
19 NLD 3980CD
22 POL 80-958
3 AUT A-1014
14 ISL A-987
24 SWE AB85176
4 CHE CH8127
18 MEX CP76020
20 OMN PC-133
21 OMN PC-A12
9 FIN 00950
12 GTM 01057
1 ARG 1674
16 ITA 04011
Wonderful!
select id, country, postalcode
from country_postalcodes
order by
case isnumeric(postalcode)
when 0 then 9999999999
else cast(postalcode as int)
end desc, postalcode
;
Resulting in (top 15 shown):
id country postalcode
30 VEN 1040A
5 CHL 282-1255
19 NLD 3980CD
22 POL 80-958
3 AUT A-1014
14 ISL A-987
24 SWE AB85176
4 CHE CH8127
18 MEX CP76020
20 OMN PC-133
21 OMN PC-A12
23 SGP 338763
6 CHN 215021
2 ASM 96799
29 USA 78759
Notice that the first text values are unchanged, but with our numbers are listed highest to lowest.
^(([0-9+-.$]{1})|([+-]?[$]?[0-9]*(([.]{1}[0-9]*)|([.]?[0-9]+))))$
From my testing, this got me what I needed to mimic the majority if not all of the IsNumeric() functionality in MySQL other than what I mention in "How it works." segment.
([0-9+-.$]{1})
The "{1}" denotes that there is one and only one character present. For minimum of one character and unlimited or other fixed maximum, the syntax changes to "{1, }" or "{1,n}", respectively.
([+-]?[$]?[0-9]*(([.]{1}[0-9]*)|([.]?[0-9]+)))
*Note: one known functionality not replicated, -$6.00 and $-6.00 should work, but didn't get into that for my version to obey the KISS ("keep it simple, silly") principle.
^[0-9]+$
The IsInteger() style implementation of IsNumeric() will lessen the amount of code needed, again to follow the KISS principle since this shortened expression is all that is needed to ensure that our value begins, includes, and ends with only the digits 0-9 so that we match on numbers only and not alphanumeric strings.
delimiter $$
drop function if exists `isnumeric` $$
create function `isnumeric` (s varchar(255)) returns int
begin
set @match =
'^(([0-9+-.$]{1})|([+-]?[$]?[0-9]*(([.]{1}[0-9]*)|([.]?[0-9]+))))$';
return if(s regexp @match, 1, 0);
end $$
delimiter ;
Voilà !
isnumeric(postalcode)
MySQL without UDF:
postalcode regexp '^[0-9]+$'
cast(postalcode as int)
MySQL:
cast(postalcode as signed)
if(instr(postalcode, '-') > 0,
left(postalcode, instr(postalcode, '-')-1),
postalcode
)
This portion of code uses Instr() to find the index of the hyphen ("-") within our value and then takes the left side using the Left() function. We can do this directly before comparing to our regular expression in the order by case statement, although, is not shown as such below to keep the comparison to the T-SQL version very clean and clear. Since there may not be a hyphen, the Instr() test is wrapped with the If() control function that ensures the index of hyphen is greater than 0 (it exists) before taking the left portion, otherwise use our entire value.
if(instr(postalcode, '-') > 0,
right(postalcode, instr(reverse(postalcode), '-')-1),
null
)
The use of Reverse() is the main difference in the Right() function than we did in the Left(). The reasoning here is if there is more than one hyphen, we will parse out the right most portion of the string. This way you can support multiple hyphens without having to support code for each level -- just left most and right most values; everything in the middle will sort as text. Again, keep it simple.
select id, country, postalcode
from (
select *,
/* split off left side */
if(instr(postalcode, '-') > 0,
left(postalcode, instr(postalcode, '-')-1),
postalcode
) as postalcode_left,
/* split off right side */
if(instr(postalcode, '-') > 0,
right(postalcode, instr(reverse(postalcode), '-')-1),
null
) as postalcode_right
from country_postalcodes
) derived
order by
/* handles ordering of text to left of hyphen */
case isnumeric(postalcode_left)
when 0 then 9999999999
else cast(postalcode_left as signed)
end desc, postalcode_left /* sort non-numeric left text */,
/* handles ordering of text to right of hyphen */
case isnumeric(postalcode_right)
when 0 then 9999999999
else cast(postalcode_right as signed)
end desc, postalcode_right /* sort non-numeric right text - catchall */
;
Resulting in our desired sort:
id country postalcode
16 VEN 1040A
9 NLD 3980CD
30 AUT A-1014
15 ISL A-987
7 SWE AB85176
19 CHE CH8127
22 MEX CP76020
29 OMN PC-A12
1 OMN PC-133
21 SGP 338763
10 CHN 215021
27 ASM 96799
4 USA 78759
24 USA 77043
23 IDN 76116
12 ISR 66851
17 GER 60549
26 USA 49417
2 DEU 45128
14 KWT 40180
28 TUR 38030
11 USA 33166
25 GRC 12243
18 CZE 10200
20 ITA 04011
13 ARG 1674
6 GTM 01057
5 FIN 00950
8 CHL 282-1255
3 POL 80-958
select id, country, postalcode
from (
select *,
/* split off left side */
if(instr(postalcode, '-') > 0,
left(postalcode, instr(postalcode, '-')-1),
postalcode
) as postalcode_left,
/* split off right side */
if(instr(postalcode, '-') > 0,
right(postalcode, instr(reverse(postalcode), '-')-1),
null
) as postalcode_right
from country_postalcodes
) derived
order by
/* handles ordering of text to left of hyphen */
case postalcode_left regexp '^[0-9]+$'
when 0 then 9999999999
else cast(postalcode_left as signed)
end desc, postalcode_left /* sort non-numeric left text */,
/* handles ordering of text to right of hyphen */
case postalcode_right regexp '^[0-9]+$'
when 0 then 9999999999
else cast(postalcode_right as signed)
end desc, postalcode_right /* sort non-numeric right text - catchall */
;
*Very simple update, but included here for convenience of copy and paste.
delimiter $$
drop function if exists `isinteger` $$
create function `isinteger` (s varchar(255)) returns int
begin
return if(s regexp '^[0-9]+$', 1, 0);
end $$
delimiter ;
select id, country, postalcode
from country_postalcodes
order by
case regexp_like(postalcode, '^[0-9]+$')
when 1 then cast(postalcode as integer)
else 9999999999
end, postalcode
;
Oracle below 9i:
select id, country, postalcode
from country_postalcodes
order by
if regexp_like(postalcode, '^[0-9]+$')
then cast(postalcode as integer)
else 9999999999
end if, postalcode
;
Have a question about something in this article? You can receive help directly from the article author. Sign up for a free trial to get started.
Comments (10)
Commented:
Big ol' yes vote!
Commented:
you can also use cast in mysql and many other database engines instead of relying on regexpr
just cast and compare the casted value to the original one will produce a reverse cast so the values will differ if the cast produced data loss which would happen if you cast a postal code with letters as an integer
select '01264abcd' == cast ( '01264abcd' as unsigned)
Author
Commented:Commented:
Commented:
select 3e2 REGEXP '^300$';
+--------------------+
| 3e2 REGEXP '^300$' |
+--------------------+
| 1 |
+--------------------+
1 row in set (0.00 sec)
... but if you quote the '3e2', it won't happen.
View More