scarface7
asked on
Regular Expressions in SQL statements for bulk column data modification
Hi,
I have extracted some data from a flat file source and uploaded into a SQL Server 2005 relational table. My goal is to extract certain parts of a string in each row of a specific column, and then place those in the associated row/column location in another SQL server destination (relational table). I am wondering if using embedded regular expressions in SQL statements is the best way to achieve the bulk data modification/retrieval. Here is an example of what I am trying to accomplish:
ADV LAT LON TIME WIND PR STAT
1 17.00 -84.20 03/00Z 30 1005 TROPICAL DEPRESSION
2 17.50 -84.50 03/03Z 30 1005 TROPICAL DEPRESSION
3 18.80 -84.60 03/09Z 30 1005 TROPICAL DEPRESSION
3A 19.50 -85.00 03/12Z 30 1005 TROPICAL DEPRESSION
4 20.00 -85.30 03/15Z 35 1004 TROPICAL STORM
4A 20.60 -85.80 03/18Z 40 1002 TROPICAL STORM
5 21.10 -85.90 03/21Z 45 1002 TROPICAL STORM
In the "TIME" column, I want to extract the first two characters as a substring from every row i.e. "03" and place it in a seperate column in another table. Then extract the two characters after "/", add two zeros and then place it in another column of a certain relational table.
I have SQL Server business intelligence 2005 tools handy, so if you feel there is some easier way to accomplish the same goal by using some in built features your aware of, let me know. If not, is it technically possible to execute a single SQL statement to accomplish this. If yes, can you give me some idea of how to write this SQL statement. I am prepared to a write a C# program to accomplish this if required.
Thanks !
I have extracted some data from a flat file source and uploaded into a SQL Server 2005 relational table. My goal is to extract certain parts of a string in each row of a specific column, and then place those in the associated row/column location in another SQL server destination (relational table). I am wondering if using embedded regular expressions in SQL statements is the best way to achieve the bulk data modification/retrieval. Here is an example of what I am trying to accomplish:
ADV LAT LON TIME WIND PR STAT
1 17.00 -84.20 03/00Z 30 1005 TROPICAL DEPRESSION
2 17.50 -84.50 03/03Z 30 1005 TROPICAL DEPRESSION
3 18.80 -84.60 03/09Z 30 1005 TROPICAL DEPRESSION
3A 19.50 -85.00 03/12Z 30 1005 TROPICAL DEPRESSION
4 20.00 -85.30 03/15Z 35 1004 TROPICAL STORM
4A 20.60 -85.80 03/18Z 40 1002 TROPICAL STORM
5 21.10 -85.90 03/21Z 45 1002 TROPICAL STORM
In the "TIME" column, I want to extract the first two characters as a substring from every row i.e. "03" and place it in a seperate column in another table. Then extract the two characters after "/", add two zeros and then place it in another column of a certain relational table.
I have SQL Server business intelligence 2005 tools handy, so if you feel there is some easier way to accomplish the same goal by using some in built features your aware of, let me know. If not, is it technically possible to execute a single SQL statement to accomplish this. If yes, can you give me some idea of how to write this SQL statement. I am prepared to a write a C# program to accomplish this if required.
Thanks !
ASKER
Thanks for the solution.
In your opinion is it better to approach the problem in this fashion or do you think I should write a C# program where I would store all the values in a result sets and temporary variables and then apply string functions and then load the data back into another staging relational table ?
In your opinion is it better to approach the problem in this fashion or do you think I should write a C# program where I would store all the values in a result sets and temporary variables and then apply string functions and then load the data back into another staging relational table ?
It really depends on your personal preferences and strengths. If it was me and the values are in the table as you describe than I would just use a SQL INSERT or UPDATE statement using something like I posted.
ASKER
I tried executing your solution against the data.
I get:
Expr1 Expr2
0 /000
0 /000
0 /000
0 /100
The SQL query gets executed in this form by the Query designer:
SELECT LEFT(TIME, 2) AS Expr1, SUBSTRING(TIME, 4, 2) + '00' AS Expr2
FROM Hurricane_Staging_1
Could this be happening because TIME might be a used name?
Also in response to your last comment, could there be a situation where you might choose to write a C# program with result sets, temporary variables and string functions instead of a direct SQL statement like this one?
I get:
Expr1 Expr2
0 /000
0 /000
0 /000
0 /100
The SQL query gets executed in this form by the Query designer:
SELECT LEFT(TIME, 2) AS Expr1, SUBSTRING(TIME, 4, 2) + '00' AS Expr2
FROM Hurricane_Staging_1
Could this be happening because TIME might be a used name?
Also in response to your last comment, could there be a situation where you might choose to write a C# program with result sets, temporary variables and string functions instead of a direct SQL statement like this one?
Than there is something you are not telling us. Compare your code/table structure to the following and you should be able to figure out what is going on:
Declare @Hurricane_Staging_1 table(
ADV char(2),
LAT numeric(9,2),
LON numeric(9,2),
[TIME] char(6),
WIND tinyint,
PR smallint,
STAT varchar(20))
SET NOCOUNT ON
Insert @Hurricane_Staging_1 (ADV, LAT, LON, [TIME], WIND, PR, STAT)
Values ('1', 17.00, -84.20, '03/00Z', 30, 1005, 'TROPICAL DEPRESSION')
Insert @Hurricane_Staging_1 (ADV, LAT, LON, [TIME], WIND, PR, STAT)
Values ('2', 17.50, -84.50, '03/03Z', 30, 1005, 'TROPICAL DEPRESSION')
Insert @Hurricane_Staging_1 (ADV, LAT, LON, [TIME], WIND, PR, STAT)
Values ('3', 18.80, -84.60, '03/09Z', 30, 1005, 'TROPICAL DEPRESSION')
Insert @Hurricane_Staging_1 (ADV, LAT, LON, [TIME], WIND, PR, STAT)
Values ('3A', 19.50, -85.00, '03/12Z', 30, 1005, 'TROPICAL DEPRESSION')
Insert @Hurricane_Staging_1 (ADV, LAT, LON, [TIME], WIND, PR, STAT)
Values ('4', 20.00, -85.30, '03/15Z', 35, 1004, 'TROPICAL STORM')
Insert @Hurricane_Staging_1 (ADV, LAT, LON, [TIME], WIND, PR, STAT)
Values ('4A', 20.60, -85.80, '03/18Z', 40, 1002, 'TROPICAL STORM')
Insert @Hurricane_Staging_1 (ADV, LAT, LON, [TIME], WIND, PR, STAT)
Values ('5', 21.10, -85.90, '03/21Z', 45, 1002, 'TROPICAL STORM')
Select * From @Hurricane_Staging_1
Select LEFT([TIME], 2),
SUBSTRING([TIME], 4, 2) + '00'
From @Hurricane_Staging_1
This give me the following output:
ADV LAT LON TIME WIND PR STAT
1 17.00 -84.20 03/00Z 30 1005 TROPICAL DEPRESSION
2 17.50 -84.50 03/03Z 30 1005 TROPICAL DEPRESSION
3 18.80 -84.60 03/09Z 30 1005 TROPICAL DEPRESSION
3A 19.50 -85.00 03/12Z 30 1005 TROPICAL DEPRESSION
4 20.00 -85.30 03/15Z 35 1004 TROPICAL STORM
4A 20.60 -85.80 03/18Z 40 1002 TROPICAL STORM
5 21.10 -85.90 03/21Z 45 1002 TROPICAL STORM
03 0000
03 0300
03 0900
03 1200
03 1500
03 1800
03 2100
Declare @Hurricane_Staging_1 table(
ADV char(2),
LAT numeric(9,2),
LON numeric(9,2),
[TIME] char(6),
WIND tinyint,
PR smallint,
STAT varchar(20))
SET NOCOUNT ON
Insert @Hurricane_Staging_1 (ADV, LAT, LON, [TIME], WIND, PR, STAT)
Values ('1', 17.00, -84.20, '03/00Z', 30, 1005, 'TROPICAL DEPRESSION')
Insert @Hurricane_Staging_1 (ADV, LAT, LON, [TIME], WIND, PR, STAT)
Values ('2', 17.50, -84.50, '03/03Z', 30, 1005, 'TROPICAL DEPRESSION')
Insert @Hurricane_Staging_1 (ADV, LAT, LON, [TIME], WIND, PR, STAT)
Values ('3', 18.80, -84.60, '03/09Z', 30, 1005, 'TROPICAL DEPRESSION')
Insert @Hurricane_Staging_1 (ADV, LAT, LON, [TIME], WIND, PR, STAT)
Values ('3A', 19.50, -85.00, '03/12Z', 30, 1005, 'TROPICAL DEPRESSION')
Insert @Hurricane_Staging_1 (ADV, LAT, LON, [TIME], WIND, PR, STAT)
Values ('4', 20.00, -85.30, '03/15Z', 35, 1004, 'TROPICAL STORM')
Insert @Hurricane_Staging_1 (ADV, LAT, LON, [TIME], WIND, PR, STAT)
Values ('4A', 20.60, -85.80, '03/18Z', 40, 1002, 'TROPICAL STORM')
Insert @Hurricane_Staging_1 (ADV, LAT, LON, [TIME], WIND, PR, STAT)
Values ('5', 21.10, -85.90, '03/21Z', 45, 1002, 'TROPICAL STORM')
Select * From @Hurricane_Staging_1
Select LEFT([TIME], 2),
SUBSTRING([TIME], 4, 2) + '00'
From @Hurricane_Staging_1
This give me the following output:
ADV LAT LON TIME WIND PR STAT
1 17.00 -84.20 03/00Z 30 1005 TROPICAL DEPRESSION
2 17.50 -84.50 03/03Z 30 1005 TROPICAL DEPRESSION
3 18.80 -84.60 03/09Z 30 1005 TROPICAL DEPRESSION
3A 19.50 -85.00 03/12Z 30 1005 TROPICAL DEPRESSION
4 20.00 -85.30 03/15Z 35 1004 TROPICAL STORM
4A 20.60 -85.80 03/18Z 40 1002 TROPICAL STORM
5 21.10 -85.90 03/21Z 45 1002 TROPICAL STORM
03 0000
03 0300
03 0900
03 1200
03 1500
03 1800
03 2100
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Perfect !
Yes, there were spaces that needed to be taken consideration.
I get the correct output data now :)
Thank you very much.
Yes, there were spaces that needed to be taken consideration.
I get the correct output data now :)
Thank you very much.
Select LEFT([TIME], 2),
SUBSTRING([TIME], 4, 2) + '00'
From YourTableName