Link to home
Start Free TrialLog in
Avatar of scarface7
scarface7

asked on

Regular Expressions in SQL statements for bulk column data modification

Hi,

I have extracted some data from a flat file source and uploaded into a SQL Server 2005 relational table. My goal is to extract certain parts of a string in each row of a specific column, and then place those in the associated row/column location in another SQL server destination (relational table).  I am wondering if using embedded regular expressions in SQL statements is the best way to achieve the bulk data modification/retrieval. Here is an example of what I am trying to accomplish:

ADV  LAT   LON     TIME    WIND  PR  STAT
 1   17.00  -84.20 03/00Z   30  1005 TROPICAL DEPRESSION
 2   17.50  -84.50 03/03Z   30  1005 TROPICAL DEPRESSION
 3   18.80  -84.60 03/09Z   30  1005 TROPICAL DEPRESSION
 3A  19.50  -85.00 03/12Z   30  1005 TROPICAL DEPRESSION
 4   20.00  -85.30 03/15Z   35  1004 TROPICAL STORM
 4A  20.60  -85.80 03/18Z   40  1002 TROPICAL STORM
 5   21.10  -85.90 03/21Z   45  1002 TROPICAL STORM

In the "TIME" column, I want to extract the first two characters as a substring from every row i.e. "03" and place it in a seperate column in another table. Then extract the two characters after "/", add two zeros and then place it in another column of a certain relational table.

I have SQL Server business intelligence 2005 tools handy, so if you feel there is some easier way to accomplish the same goal by using some in built features your aware of, let me know. If not, is it technically possible to execute a single SQL statement to accomplish this. If yes, can you give me some idea of how to write this SQL statement. I am prepared to a write a C# program to accomplish this if required.

Thanks !
Avatar of Anthony Perkins
Anthony Perkins
Flag of United States of America image

Use something like this:

Select      LEFT([TIME], 2),
      SUBSTRING([TIME], 4, 2) + '00'
From      YourTableName
Avatar of scarface7
scarface7

ASKER

Thanks for the solution.

In your opinion is it better to approach the problem in this fashion or do you think I should write a C# program where I would store all the values in a result sets and temporary variables and then apply string functions and then load the data back into another staging relational table ?

It really depends on your personal preferences and strengths.  If it was me and the values are in the table as you describe than I would just use a SQL INSERT or UPDATE statement using something like I posted.
I tried executing your solution against the data.

I get:

Expr1   Expr2
0          /000
0          /000
0          /000
0          /100

The SQL query gets executed in this form by the Query designer:
SELECT     LEFT(TIME, 2) AS Expr1, SUBSTRING(TIME, 4, 2) + '00' AS Expr2
FROM         Hurricane_Staging_1

Could this be happening because TIME might be a used name?

Also in response to your last comment, could there be a situation where you might choose to write a C# program with result sets, temporary variables and string functions instead of a direct SQL statement like this one?
 
Than there is something you are not telling us.  Compare your code/table structure to the following and you should be able to figure out what is going on:

Declare @Hurricane_Staging_1 table(
            ADV char(2),
            LAT numeric(9,2),
            LON numeric(9,2),
            [TIME] char(6),
            WIND tinyint,
            PR smallint,
            STAT varchar(20))
SET NOCOUNT ON
Insert @Hurricane_Staging_1 (ADV, LAT, LON, [TIME], WIND, PR, STAT)
Values ('1', 17.00, -84.20, '03/00Z', 30, 1005, 'TROPICAL DEPRESSION')

Insert @Hurricane_Staging_1 (ADV, LAT, LON, [TIME], WIND, PR, STAT)
Values ('2', 17.50, -84.50, '03/03Z', 30, 1005, 'TROPICAL DEPRESSION')

Insert @Hurricane_Staging_1 (ADV, LAT, LON, [TIME], WIND, PR, STAT)
Values ('3', 18.80, -84.60, '03/09Z', 30, 1005, 'TROPICAL DEPRESSION')

Insert @Hurricane_Staging_1 (ADV, LAT, LON, [TIME], WIND, PR, STAT)
Values ('3A', 19.50, -85.00, '03/12Z', 30, 1005, 'TROPICAL DEPRESSION')

Insert @Hurricane_Staging_1 (ADV, LAT, LON, [TIME], WIND, PR, STAT)
Values ('4', 20.00, -85.30, '03/15Z', 35, 1004, 'TROPICAL STORM')

Insert @Hurricane_Staging_1 (ADV, LAT, LON, [TIME], WIND, PR, STAT)
Values ('4A', 20.60, -85.80, '03/18Z', 40, 1002, 'TROPICAL STORM')

Insert @Hurricane_Staging_1 (ADV, LAT, LON, [TIME], WIND, PR, STAT)
Values ('5', 21.10, -85.90, '03/21Z', 45, 1002, 'TROPICAL STORM')

Select * From @Hurricane_Staging_1

Select      LEFT([TIME], 2),
      SUBSTRING([TIME], 4, 2) + '00'
From      @Hurricane_Staging_1


This give me the following output:
ADV      LAT      LON      TIME      WIND      PR      STAT
1       17.00      -84.20      03/00Z      30      1005      TROPICAL DEPRESSION
2       17.50      -84.50      03/03Z      30      1005      TROPICAL DEPRESSION
3       18.80      -84.60      03/09Z      30      1005      TROPICAL DEPRESSION
3A      19.50      -85.00      03/12Z      30      1005      TROPICAL DEPRESSION
4       20.00      -85.30      03/15Z      35      1004      TROPICAL STORM
4A      20.60      -85.80      03/18Z      40      1002      TROPICAL STORM
5       21.10      -85.90      03/21Z      45      1002      TROPICAL STORM

      
03      0000
03      0300
03      0900
03      1200
03      1500
03      1800
03      2100
ASKER CERTIFIED SOLUTION
Avatar of Anthony Perkins
Anthony Perkins
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Perfect !

Yes, there were spaces that needed to be taken consideration.

I get the correct output data now :)

Thank you very much.