Regular Expressions in SQL statements for bulk column data modification

Posted on 2006-04-08
Last Modified: 2012-06-27

I have extracted some data from a flat file source and uploaded into a SQL Server 2005 relational table. My goal is to extract certain parts of a string in each row of a specific column, and then place those in the associated row/column location in another SQL server destination (relational table).  I am wondering if using embedded regular expressions in SQL statements is the best way to achieve the bulk data modification/retrieval. Here is an example of what I am trying to accomplish:

 1   17.00  -84.20 03/00Z   30  1005 TROPICAL DEPRESSION
 2   17.50  -84.50 03/03Z   30  1005 TROPICAL DEPRESSION
 3   18.80  -84.60 03/09Z   30  1005 TROPICAL DEPRESSION
 3A  19.50  -85.00 03/12Z   30  1005 TROPICAL DEPRESSION
 4   20.00  -85.30 03/15Z   35  1004 TROPICAL STORM
 4A  20.60  -85.80 03/18Z   40  1002 TROPICAL STORM
 5   21.10  -85.90 03/21Z   45  1002 TROPICAL STORM

In the "TIME" column, I want to extract the first two characters as a substring from every row i.e. "03" and place it in a seperate column in another table. Then extract the two characters after "/", add two zeros and then place it in another column of a certain relational table.

I have SQL Server business intelligence 2005 tools handy, so if you feel there is some easier way to accomplish the same goal by using some in built features your aware of, let me know. If not, is it technically possible to execute a single SQL statement to accomplish this. If yes, can you give me some idea of how to write this SQL statement. I am prepared to a write a C# program to accomplish this if required.

Thanks !
Question by:scarface7
    LVL 75

    Expert Comment

    by:Anthony Perkins
    Use something like this:

    Select      LEFT([TIME], 2),
          SUBSTRING([TIME], 4, 2) + '00'
    From      YourTableName

    Author Comment

    Thanks for the solution.

    In your opinion is it better to approach the problem in this fashion or do you think I should write a C# program where I would store all the values in a result sets and temporary variables and then apply string functions and then load the data back into another staging relational table ?

    LVL 75

    Expert Comment

    by:Anthony Perkins
    It really depends on your personal preferences and strengths.  If it was me and the values are in the table as you describe than I would just use a SQL INSERT or UPDATE statement using something like I posted.

    Author Comment

    I tried executing your solution against the data.

    I get:

    Expr1   Expr2
    0          /000
    0          /000
    0          /000
    0          /100

    The SQL query gets executed in this form by the Query designer:
    SELECT     LEFT(TIME, 2) AS Expr1, SUBSTRING(TIME, 4, 2) + '00' AS Expr2
    FROM         Hurricane_Staging_1

    Could this be happening because TIME might be a used name?

    Also in response to your last comment, could there be a situation where you might choose to write a C# program with result sets, temporary variables and string functions instead of a direct SQL statement like this one?
    LVL 75

    Expert Comment

    by:Anthony Perkins
    Than there is something you are not telling us.  Compare your code/table structure to the following and you should be able to figure out what is going on:

    Declare @Hurricane_Staging_1 table(
                ADV char(2),
                LAT numeric(9,2),
                LON numeric(9,2),
                [TIME] char(6),
                WIND tinyint,
                PR smallint,
                STAT varchar(20))
    Insert @Hurricane_Staging_1 (ADV, LAT, LON, [TIME], WIND, PR, STAT)
    Values ('1', 17.00, -84.20, '03/00Z', 30, 1005, 'TROPICAL DEPRESSION')

    Insert @Hurricane_Staging_1 (ADV, LAT, LON, [TIME], WIND, PR, STAT)
    Values ('2', 17.50, -84.50, '03/03Z', 30, 1005, 'TROPICAL DEPRESSION')

    Insert @Hurricane_Staging_1 (ADV, LAT, LON, [TIME], WIND, PR, STAT)
    Values ('3', 18.80, -84.60, '03/09Z', 30, 1005, 'TROPICAL DEPRESSION')

    Insert @Hurricane_Staging_1 (ADV, LAT, LON, [TIME], WIND, PR, STAT)
    Values ('3A', 19.50, -85.00, '03/12Z', 30, 1005, 'TROPICAL DEPRESSION')

    Insert @Hurricane_Staging_1 (ADV, LAT, LON, [TIME], WIND, PR, STAT)
    Values ('4', 20.00, -85.30, '03/15Z', 35, 1004, 'TROPICAL STORM')

    Insert @Hurricane_Staging_1 (ADV, LAT, LON, [TIME], WIND, PR, STAT)
    Values ('4A', 20.60, -85.80, '03/18Z', 40, 1002, 'TROPICAL STORM')

    Insert @Hurricane_Staging_1 (ADV, LAT, LON, [TIME], WIND, PR, STAT)
    Values ('5', 21.10, -85.90, '03/21Z', 45, 1002, 'TROPICAL STORM')

    Select * From @Hurricane_Staging_1

    Select      LEFT([TIME], 2),
          SUBSTRING([TIME], 4, 2) + '00'
    From      @Hurricane_Staging_1

    This give me the following output:
    ADV      LAT      LON      TIME      WIND      PR      STAT
    1       17.00      -84.20      03/00Z      30      1005      TROPICAL DEPRESSION
    2       17.50      -84.50      03/03Z      30      1005      TROPICAL DEPRESSION
    3       18.80      -84.60      03/09Z      30      1005      TROPICAL DEPRESSION
    3A      19.50      -85.00      03/12Z      30      1005      TROPICAL DEPRESSION
    4       20.00      -85.30      03/15Z      35      1004      TROPICAL STORM
    4A      20.60      -85.80      03/18Z      40      1002      TROPICAL STORM
    5       21.10      -85.90      03/21Z      45      1002      TROPICAL STORM

    03      0000
    03      0300
    03      0900
    03      1200
    03      1500
    03      1800
    03      2100
    LVL 75

    Accepted Solution

    You appear to have a leading space in your data for [TIME]  If this is consistent for all rows than just simple do this:

    Select     SUBSTRING([TIME], 2, 2),
         SUBSTRING([TIME], 5, 2) + '00'
    From     Hurricane_Staging_1

    Author Comment

    Perfect !

    Yes, there were spaces that needed to be taken consideration.

    I get the correct output data now :)

    Thank you very much.

    Featured Post

    Enabling OSINT in Activity Based Intelligence

    Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

    Join & Write a Comment

    When you hear the word proxy, you may become apprehensive. This article will help you to understand Proxy and when it is useful. Let's talk Proxy for SQL Server. (Not in terms of Internet access.) Typically, you'll run into this type of problem w…
    This article explains how to reset the password of the sa account on a Microsoft SQL Server.  The steps in this article work in SQL 2005, 2008, 2008 R2, 2012, 2014 and 2016.
    Using examples as well as descriptions, and references to Books Online, show the different Recovery Models available in SQL Server and explain, as well as show how full, differential and transaction log backups are performed
    Viewers will learn how the fundamental information of how to create a table.

    746 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    19 Experts available now in Live!

    Get 1:1 Help Now