parse individual words from text field

Posted on 2005-04-05
Last Modified: 2008-03-17
Using only MSSQL I need to take the words in a text/varchar and pull each individual word out, check to see if it is in the "noise" list (a separate table) and ignore it if it is in the "noise" table, if it is not noise then put each word into another table with the ID field of the record the text/varchar field belongs to. So basically there would be a list of the non-noise words in a separate table with the ID of the original record where they resided in the text/varchar field. Basically creating a list of keywords that links back to the original record. I don't know the string fuctions in sql to be able to separate out the data into individual words and such.
Question by:Donzella

    Author Comment

    Tinkering around I put this together. Is this something that looks effeicient enough? There are about one million records I need to run this on so efficiency would be important. This isn't doing the actual insert, just breaking up the strings into individual words.

    Create table [TempTable] (TempField nvarchar(100))

    Declare @EndWhen int,
            @TempPosition int,
            @TempString varchar(8000),
            @TempHoldString varchar(8000)
    Set @EndWhen = 0
    Set @TempString = 'this is a test'

    If DataLength(@TempString) <= 1
      Set @EndWhen = 1

    While @EndWhen != 1

      Set @TempPosition = Charindex(' ',@TempString)
        Set @TempHoldString = @TempString
        Set @TempString = ''
        Set @TempHoldString = left(@TempString,@TempPosition)
        Set @TempString = Right(@TempString,DataLength(@TempString) - @TempPosition)

      Insert Into [TempTable] (TempField)
      Values (@TempHoldString)

      If DataLength(@TempString) <= 1
             Set @EndWhen = 1


    select * from [TempTable]

    DROP TABLE [TempTable]
    LVL 68

    Accepted Solution

    I think the code below will be considerably more efficient since it avoids assigning values to string variables, which is slow in SQL Server.

    CREATE TABLE [TempTable] (TempField VARCHAR(100))

    --**** setup/initialization (params for a sp or function, for example)

    DECLARE @TempString VARCHAR(8000)

    SET @TempString = '  this is a test string  '

    --**** actual processing

    DECLARE @WordStart INT
    DECLARE @WordEnd INT
    DECLARE @TempStringLen INT

    --add a delim to end of string (last word)
    --so don't have to check every time to see if it exists or not
    SET @TempString = LTRIM(RTRIM(@TempString)) + ' '
    SET @TempStringLen = LEN(@TempString)
    SET @wordStart = 1

    WHILE @wordStart <= @TempStringLen
          SET @wordEnd = CHARINDEX(' ', @TempString, @wordStart + 1)
          INSERT INTO [TempTable] (TempField)
                VALUES (SUBSTRING(@TempString, @WordStart, @WordEnd - @WordStart))
          SET @wordStart = @wordEnd + 1

    SELECT TempField
    FROM [TempTable]

    DROP TABLE [TempTable]

    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    What Should I Do With This Threat Intelligence?

    Are you wondering if you actually need threat intelligence? The answer is yes. We explain the basics for creating useful threat intelligence.

    The steps for moving the system databases to a new location are documented in the following technical article: However sometimes after the moving process is finished, though SQL i…
    I wrote this interesting script that really help me find jobs or procedures when working in a huge environment. I could I have written it as a Procedure but then I would have to have it on each machine or have a link to a server-related search that …
    Using examples as well as descriptions, and references to Books Online, show the documentation available for date manipulation functions and by using a select few of these functions, show how date based data can be manipulated with these functions.
    Viewers will learn how to use the INSERT statement to insert data into their tables. It will also introduce the NULL statement, to show them what happens when no value is giving for any given column.

    737 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    21 Experts available now in Live!

    Get 1:1 Help Now