Solved

counting strings in ntext field

Posted on 2004-08-16
4
483 Views
Last Modified: 2012-05-05
Attempting to count a string in an ntext field

for example
the ntext field test has "ted, bob, test, sue, ted"
I want to count the occurences of "ted" in test

I've looked through the various answers provided here re: counting substrings and haven't been successful in modifying them to work for this purpose.
0
Comment
Question by:slinman2
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
4 Comments
 
LVL 9

Expert Comment

by:paelo
ID: 11813875
The major problem as I see it is working with the text field.  I think there is some way of chunking it up and parsing the entire thing but anything I can conceive would be terribly inefficient.

If its possible to store the text in a varchar(8000) then you can do something like this:

--declarations
DECLARE @X varchar(30),
 @N varchar(8000)

--set find value, convert text field
SELECT @X='Ted', @N=CONVERT(Varchar(8000),textfld)

--print number of occurences of that string (including instances within a word, ie. this will count the TED in faTED)
PRINT (LEN(@N)-LEN(REPLACE(@N,@X,'')))/LEN(@X)


-Paul.
0
 
LVL 15

Expert Comment

by:jdlambert1
ID: 11813883
Only ways I know of to do this are to use a cursor in a Stored Procedure, or create the logic in an Extended Stored Procedure.

In an SP, create two counter variables, one to track the starting position in the string and the other to count the hits. Plug the first one into SubString's starting position and loop (with a cursor) until you get to the end of the string, then store the total and go to the next string.
0
 
LVL 12

Accepted Solution

by:
kselvia earned 125 total points
ID: 11814386
--This is one way to do it. It is slow, but probably faster than a cursor/loop
--You need a table of numbers. Run this once to create one;

SELECT TOP 50000 identity(int,1,1) as ID
INTO Numbers
FROM sysobjects s1, sysobjects s2, sysobjects s3

--Some test data
create table t (id int, test ntext )

--Generate 2 ntext entries
insert t (id, test) select 1, 'ted, bob, test, sue, ted, '  -- ted occurs twice
insert t (id, test) select 2, 'jim, kevin, joe, ted, bob, ' -- ted occurs once

DECLARE @ptrval binary(16)
DECLARE @repeat int

-- Generate 1000 duplications of initial data for row 1
SELECT @ptrval = TEXTPTR(test) , @repeat = 0
FROM t
WHERE id = 1
WHILE @repeat < 1000
BEGIN
      UPDATETEXT t.test @ptrval 0 0 'ted, bob, test, sue, ted, '
      SET @repeat = @repeat + 1
END

-- Generate 1000 duplications of initial data for row 2
SELECT @ptrval = TEXTPTR(test) , @repeat = 0
FROM t
WHERE id = 2
WHILE @repeat < 1000
BEGIN
      UPDATETEXT t.test @ptrval 0 0 'jim, kevin, joe, ted, bob, '
      SET @repeat = @repeat + 1
END

-- Sample - 2 rows of text data
SELECT * FROM t
id          test
----------- ---------------------------------------------------...
1           ted, bob, test, sue, ted, ted, bob, test, sue, ted,...
2           jim, kevin, job, ted, bob, jim, kevin, job, ted, bob...

--Count the number of times 'ted,' occurs.

DECLARE @search varchar(10)
SET @search = 'ted,'

SELECT tid TextRow, count(1) Occurs , @search SearchString
FROM (
      SELECT
      t.id tid, n.id nid, substring(test, (n.id -1), Len(@search)) part
      FROM t, Numbers n
      WHERE (n.ID) < datalength(test)
      AND substring(test, (n.id -1), Len(@search)) = @search
) Lookup
GROUP BY tid
ORDER BY tid

TextRow     Occurs      SearchString
----------- ----------- ------------
1           2002        ted,
2           1001        ted,



--P.S. I tried to do this more efficiently by breaking the text into varchar strings but if the
--text broke in the middle of 'ted' it was not matched. The version below could be made to break on
--word seperators (, or blank) but that will take more work and the version above will solve the problem.

SELECT tid TextRow, sum(Occurs) Occurs
FROM (
      SELECT tid, (Len(part) - Len (Replace(part,'ted',''))) / len ('ted') Occurs
      FROM  (
            SELECT TOP 100 PERCENT
            t.id tid, n.id nid, datalength(test) dl, substring(test, (n.id -1) * 4000 + 1 , 4000) part
            FROM t, Numbers n
            WHERE (n.ID -1) * 4000 + 1 < datalength(test)
            ORDER BY t.ID
            ) CountOccurs
      ) Lookup
GROUP BY tid

TextRow     Occurs      
----------- -----------
1           2000            <-- missed one becase a ted occured at 4000 byte boundry
2           1001

0
 

Author Comment

by:slinman2
ID: 11854922
Thanks for all the responses.  I took the approach of kselvia and it worked out great.  thanks.
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Having an SQL database can be a big investment for a small company. Hardware, setup and of course, the price of software all add up to a big bill that some companies may not be able to absorb.  Luckily, there is a free version SQL Express, but does …
Ever wondered why sometimes your SQL Server is slow or unresponsive with connections spiking up but by the time you go in, all is well? The following article will show you how to install and configure a SQL job that will send you email alerts includ…
Using examples as well as descriptions, and references to Books Online, show the different Recovery Models available in SQL Server and explain, as well as show how full, differential and transaction log backups are performed
Via a live example, show how to shrink a transaction log file down to a reasonable size.

726 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question