TEXTPTR? PATINDEX? READTEXT?

Need advice on the "best-approach" to accomplish ... this ...

TEXT var types in a MS SQL 8.0 database.
My TEXT fields can contain (literally) up to 2 gb each.

Using TSQL in a Stored Proc, I need to

"walk" through all records meeting a condition
and for each record
"walk" through every instance of the words "Hello World" appearing anywhere in the TEXT column
and for each instance ...
capture the 500 characters BEFORE "Hello World" and the 500 characters AFTER "Hello World"

So, if a single record's 2gb TEXT field has 10 instances of "Hello World", then I need each instance's 500chars+"Hello World"+500chars

I assume I'll start with a #tempTable, and use a cursor to walk the records ...
Finding 10 instances, I'll write 500chars+"Hello World"+500chars to the temp table .... 10 times ....

But ... Using TSQL ... how the heck do I walk the 2gb of each TEXT field to find the 10 instances?????

Advice is welcome.
Speed will be important ...

LVL 5
volkingAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

volkingAuthor Commented:
Also concerned about calculating 500+"Hello World"+500 when "Hello World" is the very-first or very-last words in the TEXT field.
0
Anthony PerkinsCommented:
Do yourself a big favor and look up Full-Text Search in BOL.  You are going about it the wrong way.
0
Anthony PerkinsCommented:
Full-Text Search will resolve the first part of your question.  You can then add all the rows to a Table variable and cycle through them using PATINDEX and SUBSTRING to get the values you need.  There is no need for TEXTPTR, let alone READTEXT.
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

volkingAuthor Commented:
@acperkins - thanks for the advice ... how does it address the underlying problem?

Finding which particular records contain "Hello World" SOMEPLACE is rather easy. You're right, Full-Text search would work great.

HOWEVER ... let's assume the record is identified ... How to I locate ALL TEN INSTANCES of the words "Hello World" burried SOMEPLACE in a 2 gigabyte text bucket ... using TSQL?

Any Idea's?
0
volkingAuthor Commented:
@acperkins - remember that PATINDEX does not accept a Start-Look-At offset location.
0
volkingAuthor Commented:
@acperkins - Obviously, using a solution based in the Client is not feasible, since such would require over-the-wire transmissions of theoretically dozens-and-dozens of gigabytes from the server back to the client for additional sub-processing. A solution that works at-the-server is the only reasonable approach? Right?
0
Anthony PerkinsCommented:
>> how does it address the underlying problem?<<
Which one?  There was two:
1. "I assume I'll start with a #tempTable, and use a cursor to walk the records ..."
2. "Finding 10 instances, I'll write 500chars+"Hello World"+500chars to the temp table .... 10 times ...."

As I stated "Full-Text Search will resolve the first part of your question.  You can then add all the rows to a Table variable and cycle through them using PATINDEX and SUBSTRING to get the values you need. "

Assuming that you now have all the rows in a variable of type table with an IDENTITY column (ID) you can now cycle through all the rows by iterateing over the ID's and also cycling through all the values within the text column.  This second part you can use PATINDEX with SUBSTRING(YourTextColumn, @Pos, LEN(YourTextColumn)) or if you prefer CHARINDEX with @Pos. Where @Pos is the offset you are iterating.

0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
volkingAuthor Commented:
declare @StartChar as bigint
set @StartChar = (select PATINDEX('%Hello World%', TextField) from table where key = 'value')
while @StartChar <> 0
BEGIN
    select substring(TextField, @StartChar, 1000)
    from table
    where key = 'value'

    set @StartChar = (select PatIndex('%Hello World%', Substring(TextField, @StartChar, 20000000000000000000)) from table where key = 'value')
END
0
volkingAuthor Commented:
Rats! Sometimes it's SO SIMPLE!
I've written your sample about 5 times.
Every time I wrote it, it failed!

BUT, I did not use a BIGINT
Changed my code from int to bigint.
Shazam!
Worked first time!

So close, yet so far!
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft SQL Server

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.