Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
?
Solved

TEXTPTR? PATINDEX? READTEXT?

Posted on 2006-03-26
9
Medium Priority
?
753 Views
Last Modified: 2008-01-09
Need advice on the "best-approach" to accomplish ... this ...

TEXT var types in a MS SQL 8.0 database.
My TEXT fields can contain (literally) up to 2 gb each.

Using TSQL in a Stored Proc, I need to

"walk" through all records meeting a condition
and for each record
"walk" through every instance of the words "Hello World" appearing anywhere in the TEXT column
and for each instance ...
capture the 500 characters BEFORE "Hello World" and the 500 characters AFTER "Hello World"

So, if a single record's 2gb TEXT field has 10 instances of "Hello World", then I need each instance's 500chars+"Hello World"+500chars

I assume I'll start with a #tempTable, and use a cursor to walk the records ...
Finding 10 instances, I'll write 500chars+"Hello World"+500chars to the temp table .... 10 times ....

But ... Using TSQL ... how the heck do I walk the 2gb of each TEXT field to find the 10 instances?????

Advice is welcome.
Speed will be important ...

0
Comment
Question by:volking
  • 6
  • 3
9 Comments
 
LVL 5

Author Comment

by:volking
ID: 16296014
Also concerned about calculating 500+"Hello World"+500 when "Hello World" is the very-first or very-last words in the TEXT field.
0
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 16296829
Do yourself a big favor and look up Full-Text Search in BOL.  You are going about it the wrong way.
0
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 16296840
Full-Text Search will resolve the first part of your question.  You can then add all the rows to a Table variable and cycle through them using PATINDEX and SUBSTRING to get the values you need.  There is no need for TEXTPTR, let alone READTEXT.
0
Configuration Guide and Best Practices

Read the guide to learn how to orchestrate Data ONTAP, create application-consistent backups and enable fast recovery from NetApp storage snapshots. Version 9.5 also contains performance and scalability enhancements to meet the needs of the largest enterprise environments.

 
LVL 5

Author Comment

by:volking
ID: 16304866
@acperkins - thanks for the advice ... how does it address the underlying problem?

Finding which particular records contain "Hello World" SOMEPLACE is rather easy. You're right, Full-Text search would work great.

HOWEVER ... let's assume the record is identified ... How to I locate ALL TEN INSTANCES of the words "Hello World" burried SOMEPLACE in a 2 gigabyte text bucket ... using TSQL?

Any Idea's?
0
 
LVL 5

Author Comment

by:volking
ID: 16304894
@acperkins - remember that PATINDEX does not accept a Start-Look-At offset location.
0
 
LVL 5

Author Comment

by:volking
ID: 16304967
@acperkins - Obviously, using a solution based in the Client is not feasible, since such would require over-the-wire transmissions of theoretically dozens-and-dozens of gigabytes from the server back to the client for additional sub-processing. A solution that works at-the-server is the only reasonable approach? Right?
0
 
LVL 75

Accepted Solution

by:
Anthony Perkins earned 2000 total points
ID: 16307361
>> how does it address the underlying problem?<<
Which one?  There was two:
1. "I assume I'll start with a #tempTable, and use a cursor to walk the records ..."
2. "Finding 10 instances, I'll write 500chars+"Hello World"+500chars to the temp table .... 10 times ...."

As I stated "Full-Text Search will resolve the first part of your question.  You can then add all the rows to a Table variable and cycle through them using PATINDEX and SUBSTRING to get the values you need. "

Assuming that you now have all the rows in a variable of type table with an IDENTITY column (ID) you can now cycle through all the rows by iterateing over the ID's and also cycling through all the values within the text column.  This second part you can use PATINDEX with SUBSTRING(YourTextColumn, @Pos, LEN(YourTextColumn)) or if you prefer CHARINDEX with @Pos. Where @Pos is the offset you are iterating.

0
 
LVL 5

Author Comment

by:volking
ID: 16342232
declare @StartChar as bigint
set @StartChar = (select PATINDEX('%Hello World%', TextField) from table where key = 'value')
while @StartChar <> 0
BEGIN
    select substring(TextField, @StartChar, 1000)
    from table
    where key = 'value'

    set @StartChar = (select PatIndex('%Hello World%', Substring(TextField, @StartChar, 20000000000000000000)) from table where key = 'value')
END
0
 
LVL 5

Author Comment

by:volking
ID: 16342237
Rats! Sometimes it's SO SIMPLE!
I've written your sample about 5 times.
Every time I wrote it, it failed!

BUT, I did not use a BIGINT
Changed my code from int to bigint.
Shazam!
Worked first time!

So close, yet so far!
0

Featured Post

Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In part one, we reviewed the prerequisites required for installing SQL Server vNext. In this part we will explore how to install Microsoft's SQL Server on Ubuntu 16.04.
This shares a stored procedure to retrieve permissions for a given user on the current database or across all databases on a server.
Familiarize people with the process of utilizing SQL Server functions from within Microsoft Access. Microsoft Access is a very powerful client/server development tool. One of the SQL Server objects that you can interact with from within Microsoft Ac…
Via a live example, show how to set up a backup for SQL Server using a Maintenance Plan and how to schedule the job into SQL Server Agent.
Suggested Courses

580 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question