We help IT Professionals succeed at work.

need SQL code to extract hyperlink URL and clickable text from nvarchar field populated with html

bbraman
bbraman asked
on
I have a table of product descriptions (details) where most every record includes some html formatting and one or more hyperlinks.  I need to extract from these product descriptions a list of all the hyperlink URLs contained in all of the product description records in my table, and as a separate result set column, the clickable text associated with each hyperlink URL.

A good answer to this questions will be a SQL statement executable against SQL Server 2005 that generates a list of URLs of <a href> style hyperlinks contained in all the records of the product descriptions (details), and as a separate field in the result, the clickable text that activates each URL.  Note that some records in my table contain no html formatting and some records will contain one or more hyperlinks whose URL may be enclosed in single, <a href=' '>, or double <a href=" ">, quotes.

example table:
create table productTable
(
  details nvarchar(2650)
)
go
INSERT productTable VALUES ('<b>test product 1</b><br><i>example text</i>test<a href="http://www.domain.com/pc/Search?keyWord=test&type=1&">click here</a>example text: example text<h2>example text</h2>') 
INSERT productTable VALUES ('<b>test product 2</b><br><i>example text</i>test<a href=''/cp/pc/product?product=2''>click here</a>example text: example text<h2>example text</h2>')
INSERT productTable VALUES ('test product 3 (example text)<br><br><i>example text</i>test<br><a href='' domain.com/cp/pc/product?product=3''>click here</a>example text: example text<br><a href="product?product=2010">temp</a><h2>example text</h2>') 

INSERT productTable VALUES ('<b>test product 4 (example text)</b><br><br><i>example text</i>test<br>example text: example text<br><h2>example text</h2>')
INSERT productTable VALUES ('test product 4 (example text)<br><br><i>example text</i>test<br><a href=''/cp/product?product=2010''>temp</a>example text: example text<br><a href="domain.com/cp/pc/product?product=3">click here</a><h2>example text</h2>') 
 
go

Open in new window


result set:
cp/product?product=2010                                                      example text
/cp/pc/product?product=2                                                     click here
domain.com/cp/pc/product?product=3                                           click here
http://www.domain.com/pc/Search?keyWord=test&type=1&                         click here
product?product=2010                                                         temp

Open in new window


Thank You for considering my question.
Comment
Watch Question

Commented:
this should get you started, I have to take off but I can look more at it tomrorow.  The variable is just an example of your data set.

DECLARE @TESTVariable as varchar(5000)
Set @TESTVariable = '<b>test product 1</b><br><i>example text</i>test<a href="http://www.domain.com/pc/Search?keyWord=test&type=1&">click here</a>example text: example text<h2>example text</h2>'


select SUBSTRING(@TESTVariable, PATINDEX('%<a href="%', @TESTVariable), PATINDEX('%</a>%', @TESTVariable) - PATINDEX('%<a href="%', @TESTVariable) + 4)

Author

Commented:
Thanks for an initial response and agreeing to look at it more closely tomorrow.  A couple of notes which you are probably already aware of, when I altered your testvariable to contain 2 hyperlinks, instead of one, the result set only contained the first hyperlink.  Also when I changed testvariable so that the URL was enclosed in single quotes and then executed your select statement I got both an 'Incorrect syntax near '/'.' error and a 'The label 'http' has already been declared. Label names must be unique within a query batch or stored procedure.' error.

Commented:
Is there always going to be 2 links in each one or can it be 1 or 2 but no more then 2?

Commented:
Ok have a bit further but you will have to expand it to use all possible iterations of how the URL's are created in the table, like you have <a href="   and <a href=''  and so on.

Still have to work on taking out the URL and the text from the table I created but let me know if this is looking good for what you want and I will work further on it (and also the answer to my question above.

DECLARE @productTable table (
  details nvarchar(4000)
)

INSERT INTO @productTable VALUES ('<b>test product 1</b><br><i>example text</i>test<a href="http://www.domain.com/pc/Search?keyWord=test&type=1&">click here</a>example text: example text<h2>example text</h2>')
INSERT INTO @productTable VALUES ('<b>test product 2</b><br><i>example text</i>test<a href=''/cp/pc/product?product=2''>click here</a>example text: example text<h2>example text</h2>')
INSERT INTO @productTable VALUES ('test product 3 (example text)<br><br><i>example text</i>test<br><a href='' domain.com/cp/pc/product?product=3''>click here</a>example text: example text<br><a href="product?product=2010">temp</a><h2>example text</h2>')
INSERT INTO @productTable VALUES ('<b>test product 4 (example text)</b><br><br><i>example text</i>test<br>example text: example text<br><h2>example text</h2>')
INSERT INTO @productTable VALUES ('test product 4 (example text)<br><br><i>example text</i>test<br><a href=''/cp/product?product=2010''>temp</a>example text: example text<br><a href="domain.com/cp/pc/product?product=3">click here</a><h2>example text</h2>')


---------
DECLARE @OutputTable table (
  URL nvarchar(4000),
  URLText nvarchar(4000)
)



-- this finds the first instance of using "<a href="" and "</a>"
Insert INto @OutputTable
select REVERSE(SUBSTRING(REVERSE(details),  PATINDEX(Reverse('%</a>%'), REVERSE(details)), PATINDEX(Reverse('%<a href="%'), REVERSE(details))-PATINDEX(Reverse('%</a>%'), REVERSE(details))+9))
,NULL
FROM @productTable
WHERE (PATINDEX(Reverse('%</a>%'), REVERSE(details)) > 0)
AND (PATINDEX(Reverse('%<a href="%'), REVERSE(details)) > 0)
AND ((PATINDEX(Reverse('%<a href="%'), REVERSE(details))-PATINDEX(Reverse('%</a>%'), REVERSE(details))+9) > 0)


-- this finds the second instance of using "<a href="" and "</a>"
Insert INto @OutputTable
select SUBSTRING(details, PATINDEX('%<a href="%', details), PATINDEX('%</a>%', details) - PATINDEX('%<a href="%', details) + 4)
,NULL
FROM @productTable
WHERE (PATINDEX('%</a>%', details) > 0)
AND (PATINDEX('%<a href="%', details) > 0)
AND ((PATINDEX('%</a>%', details) - PATINDEX('%<a href="%', details) + 4) > 0)



select URL
From @OutputTable
Group By URL

Author

Commented:
I ran your code and it looks like you are on the right track, though as you are aware it is only returning the first link in each record.  To answer your question, there can be quite a few links in each record, and a few records will contain no links.

Let's add this record to your table to be a better example of a record with multiple links:

INSERT INTO @productTable VALUES ('test product 5 (example text)<br><br><i>example text</i>test<br><a href="domain.com/cp/pc/product?product=3">click here</a>example text: example text<br><a href="product?product=2010">temp</a><h2>example text</h2><br><a href="domain.com/cp/pc/product?product=3">click  here</a>')

Thank you.

Commented:
If there can be more then 2 I will have to re-write the code to do it differently.

Is this going to be a one time script or something you have to run multiple times/on a consistant basis?

Also can you use a script with cursors/loops/etc or does it have to be a single query (because I dont think you will be able to have a single query do what you want with an unknown number of links in each field.



Commented:
I think this will do what you want, more complicated then the plain select but more accurate and more customizable:


DECLARE @productTable table (
	productID int,
	details nvarchar(4000)
)

INSERT INTO @productTable VALUES (1, '<b>test product 1</b><br><i>example text</i>test<a href="http://www.domain.com/pc/Search?keyWord=test&type=1&">click here</a>example text: example text<h2>example text</h2>') 
INSERT INTO @productTable VALUES (2, '<b>test product 2</b><br><i>example text</i>test<a href=''/cp/pc/product?product=2''>click here</a>example text: example text<h2>example text</h2>')
INSERT INTO @productTable VALUES (3, 'test product 3 (example text)<br><br><i>example text</i>test<br><a href='' domain.com/cp/pc/product?product=3''>click here</a>example text: example text<br><a href="product?product=2010">temp</a><h2>example text</h2>') 
INSERT INTO @productTable VALUES (4, '<b>test product 4 (example text)</b><br><br><i>example text</i>test<br>example text: example text<br><h2>example text</h2>')
INSERT INTO @productTable VALUES (5, 'test product 4 (example text)<br><br><i>example text</i>test<br><a href=''/cp/product?product=2010''>temp</a>example text: example text<br><a href="domain.com/cp/pc/product?product=3">click here</a><h2>example text</h2>') 
INSERT INTO @productTable VALUES (6, 'test product 5 (example text)<br><br><i>example text</i>test<br><a href="domain.com/cp/pc/product?product=3">click  here</a>example text: example text<br><a href="product?product=2010">temp</a><h2>example text</h2><br><a href="domain.com/cp/pc/product?product=3">click   here</a>')


---------
DECLARE @MyCursor as cursor
DECLARE @ProductID as Int
DECLARE @Details as nvarchar(4000)
DECLARE @FindCharacterFirst as varchar(50)
DECLARE @FindCharacterSecond as varchar(50)

Set @FindCharacterFirst = '<a href='
Set @FindCharacterSecond = '</a>'


-- table to hold output data
DECLARE @OutputTable table (
	ProductID int, 
	URL nvarchar(4000),
	URLText nvarchar(4000)
)



-- fastforward so you can only move the curser forward and can not update/edit the curser data
SET @MyCursor = CURSOR FAST_FORWARD
FOR

SELECT ProductID, details
FROM @productTable


-- puts the data into the curser for the above select 
OPEN @MyCursor
FETCH NEXT FROM @MyCursor
INTO @ProductID, @Details

-- loops through all the recrods retruend
WHILE @@FETCH_STATUS = 0
BEGIN
	-- ******** do loop stuff here	****************************************************************

		DECLARE @LookupTypeList NVARCHAR(4000)
		DECLARE @LookupTypeSingle NVARCHAR(4000)
		DECLARE @Count as int
		Set @Count = 0


		-- this is the value you want to get the value from
		SET @LookupTypeList = @Details

		-- this will be the result field you want
		DECLARE @ValueToGet as nvarchar(4000)

		-- loop through the array of values
		WHILE CHARINDEX(@FindCharacterFirst, @LookupTypeList ) > 0  BEGIN --LEN( @LookupTypeList ) > 0 BEGIN 
			-- *********************************************************
			-- ********** CODE TO Get Each Array Item value ************
			-- *********************************************************
				-- check to see if there are more then 1 value left
				IF CHARINDEX(@FindCharacterFirst, @LookupTypeList ) > 0 				
					SELECT @LookupTypeSingle = SubString(@LookupTypeList, CHARINDEX(@FindCharacterFirst, @LookupTypeList), CHARINDEX(@FindCharacterSecond, @LookupTypeList)-CHARINDEX(@FindCharacterFirst, @LookupTypeList) + LEN(@FindCharacterSecond)),
						@LookupTypeList = RIGHT(@LookupTypeList, LEN(@LookupTypeList) - CHARINDEX(@FindCharacterSecond, @LookupTypeList) ) 
				ELSE
					-- there is only 1 value left so set it to the final item in the list
					SELECT @LookupTypeSingle = @LookupTypeList, @LookupTypeList = SPACE(0)		
			-- *********************************************************
			-- ********** END CODE TO Get Each Array Item value ********
			-- *********************************************************			
			
			Set @count = @count + 1

			-- save data to putput table
			Insert Into @OutputTable
			Select @ProductID, @LookupTypeSingle, SUBSTRING(@LookupTypeSingle, PATINDEX('%>%', @LookupTypeSingle)+ 1, LEN(@LookupTypeSingle) - PATINDEX('%>%', @LookupTypeSingle)-4) 
			
			
		END -- end of while loop
		
	-- *********** end loop stuff ****************************************************************
	-- gets the next recrods in the curser
	FETCH NEXT FROM @MyCursor
	INTO @ProductID, @Details
END

-- close curser 
CLOSE @MyCursor
DEALLOCATE @MyCursor




Select *
From @OutputTable

Open in new window

Commented:
Ok tweaked it to strip off the A href stuff from the URL as well.
DECLARE @productTable table (
	productID int,
	details nvarchar(4000)
)

INSERT INTO @productTable VALUES (1, '<b>test product 1</b><br><i>example text</i>test<a href="http://www.domain.com/pc/Search?keyWord=test&type=1&">click here</a>example text: example text<h2>example text</h2>') 
INSERT INTO @productTable VALUES (2, '<b>test product 2</b><br><i>example text</i>test<a href=''/cp/pc/product?product=2''>click here</a>example text: example text<h2>example text</h2>')
INSERT INTO @productTable VALUES (3, 'test product 3 (example text)<br><br><i>example text</i>test<br><a href='' domain.com/cp/pc/product?product=3''>click here</a>example text: example text<br><a href="product?product=2010">temp</a><h2>example text</h2>') 
INSERT INTO @productTable VALUES (4, '<b>test product 4 (example text)</b><br><br><i>example text</i>test<br>example text: example text<br><h2>example text</h2>')
INSERT INTO @productTable VALUES (5, 'test product 4 (example text)<br><br><i>example text</i>test<br><a href=''/cp/product?product=2010''>temp</a>example text: example text<br><a href="domain.com/cp/pc/product?product=3">click here</a><h2>example text</h2>') 
INSERT INTO @productTable VALUES (6, 'test product 5 (example text)<br><br><i>example text</i>test<br><a href="domain.com/cp/pc/product?product=3">click  here</a>example text: example text<br><a href="product?product=2010">temp</a><h2>example text</h2><br><a href="domain.com/cp/pc/product?product=3">click   here</a>')


---------
DECLARE @MyCursor as cursor
DECLARE @ProductID as Int
DECLARE @Details as nvarchar(4000)
DECLARE @FindCharacterFirst as varchar(50)
DECLARE @FindCharacterSecond as varchar(50)

Set @FindCharacterFirst = '<a href='
Set @FindCharacterSecond = '</a>'


-- table to hold output data
DECLARE @OutputTable table (
	ProductID int, 
	URL nvarchar(4000),
	URLText nvarchar(4000),
	URLONLY nvarchar(4000)
)



-- fastforward so you can only move the curser forward and can not update/edit the curser data
SET @MyCursor = CURSOR FAST_FORWARD
FOR

SELECT ProductID, details
FROM @productTable


-- puts the data into the curser for the above select 
OPEN @MyCursor
FETCH NEXT FROM @MyCursor
INTO @ProductID, @Details

-- loops through all the recrods retruend
WHILE @@FETCH_STATUS = 0
BEGIN
	-- ******** do loop stuff here	****************************************************************

		DECLARE @LookupTypeList NVARCHAR(4000)
		DECLARE @LookupTypeSingle NVARCHAR(4000)
		DECLARE @Count as int
		Set @Count = 0


		-- this is the value you want to get the value from
		SET @LookupTypeList = @Details

		-- this will be the result field you want
		DECLARE @ValueToGet as nvarchar(4000)

		-- loop through the array of values
		WHILE CHARINDEX(@FindCharacterFirst, @LookupTypeList ) > 0  BEGIN --LEN( @LookupTypeList ) > 0 BEGIN 
			-- *********************************************************
			-- ********** CODE TO Get Each Array Item value ************
			-- *********************************************************
				-- check to see if there are more then 1 value left
				IF CHARINDEX(@FindCharacterFirst, @LookupTypeList ) > 0 				
					SELECT @LookupTypeSingle = SubString(@LookupTypeList, CHARINDEX(@FindCharacterFirst, @LookupTypeList), CHARINDEX(@FindCharacterSecond, @LookupTypeList)-CHARINDEX(@FindCharacterFirst, @LookupTypeList) + LEN(@FindCharacterSecond)),
						@LookupTypeList = RIGHT(@LookupTypeList, LEN(@LookupTypeList) - CHARINDEX(@FindCharacterSecond, @LookupTypeList) ) 
				ELSE
					-- there is only 1 value left so set it to the final item in the list
					SELECT @LookupTypeSingle = @LookupTypeList, @LookupTypeList = SPACE(0)		
			-- *********************************************************
			-- ********** END CODE TO Get Each Array Item value ********
			-- *********************************************************			
			
			Set @count = @count + 1

			-- save data to putput table
			Insert Into @OutputTable
			Select @ProductID, @LookupTypeSingle, 
			SUBSTRING(@LookupTypeSingle, PATINDEX('%>%', @LookupTypeSingle)+ 1, LEN(@LookupTypeSingle) - PATINDEX('%>%', @LookupTypeSingle)-4) as URL,
			SUBSTRING(@LookupTypeSingle, 10, PATINDEX('%>%', @LookupTypeSingle)-11)  as URLONLY
			
			
		END -- end of while loop
		
	-- *********** end loop stuff ****************************************************************
	-- gets the next recrods in the curser
	FETCH NEXT FROM @MyCursor
	INTO @ProductID, @Details
END

-- close curser 
CLOSE @MyCursor
DEALLOCATE @MyCursor




Select *
From @OutputTable

Open in new window

Author

Commented:
Looking good, but I am not sure how to modify your code so that I can test it against real data in my database.  The production table is 'products' and the field that contains the product descriptions is called 'details'.  Thanks.

Commented:
You would just get rid of the code that inserts into the testing table you created and alter this line here for the select to select from your table instead of the test one:

SELECT ProductID, details
FROM @productTable


Select SomeIDOnTable, details
From Products

Author

Commented:
OK, so I modified it as you suggested and ran it against a copy of my real data on an unused server and it executed for 136 minutes, It resulted in a bunch of (1 row(s) affected) messages and the message 'query completed with errors'.  Also my C drive only has 2mb left, which appears to be due to a ballooned tempdb.mdf file (15.7GB).

I don't have any experience dealing with heavy duty SQL like this, but if we take out the @ sign in front of the OutputTable will it create an actual table instead of a temp table.  If it is going to take a couple of hours for each execution it would probably be best to have an actual table then I could just use your code to rebuild it as needed.  If we used an actual table would it need to use tempdb.mdf?

It strikes me as odd that SQL server would need 15.7GB and 2+ hours to run this so let me provide some details about this data.  My table contains 2161 record, and if I export this fields 2161 records, it creates a text file that is just under 1MB.  Not an especially large amount of data.

Thanks for staying with me on this.

Commented:
You can create a real table to hold the data instead of a variable table and that should remove it from the tempdb.

Once you do that to see if it is working you can change this:

Select SomeIDOnTable, details
From Products

to just so you will only test on a couple records in the database first to see if/how it is working out for you.
Select top 10 SomeIDOnTable, details
From Products

Commented:
Ok I tweaked it a bit and put some comments in it on a few spots you can tweak it more:


DECLARE @productTable table (
	productID int,
	details nvarchar(4000)
)

INSERT INTO @productTable VALUES (1, '<b>test product 1</b><br><i>example text</i>test<a href="http://www.domain.com/pc/Search?keyWord=test&type=1&">click here</a>example text: example text<h2>example text</h2>') 
INSERT INTO @productTable VALUES (2, '<b>test product 2</b><br><i>example text</i>test<a href=''/cp/pc/product?product=2''>click here</a>example text: example text<h2>example text</h2>')
INSERT INTO @productTable VALUES (3, 'test product 3 (example text)<br><br><i>example text</i>test<br><a href='' domain.com/cp/pc/product?product=3''>click here</a>example text: example text<br><a href="product?product=2010">temp</a><h2>example text</h2>') 
INSERT INTO @productTable VALUES (4, '<b>test product 4 (example text)</b><br><br><i>example text</i>test<br>example text: example text<br><h2>example text</h2>')
INSERT INTO @productTable VALUES (5, 'test product 4 (example text)<br><br><i>example text</i>test<br><a href=''/cp/product?product=2010''>temp</a>example text: example text<br><a href="domain.com/cp/pc/product?product=3">click here</a><h2>example text</h2>') 
INSERT INTO @productTable VALUES (6, 'test product 5 (example text)<br><br><i>example text</i>test<br><a href="domain.com/cp/pc/product?product=3">click  here</a>example text: example text<br><a href="product?product=2010">temp</a><h2>example text</h2><br><a href="domain.com/cp/pc/product?product=3">click   here</a>')


---------
DECLARE @MyCursor as cursor
DECLARE @ProductID as Int
DECLARE @Details as nvarchar(4000)
DECLARE @FindCharacterFirst as nvarchar(50)
DECLARE @FindCharacterSecond as nvarchar(50)

-- used in looping below
DECLARE @LookupTypeList NVARCHAR(4000)
DECLARE @LookupTypeSingle NVARCHAR(4000)

Set @FindCharacterFirst = '<a href='
Set @FindCharacterSecond = '</a>'


-- table to hold output data
-- you can change this to be a permenant table in your database vs using a variable, 
-- if you create the table ahead of time you wont need to do this but would want to do a truncate TableNameHere  to make sure it is empty.
DECLARE @OutputTable table (
	ProductID int, 
	URL nvarchar(4000),
	URLText nvarchar(4000),
	URLONLY nvarchar(4000)
)

-- fastforward so you can only move the curser forward and can not update/edit the curser data
SET @MyCursor = CURSOR FAST_FORWARD
FOR

SELECT ProductID, details
FROM @productTable


-- puts the data into the curser for the above select 
OPEN @MyCursor
FETCH NEXT FROM @MyCursor
INTO @ProductID, @LookupTypeList -- this is the @Details item

-- loops through all the recrods retruend
WHILE @@FETCH_STATUS = 0
BEGIN
	-- ******** do loop stuff here	****************************************************************
					
		-- loop through the array of values
		WHILE CHARINDEX(@FindCharacterFirst, @LookupTypeList ) > 0  
		BEGIN
			-- ********** CODE TO Get Each Array Item value ************
			
			SELECT @LookupTypeSingle = SubString(@LookupTypeList, CHARINDEX(@FindCharacterFirst, @LookupTypeList), CHARINDEX(@FindCharacterSecond, @LookupTypeList)-CHARINDEX(@FindCharacterFirst, @LookupTypeList) + LEN(@FindCharacterSecond)),
				@LookupTypeList = RIGHT(@LookupTypeList, LEN(@LookupTypeList) - CHARINDEX(@FindCharacterSecond, @LookupTypeList) ) 
			-- ********** END CODE TO Get Each Array Item value ********
			
			-- save data to oputput table
			Insert Into @OutputTable
			Select @ProductID, @LookupTypeSingle, 
			--  instead of doing this here (the below 2 lines) you can just save the values in @ProductID, @LookupTypeSingle  into 
			--		your table and have it all saved then run just the select below on that table once the URL with the HTML is output to that table
			--		it will just be an extra step but should reduce time and speed things up.  You just have to change @LookupTypeSingle to your tabvle field name 
			SUBSTRING(@LookupTypeSingle, PATINDEX('%>%', @LookupTypeSingle)+ 1, LEN(@LookupTypeSingle) - PATINDEX('%>%', @LookupTypeSingle)-4) as URL,
			SUBSTRING(@LookupTypeSingle, 10, PATINDEX('%>%', @LookupTypeSingle)-11)  as URLONLY
			
			
		END -- end of while loop
		
	-- *********** end loop stuff ****************************************************************
	-- gets the next recrods in the curser
	FETCH NEXT FROM @MyCursor
	INTO @ProductID, @LookupTypeList
END

CLOSE @MyCursor
DEALLOCATE @MyCursor


Select *
From @OutputTable

Open in new window

Author

Commented:
So after outputting the results out to a real table I was able to see that we are getting into an infinite while loop scenario due to some malformed hyperlinks in my table (missing closing tags, closing tag without opening tag, etc).  After explicitly excluding those records your SQL ran against all my data in less than 2 seconds.

This data is constantly being modified, so to prevent a infinite loop scenario from tripping the code up in the future please alter the code so it exits the loop after say 100 iterations.  Being able to handle 100 or less hyperlinks per product record should be plenty sufficient.

Thank you for putting all this together for me.

Commented:
Instead of putting a limit on the loops I am making sure that the string it is searching has "<a href="  AND "</a>".  So it would require both to keep looping if it does not it should skip those links.  Try this with the bad data records and let me know if this works for you.  If it does it will save unnecessary looping by just looping 100 times no matter what for those bad data records.
DECLARE @productTable table (
	productID int,
	details nvarchar(4000)
)

INSERT INTO @productTable VALUES (1, '<b>test product 1</b><br><i>example text</i>test<a href="http://www.domain.com/pc/Search?keyWord=test&type=1&">click here</a>example text: example text<h2>example text</h2>') 
INSERT INTO @productTable VALUES (2, '<b>test product 2</b><br><i>example text</i>test<a href=''/cp/pc/product?product=2''>click here</a>example text: example text<h2>example text</h2>')
INSERT INTO @productTable VALUES (3, 'test product 3 (example text)<br><br><i>example text</i>test<br><a href='' domain.com/cp/pc/product?product=3''>click here</a>example text: example text<br><a href="product?product=2010">temp</a><h2>example text</h2>') 
INSERT INTO @productTable VALUES (4, '<b>test product 4 (example text)</b><br><br><i>example text</i>test<br>example text: example text<br><h2>example text</h2>')
INSERT INTO @productTable VALUES (5, 'test product 4 (example text)<br><br><i>example text</i>test<br><a href=''/cp/product?product=2010''>temp</a>example text: example text<br><a href="domain.com/cp/pc/product?product=3">click here</a><h2>example text</h2>') 
INSERT INTO @productTable VALUES (6, 'test product 5 (example text)<br><br><i>example text</i>test<br><a href="domain.com/cp/pc/product?product=3">click  here</a>example text: example text<br><a href="product?product=2010">temp</a><h2>example text</h2><br><a href="domain.com/cp/pc/product?product=3">click   here</a>')


---------
DECLARE @MyCursor as cursor
DECLARE @ProductID as Int
DECLARE @FindCharacterFirst as nvarchar(50)
DECLARE @FindCharacterSecond as nvarchar(50)

-- used in looping below
DECLARE @LookupTypeList NVARCHAR(4000)
DECLARE @LookupTypeSingle NVARCHAR(4000)

Set @FindCharacterFirst = '<a href='
Set @FindCharacterSecond = '</a>'


-- table to hold output data
-- you can change this to be a permenant table in your database vs using a variable, 
-- if you create the table ahead of time you wont need to do this but would want to do a truncate TableNameHere  to make sure it is empty.
DECLARE @OutputTable table (
	ProductID int, 
	URL nvarchar(4000),
	URLText nvarchar(4000),
	URLONLY nvarchar(4000)
)

-- fastforward so you can only move the curser forward and can not update/edit the curser data
SET @MyCursor = CURSOR FAST_FORWARD
FOR

SELECT ProductID, details
FROM @productTable


-- puts the data into the curser for the above select 
OPEN @MyCursor
FETCH NEXT FROM @MyCursor
INTO @ProductID, @LookupTypeList -- this is the @Details item

-- loops through all the recrods retruend
WHILE @@FETCH_STATUS = 0
BEGIN
	-- ******** do loop stuff here	****************************************************************
					
		-- loop through the array of values
		WHILE CHARINDEX(@FindCharacterFirst, @LookupTypeList ) > 0 AND CHARINDEX(@FindCharacterSecond, @LookupTypeList ) > 0  
		BEGIN
			-- ********** CODE TO Get Each Array Item value ************
			
			SELECT @LookupTypeSingle = SubString(@LookupTypeList, CHARINDEX(@FindCharacterFirst, @LookupTypeList), CHARINDEX(@FindCharacterSecond, @LookupTypeList)-CHARINDEX(@FindCharacterFirst, @LookupTypeList) + LEN(@FindCharacterSecond)),
				@LookupTypeList = RIGHT(@LookupTypeList, LEN(@LookupTypeList) - CHARINDEX(@FindCharacterSecond, @LookupTypeList) ) 
			-- ********** END CODE TO Get Each Array Item value ********
			
			-- save data to oputput table
			Insert Into @OutputTable
			Select @ProductID, @LookupTypeSingle, 
			--  instead of doing this here (the below 2 lines) you can just save the values in @ProductID, @LookupTypeSingle  into 
			--		your table and have it all saved then run just the select below on that table once the URL with the HTML is output to that table
			--		it will just be an extra step but should reduce time and speed things up.  You just have to change @LookupTypeSingle to your tabvle field name 
			SUBSTRING(@LookupTypeSingle, PATINDEX('%>%', @LookupTypeSingle)+ 1, LEN(@LookupTypeSingle) - PATINDEX('%>%', @LookupTypeSingle)-4) as URL,
			SUBSTRING(@LookupTypeSingle, 10, PATINDEX('%>%', @LookupTypeSingle)-11)  as URLONLY
			
			
		END -- end of while loop
		
	-- *********** end loop stuff ****************************************************************
	-- gets the next recrods in the curser
	FETCH NEXT FROM @MyCursor
	INTO @ProductID, @LookupTypeList
END

CLOSE @MyCursor
DEALLOCATE @MyCursor


Select *
From @OutputTable

Open in new window

Author

Commented:
I haven't had a chance to test your latest code against my data, but I did want to mention that the nice thing about the infinite loop :-) was that it let me know that I had a malformed hyperlink (missing closing tags, closing tag without opening tag, etc) in my data so that I could get it fixed.  I suppose the cleanest way to flag that a record may contain a malformed hyperlink would be If you reverted back to the code you posted in comment 32996956, and added additional code to detect that an infinite loop was happening and then recorded that out as a TRUE value to a new LinkError field in the output table for the effected record.  Either this or limiting the loop to 100 iterations works for me.  Thank You.
Commented:
Try this, it should set the variable to ERROR in the table if it is not formatted correctly.
DECLARE @productTable table (
	productID int,
	details nvarchar(4000)
)

INSERT INTO @productTable VALUES (1, '<b>test product 1</b><br><i>example text</i>test<a href="http://www.domain.com/pc/Search?keyWord=test&type=1&">click here</a>example text: example text<h2>example text</h2>') 
INSERT INTO @productTable VALUES (2, '<b>test product 2</b><br><i>example text</i>test<a href=''/cp/pc/product?product=2''>click here</a>example text: example text<h2>example text</h2>')
INSERT INTO @productTable VALUES (3, 'test product 3 (example text)<br><br><i>example text</i>test<br><a href='' domain.com/cp/pc/product?product=3''>click here</a>example text: example text<br><a href="product?product=2010">temp</a><h2>example text</h2>') 
INSERT INTO @productTable VALUES (4, '<b>test product 4 (example text)</b><br><br><i>example text</i>test<br>example text: example text<br><h2>example text</h2>')
INSERT INTO @productTable VALUES (5, 'test product 4 (example text)<br><br><i>example text</i>test<br><a href=''/cp/product?product=2010''>temp</a>example text: example text<br><a href="domain.com/cp/pc/product?product=3">click here</a><h2>example text</h2>') 
INSERT INTO @productTable VALUES (6, 'test product 5 (example text)<br><br><i>example text</i>test<br><a href="domain.com/cp/pc/product?product=3">click  here</a>example text: example text<br><a href="product?product=2010">temp</a><h2>example text</h2><br><a href="domain.com/cp/pc/product?product=3">click   here</a>')


---------
DECLARE @MyCursor as cursor
DECLARE @ProductID as Int
DECLARE @FindCharacterFirst as nvarchar(50)
DECLARE @FindCharacterSecond as nvarchar(50)

-- used in looping below
DECLARE @LookupTypeList NVARCHAR(4000)
DECLARE @LookupTypeSingle NVARCHAR(4000)

Set @FindCharacterFirst = '<a href='
Set @FindCharacterSecond = '</a>'


-- table to hold output data
-- you can change this to be a permenant table in your database vs using a variable, 
-- if you create the table ahead of time you wont need to do this but would want to do a truncate TableNameHere  to make sure it is empty.
DECLARE @OutputTable table (
	ProductID int, 
	URL nvarchar(4000),
	URLText nvarchar(4000),
	URLONLY nvarchar(4000)
)

-- fastforward so you can only move the curser forward and can not update/edit the curser data
SET @MyCursor = CURSOR FAST_FORWARD
FOR

SELECT ProductID, details
FROM @productTable


-- puts the data into the curser for the above select 
OPEN @MyCursor
FETCH NEXT FROM @MyCursor
INTO @ProductID, @LookupTypeList -- this is the @Details item

-- loops through all the recrods retruend
WHILE @@FETCH_STATUS = 0
BEGIN
	-- ******** do loop stuff here	****************************************************************
					
		-- loop through the array of values
		WHILE CHARINDEX(@FindCharacterFirst, @LookupTypeList ) > 0 OR CHARINDEX(@FindCharacterSecond, @LookupTypeList ) > 0  
		BEGIN
			-- ********** CODE TO Get Each Array Item value ************
			-- this makes sure it finds the first characters you are searcing for and the second set you are looking for is AFTER the first set.
			IF CHARINDEX(@FindCharacterFirst, @LookupTypeList ) > 0 AND CHARINDEX(@FindCharacterSecond, @LookupTypeList ) > CHARINDEX(@FindCharacterFirst, @LookupTypeList )
				BEGIN 			
					SELECT @LookupTypeSingle = SubString(@LookupTypeList, CHARINDEX(@FindCharacterFirst, @LookupTypeList), CHARINDEX(@FindCharacterSecond, @LookupTypeList)-CHARINDEX(@FindCharacterFirst, @LookupTypeList) + LEN(@FindCharacterSecond)),
						@LookupTypeList = RIGHT(@LookupTypeList, LEN(@LookupTypeList) - CHARINDEX(@FindCharacterSecond, @LookupTypeList) ) 
					-- ********** END CODE TO Get Each Array Item value ********
					
				END 
			ELSE					
				BEGIN
					-- missing opening or closing searching for patterns so stop looping
					Set @LookupTypeSingle = 'ERROR'
					Set @LookupTypeList = ''
				END
					
			
			-- save data to oputput table
			Insert Into @OutputTable
			Select @ProductID, @LookupTypeSingle, 
			--  instead of doing this here (the below 2 lines) you can just save the values in @ProductID, @LookupTypeSingle  into 
			--		your table and have it all saved then run just the select below on that table once the URL with the HTML is output to that table
			--		it will just be an extra step but should reduce time and speed things up.  You just have to change @LookupTypeSingle to your tabvle field name 
			CASE WHEN PATINDEX('%>%', @LookupTypeSingle) > 0 THEN SUBSTRING(@LookupTypeSingle, PATINDEX('%>%', @LookupTypeSingle)+ 1, LEN(@LookupTypeSingle) - PATINDEX('%>%', @LookupTypeSingle)-4) END as URL,
			CASE WHEN PATINDEX('%>%', @LookupTypeSingle) > 0 THEN SUBSTRING(@LookupTypeSingle, 10, PATINDEX('%>%', @LookupTypeSingle)-11) END  as URLONLY
			
			
		END -- end of while loop
		
	-- *********** end loop stuff ****************************************************************
	-- gets the next recrods in the curser
	FETCH NEXT FROM @MyCursor
	INTO @ProductID, @LookupTypeList
END

CLOSE @MyCursor
DEALLOCATE @MyCursor


Select *
From @OutputTable

Open in new window