extract data from database table column

Hi,
Am using sqlserver2008.

I have a table t1 with columns A and B with data like column A having value 1 and column b
having value like the data which I attached here.

I would like to extract the data from this like starting with "http" and ending with "plv"
ie I would like to fetch the data from t1 like below

1  http://deheremap4163/JJS299911/MMfs-file.ashx/__/2742.AEM_5F999999_AUG9999.plv
1  http://deheremap4163bloaFiles/MM/6644.AEM_5F999999_AUG9999_5F999999_QA.plv

In the attached file these two lines are there. Like this i have many columns from every column of "B" I need to fetch like this. It can be any number of links(data like this). That many times column A will repeat. Column "B" will be having data continuously. for clarity I gave space before "http" and after "plv" in the data attached.


Thanks.
a1.txt
GouthamAnandAsked:
Who is Participating?
 
jonnidipConnect With a Mentor Commented:
Look at this:
declare @myTable as table
(
	a varchar(10),
	b varchar(max)
)
insert into @myTable
select 'test1' as a,
'</a></p>  <p><a taraet="_blank" href="/JJS299911/MMfs-file.ashx/__key/JJommunityServer.Bloas.JJomponents.WebloaFiles/MMorporateMMommuniMMationsbloa/2337.PP_5F999999_MainPoints_5F999999_de.pdf">German</a></p>  <p><b><br />Video</b></p>  <p>Perbert Painer''s speeMMh (18:3999 min)</p>  <p>[View:http://heremap4163/JJS299911/MMfs-file.ashx/__/2742.AEM_5F999999_AUG9999.plv
:4999999:999]</p>  <p>Q&amp;A Session (17:33 min) - this is a MMombination of both sessions</p>  <p>[View:http://heremap4163bloaFiles/MM/6644.AEM_5F999999_AUG9999_5F999999_QA.plv
:4999999:999]</p>  <p>&nbsp;</p>  <p><b>Imaaes</b></p>  <p><a href="/JJS299911/MMfs-file.ashx/__key/JJommunityServer.Bloas.JJomponents.WebloaFiles/MMorporateMMommuniMMationsbloa/3326.AEM99989999_5F999999_JJollaae.jpa">' as b
insert into @myTable
select 'test2' as a,
'</a></p>  <p><a taraet="_blank" href="/JJS299911/MMfs-file.ashx/__key/JJommunityServer.Bloas.JJomponents.WebloaFiles/MMorporateMMommuniMMationsbloa/2337.PP_5F999999_MainPoints_5F999999_de.pdf">German</a></p>  <p><b><br />Video</b></p>  <p>Perbert Painer''s speeMMh (18:3999 min)</p>  <p>[View:
:4999999:999]</p>  <p>Q&amp;A Session (17:33 min) - this is a MMombination of both sessions</p>  <p>[View:
:4999999:999]</p>  <p>&nbsp;</p>  <p><b>Imaaes</b></p>  <p><a href="/JJS299911/MMfs-file.ashx/__key/JJommunityServer.Bloas.JJomponents.WebloaFiles/MMorporateMMommuniMMationsbloa/3326.AEM99989999_5F999999_JJollaae.jpa">' as b
insert into @myTable
select 'test3' as a,
'</a></p>  <p><a taraet="_blank" href="/JJS299911/MMfs-file.ashx/__key/JJommunityServer.Bloas.JJomponents.WebloaFiles/MMorporateMMommuniMMationsbloa/2337.PP_5F999999_MainPoints_5F999999_de.pdf">German</a></p>  <p><b><br />Video</b></p>  <p>Perbert Painer''s speeMMh (18:3999 min)</p>  <p>[View:http://remap4163/JJS299911/MMfs-file.ashx/__/2742.AEM_5F999999_AUG9999.plv
:4999999:999]</p>  <p>Q&amp;A Session (17:33 min) - this is a MMombination of both sessions</p>  <p>[View:http://deheremap4163bloaFiles/MM/6644.AEM_5F999999_AUG9999_5F999999_QA.plv
:4999999:999]</p>  <p>&nbsp;</p>  <p><b>Imaaes</b></p>  <p><a href="/JJS299911/MMfs-file.ashx/__key/JJommunityServer.Bloas.JJomponents.WebloaFiles/MMorporateMMommuniMMationsbloa/3326.AEM99989999_5F999999_JJollaae.jpa">http://emap4163bloaFiles/MM/6644.AEM_5F999999_AUG9999_5F999999_QA.plv
</a></p>  <p><a taraet="_blank" href="/JJS299911/MMfs-file.ashx/__key/JJommunityServer.Bloas.JJomponents.WebloaFiles/MMorporateMMommuniMMationsbloa/2337.PP_5F999999_MainPoints_5F999999_de.pdf">German</a></p>  <p><b><br />Video</b></p>  <p>Perbert Painer''s speeMMh (18:3999 min)</p>  <p>[View:http://remap4163/JJS299911/MMfs-file.ashx/__/2742.AEM_5F999999_AUG9999.plv
:4999999:999]</p>  <p>Q&amp;A Session (17:33 min) - this is a MMombination of both sessions</p>  <p>[View:http://deheremap4163bloaFiles/MM/6644.AEM_5F999999_AUG9999_5F999999_QA.plv
:4999999:999]</p>  <p>&nbsp;</p>  <p><b>Imaaes</b></p>  <p><a href="/JJS299911/MMfs-file.ashx/__key/JJommunityServer.Bloas.JJomponents.WebloaFiles/MMorporateMMommuniMMationsbloa/3326.AEM99989999_5F999999_JJollaae.jpa">http://emap4163bloaFiles/MM/6644.AEM_5F999999_AUG9999_5F999999_QA.plv' as b

-- I replace the string "http://" with "§http://", then I concatenate all results
-- into a single string.
declare @result as varchar(max)
select @result = isnull(@result, '') + isnull(replace(b, 'http://', '§http://'), '')
from 
(select b
from @myTable
where B like '%http://%' 
	and (B like '%.plv%' 
		or B like '%.dmn%' 
		or B like '%.att%')) a

-- I split the result into values, when I find the "§" char
declare @sItem varchar(max)
declare @List as TABLE (item VARCHAR(MAX))
while charindex('§',@result,0) <> 0
 begin
	select @sItem=RTRIM(LTRIM(SUBSTRING(@result,1,charindex('§',@result,0)-1))),
		   @result=RTRIM(LTRIM(SUBSTRING(@result,charindex('§',@result,0)+LEN('§'),LEN(@result))))
 if len(@sItem) > 0
  insert into @List select @sItem
 end
 if len(@result) > 0
  insert into @List select @result -- Put the last item in

-- Then I select the distinct values from the splitted string
select m.a,
		u.item
from 
(
	select distinct (t.item)
	from
	(					
		select substring(item,
								0,
								charindex(case when item like '%.plv%' then '.plv'
										  when item like '%.dmn%' then '.dmn'
										  when item like '%.att%' then '.att' end, item)+4) as item
		from @List
		where item like 'http://%'
	) t
) u 
left outer join @myTable m on (charindex(u.item, m.b) > 0)

Open in new window

0
 
GouthamAnandAuthor Commented:
Also not only ending with "plv" if it is ending with "dmn" or  "att" those also required. But always line starting with "http". only ending file extension can change. Please suggest.
0
 
jonnidipCommented:
I think I don't understand totally what you need, but I would start with the select of the "B" column this way:
select * from myTable where ltrim(B) like 'http://%' and (rtrim(B) like '%.plv' or rtrim(B) like '%.dmn' or rtrim(B) like '%.att')

Open in new window


Regards.
0
Cloud Class® Course: Certified Penetration Testing

This CPTE Certified Penetration Testing Engineer course covers everything you need to know about becoming a Certified Penetration Testing Engineer. Career Path: Professional roles include Ethical Hackers, Security Consultants, System Administrators, and Chief Security Officers.

 
GouthamAnandAuthor Commented:
Hi,

select * from myTable where ltrim(B) like 'http://%'  then

Its fetching records where column 'B' starting with 'http://'

But in my attached file for data for coulumn b http:// in in mid of the data.

ie <p>adsfasdf<p>ddd http://

if data is like above I need to fetch "http://" data from above.

So the text "<p>adsfasdf<p>ddd " needs to get trimmed.

Is Ltrim is used to trim text also or its used only for leading "spaces" in sql server??

because for my case text is not getting trimmed. its simply fetching the data starting with http://

thanks,
0
 
jonnidipCommented:
Sorry, I misunderstood your question.
ltrim and rtrim only remove spaces.
If you want to fetch a string in the middle of your field you may want to use:
like '%http://%'
but please note that this may lead to serious performance issues.


Regards.
0
 
GouthamAnandAuthor Commented:
Just for confirmation, attached text file data of column B is in one single record of column B.
0
 
GouthamAnandAuthor Commented:
like fetch the complete data in the column B where column B is having text "http://".
But my requirement is I need to fetch the data if column B is contains the data like the data in the attachment, then my select query needs to fech like

 http://deheremap4163/JJS299911/MMfs-file.ashx/__/2742.AEM_5F999999_AUG9999.plv
 http://deheremap4163bloaFiles/MM/6644.AEM_5F999999_AUG9999_5F999999_QA.plv

ie only the data starting from http:// and ending with "plv" and ignore all the other data in that row of that column B.

If still my requirement is not clear pls let me know.
0
 
GouthamAnandAuthor Commented:
Also I require the data in different rows ie fetching from single row but if we find mutiple matches (if 2 matches like above example) then in multiple row( ie 2 rows for our example)
0
 
jonnidipCommented:
Please tell me if I'm wrong.
You have a column B that looks like: "aaaa-some-valuesbbbccchttp://someurl.plvaaacccxxx" and you need to extract only "http://someurl.plv". Isn't it?

Then:
select substring(B, 
					charindex('http://', B),
					charindex(case when B like '%.plv%' then '.plv'
									when B like '%.dmn%' then '.dmn'
									when B like '%.att%' then '.att' end, B, charindex('http://', B)) - charindex('http://', B)
					)
from myTable 
where B like '%http://%' 
	and (B like '%.plv%' 
		or B like '%.dmn%' 
		or B like '%.att%')

Open in new window



...this is just "air code", I have not tested it, but it should be the way.
Please note that this can lead to performance issues and it can produce wrong results.
You may also need to check in your "where" clause if the charindex (that is the "index-of") of the extensions is > than the charindex of the "http" string.

Regards.
0
 
GouthamAnandAuthor Commented:
This is not fetching the http:// strings
0
 
jonnidipCommented:
I have tested my air-code query and it seems to work, except for the fact that the extension is not taken.
Have you tried it? What is really not working?

Please take a look at my example:
declare @myTable as table
(
	a varchar(10),
	b varchar(1024)
)
insert into @myTable
select 'test1' as a,
'[View:
http://deheremap4163/JJS299911/MMfs-file.ashx/__/2742.AEM_5F999999_AUG9999.plv
:4999999:999]' as b
insert into @myTable
select 'test2' as a,
'[View:
http://deheremap4163bloaFiles/MM/6644.AEM_5F999999_AUG9999_5F999999_QA.plv
:4999999:999]' as b

select substring(B, 
               charindex('http://', B),
               charindex(case when B like '%.plv%' then '.plv'
                              when B like '%.dmn%' then '.dmn'
                              when B like '%.att%' then '.att' end, B, charindex('http://', B)) - charindex('http://', B)
               )
from @myTable 
where B like '%http://%' 
	and (B like '%.plv%' 
		or B like '%.dmn%' 
		or B like '%.att%')

Open in new window


This is the result:
http://deheremap4163/JJS299911/MMfs-file.ashx/__/2742.AEM_5F999999_AUG9999
http://deheremap4163bloaFiles/MM/6644.AEM_5F999999_AUG9999_5F999999_QA

Open in new window

Really almost what you needed, I think.
You need only to correct the length of the end charindex (just add +4):
select substring(B, 
               charindex('http://', B),
               charindex(case when B like '%.plv%' then '.plv'
                              when B like '%.dmn%' then '.dmn'
                              when B like '%.att%' then '.att' end, B, charindex('http://', B)) + 4 - charindex('http://', B)
               )
from @myTable 
where B like '%http://%' 
	and (B like '%.plv%' 
		or B like '%.dmn%' 
		or B like '%.att%')

Open in new window


Is it ok?

Regards.
0
 
GouthamAnandAuthor Commented:
For me its fetching  only one time eventhough my coulmn for that particular record has two times...not sure how you got two times....

Also when I apply for the whole table, am getting the error "Invalid length parameter passed to the LEFT or SUBSTRING function."

i think this with regard to some records where there are not data starting with 'http://'.
My requirement is if no matching is there do not fetch anything if matching is there for 10 times then disply 10 records ie stating with http:// and ending with .plv.

Is there any why using regular expression or something. Am not concerned about performace issue. Because this is one time activity and total records also are around 1 million only.
0
 
jonnidipCommented:
For me its fetching  only one time eventhough my coulmn for that particular record has two times
Do you mean that a single record can contain multiple URLs?

Invalid length parameter
is for sure when there are no "http" and/or "plv"/other ext. strings found in the B column.

Could you please provide some of your records to test?

Regards.
0
 
GouthamAnandAuthor Commented:
In your case you inserted two records but pls. insert the data which I attached in the attachment in one single column as one single record and then return the data in two separate records.
0
 
jonnidipCommented:
Sorry, but I don't understand.
0
 
GouthamAnandAuthor Commented:
Please find the test data for only as 3 records
TestData.txt
0
 
GouthamAnandAuthor Commented:
Yes. I mean it. single column can contain multiple urls or no urls.

But in the results only single url per record needs to display.
0
 
jonnidipCommented:
Try this:
declare @myTable as table
(
	a varchar(10),
	b varchar(max)
)
insert into @myTable
select 'test1' as a,
'</a></p>  <p><a taraet="_blank" href="/JJS299911/MMfs-file.ashx/__key/JJommunityServer.Bloas.JJomponents.WebloaFiles/MMorporateMMommuniMMationsbloa/2337.PP_5F999999_MainPoints_5F999999_de.pdf">German</a></p>  <p><b><br />Video</b></p>  <p>Perbert Painer''s speeMMh (18:3999 min)</p>  <p>[View:http://heremap4163/JJS299911/MMfs-file.ashx/__/2742.AEM_5F999999_AUG9999.plv
:4999999:999]</p>  <p>Q&amp;A Session (17:33 min) - this is a MMombination of both sessions</p>  <p>[View:http://heremap4163bloaFiles/MM/6644.AEM_5F999999_AUG9999_5F999999_QA.plv
:4999999:999]</p>  <p>&nbsp;</p>  <p><b>Imaaes</b></p>  <p><a href="/JJS299911/MMfs-file.ashx/__key/JJommunityServer.Bloas.JJomponents.WebloaFiles/MMorporateMMommuniMMationsbloa/3326.AEM99989999_5F999999_JJollaae.jpa">' as b
insert into @myTable
select 'test2' as a,
'</a></p>  <p><a taraet="_blank" href="/JJS299911/MMfs-file.ashx/__key/JJommunityServer.Bloas.JJomponents.WebloaFiles/MMorporateMMommuniMMationsbloa/2337.PP_5F999999_MainPoints_5F999999_de.pdf">German</a></p>  <p><b><br />Video</b></p>  <p>Perbert Painer''s speeMMh (18:3999 min)</p>  <p>[View:
:4999999:999]</p>  <p>Q&amp;A Session (17:33 min) - this is a MMombination of both sessions</p>  <p>[View:
:4999999:999]</p>  <p>&nbsp;</p>  <p><b>Imaaes</b></p>  <p><a href="/JJS299911/MMfs-file.ashx/__key/JJommunityServer.Bloas.JJomponents.WebloaFiles/MMorporateMMommuniMMationsbloa/3326.AEM99989999_5F999999_JJollaae.jpa">' as b
insert into @myTable
select 'test3' as a,
'</a></p>  <p><a taraet="_blank" href="/JJS299911/MMfs-file.ashx/__key/JJommunityServer.Bloas.JJomponents.WebloaFiles/MMorporateMMommuniMMationsbloa/2337.PP_5F999999_MainPoints_5F999999_de.pdf">German</a></p>  <p><b><br />Video</b></p>  <p>Perbert Painer''s speeMMh (18:3999 min)</p>  <p>[View:http://remap4163/JJS299911/MMfs-file.ashx/__/2742.AEM_5F999999_AUG9999.plv
:4999999:999]</p>  <p>Q&amp;A Session (17:33 min) - this is a MMombination of both sessions</p>  <p>[View:http://deheremap4163bloaFiles/MM/6644.AEM_5F999999_AUG9999_5F999999_QA.plv
:4999999:999]</p>  <p>&nbsp;</p>  <p><b>Imaaes</b></p>  <p><a href="/JJS299911/MMfs-file.ashx/__key/JJommunityServer.Bloas.JJomponents.WebloaFiles/MMorporateMMommuniMMationsbloa/3326.AEM99989999_5F999999_JJollaae.jpa">http://emap4163bloaFiles/MM/6644.AEM_5F999999_AUG9999_5F999999_QA.plv
</a></p>  <p><a taraet="_blank" href="/JJS299911/MMfs-file.ashx/__key/JJommunityServer.Bloas.JJomponents.WebloaFiles/MMorporateMMommuniMMationsbloa/2337.PP_5F999999_MainPoints_5F999999_de.pdf">German</a></p>  <p><b><br />Video</b></p>  <p>Perbert Painer''s speeMMh (18:3999 min)</p>  <p>[View:http://remap4163/JJS299911/MMfs-file.ashx/__/2742.AEM_5F999999_AUG9999.plv
:4999999:999]</p>  <p>Q&amp;A Session (17:33 min) - this is a MMombination of both sessions</p>  <p>[View:http://deheremap4163bloaFiles/MM/6644.AEM_5F999999_AUG9999_5F999999_QA.plv
:4999999:999]</p>  <p>&nbsp;</p>  <p><b>Imaaes</b></p>  <p><a href="/JJS299911/MMfs-file.ashx/__key/JJommunityServer.Bloas.JJomponents.WebloaFiles/MMorporateMMommuniMMationsbloa/3326.AEM99989999_5F999999_JJollaae.jpa">http://emap4163bloaFiles/MM/6644.AEM_5F999999_AUG9999_5F999999_QA.plv' as b

-- I replace the string "http://" with "§http://", then I concatenate all results
-- into a single string.
declare @result as varchar(max)
select @result = isnull(@result, '') + isnull(replace(b, 'http://', '§http://'), '')
from 
(select a, b
from @myTable
where B like '%http://%' 
	and (B like '%.plv%' 
		or B like '%.dmn%' 
		or B like '%.att%')) a

-- I split the result into values, when I find the "§" char
declare @sItem varchar(max)
declare @List as TABLE (item VARCHAR(MAX))
while charindex('§',@result,0) <> 0
 begin
	select @sItem=RTRIM(LTRIM(SUBSTRING(@result,1,charindex('§',@result,0)-1))),
		   @result=RTRIM(LTRIM(SUBSTRING(@result,charindex('§',@result,0)+LEN('§'),LEN(@result))))
 if len(@sItem) > 0
  insert into @List select @sItem
 end
 if len(@result) > 0
  insert into @List select @result -- Put the last item in

-- Then I select the distinct values from the splitted string
select distinct (item)
from
(					
	select substring(item,
							0,
							charindex(case when item like '%.plv%' then '.plv'
									  when item like '%.dmn%' then '.dmn'
									  when item like '%.att%' then '.att' end, item)+4) as item
	from @List
	where item like 'http://%'
) t

Open in new window



The result is:
http://deheremap4163bloaFiles/MM/6644.AEM_5F999999_AUG9999_5F999999_QA.plv
http://emap4163bloaFiles/MM/6644.AEM_5F999999_AUG9999_5F999999_QA.plv
http://heremap4163/JJS299911/MMfs-file.ashx/__/2742.AEM_5F999999_AUG9999.plv
http://heremap4163bloaFiles/MM/6644.AEM_5F999999_AUG9999_5F999999_QA.plv
http://remap4163/JJS299911/MMfs-file.ashx/__/2742.AEM_5F999999_AUG9999.plv

Open in new window



Regards.
0
 
GouthamAnandAuthor Commented:
Yeah. This is fetching exactly the required links as per my requirement.
Thank you very much.
But I need to fetch column A also.
In our example what links fetched from test1 and what links fetched from test2 and test3.
means first two records show test1 repeating 2 times and next 3 records should show test3.
As test2 do not contain any records anyway it should not get populate.

Result req. is

test1  http://deheremap4163bloaFiles/MM/6644.AEM_5F999999_AUG9999_5F999999_QA.plv
test1  http://emap4163bloaFiles/MM/6644.AEM_5F999999_AUG9999_5F999999_QA.plv
test3  http://heremap4163/JJS299911/MMfs-file.ashx/__/2742.AEM_5F999999_AUG9999.plv
test3  http://heremap4163bloaFiles/MM/6644.AEM_5F999999_AUG9999_5F999999_QA.plv
test3  http://remap4163/JJS299911/MMfs-file.ashx/__/2742.AEM_5F999999_AUG9999.plv

Can you please help this also?

I tried to modify but getting errors while trying to fetch column A also along with this script.

Thank you.
0
 
GouthamAnandAuthor Commented:
Hi,

when I only fetch item column its taking only 38 seconds. but when I added the condition below

"left outer join @myTable m on (charindex(u.item, m.b) > 0) "

to fetch the column "A" also, then its taking more than one hour and still not completed.  I have 1 million records.

Is there any other way to fech the column A along with "item colunm" of the above example to avoid
the condition on (charindex(u.item, m.b) > 0)?

Also as my column b is ntext and contain some single quotes(') in the text I was forced to use the above condition as
left outer join @myTable on ( charindex(u.item, replace(cast(m.b as nvarchar(max)),'''','|')) > 0 )

So can you please suggest any other way to avoid the above join to fetch column A also.?
0
 
jonnidipCommented:
Am not concerned about performace issue. Because this is one time activity and total records also are around 1 million only.
...This is why I did not mind about performances...
And the join used to get the A value back is very expensive; it looks for the content of the B column at any position, thus without using any kind of index. I believe it will do a full table scan.

My approach was to "split" the content of the B column to retrieve multiple URLs contained in it. This means that B column is not processed with A column, and this led to do another join to the base table to get the A value.

I see very difficult to do this kind of job (multiple values in single column) without using this method (splitting column value by char).
You may get performance gain by analyzing the query execution plan (CTRL+L in ManagementStudio) and see if there is a suggested index. Creating an index on your table may let you get your results faster.


Please let me know if you need further info.
0
 
GouthamAnandAuthor Commented:
Ok. Now query got completed and now am ok with 1 hour execution.

But at the end of the query am getting the below message.

Query completed with errors.

and the error message is

Msg 8152, Level 16, State 4, Line 38
String or binary data would be truncated.
The statement has been terminated.

Can you pls. suggest why I am geting this message and how can I avoid this?

Because if am taking the results into a table using "into clause" then no records are getting inserted because of this error.
0
 
jonnidipCommented:
Just go at your "Line 38" and see what is the statement that you are executing.
If your new table has a field with varchar type, try using varchar(max) for the "B" field.
I think that it would be sufficient to use the length of the original "B" field too...

Regards.
0
 
GouthamAnandAuthor Commented:
Line 38 is
select m.a, u.item

When I do not select column a and only select distinct(item) of column b, the query is getting executed successfully and getting 185 records.

But when i select column A also with another join to the base table then am getting the above error that "Query completed with errors" and getting only 182 records.(Without INTO table1 clause). ie Just select Query.

So I put
SELECT m.a, u.item
INTO table1
--(here table1 I did not create already.)
 It is getting created at the time of query execution.
then also its giving that error "Query completd with errors" and showing that error also. and rolling back all the already inserted records also and table1 is getting created with empty records.

So even query execution time is not of concern, am not ending up with selecting of column A along with column b links completely ie without the error that "Query completed with errors"
0
 
GouthamAnandAuthor Commented:
My column B is nText data type. This is for your information.
0
 
GouthamAnandAuthor Commented:
Thank you very much for your continuous help.
There is some data problem so getting that error. Now got resolved.
0
 
jonnidipCommented:
You're welcome!
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.