[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

convert html tags into text

Posted on 2011-10-06
13
Medium Priority
?
325 Views
Last Modified: 2012-05-12
Hi,
i am pulling a data from wordpress and one of the columns has html tags content in the column.  I am also copying the data from wordpress into sql server.  so i would like to format the data in sql server or in wordpress but my goal is to clean up and convert in text so it could be something readable.  Here is what it look like the data:
Note, i have researched this but could not get anything useful so i am hoping the EE can help me on this.  thanks for your time.
<font size="1">

<table border=3>
 <tr><b>
  <td>Chg #</td>
  <td>Planned Start</td>
  <td>Planned End</td>
  <td>Risk Level</td>
  <td>Brief Description</td>
  <td>Assign Dept</td>
  <td>Requested by</td>
  <td>Status</td> </b>
 </tr>
 <tr>
  <td>C10269</td>
  <td>9/16/2011 3:00 PM</td>
  <td>9/17/2011 11:00 PM</td>
  <td>-</td>
  <td>Backup for VNETS Refresh</td>
  <td>DBA</td>
  <td>GKERN</td>
  <td>pending</td>
 </tr>
 <tr>
  <td>C10274</td>
  <td>9/16/2011 8:30 PM</td>
  <td>9/17/2011 10:00 PM</td>
  <td>2</td>
  <td>EAI Adapter outage</td>
  <td>Application-Middleware  Tech</td>
  <td>janders4</td>
  <td>pending</td>
 </tr>
 <tr>
  <td>C10273</td>
  <td>9/17/2011 8:00 AM</td>
  <td>9/17/2011 2:30 PM</td>
  <td>-</td>

Open in new window

0
Comment
Question by:karinos57
  • 6
  • 5
  • 2
13 Comments
 
LVL 14

Accepted Solution

by:
quizwedge earned 750 total points
ID: 36926676
Given that dataset, what would you like the output to look like (the part stored in SQL)? I'm thinking you'll want to use multiple Replace statements http://msdn.microsoft.com/en-us/library/ms186862.aspx
0
 

Author Comment

by:karinos57
ID: 36926758
i am thinking something like this but open to any other way or easier way to do this.  i am thinking something like this:
Chg #	Planned Start	Planned End	Risk Level	Brief Description	Assign Dept	Requested by	Status

C10212	9/9/11 10:00 AM	11/11/11 10:30 AM	2	Something here	DBA	moe	approved
C10209	9/9/11 6:00 PM	9/10/11 5:00 PM	3	SharePoint 2010 upgrade	Web Technologies	krubi	approved

Open in new window

0
 

Author Comment

by:karinos57
ID: 36926766
i am also using this sql code to modify the data
SELECT distinct [post_date]
      ,convert(varchar(MAX),[post_content]) [post_content]

Open in new window

0
 [eBook] Windows Nano Server

Download this FREE eBook and learn all you need to get started with Windows Nano Server, including deployment options, remote management
and troubleshooting tips and tricks

 
LVL 14

Expert Comment

by:quizwedge
ID: 36926852
Looks like you're trying to take the data from one field in WordPress and turn it into a table in SQL. That's probably possible in SQL, but it's probably not the best place to put that logic. It'd be better to use whatever programming language you're using to get the data from WordPress to SQL. Then, you could loop through the lines one by one. If you hit a "<tr>" you know that you're about to grab data for the next record. A "<td>" is one of the fields for that record. A "</tr>" means that you're done with that record and should add it to SQL.

How are you getting the data out of WordPress?
0
 
LVL 14

Expert Comment

by:quizwedge
ID: 36926865
I think I see, based on your last comment, but just to clarify. You're trying to take a field from WordPress and add it to a new table in the same SQL database that WordPress is using?
0
 

Author Comment

by:karinos57
ID: 36926989
actually the original data is sitting in wordpress and i am getting the data from there into sql server using mysql code.  Once i get the data in the  sql server then i would like to clean it up.  I saw some example on this website but could not understand how to implement it.  I have 3 columns in this table and only one of the columns has the html tags.  Also not all the cells in this column has html tags; some of them are text.  If u can explain to me how can i modify this code would be great.  thanks
http://lazycoders.blogspot.com/2007/06/stripping-html-from-text-in-sql-server.html
0
 
LVL 60

Assisted Solution

by:Kevin Cross
Kevin Cross earned 750 total points
ID: 36927163
I would have to read through that link carefully, but by the title and quick scan, I would suspect it is returning the text all as one blob of text. Not sure that is what you want. Since HTML can be well-formed like XML, I would probably try something around parsing XML. The trick will be that not everything in HTML has to be well-formed, so if you have a way to get down to the exact table you want that would help as that is likely a good piece of XML snippet. Therefore, if there is a common piece of content before or after or if this is the only table in the text ...

- You can grab only that piece of string.
- Converted to XML, you can use the XML data type capabilities to parse XML values '//table/tr' XPath.
- You can dump everything into text, so having heading row does not hurt OR you can filter that out.
- Whether you use a staging table or not, you can then CONVERT each of the 'td' values per your needs.
0
 
LVL 14

Expert Comment

by:quizwedge
ID: 36927220
By my quick scan, I agree with mwvisa1 that it's going to transform that text into just one big block of text. Is your goal to have all of that information in a new table or as one field in the existing table, but in a more readable format?
0
 

Author Comment

by:karinos57
ID: 36927472
well, i would like to test it.  how would i tell the code to run this table and check this column?   For instance, i don't see in the code below where they specify db name/table name and column name.
CREATE FUNCTION [dbo].[udf_StripHTML]
(@HTMLText VARCHAR(MAX))
RETURNS VARCHAR(MAX)
AS
BEGIN
DECLARE @Start INT
DECLARE @End INT
DECLARE @Length INT
SET @Start = CHARINDEX('<',@HTMLText)
SET @End = CHARINDEX('>',@HTMLText,CHARINDEX('<',@HTMLText))
SET @Length = (@End - @Start) + 1
WHILE @Start > 0
AND @End > 0
AND @Length > 0
BEGIN
SET @HTMLText = STUFF(@HTMLText,@Start,@Length,'')
SET @Start = CHARINDEX('<',@HTMLText)
SET @End = CHARINDEX('>',@HTMLText,CHARINDEX('<',@HTMLText))
SET @Length = (@End - @Start) + 1
END
RETURN LTRIM(RTRIM(@HTMLText))
END

Open in new window

0
 
LVL 14

Expert Comment

by:quizwedge
ID: 36927707
In SQL Server, create a new user function. To do this, you can click the "New Query" button and select your database from the drop down menu. Paste the function above in that window and then click the "Execute" button.

Then, in your select statement, instead of "select MyField" you need "select dbo.udf_StripHTML(MyField)"
0
 

Author Closing Comment

by:karinos57
ID: 36965293
thnx
0
 
LVL 60

Expert Comment

by:Kevin Cross
ID: 36965843
You are welcome! Did you need something additional? Just curious since you marked this a "B". The grade is fine, but if there is something you feel would help you, please do ask or if you run into trouble later you can use a related question; however, do not be afraid to ask for clarification in the future as I would have been happy to keep commenting. I had not responded to your last comment because it looked like you were still going down the stripHTML path and http:#36927707 explained what to do there.

Anyway, glad that I could help in some way.

Best regards and happy coding,

Kevin
0
 

Author Comment

by:karinos57
ID: 36969314
thanks mwvisa1, appreciate your efforts.  i am good now.  thanks anyway
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

How to leverage one TLS certificate to encrypt Microsoft SQL traffic and Remote Desktop Services, versus creating multiple tickets for the same server.
This shares a stored procedure to retrieve permissions for a given user on the current database or across all databases on a server.
Integration Management Part 2
With just a little bit of  SQL and VBA, many doors open to cool things like synchronize a list box to display data relevant to other information on a form.  If you have never written code or looked at an SQL statement before, no problem! ...  give i…

834 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question