Link to home
Start Free TrialLog in
Avatar of karinos57
karinos57Flag for Afghanistan

asked on

convert html tags into text

Hi,
i am pulling a data from wordpress and one of the columns has html tags content in the column.  I am also copying the data from wordpress into sql server.  so i would like to format the data in sql server or in wordpress but my goal is to clean up and convert in text so it could be something readable.  Here is what it look like the data:
Note, i have researched this but could not get anything useful so i am hoping the EE can help me on this.  thanks for your time.
<font size="1">

<table border=3>
 <tr><b>
  <td>Chg #</td>
  <td>Planned Start</td>
  <td>Planned End</td>
  <td>Risk Level</td>
  <td>Brief Description</td>
  <td>Assign Dept</td>
  <td>Requested by</td>
  <td>Status</td> </b>
 </tr>
 <tr>
  <td>C10269</td>
  <td>9/16/2011 3:00 PM</td>
  <td>9/17/2011 11:00 PM</td>
  <td>-</td>
  <td>Backup for VNETS Refresh</td>
  <td>DBA</td>
  <td>GKERN</td>
  <td>pending</td>
 </tr>
 <tr>
  <td>C10274</td>
  <td>9/16/2011 8:30 PM</td>
  <td>9/17/2011 10:00 PM</td>
  <td>2</td>
  <td>EAI Adapter outage</td>
  <td>Application-Middleware  Tech</td>
  <td>janders4</td>
  <td>pending</td>
 </tr>
 <tr>
  <td>C10273</td>
  <td>9/17/2011 8:00 AM</td>
  <td>9/17/2011 2:30 PM</td>
  <td>-</td>

Open in new window

ASKER CERTIFIED SOLUTION
Avatar of quizwedge
quizwedge
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of karinos57

ASKER

i am thinking something like this but open to any other way or easier way to do this.  i am thinking something like this:
Chg #	Planned Start	Planned End	Risk Level	Brief Description	Assign Dept	Requested by	Status

C10212	9/9/11 10:00 AM	11/11/11 10:30 AM	2	Something here	DBA	moe	approved
C10209	9/9/11 6:00 PM	9/10/11 5:00 PM	3	SharePoint 2010 upgrade	Web Technologies	krubi	approved

Open in new window

i am also using this sql code to modify the data
SELECT distinct [post_date]
      ,convert(varchar(MAX),[post_content]) [post_content]

Open in new window

Looks like you're trying to take the data from one field in WordPress and turn it into a table in SQL. That's probably possible in SQL, but it's probably not the best place to put that logic. It'd be better to use whatever programming language you're using to get the data from WordPress to SQL. Then, you could loop through the lines one by one. If you hit a "<tr>" you know that you're about to grab data for the next record. A "<td>" is one of the fields for that record. A "</tr>" means that you're done with that record and should add it to SQL.

How are you getting the data out of WordPress?
I think I see, based on your last comment, but just to clarify. You're trying to take a field from WordPress and add it to a new table in the same SQL database that WordPress is using?
actually the original data is sitting in wordpress and i am getting the data from there into sql server using mysql code.  Once i get the data in the  sql server then i would like to clean it up.  I saw some example on this website but could not understand how to implement it.  I have 3 columns in this table and only one of the columns has the html tags.  Also not all the cells in this column has html tags; some of them are text.  If u can explain to me how can i modify this code would be great.  thanks
http://lazycoders.blogspot.com/2007/06/stripping-html-from-text-in-sql-server.html
SOLUTION
Avatar of Kevin Cross
Kevin Cross
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
By my quick scan, I agree with mwvisa1 that it's going to transform that text into just one big block of text. Is your goal to have all of that information in a new table or as one field in the existing table, but in a more readable format?
well, i would like to test it.  how would i tell the code to run this table and check this column?   For instance, i don't see in the code below where they specify db name/table name and column name.
CREATE FUNCTION [dbo].[udf_StripHTML]
(@HTMLText VARCHAR(MAX))
RETURNS VARCHAR(MAX)
AS
BEGIN
DECLARE @Start INT
DECLARE @End INT
DECLARE @Length INT
SET @Start = CHARINDEX('<',@HTMLText)
SET @End = CHARINDEX('>',@HTMLText,CHARINDEX('<',@HTMLText))
SET @Length = (@End - @Start) + 1
WHILE @Start > 0
AND @End > 0
AND @Length > 0
BEGIN
SET @HTMLText = STUFF(@HTMLText,@Start,@Length,'')
SET @Start = CHARINDEX('<',@HTMLText)
SET @End = CHARINDEX('>',@HTMLText,CHARINDEX('<',@HTMLText))
SET @Length = (@End - @Start) + 1
END
RETURN LTRIM(RTRIM(@HTMLText))
END

Open in new window

In SQL Server, create a new user function. To do this, you can click the "New Query" button and select your database from the drop down menu. Paste the function above in that window and then click the "Execute" button.

Then, in your select statement, instead of "select MyField" you need "select dbo.udf_StripHTML(MyField)"
thnx
You are welcome! Did you need something additional? Just curious since you marked this a "B". The grade is fine, but if there is something you feel would help you, please do ask or if you run into trouble later you can use a related question; however, do not be afraid to ask for clarification in the future as I would have been happy to keep commenting. I had not responded to your last comment because it looked like you were still going down the stripHTML path and http:#36927707 explained what to do there.

Anyway, glad that I could help in some way.

Best regards and happy coding,

Kevin
thanks mwvisa1, appreciate your efforts.  i am good now.  thanks anyway