Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people, just like you, are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
Solved

How to import the HTML page body contents into sql Data Table

Posted on 2013-12-25
9
1,704 Views
Last Modified: 2016-06-15
The page_Details column type declared as (nvarchar(max) in the sql Table. How  to import the HTML page body contents into sql Data Table.

Thanks
0
Comment
Question by:KavyaVS
  • 4
  • 4
9 Comments
 
LVL 19

Expert Comment

by:Rikin Shah
ID: 39739075
Hi,

Where exactly the HTML Page is getting loaded?
0
 

Author Comment

by:KavyaVS
ID: 39739076
The HTML page is in the C drive of sql server.

Thanks
0
 
LVL 19

Assisted Solution

by:Rikin Shah
Rikin Shah earned 500 total points
ID: 39739090
And you want whole HTML file to be dumped to the SQL Column?

I think you must have got the code to read the content of the file... All you need to do is remove HTML tags from the content. Here is the function which will help you get plain text from the HTML string...

private string GetPlainTextFromHtml(string htmlString)
{
    string htmlTagPattern = "<.*?>";
    var regexCss = new Regex("(\\<script(.+?)\\</script\\>)|(\\<style(.+?)\\</style\\>)", RegexOptions.Singleline | RegexOptions.IgnoreCase);
    htmlString = regexCss.Replace(htmlString, string.Empty);
    htmlString = Regex.Replace(htmlString, htmlTagPattern, string.Empty);
    htmlString = Regex.Replace(htmlString, @"^\s+$[\r\n]*", "", RegexOptions.Multiline);
    htmlString = htmlString.Replace("&nbsp;", string.Empty);

    return htmlString;
}

Open in new window

0
The Eight Noble Truths of Backup and Recovery

How can IT departments tackle the challenges of a Big Data world? This white paper provides a roadmap to success and helps companies ensure that all their data is safe and secure, no matter if it resides on-premise with physical or virtual machines or in the cloud.

 

Author Comment

by:KavyaVS
ID: 39739357
I don't want to remove html tags from from the html page. I want to save as it is into
Sql Data Table column. I don't want the whole html page. I want to save the body tag contents in the sql column(data type nvarchar(max))
Any suggestions please.


The following query inserting the HTML page content into Sql DataTable
 when the page_Details column type declared as (XML(.),null(The content
 inside the body tags in .aspx page was saved as xml file)
 Ex:<PageContents>

     - <![CDATA[
 <div>
 </div>

  ]]>

   </PageContents>
 Now the page_Details column type declared as (nvarchar(max). The below
 query is not inserting data.The column type can not be changed. How to
 insert the html data there.

 UPDATE [Content_Site].[dbo].t_Page_List

 SET Page_Details =(

 SELECT * FROM OPENROWSET(

    BULK 'C:\PagedETAILS_Xml\Page1content.xml’,

            SINGLE_BLOB

 ) AS x

 )

 WHERE PageID = 1

 GO

Thanks
0
 
LVL 19

Accepted Solution

by:
Rikin Shah earned 500 total points
ID: 39739659
Hi,

I'm not proficient in my SQL but you can do something like this-

DECLARE @xml NVARCHAR(MAX)

SET @xml = SELECT * FROM OPENROWSET(
   BULK 'C:\SampleFolder\SampleData3.txt',
           SINGLE_BLOB
) AS x


UPDATE [Content_Site].[dbo].t_Page_List
SET Page_Details = @xml
WHERE PageID = 1

Open in new window

0
 
LVL 19

Expert Comment

by:Rikin Shah
ID: 39739660
You might need to cast the x to nvarchar.
0
 

Author Comment

by:KavyaVS
ID: 39746243
I've requested that this question be closed as follows:

Accepted answer: 167 points for rikin_shah's comment #a39739659
Assisted answer: 166 points for rikin_shah's comment #a39739075
Assisted answer: 0 points for KavyaVS's comment #a39739076
Assisted answer: 167 points for rikin_shah's comment #a39739090

for the following reason:

Thanks
0
 

Author Closing Comment

by:KavyaVS
ID: 39746244
Thanks
0
 

Expert Comment

by:Safak KAYA
ID: 41654852
Hello, I am new in sql but I have the same issue.  I want to import a particular data from a web page's html source code. to sql table.

Is it possible?

Thanks
0

Featured Post

Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

The Delta outage: 650 cancelled flights, more than 1200 delayed flights, thousands of frustrated customers, tens of millions of dollars in damages – plus untold reputational damage to one of the world’s most trusted airlines. All due to a catastroph…
For both online and offline retail, the cross-channel business is the most recent pattern in the B2C trade space.
Via a live example, show how to extract information from SQL Server on Database, Connection and Server properties
This videos aims to give the viewer a basic demonstration of how a user can query current session information by using the SYS_CONTEXT function

765 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question