Solved

How to import the HTML page body contents into sql Data Table

Posted on 2013-12-25
9
1,526 Views
Last Modified: 2016-06-15
The page_Details column type declared as (nvarchar(max) in the sql Table. How  to import the HTML page body contents into sql Data Table.

Thanks
0
Comment
Question by:KavyaVS
  • 4
  • 4
9 Comments
 
LVL 19

Expert Comment

by:Rikin Shah
ID: 39739075
Hi,

Where exactly the HTML Page is getting loaded?
0
 

Author Comment

by:KavyaVS
ID: 39739076
The HTML page is in the C drive of sql server.

Thanks
0
 
LVL 19

Assisted Solution

by:Rikin Shah
Rikin Shah earned 500 total points
ID: 39739090
And you want whole HTML file to be dumped to the SQL Column?

I think you must have got the code to read the content of the file... All you need to do is remove HTML tags from the content. Here is the function which will help you get plain text from the HTML string...

private string GetPlainTextFromHtml(string htmlString)
{
    string htmlTagPattern = "<.*?>";
    var regexCss = new Regex("(\\<script(.+?)\\</script\\>)|(\\<style(.+?)\\</style\\>)", RegexOptions.Singleline | RegexOptions.IgnoreCase);
    htmlString = regexCss.Replace(htmlString, string.Empty);
    htmlString = Regex.Replace(htmlString, htmlTagPattern, string.Empty);
    htmlString = Regex.Replace(htmlString, @"^\s+$[\r\n]*", "", RegexOptions.Multiline);
    htmlString = htmlString.Replace("&nbsp;", string.Empty);

    return htmlString;
}

Open in new window

0
 

Author Comment

by:KavyaVS
ID: 39739357
I don't want to remove html tags from from the html page. I want to save as it is into
Sql Data Table column. I don't want the whole html page. I want to save the body tag contents in the sql column(data type nvarchar(max))
Any suggestions please.


The following query inserting the HTML page content into Sql DataTable
 when the page_Details column type declared as (XML(.),null(The content
 inside the body tags in .aspx page was saved as xml file)
 Ex:<PageContents>

     - <![CDATA[
 <div>
 </div>

  ]]>

   </PageContents>
 Now the page_Details column type declared as (nvarchar(max). The below
 query is not inserting data.The column type can not be changed. How to
 insert the html data there.

 UPDATE [Content_Site].[dbo].t_Page_List

 SET Page_Details =(

 SELECT * FROM OPENROWSET(

    BULK 'C:\PagedETAILS_Xml\Page1content.xml’,

            SINGLE_BLOB

 ) AS x

 )

 WHERE PageID = 1

 GO

Thanks
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 
LVL 19

Accepted Solution

by:
Rikin Shah earned 500 total points
ID: 39739659
Hi,

I'm not proficient in my SQL but you can do something like this-

DECLARE @xml NVARCHAR(MAX)

SET @xml = SELECT * FROM OPENROWSET(
   BULK 'C:\SampleFolder\SampleData3.txt',
           SINGLE_BLOB
) AS x


UPDATE [Content_Site].[dbo].t_Page_List
SET Page_Details = @xml
WHERE PageID = 1

Open in new window

0
 
LVL 19

Expert Comment

by:Rikin Shah
ID: 39739660
You might need to cast the x to nvarchar.
0
 

Author Comment

by:KavyaVS
ID: 39746243
I've requested that this question be closed as follows:

Accepted answer: 167 points for rikin_shah's comment #a39739659
Assisted answer: 166 points for rikin_shah's comment #a39739075
Assisted answer: 0 points for KavyaVS's comment #a39739076
Assisted answer: 167 points for rikin_shah's comment #a39739090

for the following reason:

Thanks
0
 

Author Closing Comment

by:KavyaVS
ID: 39746244
Thanks
0
 

Expert Comment

by:Safak KAYA
ID: 41654852
Hello, I am new in sql but I have the same issue.  I want to import a particular data from a web page's html source code. to sql table.

Is it possible?

Thanks
0

Featured Post

Why You Should Analyze Threat Actor TTPs

After years of analyzing threat actor behavior, it’s become clear that at any given time there are specific tactics, techniques, and procedures (TTPs) that are particularly prevalent. By analyzing and understanding these TTPs, you can dramatically enhance your security program.

Join & Write a Comment

Suggested Solutions

Load balancing is the method of dividing the total amount of work performed by one computer between two or more computers. Its aim is to get more work done in the same amount of time, ensuring that all the users get served faster.
Ever needed a SQL 2008 Database replicated/mirrored/log shipped on another server but you can't take the downtime inflicted by initial snapshot or disconnect while T-logs are restored or mirror applied? You can use SQL Server Initialize from Backup…
Using examples as well as descriptions, and references to Books Online, show the different Recovery Models available in SQL Server and explain, as well as show how full, differential and transaction log backups are performed
Via a live example, show how to setup several different housekeeping processes for a SQL Server.

760 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now