Improve company productivity with a Business Account.Sign Up

x
?
Solved

C# HTML to PDF

Posted on 2013-11-11
14
Medium Priority
?
649 Views
Last Modified: 2013-12-12
Hello experts,
I have html code that need to be converted to PDF.
HTML contains table populated from recordset.
Please check the code. It is working, except for some reason in pdf it cuts out last row in the table.
 private void GenerateReport(string Html, HttpContext context)
        {
            MemoryStream stream = createPDF(Html);

            context.Response.ContentType = "application/pdf";
            context.Response.AddHeader("Content-Disposition", "attachment; filename=\"Report.pdf\"");
            context.Response.BinaryWrite(stream.ToArray());
       
        }

        private MemoryStream createPDF(string html)
        {
            MemoryStream msOutput = new MemoryStream();
            TextReader reader = new StringReader(html);

            Document document = new Document(PageSize.A4,10f,10f,10f,0f);
            
            PdfWriter writer = PdfWriter.GetInstance(document, msOutput);

            HTMLWorker worker = new HTMLWorker(document);
    
            document.Open();
            worker.StartDocument();

            worker.Parse(reader);
            worker.EndDocument();
            worker.Close();
            document.Close();

            return msOutput;
        }

Open in new window


If i run just html, all rows are displayed. The generated PDF document won't include last row.
Please, help.
Thank you.
0
Comment
Question by:Galina Besselyanova
  • 7
  • 5
12 Comments
 
LVL 33

Expert Comment

by:Robberbaron (robr)
ID: 39674535
1. are you sure your html is terminated correctly.  many browsers fix bad html for you.

2. can you paste the last part of the HTML as displayed in 'view-source' ?

3. what library are you using ?  I had problems with iTextSharp and ended up using WebKit commandline on a separate thread to get excellent results and consistent.

     /// <summary>
        /// Runs WebKit PDF command line convertor
        /// http://code.google.com/p/wkhtmltopdf/
        /// </summary>
        /// <param name="sRawUrl"></param>
        /// <returns></returns>
        ///
0
 

Author Comment

by:Galina Besselyanova
ID: 39675103
The HTML page is not displayed. When someone clicks on the link
<a id="Summary" title="List" href="/handlers/file.ashx" target="_blank">Click</a>

Open in new window

it doesn't generates the HTML page, it generates html string and then converts it into pdf (see the code above). How can i check the source of generated html?
0
 

Author Comment

by:Galina Besselyanova
ID: 39675155
We do use iTextSharp.
0
What Kind of Coding Program is Right for You?

There are many ways to learn to code these days. From coding bootcamps like Flatiron School to online courses to totally free beginner resources. The best way to learn to code depends on many factors, but the most important one is you. See what course is best for you.

 
LVL 33

Expert Comment

by:Robberbaron (robr)
ID: 39676053
I do exactly the same process.
Create html as string and then convert to html.  You could write the string to console or text file to test.
As I said, I tried itextsharp but gave up as it couldn't handle my formatted html with css.
So I write the string to a temp file and then send that file to webkit for pdf output. Works well but needs the external webkit files to be available.
Ok for me as my app is intranet only.
0
 

Author Comment

by:Galina Besselyanova
ID: 39692978
We actually convert html string into PDF and this is on our website. Also, our website is based on iParts so we will have to stay with itextsharp at least for now.
However, your idea to write the string to console or text file might work. It can show us what is wrong.
Can you please show me an example of how to do this ? I would really appreciate it.
Thanks!
0
 
LVL 33

Expert Comment

by:Robberbaron (robr)
ID: 39694785
this includes my calls to WebKit on a separate thread but it writes the incoming HTMLCode to a temporary file.  You can use whatever path/filename  you want.

                StreamWriter sWriter = File.CreateText(myPathFile);
                sWriter.WriteLine(HTMLCode);
                sWriter.Close();


        #region WebKit
        /// <summary>
        /// Runs WebKit PDF command line convertor
        /// http://code.google.com/p/wkhtmltopdf/
        /// </summary>
        /// <param name="sRawUrl"></param>
        /// <returns></returns>
        /// 
        private string _WebKitFiles = "DocMan_Files";
        private void ConvertHTMLToPDF_Wk(string HTMLCode)
        {
            string sFileName = ""; //GetNewName();
            string sPage = sFileName + ".html";
            //docman_files

            if (HTMLCode == "")
            {
                HTMLCode = "<HTML><HEAD><title>Blank data</title></head><body><h1>Blank document</h1></body></html>";
            }
            string GlobOptions = "-orientation Portrait -page-size A4 -title " + _DocInfo_title;
            StringWriter sw = new StringWriter();

            //Server.Execute(sUrlVirtual, sw);
            using (TemporaryFile htmlfile = new TemporaryFile(false, "HTML"))
            {
                StreamWriter sWriter = File.CreateText(htmlfile.Path);
                sWriter.WriteLine(HTMLCode);
                sWriter.Close();


                _threadArgs = RG_Utils.StringManip.Quoted(htmlfile.Path) + " " + RG_Utils.StringManip.Quoted(_PDFName);
                _threadWorkingDir = RG_Utils.StringManip.Quoted(System.AppDomain.CurrentDomain.BaseDirectory + WbKitFileLocation);
                _threadApp = RG_Utils.StringManip.Quoted(System.AppDomain.CurrentDomain.BaseDirectory + WbKitFileLocation + @"\" + "wkhtmltopdf.exe");

                System.Threading.ThreadStart job = new System.Threading.ThreadStart(ThreadStart);
                System.Threading.Thread thread = new System.Threading.Thread(job);
                thread.Start();

                // Wait for NewThread to terminate.
                thread.Join();
            }

        }

Open in new window

0
 

Author Comment

by:Galina Besselyanova
ID: 39696466
Great. I'll try and will let you know how it goes.
Thanks!
0
 

Author Comment

by:Galina Besselyanova
ID: 39708863
Hi,
We tested the HTML and generated HTML contains all records. So , probably the issue is when it converting into PDF.
Please, any ideas? The code is above.
0
 
LVL 33

Expert Comment

by:Robberbaron (robr)
ID: 39710557
as before, look very carefully at the end of the HTML .    can you post the last 2 rows ?
are all rows properly terminated ?

1. try pasting your html to  http://validator.w3.org/#validate_by_input

2. try a small set of your data through iTextSharp.  as i said, I had problems with it parsing HTML.
0
 

Author Comment

by:Galina Besselyanova
ID: 39711736
Here is a generated HTML string. As you can see there are 3 rows with 5 records.
But when this string converted into PDF, the last row  on the PDF document is not there .
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html>
<body style="font-family: Arial, Helvetica, sans-serif; font-size: 9px; line-height: 1.1em;" bgcolor="FFFFFF#" link="CC3300#" vlink="333300"  leftmargin="0" topmargin="5" marginwidth="0" marginheight="0" alink="333300">
<p align="center" style="color:003366;font-size:16px;font-weight:bold;font-family: Arial, Helvetica, sans-serif;">
NEW YORK CITY<br />
<em>Committee List</em></p>
<br />
<p align="center" style="color:003366;font-size:12px;font-weight:bold;font-family: Arial, Helvetica, sans-serif;">AIDS Committee (5)</p>
<div style="clear:both;"></div><br />
<div>
<table>
<tr>
	<td valign="top">
	<b><em>Chair</em></b>- <b><em>10-28-2011</em></b><br />
	Ly Neuer, Esq.<br />
	Safe Law Proj<br />
	150 Court St<br />Rm 1600<br />Brooklyn, NY &nbsp; 10001<br />
	Phone: (555) 555-55555<br />Fax: (555) 555-55555<br />
	Email: <a style="color:blue;" href="mailto:ler@san.org">ler@san.org</a><br />
	</td>
	<td valign="top">
	<b><em>Member</em></b>- <b><em>05-01-2012</em></b><br />
	Alt Ren, Esq.<br />
	860 E 63rd St<br />
	New York, NY &nbsp; 10011<br />
	Phone: (555) 555-55555<br />
	Fax: (555) 555-55555<br />
	Email: <a style="color:blue;" href="mailto:albertrchen@gmail.com">an@gmail.com</a><br />
	</td>
</tr>
<tr>
	<td valign="top">
	<b><em>Member</em></b>- <b><em>05-01-2012</em></b><br />
	Doy Chr, Esq.<br />The Bronx Defenders<br />
	1760 Ave<br />Bronx, NY &nbsp; 10651<br />
	Phone: (555) 555-55555<br />Fax: (555) 555-55555<br />
	Email: <a style="color:blue;" href="mailto:chy@yahoo.com">chy@yahoo.com</a><br />
	</td>
	<td valign="top">
	<b><em>Member</em></b>- <b><em>06-25-2013</em></b><br />
	Last Join<br />42 east 46th Street<br />New York, NY &nbsp; 10011<br />
	United States<br />Email: <a style="color:blue;" href="mailto:as@as.com">as@as.com</a><br />
	</td>
</tr>
<tr>
	<td valign="top"><b><em>Member</em></b>- <b><em>03-06-2013</em></b><br />
	Chott Ho, Esq.<br />1180 Heat St<br />New York, NY &nbsp; 10011<br />
	United States<br />Phone: (555) 555-55555<br />
	Fax: (555) 555-55555<br />
	Email: <a style="color:blue;" href="mailto:sgl@nyc.org">sgl@nyc.org</a><br />
	</td>
</tr>
</table>
</div>
</body>
</html>

Open in new window


Can't find anything wrong. Am i missing something?
Thank you .
0
 
LVL 33

Accepted Solution

by:
Robberbaron (robr) earned 2000 total points
ID: 39713123
I ran it through the validator.

lots of warnings but the one that sticks out is that the last row of the table only has one cell, yet all others have 2,  And there is no colspan specified.

ItextSharp is probably very picky about this.


<tr>
	<td valign="top"><b><em>Member</em></b>- <b><em>03-06-2013</em></b><br />
	Chott Ho, Esq.<br />1180 Heat St<br />New York, NY &nbsp; 10011<br />
	United States<br />Phone: (555) 555-55555<br />
	Fax: (555) 555-55555<br />
	Email: <a style="color:blue;" href="mailto:sgl@nyc.org">sgl@nyc.org</a><br />
	</td>
       <td> ************** Test **********</td>
</tr>
</table>

Open in new window

0
 

Author Closing Comment

by:Galina Besselyanova
ID: 39714228
Great! Thank you so much!
When generating an HTML I have a variable that keeps track how many columns.
So, i added the following in the end and everything is working.
 if (columnCounter == 2)
            {
                html += "</tr></table></div></body></html>";
            }
            else
            {
                html += "<td></td></tr></table></div></body></html>";
            }
Thank you!
0

Featured Post

What Kind of Coding Program is Right for You?

There are many ways to learn to code these days. From coding bootcamps like Flatiron School to online courses to totally free beginner resources. The best way to learn to code depends on many factors, but the most important one is you. See what course is best for you.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

When we want to run, execute or repeat a statement multiple times, a loop is necessary. This article covers the two types of loops in Python: the while loop and the for loop.
In threads here at EE, each comment has a unique Identifier (ID). It is easy to get the full path for an ID via the right-click context menu. However, we often want to post a short link within a thread rather than the full link. This article shows a…
The viewer will learn how to clear a vector as well as how to detect empty vectors in C++.
This video will show you how to get GIT to work in Eclipse.   It will walk you through how to install the EGit plugin in eclipse and how to checkout an existing repository.

595 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question