Download pdf file ASP.NET

Posted on 2012-03-28
Medium Priority
Last Modified: 2012-06-21
Okay, I am working in VB.NET using VS 2010, working on an ASP.NET 4.0 Website. That said, I am looking to create a service that downloads a PDF file (publicly available) from another website and then converts that into text for display. I am currently using HttpWebRequest and Response for my downloads and it has worked well - there is a problem with the file it brings down.

The test file is 17Kb, but when I download it, it displays a size of 21Kb. Now, when I attempt to open the downloaded PDF file, I get a warning saying that the file could not be open because it is either not a supported file type or because the file has been damaged. I know the test file is good, but I suspect that somewhere along the line, the header is getting bloated with a couple Kb's worth of junk.

Below is the code I am using to download and write.

Dim fName As String = Server.MapPath("Programs/") & Date.Now.Ticks.ToString & ".pdf"
        Dim wr As HttpWebRequest = CType(WebRequest.Create("http://www.somesite.com/test.pdf"), HttpWebRequest)
        Dim ws As HttpWebResponse = CType(wr.GetResponse(), HttpWebResponse)

        Dim memStream As MemoryStream = New MemoryStream

        Dim length As Integer = 1024
        Dim buffer As [Byte]() = New [Byte](length - 1) {}
        Dim bytesRead As Integer = ws.GetResponseStream.Read(buffer, 0, length - 1)

        ' write the required bytes
        While bytesRead > 0
            memStream.Write(buffer, 0, bytesRead)
            bytesRead = ws.GetResponseStream.Read(buffer, 0, length)
        End While

        Using fstr As FileStream = New FileStream(fName, FileMode.CreateNew, FileAccess.ReadWrite)
        End Using


        'Delete the PDF - Currently disabled for testing

Open in new window

Help me Obi Wan Kenobi... I mean help me EE, I am lost and cannot find the answer on my own. I suspect it comes from the improper handling of the stream, but I can't figure it out.
Question by:Thomas_Hawkins
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 37780190
Have you considered using the WebClient class? I think it would make the task a bit simpler.



Dim fName As String = Server.MapPath("Programs/") & Date.Now.Ticks.ToString & ".pdf"

Using client As New System.Net.WebClient()
    client.DownloadFile("http://www.somesite.com/test.pdf", fName)
End Using



Open in new window


Author Comment

ID: 37780308
Kaufmed, I tried that solution just after you suggested, to the same results. The resulting PDF is 21Kb and unreadable. Here is the code:

     Dim fName As String = Server.MapPath("Programs/") & Date.Now.Ticks.ToString & ".pdf"
        Using client As New System.Net.WebClient()
        client.DownloadFile("http://www.equibase.com/premium/eqbPDFChartPlus.cfm?RACE=1&BorP=P&TID=AQU&CTRY=USA&DT=02/09/2011&DAY=D&STYLE=EQB.pdf", fName)
        End Using

Open in new window

LVL 16

Expert Comment

ID: 37780484
If the solution from kaufmed is not working.. (download the file and save it. Maybe the file equibase push is incorrectly sent. Try downloading a real pdf file like this:

If that doesn't work, something else is wrong (maybe the parsePDF method?)
Quick Start: DOCKER

Sometimes you just need a Quick Start on a topic in order to begin using it.. this is just what you need to know to get up and running with Docker!

LVL 20

Accepted Solution

BuggyCoder earned 1500 total points
ID: 37780502
Try this:-

Dim request = WebRequest.Create("<your path>")
Dim response = TryCast(request.GetResponse(), HttpWebResponse)

If response IsNot Nothing Then
	Dim sReader = New BinaryReader(response.GetResponseStream())
	Dim bytes = sReader.ReadBytes(CInt(response.ContentLength))

	Dim fs = New FileStream("c:/test.pdf", FileMode.CreateNew)
	fs.Write(bytes, 0, bytes.Length)
End If

Open in new window

LVL 75

Expert Comment

by:käµfm³d 👽
ID: 37781148
I just tried the code in a new project, and it downloads the file correctly for me--17 KB. If you put breakpoint on the call to parsePDF, and then go to the download folder and view the file, is the size correct? Can you view the PDF in Adobe Viewer prior to parsePDF working with the file?

Expert Comment

ID: 37781588
Ho you are trying to open the PDF File?
using code?
Then you need to use some third party tool to read the PDF. .Net does not have any inbuilt functionality of reading PDF.

One of the best and free is : iTextSharp

Author Comment

ID: 37783210
kaufmed, I've had the solution work for me as well - but for some reason sometimes (most times) it does not. When I've used stream and binary readers it displays the content length as -1. And yes, I've set a breakpoint right at the call to parsePDF(), and am attempting to open the file Adobe Reader 9.

StephanOnline, downloaded that PDF perfectly 89Kb as intended. It is true that this is a .cfm page rendered as a pdf - but I have successfully downloaded it before. Does anyone know of a way to properly do this, or am I going to have to scrape the page?

BuggyCoder, I've used your code successfully on numerous PDF files now, sadly it gives me a -1 ContentLength on my intended files. I suppose the fault is in the file I've chosen, a .cfm file presented as a PDF.

darjmaulik, I have used both SautinSoft's PDF Focus and PDFBox, I've not dabbled in iTextSharp any.

Author Closing Comment

ID: 37831009
I would've given an A, but everyone dropped out on me, this solution was almost perfect; however it did not fix my issue.

However, sending a contentType along with the request (request.ContentType="application/pdf") and then grabbing the Content-Length to use in the filestream object fs (response.getresponseheader("Content-Length")) solved my issue.

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

PDF files have been in the limelight due to its unmatched features.  Personal documents, emails, business reports and eBooks are all converted into PDF files owing to peerless features provided by it. Adding watermark to a PDF file is a method to se…
*Adobe Acrobat 9 was used for this article.  Particular steps may vary depending on software versions. Adobe Acrobat has many, many variables that my be utilized to customize your forms for clarity and ease of use. The Form Editing Tool will be y…
In this second video of the Xpdf series, we discuss and demonstrate the PDFimages utility, which, in a single command, is able to extract all the images from a PDF file and save each one in a separate image file (PBM, PPM, or JPG). Download and inst…
In this fifth video of the Xpdf series, we discuss and demonstrate the PDFdetach utility, which is able to list and, more importantly, extract attachments that are embedded in PDF files. It does this via a command line interface, making it suitable …
Suggested Courses

770 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question