Reading string from PDF without searching whole file.

Posted on 2008-06-20
Last Modified: 2013-11-26
I am trying to strip a row of characters out of a PDF.  I have code that I obtained from this site, and it works well, but it loops through the whole file.  I would like to be able to stip characters from a certain part of the file everytime just to obtain an account number that constantly changes with each file.
Question by:thomashospital
  • 6
  • 6
LVL 27

Expert Comment

ID: 21832227
You say...' I would like to be able to strip characters from a certain part of the file everytime just to obtain an account number that constantly changes with each file.'
Do you want to replace a account number?
do you want to just read account number ?
what are you wanting to do?

Author Comment

ID: 21832771
Sorry, I am wanting to read the account num into a variable then rename the file to match the account number.

Author Comment

ID: 21832797
I am pretty sure I can figure out the file rename part and if I cant I will ask in another question.  I just want to read the account number from the pdf at a certain location.
LVL 27

Expert Comment

ID: 21833281
what are you using to read the PDF file?

Author Comment

ID: 21833377
I have something called pdfbox.
LVL 27

Expert Comment

ID: 21833613
Do you have a pdf file that I can test with?
3 Use Cases for Connected Systems

Our Dev teams are like yours. They’re continually cranking out code for new features/bugs fixes, testing, deploying, testing some more, responding to production monitoring events and more. It’s complex. So, we thought you’d like to see what’s working for us.


Author Comment

ID: 21837892
I will create a testing pdf.  How can I get it to you.
LVL 27

Expert Comment

ID: 21845855
I have not looked but there is a way to post a file on EE for people to read.

Author Comment

ID: 21984870
It turns out that our problems run deeper than this solution.  Also I was having problems editing a PDF without posting patient information.  Do you have any example code that shows how to extract a string from any PDF so I can go ahead and give you the points for the solution?
LVL 27

Expert Comment

ID: 21997378
All of the examples I have are made from the full version of Adobe Acrbat which can read all parts of a PDF file.

Author Comment

ID: 22088288
That is fine.
LVL 27

Accepted Solution

planocz earned 500 total points
ID: 22131233
Sample code with a test file..

Imports Microsoft.VisualBasic

Public Class frmW9

    Inherits System.Windows.Forms.Form

#Region " Windows Form Designer generated code "

    Public Sub New()


        'This call is required by the Windows Form Designer.


        'Add any initialization after the InitializeComponent() call

    End Sub

    'Form overrides dispose to clean up the component list.

    Protected Overloads Overrides Sub Dispose(ByVal disposing As Boolean)

        If disposing Then

            If Not (components Is Nothing) Then


            End If

        End If


    End Sub

    'Required by the Windows Form Designer

    Private components As System.ComponentModel.IContainer

    'NOTE: The following procedure is required by the Windows Form Designer

    'It can be modified using the Windows Form Designer.  

    'Do not modify it using the code editor.

    Friend WithEvents Label1 As System.Windows.Forms.Label

    Friend WithEvents TextBox1 As System.Windows.Forms.TextBox

    Friend WithEvents Button1 As System.Windows.Forms.Button

    Public WithEvents PdfRpt As AxPdfLib.AxPdf

    <System.Diagnostics.DebuggerStepThrough()> Private Sub InitializeComponent()

        Dim resources As System.Resources.ResourceManager = New System.Resources.ResourceManager(GetType(frmW9))

        Me.Label1 = New System.Windows.Forms.Label

        Me.TextBox1 = New System.Windows.Forms.TextBox

        Me.Button1 = New System.Windows.Forms.Button

        Me.PdfRpt = New AxPdfLib.AxPdf

        CType(Me.PdfRpt, System.ComponentModel.ISupportInitialize).BeginInit()





        Me.Label1.Location = New System.Drawing.Point(4, 12)

        Me.Label1.Name = "Label1"

        Me.Label1.Size = New System.Drawing.Size(40, 16)

        Me.Label1.TabIndex = 0

        Me.Label1.Text = "Name:"




        Me.TextBox1.Location = New System.Drawing.Point(56, 8)

        Me.TextBox1.Name = "TextBox1"

        Me.TextBox1.Size = New System.Drawing.Size(168, 20)

        Me.TextBox1.TabIndex = 1

        Me.TextBox1.Text = ""




        Me.Button1.Location = New System.Drawing.Point(184, 96)

        Me.Button1.Name = "Button1"

        Me.Button1.TabIndex = 2

        Me.Button1.Text = "Start"




        Me.PdfRpt.Enabled = True

        Me.PdfRpt.Location = New System.Drawing.Point(12, 72)

        Me.PdfRpt.Name = "PdfRpt"

        Me.PdfRpt.OcxState = CType(resources.GetObject("PdfRpt.OcxState"), System.Windows.Forms.AxHost.State)

        Me.PdfRpt.Size = New System.Drawing.Size(100, 50)

        Me.PdfRpt.TabIndex = 3

        Me.PdfRpt.Visible = False




        Me.AutoScaleBaseSize = New System.Drawing.Size(5, 13)

        Me.ClientSize = New System.Drawing.Size(292, 145)





        Me.Name = "frmW9"

        Me.Text = "frmW9"

        CType(Me.PdfRpt, System.ComponentModel.ISupportInitialize).EndInit()


    End Sub

#End Region

    Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click



    End Sub

    Private Sub frmw9_Resize(ByVal eventSender As System.Object, ByVal eventArgs As System.EventArgs) Handles MyBase.Resize

        PdfRpt.SetBounds(0, 0, Me.ClientRectangle.Width - 15, Me.ClientRectangle.Height - 15)

    End Sub

    Private Sub AddData()

        If TextBox1.Text = String.Empty Then Exit Sub

        FdfAcX = New FDFACXLib.FdfApp

        FdfDoc = FdfAcX.FDFCreate



    End Sub

    Private Sub PDFView()


        'show PDF file here

        With Me

            .PdfRpt.Visible = True



            Cursor.Current = Cursors.Default


        End With

        Me.WindowState = FormWindowState.Maximized

    End Sub

    Public Sub SavePDF()

        Cursor.Current = Cursors.WaitCursor

        sPDFPath = sAppPath & "\Acrobat\W9Test.PDF"



        FdfDoc = Nothing

        FdfAcX = Nothing

        AcroExchAVDoc = CreateObject("AcroExch.AVDoc")

        bOK = AcroExchAVDoc.Open(sFDFPath, "")

        AcroExchPDDoc = AcroExchAVDoc.GetPDDoc

        ' bOK = AcroExchPDDoc.Save(Acrobat.__MIDL___MIDL_itf_acrobat_0000_0005.PDSaveFull, sPDFPath)

        bOK = AcroExchPDDoc.Save(1, sPDFPath)



        Cursor.Current = Cursors.Default

    End Sub

    Public Sub W9_Page1()

        iCount = 1 'FOR TESTING

        'HERE is where you add your fields

        With FdfDoc

            ' ADD Template to your Form--- On Adobe Acrobat Menu goto the Tools,Forms

            'then to page Templates... create a template name...W9_page_1

            .FDFAddTemplate(True, sMasterPath & "fW9.pdf", "W9_page_1", True)

            .FDFSetValue("f1-" & (iCount), TextBox1.Text.ToString, False)

        End With

    End Sub

End Class

Open in new window


Featured Post

DevOps Toolchain Recommendations

Read this Gartner Research Note and discover how your IT organization can automate and optimize DevOps processes using a toolchain architecture.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

This article describes relatively difficult and non-obvious issues that are likely to arise when creating COM class in Visual Studio and deploying it by professional MSI-authoring tools. It is assumed that the reader is already familiar with the cla…
A long time ago (May 2011), I have written an article showing you how to create a DLL using Visual Studio 2005 to be hosted in SQL Server 2005. That was valid at that time and it is still valid if you are still using these versions. You can still re…
This Micro Tutorial hows how you can integrate  Mac OSX to a Windows Active Directory Domain. Apple has made it easy to allow users to bind their macs to a windows domain with relative ease. The following video show how to bind OSX Mavericks to …
Learn how to create flexible layouts using relative units in CSS.  New relative units added in CSS3 include vw(viewports width), vh(viewports height), vmin(minimum of viewports height and width), and vmax (maximum of viewports height and width).

867 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now