Reading string from PDF without searching whole file.

Posted on 2008-06-20
Last Modified: 2013-11-26
I am trying to strip a row of characters out of a PDF.  I have code that I obtained from this site, and it works well, but it loops through the whole file.  I would like to be able to stip characters from a certain part of the file everytime just to obtain an account number that constantly changes with each file.
Question by:thomashospital
  • 6
  • 6
LVL 27

Expert Comment

ID: 21832227
You say...' I would like to be able to strip characters from a certain part of the file everytime just to obtain an account number that constantly changes with each file.'
Do you want to replace a account number?
do you want to just read account number ?
what are you wanting to do?

Author Comment

ID: 21832771
Sorry, I am wanting to read the account num into a variable then rename the file to match the account number.

Author Comment

ID: 21832797
I am pretty sure I can figure out the file rename part and if I cant I will ask in another question.  I just want to read the account number from the pdf at a certain location.
LVL 27

Expert Comment

ID: 21833281
what are you using to read the PDF file?

Author Comment

ID: 21833377
I have something called pdfbox.
LVL 27

Expert Comment

ID: 21833613
Do you have a pdf file that I can test with?
How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails


Author Comment

ID: 21837892
I will create a testing pdf.  How can I get it to you.
LVL 27

Expert Comment

ID: 21845855
I have not looked but there is a way to post a file on EE for people to read.

Author Comment

ID: 21984870
It turns out that our problems run deeper than this solution.  Also I was having problems editing a PDF without posting patient information.  Do you have any example code that shows how to extract a string from any PDF so I can go ahead and give you the points for the solution?
LVL 27

Expert Comment

ID: 21997378
All of the examples I have are made from the full version of Adobe Acrbat which can read all parts of a PDF file.

Author Comment

ID: 22088288
That is fine.
LVL 27

Accepted Solution

planocz earned 500 total points
ID: 22131233
Sample code with a test file..

Imports Microsoft.VisualBasic

Public Class frmW9

    Inherits System.Windows.Forms.Form

#Region " Windows Form Designer generated code "

    Public Sub New()


        'This call is required by the Windows Form Designer.


        'Add any initialization after the InitializeComponent() call

    End Sub

    'Form overrides dispose to clean up the component list.

    Protected Overloads Overrides Sub Dispose(ByVal disposing As Boolean)

        If disposing Then

            If Not (components Is Nothing) Then


            End If

        End If


    End Sub

    'Required by the Windows Form Designer

    Private components As System.ComponentModel.IContainer

    'NOTE: The following procedure is required by the Windows Form Designer

    'It can be modified using the Windows Form Designer.  

    'Do not modify it using the code editor.

    Friend WithEvents Label1 As System.Windows.Forms.Label

    Friend WithEvents TextBox1 As System.Windows.Forms.TextBox

    Friend WithEvents Button1 As System.Windows.Forms.Button

    Public WithEvents PdfRpt As AxPdfLib.AxPdf

    <System.Diagnostics.DebuggerStepThrough()> Private Sub InitializeComponent()

        Dim resources As System.Resources.ResourceManager = New System.Resources.ResourceManager(GetType(frmW9))

        Me.Label1 = New System.Windows.Forms.Label

        Me.TextBox1 = New System.Windows.Forms.TextBox

        Me.Button1 = New System.Windows.Forms.Button

        Me.PdfRpt = New AxPdfLib.AxPdf

        CType(Me.PdfRpt, System.ComponentModel.ISupportInitialize).BeginInit()





        Me.Label1.Location = New System.Drawing.Point(4, 12)

        Me.Label1.Name = "Label1"

        Me.Label1.Size = New System.Drawing.Size(40, 16)

        Me.Label1.TabIndex = 0

        Me.Label1.Text = "Name:"




        Me.TextBox1.Location = New System.Drawing.Point(56, 8)

        Me.TextBox1.Name = "TextBox1"

        Me.TextBox1.Size = New System.Drawing.Size(168, 20)

        Me.TextBox1.TabIndex = 1

        Me.TextBox1.Text = ""




        Me.Button1.Location = New System.Drawing.Point(184, 96)

        Me.Button1.Name = "Button1"

        Me.Button1.TabIndex = 2

        Me.Button1.Text = "Start"




        Me.PdfRpt.Enabled = True

        Me.PdfRpt.Location = New System.Drawing.Point(12, 72)

        Me.PdfRpt.Name = "PdfRpt"

        Me.PdfRpt.OcxState = CType(resources.GetObject("PdfRpt.OcxState"), System.Windows.Forms.AxHost.State)

        Me.PdfRpt.Size = New System.Drawing.Size(100, 50)

        Me.PdfRpt.TabIndex = 3

        Me.PdfRpt.Visible = False




        Me.AutoScaleBaseSize = New System.Drawing.Size(5, 13)

        Me.ClientSize = New System.Drawing.Size(292, 145)





        Me.Name = "frmW9"

        Me.Text = "frmW9"

        CType(Me.PdfRpt, System.ComponentModel.ISupportInitialize).EndInit()


    End Sub

#End Region

    Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click



    End Sub

    Private Sub frmw9_Resize(ByVal eventSender As System.Object, ByVal eventArgs As System.EventArgs) Handles MyBase.Resize

        PdfRpt.SetBounds(0, 0, Me.ClientRectangle.Width - 15, Me.ClientRectangle.Height - 15)

    End Sub

    Private Sub AddData()

        If TextBox1.Text = String.Empty Then Exit Sub

        FdfAcX = New FDFACXLib.FdfApp

        FdfDoc = FdfAcX.FDFCreate



    End Sub

    Private Sub PDFView()


        'show PDF file here

        With Me

            .PdfRpt.Visible = True



            Cursor.Current = Cursors.Default


        End With

        Me.WindowState = FormWindowState.Maximized

    End Sub

    Public Sub SavePDF()

        Cursor.Current = Cursors.WaitCursor

        sPDFPath = sAppPath & "\Acrobat\W9Test.PDF"



        FdfDoc = Nothing

        FdfAcX = Nothing

        AcroExchAVDoc = CreateObject("AcroExch.AVDoc")

        bOK = AcroExchAVDoc.Open(sFDFPath, "")

        AcroExchPDDoc = AcroExchAVDoc.GetPDDoc

        ' bOK = AcroExchPDDoc.Save(Acrobat.__MIDL___MIDL_itf_acrobat_0000_0005.PDSaveFull, sPDFPath)

        bOK = AcroExchPDDoc.Save(1, sPDFPath)



        Cursor.Current = Cursors.Default

    End Sub

    Public Sub W9_Page1()

        iCount = 1 'FOR TESTING

        'HERE is where you add your fields

        With FdfDoc

            ' ADD Template to your Form--- On Adobe Acrobat Menu goto the Tools,Forms

            'then to page Templates... create a template name...W9_page_1

            .FDFAddTemplate(True, sMasterPath & "fW9.pdf", "W9_page_1", True)

            .FDFSetValue("f1-" & (iCount), TextBox1.Text.ToString, False)

        End With

    End Sub

End Class

Open in new window


Featured Post

Free Trending Threat Insights Every Day

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

Join & Write a Comment

This document covers how to connect to SQL Server and browse its contents.  It is meant for those new to Visual Studio and/or working with Microsoft SQL Server.  It is not a guide to building SQL Server database connections in your code.  This is mo…
A long time ago (May 2011), I have written an article showing you how to create a DLL using Visual Studio 2005 to be hosted in SQL Server 2005. That was valid at that time and it is still valid if you are still using these versions. You can still re…
This video shows how to remove a single email address from the Outlook 2010 Auto Suggestion memory. NOTE: For Outlook 2016 and 2013 perform the exact same steps. Open a new email: Click the New email button in Outlook. Start typing the address: …
This video demonstrates how to create an example email signature rule for a department in a company using CodeTwo Exchange Rules. The signature will be inserted beneath users' latest emails in conversations and will be displayed in users' Sent Items…

708 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now