Link to home
Start Free TrialLog in
Avatar of jtovar3
jtovar3Flag for United States of America

asked on

How can I import a text file into excel then format the data easily?

Hello all!

I orginally had a horrible pdf file with a bunch of tables and information on it.  As seen in the standard pdf picture

I then used PDFZilla to convert the pdf to a text file. The output can be seen in sampletext.

Then this is where my problem is...

when ever i try to import the text file into excel, I have lots of trouble to get it to format correctly. I try text to columns but i cant get it to separate the numbers correctly.

I don't care about the column headers, but I do need the row data to be separate so that I can write a macro to extract the data and then copy it to another workboook.

Please help! I hope that I have provided enough explanation.
Avatar of Richard2k4
Flag of United States of America image

if it was me, I would write a script in Powershell or VB to parse the text and seperate the words from the numbers section...then i would import only the numbers as space delimited.   Insert a blank column and then paste in the word sections
Have you looked at the text file that gets created?  There are no separators.
I've seen that a lot when copying pdf files and consider it a problem with the automation mechanism used to examine the document.
Anyway, it seems that your PDFZilla tool is not working on that pdf document.
Avatar of shahzadbux
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of jtovar3


actually the copy and paste actually didnt work out too bad... surprisingly.

do you have any tips on writing a script to automatically open a pdf, copy all, then paste into excel?

I also need to work on a code to find specific sets of data then transfer them to a separate spreadsheet, but i think i should separate that into a different question thread
Hmm...only way I know off the top of my head is to use AutoIT, maybe a vb script?

Anyone else have another suggestion?
I tried it by adding a pdf reference to VBA and opened the pdf and it was excessively complicated to interpret because of the way pdfs are created.  Much easier to copy/paste by hand.
Avatar of redmondb

Do you have an OCR program?

For example, I have used ABBYY FineReader a lot for converting PDF's and TIFF's to text. This particular OCR (can't speak for others, obviously) recognises when the text is available uses that, bypassing any actual character recognition.


B, I don't think that comment was meant for me.  I haven't had an OCR program in about 10 years and really don't use one, but it sounds useful for pdf interpretations.
Apologies, rspahitz!
We have the full version of Adobe Acrobat Pro, this allows you to convert the pdf, and then you can copy and paste directly from it to excel/word. Quite a worthwhile tool.


Presumably you had tried that before using PDFZilla?

I have looked at PDFZilla and I don't think it will work in this scenario.  What jtovar3 needs is something that will convert the pdf to a csv file. I found this, it might work.

1.      Open the desired PDF document in Acrobat Standard or Professional.
2.      Select "Export" under "File" and choose "Text." Some versions of Acrobat include options for "Text (accessible)" and "Text (plain);" choose "Text (accessible)" to preserve basic formatting.
3.      Type the file name for the converted document and click the "Save" button. Acrobat saves text files as tab-delimited files.
4.    Launch the spreadsheet application (such as Microsoft Excel or OpenOffice Calc) and select "Open" under "File" in the top menu bar.
5.      Select the text file created in Step 3 and click the "Open" button to launch an Import Wizard.
6.      Review the pages in the Import Wizard to select how the data is organized in columns and click the "Next" button to navigate through the wizard. For example, select "Delimited" to specify fields and click the option next to "Space" or "Comma" to specify how the fields are separated.
7.      Click the "Finish" button.
8.      Select the "Save As" function (usually under "File" in the top menu bar) and select the file type as "CSV (Comma-Separated Values)." Select "CSV (Windows)" instead of "CSV (MS-DOS)" if this option is displayed.
9.      Click the "Save" button.

Please read my last post again - I was querying why he was using PDFZilla when he has Adobe Acrobat Pro!

Avatar of jtovar3


I'll post another question about scripts to open adobe and copy and paste.