Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

Reading files from a folder into excel spreadsheet

Posted on 2013-11-16
18
Medium Priority
?
231 Views
Last Modified: 2013-12-05
I am looking for help; with an excel script that

1. reads all files from a given folder (path can be hardcoded)
2. opens them as text and copies the contents into excel spreadsheet with 2 columns:
a. FileName
b. FileContents

The resulting spreadsheet should have as many rows as there are files and should contain the contents of each file in the "FileContents" column.

Thank you, Experts!
0
Comment
Question by:cyber-33
  • 11
  • 7
18 Comments
 

Author Comment

by:cyber-33
ID: 39654008
Note, some files can be large and contain all sorts of weird characters. The trick is to insert the contents of each files into a single cell. I am attaching a few files as samples.
0004514.html
0
 

Author Comment

by:cyber-33
ID: 39654014
File2
0011102.html
0
 
LVL 14

Expert Comment

by:Faustulus
ID: 39654088
What is the format of the files you wish to import? Are they all of the same format? What is their extension?
Note that the maximum number of characters that Excel can write into a single cell is 32,767. There is no trick to writing that many and none to write more.
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 

Author Comment

by:cyber-33
ID: 39654191
Files s are html.  But I want to treat them as text. I cam deal with yhe size limitations by importing a sybset of each file appearing within some html tags. For example, open a section of the file appearing between <claims> and </claims> tags. This part will be significantly smaller than the entire file.

Thank you for your help!
0
 

Author Comment

by:cyber-33
ID: 39654193
Also as the subsets are beeing imported, all the strange character can be replaced with spaces
0
 
LVL 14

Expert Comment

by:Faustulus
ID: 39654724
Can we say that you want only the part of the file between <claims> tags?
And that you will be happy to truncate any part of that part that might exceed the maximum acceptable length?
Can we say that you want only those exceptional characters replaced which Excel might refuse to accept?
I would really appreciate a sample of such a file - duly sanitised for public view. Your links lead to all kinds of advertising. Perhaps you can upload them in txt format.
0
 

Author Comment

by:cyber-33
ID: 39655375
My commens are below:
Can we say that you want only the part of the file between <claims> tags?
[YES]

And that you will be happy to truncate any part of that part that might exceed the maximum acceptable length?
[YES]

Can we say that you want only those exceptional characters replaced which Excel might refuse to accept?
[YES]

I would really appreciate a sample of such a file - duly sanitised for public view. Your links lead to all kinds of advertising. Perhaps you can upload them in txt format.
[A duly sanitized file would be different from the sample I would like the import process to work with. The 2 files attached with everything in them represent a good sample of the data I will be working with]

Thank you for your help.
0
 
LVL 14

Expert Comment

by:Faustulus
ID: 39655600
I haven't been able to get at your two files.
0
 

Author Comment

by:cyber-33
ID: 39657933
They are just text files in html format...
0
 
LVL 14

Expert Comment

by:Faustulus
ID: 39660245
When I click on them they open a Web site. Perhaps they execute. If they are text files change their extension to txt.
0
 

Author Comment

by:cyber-33
ID: 39666196
0
 
LVL 14

Expert Comment

by:Faustulus
ID: 39668152
Thank you.
I have now received your two files. Unfortunately, I will be travelling these next four days, leaving even my laptop behind, and expect to return to a back log of work. More likely than not it will be a week before I can get back to you.
Faustulus
0
 
LVL 14

Expert Comment

by:Faustulus
ID: 39668167
There seems to be a little problem with the claims tags. Neither file has <claims>, </claims> tags. The nearest I can find is <!--      Claims  --> and >Claims:<.
It seems that quite substantial parts of the files are not between these tags. Do you wish to revise the instruction?
0
 

Author Comment

by:cyber-33
ID: 39669016
The instructions are still valid. I used "claims"as a sample tag. The idea is that I can use some delimiters within the text to select the subsets within the files. The tags that u identified are perfect for my needs.

Thank you for your help!
0
 
LVL 14

Accepted Solution

by:
Faustulus earned 2000 total points
ID: 39676675
Thank you for your patience. The attached workbook contains the solution you asked for.
Please set the Const FilePath to point at the folder where your htm files are. The program will write to the worksheet. You can change the sheet's name and assign different columns. Note that the macro formats the two output columns.

To run the program call the procedure 'ExtractFromFile'.
The program will look for a string that marks the beginning of the excerpt and another that marks its end. You can experiment with different strings which you can set in the procedure Private Function GetExtract(TextStream As String) As String

    Const TxtStart As String = "<!--    Claims  -->"
    Const TxtEnd As String = ">Claims:</div>"

The above two strings follow each other very closely in your files. Therefore the extracted text is quite short. It would be possible to work on the extract in multiple ways, for example, eliminate all <> brackets. For the moment the code only removes leading non-characters (like carriage returns).
EXX-131126-Extract-From-HTML-Fil.xlsm
0
 

Author Comment

by:cyber-33
ID: 39686180
Thank you! I will test this solution on Monday and assign points then. Looking forward to it!
0
 

Author Comment

by:cyber-33
ID: 39699706
Verified - the code is clean, easy to read, follow and modify. Thank you!
0
 

Author Closing Comment

by:cyber-33
ID: 39699707
Elegant solution. Excellent coding style. Knowledgeable expert.
0

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article descibes how to create a connection between Excel and SAP and how to move data from Excel to SAP or the other way around.
Freeze panes is an option within all variants of Excel to enable parts of a sheet to remain stationary when the cursor is in another part of the sheet. This is a very useful feature which is overlooked or under used.
This Micro Tutorial demonstrates using Microsoft Excel pivot tables, how to reverse engineer competitors' marketing strategies through backlinks.
This Micro Tutorial will demonstrate how to create pivot charts out of a data set. I also added a drop-down menu which allows to choose from different categories in the data set and the chart will automatically update.

916 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question