Solved

Reading files from a folder into excel spreadsheet

Posted on 2013-11-16
18
216 Views
Last Modified: 2013-12-05
I am looking for help; with an excel script that

1. reads all files from a given folder (path can be hardcoded)
2. opens them as text and copies the contents into excel spreadsheet with 2 columns:
a. FileName
b. FileContents

The resulting spreadsheet should have as many rows as there are files and should contain the contents of each file in the "FileContents" column.

Thank you, Experts!
0
Comment
Question by:cyber-33
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 11
  • 7
18 Comments
 

Author Comment

by:cyber-33
ID: 39654008
Note, some files can be large and contain all sorts of weird characters. The trick is to insert the contents of each files into a single cell. I am attaching a few files as samples.
0004514.html
0
 

Author Comment

by:cyber-33
ID: 39654014
File2
0011102.html
0
 
LVL 14

Expert Comment

by:Faustulus
ID: 39654088
What is the format of the files you wish to import? Are they all of the same format? What is their extension?
Note that the maximum number of characters that Excel can write into a single cell is 32,767. There is no trick to writing that many and none to write more.
0
Creating Instructional Tutorials  

For Any Use & On Any Platform

Contextual Guidance at the moment of need helps your employees/users adopt software o& achieve even the most complex tasks instantly. Boost knowledge retention, software adoption & employee engagement with easy solution.

 

Author Comment

by:cyber-33
ID: 39654191
Files s are html.  But I want to treat them as text. I cam deal with yhe size limitations by importing a sybset of each file appearing within some html tags. For example, open a section of the file appearing between <claims> and </claims> tags. This part will be significantly smaller than the entire file.

Thank you for your help!
0
 

Author Comment

by:cyber-33
ID: 39654193
Also as the subsets are beeing imported, all the strange character can be replaced with spaces
0
 
LVL 14

Expert Comment

by:Faustulus
ID: 39654724
Can we say that you want only the part of the file between <claims> tags?
And that you will be happy to truncate any part of that part that might exceed the maximum acceptable length?
Can we say that you want only those exceptional characters replaced which Excel might refuse to accept?
I would really appreciate a sample of such a file - duly sanitised for public view. Your links lead to all kinds of advertising. Perhaps you can upload them in txt format.
0
 

Author Comment

by:cyber-33
ID: 39655375
My commens are below:
Can we say that you want only the part of the file between <claims> tags?
[YES]

And that you will be happy to truncate any part of that part that might exceed the maximum acceptable length?
[YES]

Can we say that you want only those exceptional characters replaced which Excel might refuse to accept?
[YES]

I would really appreciate a sample of such a file - duly sanitised for public view. Your links lead to all kinds of advertising. Perhaps you can upload them in txt format.
[A duly sanitized file would be different from the sample I would like the import process to work with. The 2 files attached with everything in them represent a good sample of the data I will be working with]

Thank you for your help.
0
 
LVL 14

Expert Comment

by:Faustulus
ID: 39655600
I haven't been able to get at your two files.
0
 

Author Comment

by:cyber-33
ID: 39657933
They are just text files in html format...
0
 
LVL 14

Expert Comment

by:Faustulus
ID: 39660245
When I click on them they open a Web site. Perhaps they execute. If they are text files change their extension to txt.
0
 

Author Comment

by:cyber-33
ID: 39666196
0
 
LVL 14

Expert Comment

by:Faustulus
ID: 39668152
Thank you.
I have now received your two files. Unfortunately, I will be travelling these next four days, leaving even my laptop behind, and expect to return to a back log of work. More likely than not it will be a week before I can get back to you.
Faustulus
0
 
LVL 14

Expert Comment

by:Faustulus
ID: 39668167
There seems to be a little problem with the claims tags. Neither file has <claims>, </claims> tags. The nearest I can find is <!--      Claims  --> and >Claims:<.
It seems that quite substantial parts of the files are not between these tags. Do you wish to revise the instruction?
0
 

Author Comment

by:cyber-33
ID: 39669016
The instructions are still valid. I used "claims"as a sample tag. The idea is that I can use some delimiters within the text to select the subsets within the files. The tags that u identified are perfect for my needs.

Thank you for your help!
0
 
LVL 14

Accepted Solution

by:
Faustulus earned 500 total points
ID: 39676675
Thank you for your patience. The attached workbook contains the solution you asked for.
Please set the Const FilePath to point at the folder where your htm files are. The program will write to the worksheet. You can change the sheet's name and assign different columns. Note that the macro formats the two output columns.

To run the program call the procedure 'ExtractFromFile'.
The program will look for a string that marks the beginning of the excerpt and another that marks its end. You can experiment with different strings which you can set in the procedure Private Function GetExtract(TextStream As String) As String

    Const TxtStart As String = "<!--    Claims  -->"
    Const TxtEnd As String = ">Claims:</div>"

The above two strings follow each other very closely in your files. Therefore the extracted text is quite short. It would be possible to work on the extract in multiple ways, for example, eliminate all <> brackets. For the moment the code only removes leading non-characters (like carriage returns).
EXX-131126-Extract-From-HTML-Fil.xlsm
0
 

Author Comment

by:cyber-33
ID: 39686180
Thank you! I will test this solution on Monday and assign points then. Looking forward to it!
0
 

Author Comment

by:cyber-33
ID: 39699706
Verified - the code is clean, easy to read, follow and modify. Thank you!
0
 

Author Closing Comment

by:cyber-33
ID: 39699707
Elegant solution. Excellent coding style. Knowledgeable expert.
0

Featured Post

Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction This Article briefly covers methods of calculating the NPV and IRR variants in Excel as well as the limitations in calculating and interpreting IRR results. Paraphrasing Richard Shockley, author of my favourite finance reference tex…
In Part II of this series, I will discuss how to identify all open instances of Excel and enumerate the workbooks, spreadsheets, and named ranges within each of those instances.
This Micro Tutorial will demonstrate on a Mac how to change the sort order for chart legend values and decrpyt the intimidating chart menu.
Excel styles will make formatting consistent and let you apply and change formatting faster. In this tutorial, you'll learn how to use Excel's built-in styles, how to modify styles, and how to create your own. You'll also learn how to use your custo…

734 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question