Solved

text info gathering

Posted on 2014-04-03
12
153 Views
Last Modified: 2014-04-03
I am looking for an excel macro that will accept a folder as a parameter (it can be hard-coded within the script), it will perform the following:
-------------------------------
1. Recursively open each subfolder, including the current folder
2. In each opened folder, open each *.txt file
3. For each *.txt file do:
            a. List the name of the txt file (without the file extension) in column A
            b. list the first 200 characters from the txt file in column B
-----------------------------
In summary, the script above will generate a spreadsheet with as many records as there are text files in the directory tree, each record will identify a file name and first 200 characters of each file.

Thank you, experts!
0
Comment
Question by:cyber-33
  • 6
  • 4
  • 2
12 Comments
 
LVL 34

Expert Comment

by:Dan Craciun
ID: 39976071
Here's a Powershell solution.
The script will create a csv file with 2 columns that you can open in Excel.
#get file names and first 200 characters
#script created for E-E, Q_28404384 
#it requires Powershell v3 (uses PSCustomObject)

$path = "X:\path\to\folder"
$csvFile = "X:\path\to\files.csv"

#test if the csv file exists. If it does, empty it, if it doesn't, create it
if(Test-Path $csvFile) { Clear-Content $csvFile }
else {New-Item $csvFile -ItemType file}

#get all text files
gci $path -recurse -filter "*.txt" | %{
    
    #get the first 200 characters from file. Ugly, but working. Encoding dependant
    $content = [char[]](Get-Content $_.FullName -Encoding byte -TotalCount 200) -join ''
    
    #delete the .txt part
    $name = $_.FullName.replace(".txt", "")

    #export the name and content to the csv file
    [PSCustomObject][ordered]@{Name=$name; Content=$content} | Export-Csv $csvFile -Append -NoTypeInformation
} 

Open in new window

Replace "X:\path\to\folder" and "X:\path\to\files.csv" with your data.

HTH,
Dan
0
 

Author Comment

by:cyber-33
ID: 39976206
Unfortunately, I don't have access to PowerShell. Any way to rewrite this to be used as a VB Excel Macro?
0
 
LVL 34

Expert Comment

by:Dan Craciun
ID: 39976218
Yes, but not my field. You'll have to wait to the VBA experts.

What OS are you using? PS comes bundled with Windows 7 and onward.
Just click Start, type run, type powershell in the run box. You might be amazed :)
0
 

Author Comment

by:cyber-33
ID: 39976675
The problem is PowerShell scripts are disabled on my system as a matter of security.
0
 
LVL 34

Expert Comment

by:Dan Craciun
ID: 39976687
If you can open a powershell console, you can just paste the code (all at once, not line by line) and it will run, regardless of your ExecutionPolicy.
0
 

Author Comment

by:cyber-33
ID: 39976711
Got an error:

PS C:\Users\admin> else {New-Item $csvFile -ItemType file}
The term 'else' is not recognized as the name of a cmdlet, function, script file, or operable program. Check the spelli
ng of the name, or if a path was included, verify that the path is correct and try again.
At line:1 char:5
+ else <<<<  {New-Item $csvFile -ItemType file}
    + CategoryInfo          : ObjectNotFound: (else:String) [], CommandNotFoundException
    + FullyQualifiedErrorId : CommandNotFoundException
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 
LVL 21

Accepted Solution

by:
Ejgil Hedegaard earned 500 total points
ID: 39976719
Try this VBA function, in the file

Option Explicit
Dim strTopDir As String, strDirNow As String
Dim strDirName As String, strFileName As String
Dim lngDirRow As Long, lngRowNow As Long
Dim lngRow As Long, lngMaxRow As Long
Dim strText As String, strInputText As String
Dim wsDirList As Worksheet, wsFileList As Worksheet


Sub TxtFileList()

    Set wsDirList = Worksheets("Dirlist")
    Set wsFileList = Worksheets("Filelist")
    
    strTopDir = SelectFolder("Select Folder")
    If strTopDir = vbNullString Then
        MsgBox "No Folder Selected, Program Stop"
        End
    End If
    strTopDir = strTopDir + "\"
    
    wsDirList.Cells.ClearContents
    wsDirList.Cells(1, 1) = "Level"
    wsDirList.Cells(1, 2) = "Folder"
    
    lngRowNow = 2
    strDirName = Dir(strTopDir, vbDirectory)
    Do While strDirName <> ""
        If strDirName <> "." And strDirName <> ".." Then
            If (GetAttr(strTopDir & strDirName) And vbDirectory) = vbDirectory Then
                wsDirList.Cells(lngRowNow, 1) = 1
                wsDirList.Cells(lngRowNow, 2) = strDirName
                lngRowNow = lngRowNow + 1
            End If
        End If
        strDirName = Dir
    Loop
    
    lngRow = 2
    Do
        strDirNow = strTopDir + wsDirList.Cells(lngRow, 2) + "\"
        strDirName = Dir(strDirNow, vbDirectory)
        Do While strDirName <> ""
            If strDirName <> "." And strDirName <> ".." Then
                If (GetAttr(strDirNow & strDirName) And vbDirectory) = vbDirectory Then
                    wsDirList.Cells(lngRowNow, 1) = wsDirList.Cells(lngRow, 1) + 1
                    wsDirList.Cells(lngRowNow, 2) = wsDirList.Cells(lngRow, 2) + "\" + strDirName
                    lngRowNow = lngRowNow + 1
                End If
            End If
            strDirName = Dir
        Loop
        lngRow = lngRow + 1
    Loop While lngRow < lngRowNow
    lngMaxRow = lngRow - 1
    
    wsFileList.Select
    wsFileList.Cells.ClearContents
    lngRowNow = 2
    wsFileList.Cells(1, 1) = "Filname"
    wsFileList.Cells(1, 2) = "First 200 Characters"
    
        strDirNow = strTopDir
        strDirName = Dir(strDirNow + "*.txt")
        Do While strDirName <> ""
            If strDirName <> "." And strDirName <> ".." Then
                wsFileList.Cells(lngRowNow, 1) = Left(strDirName, Len(strDirName) - 4)
                strText = ""
                Open strDirNow + "\" + strDirName For Input As #1
                    Do
                        Line Input #1, strInputText
                        strText = strText + strInputText
                        If Len(strText) > 200 Then
                            wsFileList.Cells(lngRowNow, 2) = Left(strText, 200)
                        End If
                    Loop Until EOF(1) Or Len(strText) > 200
                Close #1
                If Len(strText) <= 200 Then
                    wsFileList.Cells(lngRowNow, 2) = strText
                End If
                lngRowNow = lngRowNow + 1
            End If
            strDirName = Dir
        Loop
    
    
    For lngDirRow = 2 To lngMaxRow
        strDirNow = strTopDir + wsDirList.Cells(lngDirRow, 2) + "\"
        strDirName = Dir(strDirNow + "*.txt")
        Do While strDirName <> ""
            If strDirName <> "." And strDirName <> ".." Then
                wsFileList.Cells(lngRowNow, 1) = Left(strDirName, Len(strDirName) - 4)
                strText = ""
                Open strDirNow + "\" + strDirName For Input As #1
                    Do
                        Line Input #1, strInputText
                        strText = strText + strInputText
                        If Len(strText) > 200 Then
                            wsFileList.Cells(lngRowNow, 2) = Left(strText, 200)
                        End If
                    Loop Until EOF(1) Or Len(strText) > 200
                Close #1
                If Len(strText) <= 200 Then
                    wsFileList.Cells(lngRowNow, 2) = strText
                End If
                lngRowNow = lngRowNow + 1
            End If
            strDirName = Dir
        Loop
    Next lngDirRow
End Sub

Function SelectFolder(Title As String, _
        Optional InitialFolder As String = vbNullString, _
        Optional InitialView As Office.MsoFileDialogView = _
            msoFileDialogViewList) As String
    Dim V As Variant
    Dim InitFolder As String
    With Application.FileDialog(msoFileDialogFolderPicker)
        .Title = Title
        .InitialView = InitialView
        If Len(InitialFolder) > 0 Then
            If Dir(InitialFolder, vbDirectory) <> vbNullString Then
                InitFolder = InitialFolder
                If Right(InitFolder, 1) <> "\" Then
                    InitFolder = InitFolder & "\"
                End If
                .InitialFileName = InitFolder
            End If
        End If
        .Show
        On Error Resume Next
        Err.Clear
        V = .SelectedItems(1)
        If Err.Number <> 0 Then
            V = vbNullString
        End If
    End With
    SelectFolder = CStr(V)
End Function

Open in new window

TxtFileList.xlsm
0
 
LVL 34

Expert Comment

by:Dan Craciun
ID: 39976729
You pasted line by line :)

Select all, then paste in the PS window, then press enter.
0
 

Author Comment

by:cyber-33
ID: 39976752
Dan -  I did select all and then pasted all by using the right mouse button click.
0
 

Author Comment

by:cyber-33
ID: 39976760
HGHOLT - does your solution work only with 2 levels of the tree? My tree can have many layers, hence there was a requirement for recursion....
0
 
LVL 21

Expert Comment

by:Ejgil Hedegaard
ID: 39976795
It will include all subfolders.
The sheet DirList list all subfolders, but not the topfolder.
Press the button and select the topfolder when asked.
Foldername (path) is not in the filelist, as this was your specific requirement.
0
 

Author Comment

by:cyber-33
ID: 39976888
Works like a charm! Thank you!
0

Featured Post

What Security Threats Are You Missing?

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

Join & Write a Comment

Drop Down List with Unique/Distinct Values (Part II - ComboBox or ListBox and Data Validation List Bonus!) David Miller (dlmille) Intro This article focuses on delivering unique, sorted lists to list objects (e.g., ComboBox, ListBox) and Dat…
Introduction While answering a recent question (http:/Q_27311462.html), I created an alternative function to the Excel Concatenate() function that you might find useful.  I tested several solutions and share the results in this article as well as t…
The view will learn how to download and install SIMTOOLS and FORMLIST into Excel, how to use SIMTOOLS to generate a Monte Carlo simulation of 30 sales calls, and how to calculate the conditional probability based on the results of the Monte Carlo …
This Micro Tutorial will demonstrate how to use longer labels with horizontal bar charts instead of the vertical column chart.

760 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

18 Experts available now in Live!

Get 1:1 Help Now