Solved

text info gathering

Posted on 2014-04-03
12
154 Views
Last Modified: 2014-04-03
I am looking for an excel macro that will accept a folder as a parameter (it can be hard-coded within the script), it will perform the following:
-------------------------------
1. Recursively open each subfolder, including the current folder
2. In each opened folder, open each *.txt file
3. For each *.txt file do:
            a. List the name of the txt file (without the file extension) in column A
            b. list the first 200 characters from the txt file in column B
-----------------------------
In summary, the script above will generate a spreadsheet with as many records as there are text files in the directory tree, each record will identify a file name and first 200 characters of each file.

Thank you, experts!
0
Comment
Question by:cyber-33
  • 6
  • 4
  • 2
12 Comments
 
LVL 34

Expert Comment

by:Dan Craciun
ID: 39976071
Here's a Powershell solution.
The script will create a csv file with 2 columns that you can open in Excel.
#get file names and first 200 characters
#script created for E-E, Q_28404384 
#it requires Powershell v3 (uses PSCustomObject)

$path = "X:\path\to\folder"
$csvFile = "X:\path\to\files.csv"

#test if the csv file exists. If it does, empty it, if it doesn't, create it
if(Test-Path $csvFile) { Clear-Content $csvFile }
else {New-Item $csvFile -ItemType file}

#get all text files
gci $path -recurse -filter "*.txt" | %{
    
    #get the first 200 characters from file. Ugly, but working. Encoding dependant
    $content = [char[]](Get-Content $_.FullName -Encoding byte -TotalCount 200) -join ''
    
    #delete the .txt part
    $name = $_.FullName.replace(".txt", "")

    #export the name and content to the csv file
    [PSCustomObject][ordered]@{Name=$name; Content=$content} | Export-Csv $csvFile -Append -NoTypeInformation
} 

Open in new window

Replace "X:\path\to\folder" and "X:\path\to\files.csv" with your data.

HTH,
Dan
0
 

Author Comment

by:cyber-33
ID: 39976206
Unfortunately, I don't have access to PowerShell. Any way to rewrite this to be used as a VB Excel Macro?
0
 
LVL 34

Expert Comment

by:Dan Craciun
ID: 39976218
Yes, but not my field. You'll have to wait to the VBA experts.

What OS are you using? PS comes bundled with Windows 7 and onward.
Just click Start, type run, type powershell in the run box. You might be amazed :)
0
 

Author Comment

by:cyber-33
ID: 39976675
The problem is PowerShell scripts are disabled on my system as a matter of security.
0
 
LVL 34

Expert Comment

by:Dan Craciun
ID: 39976687
If you can open a powershell console, you can just paste the code (all at once, not line by line) and it will run, regardless of your ExecutionPolicy.
0
 

Author Comment

by:cyber-33
ID: 39976711
Got an error:

PS C:\Users\admin> else {New-Item $csvFile -ItemType file}
The term 'else' is not recognized as the name of a cmdlet, function, script file, or operable program. Check the spelli
ng of the name, or if a path was included, verify that the path is correct and try again.
At line:1 char:5
+ else <<<<  {New-Item $csvFile -ItemType file}
    + CategoryInfo          : ObjectNotFound: (else:String) [], CommandNotFoundException
    + FullyQualifiedErrorId : CommandNotFoundException
0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 
LVL 21

Accepted Solution

by:
Ejgil Hedegaard earned 500 total points
ID: 39976719
Try this VBA function, in the file

Option Explicit
Dim strTopDir As String, strDirNow As String
Dim strDirName As String, strFileName As String
Dim lngDirRow As Long, lngRowNow As Long
Dim lngRow As Long, lngMaxRow As Long
Dim strText As String, strInputText As String
Dim wsDirList As Worksheet, wsFileList As Worksheet


Sub TxtFileList()

    Set wsDirList = Worksheets("Dirlist")
    Set wsFileList = Worksheets("Filelist")
    
    strTopDir = SelectFolder("Select Folder")
    If strTopDir = vbNullString Then
        MsgBox "No Folder Selected, Program Stop"
        End
    End If
    strTopDir = strTopDir + "\"
    
    wsDirList.Cells.ClearContents
    wsDirList.Cells(1, 1) = "Level"
    wsDirList.Cells(1, 2) = "Folder"
    
    lngRowNow = 2
    strDirName = Dir(strTopDir, vbDirectory)
    Do While strDirName <> ""
        If strDirName <> "." And strDirName <> ".." Then
            If (GetAttr(strTopDir & strDirName) And vbDirectory) = vbDirectory Then
                wsDirList.Cells(lngRowNow, 1) = 1
                wsDirList.Cells(lngRowNow, 2) = strDirName
                lngRowNow = lngRowNow + 1
            End If
        End If
        strDirName = Dir
    Loop
    
    lngRow = 2
    Do
        strDirNow = strTopDir + wsDirList.Cells(lngRow, 2) + "\"
        strDirName = Dir(strDirNow, vbDirectory)
        Do While strDirName <> ""
            If strDirName <> "." And strDirName <> ".." Then
                If (GetAttr(strDirNow & strDirName) And vbDirectory) = vbDirectory Then
                    wsDirList.Cells(lngRowNow, 1) = wsDirList.Cells(lngRow, 1) + 1
                    wsDirList.Cells(lngRowNow, 2) = wsDirList.Cells(lngRow, 2) + "\" + strDirName
                    lngRowNow = lngRowNow + 1
                End If
            End If
            strDirName = Dir
        Loop
        lngRow = lngRow + 1
    Loop While lngRow < lngRowNow
    lngMaxRow = lngRow - 1
    
    wsFileList.Select
    wsFileList.Cells.ClearContents
    lngRowNow = 2
    wsFileList.Cells(1, 1) = "Filname"
    wsFileList.Cells(1, 2) = "First 200 Characters"
    
        strDirNow = strTopDir
        strDirName = Dir(strDirNow + "*.txt")
        Do While strDirName <> ""
            If strDirName <> "." And strDirName <> ".." Then
                wsFileList.Cells(lngRowNow, 1) = Left(strDirName, Len(strDirName) - 4)
                strText = ""
                Open strDirNow + "\" + strDirName For Input As #1
                    Do
                        Line Input #1, strInputText
                        strText = strText + strInputText
                        If Len(strText) > 200 Then
                            wsFileList.Cells(lngRowNow, 2) = Left(strText, 200)
                        End If
                    Loop Until EOF(1) Or Len(strText) > 200
                Close #1
                If Len(strText) <= 200 Then
                    wsFileList.Cells(lngRowNow, 2) = strText
                End If
                lngRowNow = lngRowNow + 1
            End If
            strDirName = Dir
        Loop
    
    
    For lngDirRow = 2 To lngMaxRow
        strDirNow = strTopDir + wsDirList.Cells(lngDirRow, 2) + "\"
        strDirName = Dir(strDirNow + "*.txt")
        Do While strDirName <> ""
            If strDirName <> "." And strDirName <> ".." Then
                wsFileList.Cells(lngRowNow, 1) = Left(strDirName, Len(strDirName) - 4)
                strText = ""
                Open strDirNow + "\" + strDirName For Input As #1
                    Do
                        Line Input #1, strInputText
                        strText = strText + strInputText
                        If Len(strText) > 200 Then
                            wsFileList.Cells(lngRowNow, 2) = Left(strText, 200)
                        End If
                    Loop Until EOF(1) Or Len(strText) > 200
                Close #1
                If Len(strText) <= 200 Then
                    wsFileList.Cells(lngRowNow, 2) = strText
                End If
                lngRowNow = lngRowNow + 1
            End If
            strDirName = Dir
        Loop
    Next lngDirRow
End Sub

Function SelectFolder(Title As String, _
        Optional InitialFolder As String = vbNullString, _
        Optional InitialView As Office.MsoFileDialogView = _
            msoFileDialogViewList) As String
    Dim V As Variant
    Dim InitFolder As String
    With Application.FileDialog(msoFileDialogFolderPicker)
        .Title = Title
        .InitialView = InitialView
        If Len(InitialFolder) > 0 Then
            If Dir(InitialFolder, vbDirectory) <> vbNullString Then
                InitFolder = InitialFolder
                If Right(InitFolder, 1) <> "\" Then
                    InitFolder = InitFolder & "\"
                End If
                .InitialFileName = InitFolder
            End If
        End If
        .Show
        On Error Resume Next
        Err.Clear
        V = .SelectedItems(1)
        If Err.Number <> 0 Then
            V = vbNullString
        End If
    End With
    SelectFolder = CStr(V)
End Function

Open in new window

TxtFileList.xlsm
0
 
LVL 34

Expert Comment

by:Dan Craciun
ID: 39976729
You pasted line by line :)

Select all, then paste in the PS window, then press enter.
0
 

Author Comment

by:cyber-33
ID: 39976752
Dan -  I did select all and then pasted all by using the right mouse button click.
0
 

Author Comment

by:cyber-33
ID: 39976760
HGHOLT - does your solution work only with 2 levels of the tree? My tree can have many layers, hence there was a requirement for recursion....
0
 
LVL 21

Expert Comment

by:Ejgil Hedegaard
ID: 39976795
It will include all subfolders.
The sheet DirList list all subfolders, but not the topfolder.
Press the button and select the topfolder when asked.
Foldername (path) is not in the filelist, as this was your specific requirement.
0
 

Author Comment

by:cyber-33
ID: 39976888
Works like a charm! Thank you!
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
excel pivot question 4 40
Formula or Macro to determine variance 17 75
Fixing a embedded format 7 29
NEED TO UPDATE DATA IN EXCEL 18 29
Sparklines have been introduced with Excel 2010 and are a useful tool for creating small in-cell charts, used for example in dashboards. Excel 2010 offers three different types of Sparklines: Line, Column and Win/Loss. What it does not offer is a…
Some code to ensure data integrity when using macros within Excel. Also included code that helps secure your data within an Excel workbook.
This Micro Tutorial will demonstrate the scrolling table in Microsoft Excel using the INDEX function.
This Micro Tutorial demonstrates in Microsoft Excel how to consolidate your marketing data by creating an interactive charts using form controls. This creates cool drop-downs for viewers of your chart to choose from.

911 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

18 Experts available now in Live!

Get 1:1 Help Now