Solved

Powershell Script to separate text strings in a text file & transfer to Excel

Posted on 2015-02-21
9
210 Views
Last Modified: 2015-02-25
Dear Experts,
I need to automate a separation and transfer process from text files to an excel sheet. I am familiar with running automation scripts with Powershell ISE which I think would be the best solution because I use powershell frequently and I would use this script frequently. I'm not a programmer. If you respond to this question then  do not expect that I could learn programming because I will not be able to do this. The best I could do is simply use the script and add paths for it to work.

I have a lot of text files that contains text strings that are ordered as shown in the zip file example which you can download from the dropbox link at the foot of this message.  You will see that the format takes this structure where the <CRLF> is a line break.

TextStringPathForFile1Before.lbi<CRLF>
NewTextStringForFile1After.lbi<CRLF>
<CRLF>
TextStringPathForFile2Before.lbi<CRLF>
NewTextStringForFile2Afte.lbir<CRLF>
<CRLF>
TextStringPathForFile3Before.lbi<CRLF>
NewTextStringForFile3After.lbi<CRLF>
<CRLF>

I have many text files like this with records all ordered in this way.   These files contain  the original (Before) text string and then on the next line a new (After) text string, then a space separating the text string pair then another text string pair all the way down the text document.

What I want to do is automate the separation and transfer of the text strings in these text files  to an excel sheet such that the "before" path is in column 1 and the "after" path, for the corresponding text string pair, is in column 2 of the excel sheet for all entries in the text file.

The "after" text strings may be completely different from the "before" text strings or they may be the same. In addition the text strings may have different  text string before and after and  possibly different file extensions for the text strings. Therefore the matching criteria for separation and transfer would be the <CRLR>. The best way to view the <CRLR> characters we have inserted in the attached text file is to open the text file in NotePad++ then go to View/Show Symbol and tick the option called "Show End Of Line" . This will display all the end of line characters.  You will see that all strings follow the same ordered pattern.

The text files all have exactly the same order. i.e. a text "before"  text string is terminated by a <CRLR> the "after" text string is terminated by a <CRLR> then a single <CRLR> on the next line to separate the text string pair from the next text string pair and then the next text string pair is repeated.   This continues all the way down to the end  of the text file and is terminated by a single <CRLR> after the last text string.

As an example I have provided a dropbox download link for a zip file containing an example of both the text file and the excel file output  that would be generated by the script.

Download link
https://www.dropbox.com/s/9zmfgnyny9yg9cp/TextStringSepartor.zip?dl=0

I have only included 3 string pairs (that is 6 strings) only, however our text files could have several thousand pairs for separation.

What we would like is for the script to be able to process all text files in a folder and subfolders. We would specify the path to the path to be processed in the powershell script and it would then process all folders and subfolders. This would save us time having to run individual processes for each text file. It would be good if the output excel sheet could be placed in the same folder as the text file it was created from and name by the same name as the original text file.

I have windows 7 computer with powershell 3 ISE installed.  I look forward to seeing what you can come up with to solve this issue.

Download link for sample files:
https://www.dropbox.com/s/9zmfgnyny9yg9cp/TextStringSepartor.zip?dl=0


Thank you for your interest and taking the time to read this.

best regards

John
0
Comment
Question by:boltweb
  • 5
  • 4
9 Comments
 
LVL 16

Expert Comment

by:Learnctx
Comment Utility
I've hacked something together. The way it works is to assume based off your example file that the file will always have a before and after combo 1 line after each other and nothing else (no comments, headers, etc). The script imports the contents of the file stripping out empty lines and then iterates through the file checking whether the line is odd or even.

Array item 0 = even (as far as mod is concerned) .. becomes a before
Array item 1 = odd .. becomes an after
Array item 2 = even .. becomes an before
Array item 3 = odd .. becomes an after
And so on.

Results are exported to a CSV which can be opened in Excel.

Param (
	[Parameter(Mandatory=$true,Position=0)]
	[string]$Path = $null,
	[Parameter(Mandatory=$false,Position=1)]
	[string]$Outfile = "results.csv",
)

$Contents = Get-Content -Path $Path | ? {$_ -and $_ -match "\w"}

$ArrResults = @()

0..$Contents.GetUpperBound(0) | % {
	if (($_ % 2) -eq 0)
	{
		$objValues = "" | Select Before, After
		$ObjValues.Before=$a[$_]
		$objValues.After=$a[$_+1]
		$ArrResults += $objvalues
	}
}
$ArrResults | Export-Csv -NoType "results.csv"

Open in new window


Usage is simple. Save as a .ps1. Run as any other script. Accepts a named or positional parameter for the file path to import. There is an outfile parameter which defaults to results.csv.

script.ps1 -Path c:\sourceTextFile.txt -Outfile c:\results.csv
0
 

Author Comment

by:boltweb
Comment Utility
Hello Learnctx,
thank you for your reply. Yes you are correct there are no comments, no headers and no footers in the text files to be processed. There are  just the 2 string pairs (before and after) and a single carriage return between the pairs as shown in the sample.


I hope you can give me some assistance on running your script. You have used some technical terms which I do not properly undestand since I have no experience or knowledge of programming.  Normally when I run a powershell script that has been written form me I run it from Powershell ISE.  When I open the script from within the ISE there is normally a line for me to enter  the path to the parent directory that contains the folders and subfolders to be processed. Then an output path which would mirror the folder structure of the input path and output the processed files. I normally just add those 2 paths and then click the RUN button.    Can I run your script in the same way? If not where do I input the paths in your script?  Thank you for writing this I think you've come up a  ingenious solution by using odd and even values for the logic of the script.

Best regards

John

I don't know how to run a powershell script from a command line.
0
 
LVL 16

Expert Comment

by:Learnctx
Comment Utility
When you execute the script it will prompt you to enter the path to a file. There was an error above I fixed and I've included an alternative where you can edit the file paths in ISE.

Param (
	[Parameter(Mandatory=$true,Position=0)]
	[string]$Path = $null,
	[Parameter(Mandatory=$false,Position=1)]
	[string]$Outfile = "results.csv"
)

$Contents = Get-Content -Path $Path | ? {$_ -and $_ -match "\w"}

$ArrResults = @()

0..$Contents.GetUpperBound(0) | % {
	if (($_ % 2) -eq 0)
	{
		$objValues = "" | Select Before, After
		$ObjValues.Before=$a[$_]
		$objValues.After=$a[$_+1]
		$ArrResults += $objvalues
	}
}
$ArrResults | Export-Csv -NoType -Path $Outfile

Open in new window


That said if you would prefer to hard code the path into the script as well you can do that.

$Path = "C:\Pathtofile\SourceFile.txt"
$Outfile = "C:\Pathtoresults\results.csv"

$Contents = Get-Content -Path $Path | ? {$_ -and $_ -match "\w"}

$ArrResults = @()

0..$Contents.GetUpperBound(0) | % {
	if (($_ % 2) -eq 0)
	{
		$objValues = "" | Select Before, After
		$ObjValues.Before=$a[$_]
		$objValues.After=$a[$_+1]
		$ArrResults += $objvalues
	}
}
$ArrResults | Export-Csv -NoType -Path $Outfile

Open in new window

0
 

Author Comment

by:boltweb
Comment Utility
Hello, I tried running the script. It ran however I received these errors which seem to be for every record. When I opened the CSV file the is only 2 words  "before" and "after"   Any ideas what is happening?


Cannot index into a null array.
At C:\Work\SCM\Dev\Scripts\PowershellScripts\DicFileToCSV.ps1:13 char:3
+         $ObjValues.Before=$a[$_]
+         ~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (:) [], RuntimeException
    + FullyQualifiedErrorId : NullArray
 
Cannot index into a null array.
At C:\Work\SCM\Dev\Scripts\PowershellScripts\DicFileToCSV.ps1:14 char:3
+         $objValues.After=$a[$_+1]
+         ~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (:) [], RuntimeException
    + FullyQualifiedErrorId : NullArray
0
What Should I Do With This Threat Intelligence?

Are you wondering if you actually need threat intelligence? The answer is yes. We explain the basics for creating useful threat intelligence.

 
LVL 16

Accepted Solution

by:
Learnctx earned 500 total points
Comment Utility
I do indeed. I forgot to replace $a with $contents. Modified below.

## Input file
$Path = "C:\Pathtofile\SourceFile.txt"
## Output CSV
$Outfile = "C:\Pathtoresults\results.csv"

## Get the contents of the input file excluding blank lines and lines which do not contain an alpha numeric character
$Contents = Get-Content -Path $Path | ? {$_ -and $_ -match "\w"}

## Create an empty array
$ArrResults = @()

## From array item 0 until the last array item
0..$Contents.GetUpperBound(0) | % {
	## Check if the number is odd or even
	if (($_ % 2) -eq 0)
	{
		## If even (including 0) then get the before and after values
		$objValues = "" | Select Before, After
		$ObjValues.Before=$Contents[$_]
		$objValues.After=$Contents[$_+1]
		## Add the result to the results array
		$ArrResults += $objvalues
	}
}
## Export the results to CSV.
$ArrResults | Export-Csv -NoType -Path $Outfile

Open in new window

0
 

Author Comment

by:boltweb
Comment Utility
Hi Learnctx,   I have just run the new version and it worked perfectly.    This is a fantastic little script that will save me a lot of time.  Thank you for writing this.

JohnB
0
 

Author Closing Comment

by:boltweb
Comment Utility
Solution works perfectly. Thank you.    

After seeing the excel sheet I realised that I could use another  powershell script that could do the reverse i.e. create the text file from an excel sheet. The reason is that it is much easier to edit and QA check the before and after strings in Excel. Then once the QA checks / mods and updates are done I could use the second powershell script to convert  the Excel file back to the text file with the same format as the text file sample I sent to you (i.e. without the the Before and After headings) and with the CRLR exactly as the text file sample.     If you think you could  do this let me know and  I could send you the link to the new EE  question. It will also be for 500 points.  

All the best

John
0
 
LVL 16

Expert Comment

by:Learnctx
Comment Utility
It should be pretty straight forward to export back to the txt file.

$Path = "results.csv"
$Outfile = "file.txt"
$Import = Import-CSV -Path $Path
Foreach ($Item in $Import)
{
   $Item.Before | Out-File -Append -Path $Outfile
   $Item.After | Out-File -Append -Path $Outfile
   "" | Out-File -Append -Path $Outfile
}

Open in new window

0
 

Author Comment

by:boltweb
Comment Utility
Hello, I tried running it but there were a long list of error messages. Did it work for you when you tested on the files I sent?

Below is a sample of the messages.  I also need to make certain that the character encoding in the text file is UTF8 without BOM.   UTF8 with BOM won't work.

Here is the output.   Should I raise another Question at EE to fix the bug?

Out-File : A parameter cannot be found that matches parameter name 'Path'.
At C:\Work\SCM\Dev\Scripts\PowershellScripts\CSVToDicFile.ps1:6 char:36
+    $Item.Before | Out-File -Append -Path $Outfile
+                                    ~~~~~
    + CategoryInfo          : InvalidArgument: (:) [Out-File], ParameterBindingException
    + FullyQualifiedErrorId : NamedParameterNotFound,Microsoft.PowerShell.Commands.OutFileCommand
 
Out-File : A parameter cannot be found that matches parameter name 'Path'.
At C:\Work\SCM\Dev\Scripts\PowershellScripts\CSVToDicFile.ps1:7 char:35
+    $Item.After | Out-File -Append -Path $Outfile
+                                   ~~~~~
    + CategoryInfo          : InvalidArgument: (:) [Out-File], ParameterBindingException
    + FullyQualifiedErrorId : NamedParameterNotFound,Microsoft.PowerShell.Commands.OutFileCommand
 
Out-File : A parameter cannot be found that matches parameter name 'Path'.
At C:\Work\SCM\Dev\Scripts\PowershellScripts\CSVToDicFile.ps1:8 char:26
+    "" | Out-File -Append -Path $Outfile
0

Featured Post

IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

Utilizing an array to gracefully append to a list of EmailAddresses
This script checks a path to see if a folder exists. If the folder does exist you will get output "The folder has previously been created. No action taken" If not it will create the folder. Then adds one user modify permission to the folder. It …
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…
Here's a very brief overview of the methods PRTG Network Monitor (https://www.paessler.com/prtg) offers for monitoring bandwidth, to help you decide which methods you´d like to investigate in more detail.  The methods are covered in more detail in o…

728 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

9 Experts available now in Live!

Get 1:1 Help Now