Solved

Powershell Script to separate text strings in a text file & transfer to Excel

Posted on 2015-02-21
9
222 Views
Last Modified: 2015-02-25
Dear Experts,
I need to automate a separation and transfer process from text files to an excel sheet. I am familiar with running automation scripts with Powershell ISE which I think would be the best solution because I use powershell frequently and I would use this script frequently. I'm not a programmer. If you respond to this question then  do not expect that I could learn programming because I will not be able to do this. The best I could do is simply use the script and add paths for it to work.

I have a lot of text files that contains text strings that are ordered as shown in the zip file example which you can download from the dropbox link at the foot of this message.  You will see that the format takes this structure where the <CRLF> is a line break.

TextStringPathForFile1Before.lbi<CRLF>
NewTextStringForFile1After.lbi<CRLF>
<CRLF>
TextStringPathForFile2Before.lbi<CRLF>
NewTextStringForFile2Afte.lbir<CRLF>
<CRLF>
TextStringPathForFile3Before.lbi<CRLF>
NewTextStringForFile3After.lbi<CRLF>
<CRLF>

I have many text files like this with records all ordered in this way.   These files contain  the original (Before) text string and then on the next line a new (After) text string, then a space separating the text string pair then another text string pair all the way down the text document.

What I want to do is automate the separation and transfer of the text strings in these text files  to an excel sheet such that the "before" path is in column 1 and the "after" path, for the corresponding text string pair, is in column 2 of the excel sheet for all entries in the text file.

The "after" text strings may be completely different from the "before" text strings or they may be the same. In addition the text strings may have different  text string before and after and  possibly different file extensions for the text strings. Therefore the matching criteria for separation and transfer would be the <CRLR>. The best way to view the <CRLR> characters we have inserted in the attached text file is to open the text file in NotePad++ then go to View/Show Symbol and tick the option called "Show End Of Line" . This will display all the end of line characters.  You will see that all strings follow the same ordered pattern.

The text files all have exactly the same order. i.e. a text "before"  text string is terminated by a <CRLR> the "after" text string is terminated by a <CRLR> then a single <CRLR> on the next line to separate the text string pair from the next text string pair and then the next text string pair is repeated.   This continues all the way down to the end  of the text file and is terminated by a single <CRLR> after the last text string.

As an example I have provided a dropbox download link for a zip file containing an example of both the text file and the excel file output  that would be generated by the script.

Download link
https://www.dropbox.com/s/9zmfgnyny9yg9cp/TextStringSepartor.zip?dl=0

I have only included 3 string pairs (that is 6 strings) only, however our text files could have several thousand pairs for separation.

What we would like is for the script to be able to process all text files in a folder and subfolders. We would specify the path to the path to be processed in the powershell script and it would then process all folders and subfolders. This would save us time having to run individual processes for each text file. It would be good if the output excel sheet could be placed in the same folder as the text file it was created from and name by the same name as the original text file.

I have windows 7 computer with powershell 3 ISE installed.  I look forward to seeing what you can come up with to solve this issue.

Download link for sample files:
https://www.dropbox.com/s/9zmfgnyny9yg9cp/TextStringSepartor.zip?dl=0


Thank you for your interest and taking the time to read this.

best regards

John
0
Comment
Question by:boltweb
  • 5
  • 4
9 Comments
 
LVL 17

Expert Comment

by:Learnctx
ID: 40623600
I've hacked something together. The way it works is to assume based off your example file that the file will always have a before and after combo 1 line after each other and nothing else (no comments, headers, etc). The script imports the contents of the file stripping out empty lines and then iterates through the file checking whether the line is odd or even.

Array item 0 = even (as far as mod is concerned) .. becomes a before
Array item 1 = odd .. becomes an after
Array item 2 = even .. becomes an before
Array item 3 = odd .. becomes an after
And so on.

Results are exported to a CSV which can be opened in Excel.

Param (
	[Parameter(Mandatory=$true,Position=0)]
	[string]$Path = $null,
	[Parameter(Mandatory=$false,Position=1)]
	[string]$Outfile = "results.csv",
)

$Contents = Get-Content -Path $Path | ? {$_ -and $_ -match "\w"}

$ArrResults = @()

0..$Contents.GetUpperBound(0) | % {
	if (($_ % 2) -eq 0)
	{
		$objValues = "" | Select Before, After
		$ObjValues.Before=$a[$_]
		$objValues.After=$a[$_+1]
		$ArrResults += $objvalues
	}
}
$ArrResults | Export-Csv -NoType "results.csv"

Open in new window


Usage is simple. Save as a .ps1. Run as any other script. Accepts a named or positional parameter for the file path to import. There is an outfile parameter which defaults to results.csv.

script.ps1 -Path c:\sourceTextFile.txt -Outfile c:\results.csv
0
 
LVL 1

Author Comment

by:boltweb
ID: 40624060
Hello Learnctx,
thank you for your reply. Yes you are correct there are no comments, no headers and no footers in the text files to be processed. There are  just the 2 string pairs (before and after) and a single carriage return between the pairs as shown in the sample.


I hope you can give me some assistance on running your script. You have used some technical terms which I do not properly undestand since I have no experience or knowledge of programming.  Normally when I run a powershell script that has been written form me I run it from Powershell ISE.  When I open the script from within the ISE there is normally a line for me to enter  the path to the parent directory that contains the folders and subfolders to be processed. Then an output path which would mirror the folder structure of the input path and output the processed files. I normally just add those 2 paths and then click the RUN button.    Can I run your script in the same way? If not where do I input the paths in your script?  Thank you for writing this I think you've come up a  ingenious solution by using odd and even values for the logic of the script.

Best regards

John

I don't know how to run a powershell script from a command line.
0
 
LVL 17

Expert Comment

by:Learnctx
ID: 40624109
When you execute the script it will prompt you to enter the path to a file. There was an error above I fixed and I've included an alternative where you can edit the file paths in ISE.

Param (
	[Parameter(Mandatory=$true,Position=0)]
	[string]$Path = $null,
	[Parameter(Mandatory=$false,Position=1)]
	[string]$Outfile = "results.csv"
)

$Contents = Get-Content -Path $Path | ? {$_ -and $_ -match "\w"}

$ArrResults = @()

0..$Contents.GetUpperBound(0) | % {
	if (($_ % 2) -eq 0)
	{
		$objValues = "" | Select Before, After
		$ObjValues.Before=$a[$_]
		$objValues.After=$a[$_+1]
		$ArrResults += $objvalues
	}
}
$ArrResults | Export-Csv -NoType -Path $Outfile

Open in new window


That said if you would prefer to hard code the path into the script as well you can do that.

$Path = "C:\Pathtofile\SourceFile.txt"
$Outfile = "C:\Pathtoresults\results.csv"

$Contents = Get-Content -Path $Path | ? {$_ -and $_ -match "\w"}

$ArrResults = @()

0..$Contents.GetUpperBound(0) | % {
	if (($_ % 2) -eq 0)
	{
		$objValues = "" | Select Before, After
		$ObjValues.Before=$a[$_]
		$objValues.After=$a[$_+1]
		$ArrResults += $objvalues
	}
}
$ArrResults | Export-Csv -NoType -Path $Outfile

Open in new window

0
Best Practices: Disaster Recovery Testing

Besides backup, any IT division should have a disaster recovery plan. You will find a few tips below relating to the development of such a plan and to what issues one should pay special attention in the course of backup planning.

 
LVL 1

Author Comment

by:boltweb
ID: 40624275
Hello, I tried running the script. It ran however I received these errors which seem to be for every record. When I opened the CSV file the is only 2 words  "before" and "after"   Any ideas what is happening?


Cannot index into a null array.
At C:\Work\SCM\Dev\Scripts\PowershellScripts\DicFileToCSV.ps1:13 char:3
+         $ObjValues.Before=$a[$_]
+         ~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (:) [], RuntimeException
    + FullyQualifiedErrorId : NullArray
 
Cannot index into a null array.
At C:\Work\SCM\Dev\Scripts\PowershellScripts\DicFileToCSV.ps1:14 char:3
+         $objValues.After=$a[$_+1]
+         ~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (:) [], RuntimeException
    + FullyQualifiedErrorId : NullArray
0
 
LVL 17

Accepted Solution

by:
Learnctx earned 500 total points
ID: 40624763
I do indeed. I forgot to replace $a with $contents. Modified below.

## Input file
$Path = "C:\Pathtofile\SourceFile.txt"
## Output CSV
$Outfile = "C:\Pathtoresults\results.csv"

## Get the contents of the input file excluding blank lines and lines which do not contain an alpha numeric character
$Contents = Get-Content -Path $Path | ? {$_ -and $_ -match "\w"}

## Create an empty array
$ArrResults = @()

## From array item 0 until the last array item
0..$Contents.GetUpperBound(0) | % {
	## Check if the number is odd or even
	if (($_ % 2) -eq 0)
	{
		## If even (including 0) then get the before and after values
		$objValues = "" | Select Before, After
		$ObjValues.Before=$Contents[$_]
		$objValues.After=$Contents[$_+1]
		## Add the result to the results array
		$ArrResults += $objvalues
	}
}
## Export the results to CSV.
$ArrResults | Export-Csv -NoType -Path $Outfile

Open in new window

0
 
LVL 1

Author Comment

by:boltweb
ID: 40625512
Hi Learnctx,   I have just run the new version and it worked perfectly.    This is a fantastic little script that will save me a lot of time.  Thank you for writing this.

JohnB
0
 
LVL 1

Author Closing Comment

by:boltweb
ID: 40625520
Solution works perfectly. Thank you.    

After seeing the excel sheet I realised that I could use another  powershell script that could do the reverse i.e. create the text file from an excel sheet. The reason is that it is much easier to edit and QA check the before and after strings in Excel. Then once the QA checks / mods and updates are done I could use the second powershell script to convert  the Excel file back to the text file with the same format as the text file sample I sent to you (i.e. without the the Before and After headings) and with the CRLR exactly as the text file sample.     If you think you could  do this let me know and  I could send you the link to the new EE  question. It will also be for 500 points.  

All the best

John
0
 
LVL 17

Expert Comment

by:Learnctx
ID: 40630165
It should be pretty straight forward to export back to the txt file.

$Path = "results.csv"
$Outfile = "file.txt"
$Import = Import-CSV -Path $Path
Foreach ($Item in $Import)
{
   $Item.Before | Out-File -Append -Path $Outfile
   $Item.After | Out-File -Append -Path $Outfile
   "" | Out-File -Append -Path $Outfile
}

Open in new window

0
 
LVL 1

Author Comment

by:boltweb
ID: 40630316
Hello, I tried running it but there were a long list of error messages. Did it work for you when you tested on the files I sent?

Below is a sample of the messages.  I also need to make certain that the character encoding in the text file is UTF8 without BOM.   UTF8 with BOM won't work.

Here is the output.   Should I raise another Question at EE to fix the bug?

Out-File : A parameter cannot be found that matches parameter name 'Path'.
At C:\Work\SCM\Dev\Scripts\PowershellScripts\CSVToDicFile.ps1:6 char:36
+    $Item.Before | Out-File -Append -Path $Outfile
+                                    ~~~~~
    + CategoryInfo          : InvalidArgument: (:) [Out-File], ParameterBindingException
    + FullyQualifiedErrorId : NamedParameterNotFound,Microsoft.PowerShell.Commands.OutFileCommand
 
Out-File : A parameter cannot be found that matches parameter name 'Path'.
At C:\Work\SCM\Dev\Scripts\PowershellScripts\CSVToDicFile.ps1:7 char:35
+    $Item.After | Out-File -Append -Path $Outfile
+                                   ~~~~~
    + CategoryInfo          : InvalidArgument: (:) [Out-File], ParameterBindingException
    + FullyQualifiedErrorId : NamedParameterNotFound,Microsoft.PowerShell.Commands.OutFileCommand
 
Out-File : A parameter cannot be found that matches parameter name 'Path'.
At C:\Work\SCM\Dev\Scripts\PowershellScripts\CSVToDicFile.ps1:8 char:26
+    "" | Out-File -Append -Path $Outfile
0

Featured Post

Does Powershell have you tied up in knots?

Managing Active Directory does not always have to be complicated.  If you are spending more time trying instead of doing, then it's time to look at something else. For nearly 20 years, AD admins around the world have used one tool for day-to-day AD management: Hyena. Discover why

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Microsoft Windows Server Update Service (WSUS) is free for everyone, but it lacks of some desirable features like send an e-mail to the administrator with the status of all computers on the WSUS server. This article is based on my PowerShell script …
This article explains how to prepare an HTML email signature template file containing dynamic placeholders for users' Azure AD data. Furthermore, it explains how to use this file to remotely set up a department-wide email signature policy in Office …
A short tutorial showing how to set up an email signature in Outlook on the Web (previously known as OWA). For free email signatures designs, visit https://www.mail-signatures.com/articles/signature-templates/?sts=6651 If you want to manage em…
Finds all prime numbers in a range requested and places them in a public primes() array. I've demostrated a template size of 30 (2 * 3 * 5) but larger templates can be built such 210  (2 * 3 * 5 * 7) or 2310  (2 * 3 * 5 * 7 * 11). The larger templa…

831 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question