Solved

Modification of the Powershell script further

Posted on 2014-09-29
11
170 Views
Last Modified: 2014-10-01
This is a continuation of the requests for modifications of the Powershell script.

The unique lines are the following..

ST*850*0001~
BEG*00*SA*4610173073**20140929~
...

PO1**10*CA***UK*12345678912345~
CTP**UCP*10.20~
PID*F*08***Desc 1~
....
the above 3 lines can be repeated over and over..

IEA*1*000001234~

Script to work with:

$InputFile = "C:\Data\original.txt"
$OutputFile = "C:\Data\new.txt"

 Function ParseText ($OutputFile,$InputFile){
 Begin{
             $line,$poline,$totline,$total,$i = $null,$null,$null,$null,0
             Set-Content $OutputFile $null
             $Data = Get-Content -Path $InputFile | ?{$_ -match "(BEG\*)|(PO1\*)|(CTP\*)|(PID\*)|(IEA\*)"}
       }
 Process{
       $Data | % {
       #Check the end of data set
             If ($_ -match "IEA\*"){
                   If($poline -ne $null -and $total -ne $null){
                         $Totline = "Total: `$$Total"
                         Write-host "This is Total `$$total"
                         #Write the data collection to output file
                         "$poline$line`r`n$totline`r`n" | out-file $OutputFile -Encoding UTF8 -Append
                         #reset the variables
                         $line,$poline,$totline,$total,$i = $null,$null,$null,$null,0
                   }
             }
             #Collect the PO Number
             If($_ -match "(SA\*)(\d{3,10})"){
                   $poline = "PONumber: $($Matches[2])"
             }
             #Collect  G80 details and following G81 Description
             If($_ -match "(PO1\*{1,})(\d{1,})\*\w{2}\*{1,}\w{2}\*(\d{1,})"){
                                     $count = $Matches[2]
                                     $UPC = $Matches[3]
                                     }
                                     If($_ -match "(CTP\*{1,}).*\*(\d{1,}(\.\d{1,})?)"){
                   $i++
                   $total += [double]$count * [double]$Matches[2]
                   $Desc = $Data[([Array]::IndexOf($Data,$_)+1)] -Replace "PID\*\w\*{1,}(\d+\*{1,})?|<NL>"
                   [String[]]$line += "`r`nItem $i`: UPC: $UPC, Desc: $Desc, Qty: $count Price  `$$($Matches[2])"
             }
       }
  }
 }

 If(Test-Path $InputFile){
  ParseText $OutputFile $InputFile
 }

The output desired is similar to this:
PONumber: 235454654
Item 1: UPC: 12345678945612, Desc: Desc 1, Qty: 88 Price  $10.00
Item 2: UPC: 12345678921111, Desc: Desc 2, Qty: 88 Price  $10.50
Item 3: UPC: 12345678955555, Desc: Desc 3, Qty: 88 Price  $12.00
Item 4: UPC: 12345644545455, Desc: Desc 4, Qty: 88 Price  $5.00
Item 5: UPC: 12345689989889, Desc: Desc 6, Qty: 88 Price  $1.00
Total: $100.00

PONumber: 235454100
Item 1: UPC: 12345678945612, Desc: Desc 1, Qty: 88 Price  $10.00
Item 2: UPC: 12345678921111, Desc: Desc 2, Qty: 88 Price  $10.50
Item 3: UPC: 12345678955555, Desc: Desc 3, Qty: 88 Price  $12.00
Item 4: UPC: 12345644545455, Desc: Desc 4, Qty: 88 Price  $5.00
Item 5: UPC: 12345689989889, Desc: Desc 6, Qty: 88 Price  $1.00
Total: $100.00

The script is not working well ....

If there is one PO, and perhaps 5 lines, it will duplicate the first Description, Qty and Price line 5 times, and not reflect the proper data in the original.txt file which corresponds to each line.  The UPC seems to be recognized fine, it's just the rest of the data associated with each line.

What modification would I need to make to the Powershell script for the desired output?
0
Comment
Question by:100questions
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 6
  • 4
11 Comments
 
LVL 40

Expert Comment

by:footech
ID: 40351076
If this is building off previous posts, you should include a link to the post(s), instead of leaving it for others to search through your previous questions to see what you might be referring to.

There's not enough sample data here to know what's different and where (at least I'm not seeing it).  I'll repeat my earlier advice.
Go through all of your data and work out the variations that can occur.  You can post all the lines that vary here (in fact it'd probably be best).  Group them according what data should be extracted from the line.  For example, here's a couple examples of the PO line (you should note where the PO number is in each case):
BEG*00*SA*1234567**20140917<NL>
G20*N*20140902*98X123456~

At least one sample for each variation would be good.

Variations can include:
 -placement of a desired portion within a line
 -what characters are surrounding the portion
 -how the portion itself can vary, like
    -what kind of characters it can include (e.g. numbers only, letters and numbers, etc.)
    -the length of the portion
 All these things are needed to come up with a regex pattern that will match your data under various circumstances.  Otherwise you're stuck with modifying the regex pattern again and again each time something slightly different occurs.
0
 

Author Comment

by:100questions
ID: 40351106
@subsun - since you were very much involved in previous solutions, I would also welcome assistance from you.
0
 
LVL 40

Expert Comment

by:Subsun
ID: 40352730
To confirm..  PONumber  is 4610173073 from below line..
BEG*00*SA*4610173073**20140929~


UPC: 12345678912345 & Qty: 10 from below line..
PO1**10*CA***UK*12345678912345~

Price  $10.20 from below line..
CTP**UCP*10.20~

Desc: Desc 1 from below line..
PID*F*08***Desc 1~

And the below line comes after each data set..
IEA*1*000001234~

Is that correct?
0
Backup Solution for AWS

Read about how CloudBerry Backup fully integrates your backups with Amazon S3 and Amazon Glacier to provide military-grade encryption and dramatically cut storage costs on any platform.

 

Author Comment

by:100questions
ID: 40352776
Yes, thanks for looking at this.   All is good and to be more specific,  If I might add that each data set starts with the line that starts with ISA*00* ....  and ends with a line that starts with IEA*1*...
Hope this helps.
0
 
LVL 40

Expert Comment

by:Subsun
ID: 40352843
I think there wont be a problem with script unless you have duplicate in the input file for CTP line. Do you expect duplicates for CTP line in the input file?

CTP**UCP*10.20~
0
 

Author Comment

by:100questions
ID: 40352993
Yes, it definitely will happen since some products will have the same price.
The price found in the CTP should always correspond to the PO1 information which is before it.
That's probably why there are issues with the output.
Is this something that could be fixed?
0
 
LVL 40

Expert Comment

by:Subsun
ID: 40353373
It can be fixed.. Will there be always a PID line after the CTP line?
0
 

Author Comment

by:100questions
ID: 40353492
Yes, there will, thanks very much.
0
 
LVL 40

Accepted Solution

by:
Subsun earned 500 total points
ID: 40353536
Try..
$InputFile = "C:\Data\original.txt"
$OutputFile = "C:\Data\new.txt"

Function ParseText ($OutputFile,$InputFile){
  Begin{
    $line,$poline,$totline,$total,$i = $null,$null,$null,$null,0
    Set-Content $OutputFile $null
    $Data = Get-Content -Path $InputFile | ?{$_ -match "(BEG\*)|(PO1\*)|(CTP\*)|(PID\*)|(IEA\*)"}
  }
  Process{
  $Data | % {
     #Check the end of data set
     If ($_ -match "IEA\*"){
      If($poline -ne $null -and $total -ne $null){
        $Totline = "Total: `$$Total"
        Write-host "This is Total `$$total"
        #Write the data collection to output file
        "$poline$line`r`n$totline`r`n" | out-file $OutputFile -Encoding UTF8 -Append
        #reset the variables
        $line,$poline,$totline,$total,$i = $null,$null,$null,$null,0
      }
  }
  #Collect the PO Number
     If($_ -match "(SA\*)(\d{3,10})"){
      $poline = "PONumber: $($Matches[2])"
     }
     #Collect  G80 details and following G81 Description
     If($_ -match "(PO1\*{1,})(\d{1,})\*\w{2}\*{1,}\w{2}\*(\d{1,})"){
      $count = $Matches[2]
      $UPC = $Matches[3]
     }
     If($_ -match "(CTP\*{1,}).*\*(\d{1,}(\.\d{1,})?)"){
     $Price = $Matches[2]
		 $total += [double]$count * [double]$Price
		 }
     If($_ -match "PID\*"){
      $i++
      $Desc = $_ -Replace "PID\*\w\*{1,}(\d+\*{1,})?|<NL>|~"
      [String[]]$line += "`r`nItem $i`: UPC: $UPC, Desc: $Desc, Qty: $count Price  `$$Price"
     }
    }
  }
}

If(Test-Path $InputFile){
  ParseText $OutputFile $InputFile
}

Open in new window

PS : As I explained earlier (also explained by my fellow expert footech), you could come up with the possible variations of the input files so that you could get a single script to parse all types of input file. Else you will end up having separate scripts for each set of input files...
0
 

Author Comment

by:100questions
ID: 40354445
Thanks for the advice.  I only have 1 more left which I've posted after this.  Perhaps if I have a handful in the future I will group the requests together.  For now the separate Powershell files are working well and are serving the purpose for which I need.  
As for this solution, it works very well.  Excellent work.
0
 

Author Closing Comment

by:100questions
ID: 40354446
Excellent work.
0

Featured Post

Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I thought I'd write this up for anyone who has a request to create an anonymous whistle-blower-type submission form created using SharePoint 2010 (this would probably work the same for 2013). It's not 100% fool-proof but it's as close as you can get…
A recent project that involved parsing Tableau Desktop and Server log files to extract reusable user queries for use in other systems. I chose to use PowerShell to gather the data, and SharePoint to present it...
Exchange organizations may use the Journaling Agent of the Transport Service to archive messages going through Exchange. However, if the Transport Service is integrated with some email content management application (such as an antispam), the admini…
Finding and deleting duplicate (picture) files can be a time consuming task. My wife and I, our three kids and their families all share one dilemma: Managing our pictures. Between desktops, laptops, phones, tablets, and cameras; over the last decade…

696 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question