Link to home
Start Free TrialLog in
Avatar of D B
D BFlag for United States of America

asked on

PowerShell Query to calculate MD5 and add to an audit file.

Actually two issues. I want to calculate checksum of files I'm receiving based on the contents of an audit file we receive. The audit file has a filename, row count, MD5 checksum from the source and I want to calculate the MD5 on our end and add it to the audit file.
First issue is I'm trying to read the file as a stream, but getting the error:
Get-FileHash : A parameter cannot be found that matches parameter name 'inputstream'.
The code is:
$stream = [System.IO.File]::OpenRead('D:\Input\account_data.txt')
$hash = Get-FileHash -inputstream $stream -Algorithm MD5
$stream.Close()

Open in new window

I am running v4.0 of PowerShell. It works using a file path, but some of our files could be quite large and I'd rather use a stream to avoid memory issues.

The second is how to add the calculated value to the end of the proper row and maintain the original format of the audit file (with the exception of adding our calculated checksum.

The code I'm using for that is:
$AuditContents = import-csv $AuditPathAndName -Delimiter "|" -Header filename,rowcount,source_checksum

ForEach ($row in $AuditContents) {
    $file = $DataInputPath + $row.filename
    $fileHash = Get-FileHash -Path $file -Algorithm MD5
    $hash = $fileHash.Hash
    # add code here to add 'our' checksum to the row.
    }

Open in new window

So, if a certain row of incoming data had the following:
account_data.txt|1234|593D6592BD9B7F9174711AB136F5E751
and I'm processing account_data.txt, then after running my code, I would want the audit file to contain:
account_data.txt|1234|593D6592BD9B7F9174711AB136F5E751|593D6592BD9B7F9174711AB136F5E751

There are about 10 files listed in the audit file that will be processed. It is also important that each row in the audit file be terminated with just a linefeed if possible.
Avatar of J0rtIT
J0rtIT
Flag of Venezuela, Bolivarian Republic of image

Just change lines 3 and 4.

[CmdletBinding()]
param(
    [Parameter(Mandatory=$false,Position=0,ValueFromPipeline=$true)]$AuditPathAndName="D:\lib\Desktop\text.csv",
    [Parameter(Mandatory=$false,Position=1,ValueFromPipeline=$true)]$TxtFilePath="D:\lib\Desktop\text.txt"
)

function Get-Hash{
    [CmdletBinding()]
    param(
        [Parameter(Mandatory=$true,Position=0,ValueFromPipeline=$true)]$AuditPathName,
        [Parameter(Mandatory=$true,Position=1,ValueFromPipeline=$true)]$FilePath
    )
    begin{
        if(!(Test-Path "$AuditPathName")){
            Write-Error "The CSV file couldn't be found at $AuditPathName"
            exit
        }

        if(!(Test-Path "$FilePath")){
            Write-Error "The file doesn't exits $FilePath"
            exit
        }
    }
    process{
        try{
            $stream = [System.IO.File]::OpenRead('D:\lib\Desktop\text.txt')
            $hash = Get-FileHash -inputstream $stream -Algorithm MD5
            $stream.Close()
        }
        catch{
            Write-Error "There was an error while calculating the MD5 hash with the message: $($_.exception.Message)"
            exit
        }
    }
    end{
        return $hash
    }
}

$AuditContents = import-csv $AuditPathAndName -Delimiter '|' -Header filename,rowcount,source_checksum

ForEach ($row in $AuditContents) {
    $Hash=  Get-Hash $AuditPathAndName $TxtFilePath
    $row| Add-Member -Name "InternalCheckSum" -Value $Hash.Hash -MemberType NoteProperty
}

$AuditContents | ConvertTo-Csv -Delimiter '|' -NoTypeInformation | % {$_.Replace('"','')} | Out-File "D:\lib\Desktop\here2.csv"

Open in new window

This works for a single file and a single row

And about error1, the file must exist and be accessible to the command to work.
Avatar of D B

ASKER

A few tweaks (you have 'D:\lib\Desktop\text.txt' hard-coded as the file to compute the hash from but the filename should be coming from the audit file. I'm sure that was just an oversight and that is supposed to be $FilePath).
I'm not sure why two parameters are required to the function. I would think all that is required is the full path to the file I want to acquire the hash from. The only thing I see being done with $AuditPathName is checking if it exists, and if it didn't the function would never be called, since the foreach is passing rows from it.

I think I can make my existing code work from this, only a couple of things: Is there a way to create an output file that has row terminator of just LF. The output from Out-File terminates rows with CRLF. I've currently got the following as the last line of code:
$AuditContents | ConvertTo-Csv -Delimiter '|' -NoTypeInformation | % {$_.replace('"','')} | select -skip 1 | Out-File "D:\NewAudit.txt"

Open in new window

to supress the header row.
ASKER CERTIFIED SOLUTION
Avatar of J0rtIT
J0rtIT
Flag of Venezuela, Bolivarian Republic of image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of D B

ASKER

I had already got it working. My last was just mainly a comment except for how to terminate lines with just a linefeed and not CRLF. Any help there?
I don't really know what you mean by a "lineFeed"
Do you want to have all of that in a single line?
Avatar of D B

ASKER

Each line of text data is terminated with either a CRLF (typically in Windows systems) or just a LF (Unix systems). This data comes in with just LF but when I write the calculated MD% value and create a new file, out-file terminates each line with a CRLF. I want to terminate it with just a line feed.
Avatar of D B

ASKER

This worked for calculating the hash. I was able to change my format file to expect CR-LF instead of just LF, so that part is okay. Wish I could have figured out how to write the output with just a linefeed.