Replacing bunch of string characters in a text file using Powershell

Working on a Powershell code which will replace a set of special characters ('¿', 'Ù', 'À','Ú','³','Ä')from a text file in a folder. Each text file is 300MB in size and the characters are repeated multiple times in it. I tried doing in manually but since there are many such files, it's almost impossible. I am using a Win 7 OS and Powershell Ver 3. Attaching the code which I have.
The issue is that it creates a new file when I run the code (New_NOV_1995.txt) but it doesn't change any character in the new file as mentioned in the code. Help very much Appreciated.

$lookupTable = @{
'¿' = '|'
'Ù' = '|'
'À' = '|'
'Ú' = '|'
'³' = '|'
'Ä' = '-'
}

$original_file = 'C:\FilePath\NOV_1995.txt'
$destination_file =  'C:\FilePath\NOV_1995_NEW.txt'

Get-Content -Path $original_file | ForEach-Object {
    $line = $_

    $lookupTable.GetEnumerator() | ForEach-Object {
        if ($line -match $_.Key)
        {
            $line = $line -replace $_.Key, $_.Value
        }
    }
   $line
} | Set-Content -Path $destination_file
Flex TronDeveloperAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

David Johnson, CD, MVPRetiredCommented:
function Replace-Characters
{
    <#
        .SYNOPSIS
        Short Description
        .DESCRIPTION
        Detailed Description
        .EXAMPLE
        Remove-Something
        explains how to use the command
        can be multiple lines
        .EXAMPLE
        Remove-Something
        another example
        can have as many examples as you like
    #>
    [CmdletBinding()]
    param
    (
        [Parameter(Mandatory=$false, Position=0)]
        [Object]
        $lookupTable = @{ '¿' = '|';'Ù' = '|';'À' = '|';'Ú' = '|';'³' = '|';'Ä' = '-'},
        
        [Parameter(Mandatory=$false, Position=1)]
        [System.String]
        $original_file = 'C:\FilePath\NOV_1995.txt',
        
        [Parameter(Mandatory=$false, Position=2)]
        [System.String]
        $destination_file = 'C:\FilePath\NOV_1995_NEW.txt',
        
        [Parameter(Mandatory=$false, Position=3)]
        [Object]
        $contents = (Get-Content -Path $original_file),
        
        [Parameter(Mandatory=$false, Position=4)]
        [Object]
        $output=@()
    )
    
    
    
    foreach ($line in $contents){
        $lookupTable.GetEnumerator() | ForEach-Object {
            if ($line -match $_.Key) {
                $line = $line -replace $_.Key, $_.Value
            }
        }
        $output += $line
    } 
    $output | out-file -FilePath $destination_file
}

Open in new window


$filelist = get-childitem c:\inputdir
foreach ($file in filelist){
$original_file = $file.fullname
$destination_file = $original_file.Substring(0,$original_file.length-4) + ' NEW.txt'
replace-characters $original_file $destination_file
}
Flex TronDeveloperAuthor Commented:
Thanks David,
I took your code in Grey  and  changed the file path for source and destination,. But when I execute it...nothing happens in the powershell terminal . Attaching the sample text file.
NOV_1995.txt
David Johnson, CD, MVPRetiredCommented:
function Replace-Characters 
{
     
    <#
        .SYNOPSIS
        Short Description
        .DESCRIPTION
        Detailed Description
        .EXAMPLE
        Remove-Something
        explains how to use the command
        can be multiple lines
        .EXAMPLE
        Remove-Something
        another example
        can have as many examples as you like
    #>
    [CmdletBinding()]
    param     (
        $lookupTable = @{ '¿' = '|';'Ù' = '|';'À' = '|';'Ú' = '|';'³' = '|';'Ä' = '-'},
     [string]  $original_file,
     [string]  $destination_file
        )
    $output=@()
    $contents = Get-Content -Path $original_file
    $linecounter = 0
    $max = $contents.Count
    
    foreach ($line in $contents){
    $linecounter++
    write-progress -Activity 'Checking lines' -status "line # $linecounter of $max" -PercentComplete ($linecounter/$max * 100)
          $lookupTable.GetEnumerator() | ForEach-Object {
            if ($line -match $_.Key) {
                $line = $line -replace $_.Key, $_.Value
            }
        }
        $output += $line
    } 
    $output | out-file -FilePath $destination_file
}
replace-characters -original_file $original_file -destination_file $destination_file

Open in new window

$filelist = get-childitem c:\filepath\
foreach ($file in $filelist){
$original_file = $file.fullname
$destination_file = $original_file.Substring(0,$original_file.length-4) + '_NEW.txt'
$destination_file
G:\Documents\WindowsPowershell\Scripts\replace-characters.ps1 -original_file $original_file -destination_file $destination_file
} 

Open in new window

OWASP: Forgery and Phishing

Learn the techniques to avoid forgery and phishing attacks and the types of attacks an application or network may face.

David Johnson, CD, MVPRetiredCommented:
David Johnson, CD, MVPRetiredCommented:
changed
 $output | out-file -FilePath $destination_file -Encoding ascii

so it doesn't output in UTF8
NOV_1995_NEW.txt
Flex TronDeveloperAuthor Commented:
Thanks...But will this work in the powershell window.
Suppose I save your first code as string_replacer.ps1. and the second code as string_replacer2.ps1.
Will it work

ps> .\string_replacer1.ps1

I am not sure how to execute your code ?

Thanks
David Johnson, CD, MVPRetiredCommented:
add your code to the bottom of the original script
  function Update-Characters 
{
     [CmdletBinding()]
    param     (
        $lookupTable = @{ '¿' = '|';'Ù' = '|';'À' = '|';'Ú' = '|';'³' = '|';'Ä' = '-'},
        [string] $source,
        [string] $destination
        )
    $output=@()
    if(($source -ne $null) -or (test-path -path $source -IsValid) -eq $true)
    {
    write-debug('Source: '+$source)
    $contents = Get-Content $source
        $linecounter = 0
        $max = $contents.Count
    
        foreach ($line in $contents){
            $linecounter++
            write-progress -Activity 'Checking lines' -status "line # $linecounter of $max" -PercentComplete ($linecounter/$max * 100)
            $lookupTable.GetEnumerator() | ForEach-Object {
                if ($line -match $_.Key) {
                    $line = $line -replace $_.Key, $_.Value
                }
            }
            $output += $line
        } 
        $output | out-file -FilePath $destination -Encoding ascii
    }
    else {
        write-ouput ('Original File:' + $source + ' not found') 
        break
    }

}

 
 $sourcedir = 'C:\filepath'
$filelist = Get-ChildItem $sourcedir
foreach ($file in $filelist){
    $filename = split-path $file -Leaf
    $filepath = split-path $file -Parent
    $destination = $filename.Substring(0,$filename.length-4)
    $destination  += '_NEW.txt'
    $destination = $filepath + $destination
  
    write-output ('Source: '+$file + ' Destination:'+ $destination)
    Update-Characters -source $file.fullname -destination $destination
    }

Open in new window

footechCommented:
I tested the code in the original question and it worked fine, replacing all the characters.  However, for performance reasons I would suggest making greater use of regex capabilities, searching for any matching character vs. iteratively searching for a one specific character.
The simplest example is
Get-Content -Path $original_file -ReadCount 0 | ForEach-Object {
    $_ -replace '¿|Ù|À|Ú|³','|' -replace 'Ä','-'
} | Set-Content -Path $destination_file

Open in new window


It is possible to take your lookup (hash) table and generate the regex patterns from that, but it's a bit of a mess.
$lookupTable = @{
'¿' = '|'
'Ù' = '|'
'À' = '|'
'Ú' = '|'
'³' = '|'
'Ä' = '-'
}
$original_file = 'C:\FilePath\NOV_1995.txt'
$destination_file =  'C:\FilePath\NOV_1995_NEW.txt'
$patterns = $lookupTable.GetEnumerator() | Select -ExpandProperty Value -Unique | ForEach `
 {
    $value = $_
    [pscustomobject] @{ 
                pattern = ($lookupTable.GetEnumerator() | Where { $_.value -eq $value } | Select -ExpandProperty Name) -join "|"
                replacement = $value
                }
 }
Get-Content -Path $original_file -ReadCount 0 | ForEach-Object {
    $line = $_
    foreach ($pattern in $patterns)
    {
        Write-Host "Matching pattern ""$($pattern.pattern)"" and replacing with ""$($pattern.replacement)""" -ForegroundColor Yellow
        $line = $line -replace $pattern.pattern,$pattern.replacement
    }
    $line
} | Set-Content -Path $destination_file

Open in new window

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Flex TronDeveloperAuthor Commented:
Both solutions by footech and David work. Thankyou
Flex TronDeveloperAuthor Commented:
Thanks Footech very much for your inputs.
Is there a way we can loop this code for all text files in  a directory Folder ?
footechCommented:
Of course.  David shows an example of that in his code on lines 37-48 that takes care of dynamically naming the output files.  I've reworked it a bit and adjusted my code accordingly.
$lookupTable = @{
'¿' = '|'
'Ù' = '|'
'À' = '|'
'Ú' = '|'
'³' = '|'
'Ä' = '-'
}

$patterns = $lookupTable.GetEnumerator() | Select -ExpandProperty Value -Unique | ForEach `
 {
    $value = $_
    [pscustomobject] @{ 
                pattern = ($lookupTable.GetEnumerator() | Where { $_.value -eq $value } | Select -ExpandProperty Name) -join "|"
                replacement = $value
                }
 }

$sourcedir = 'C:\filepath'
Get-ChildItem $sourcedir -File | ForEach `
{
    $filepath = Split-Path $_.FullName -Parent
    $destinationfile = "$($_.BaseName)_NEW$($_.Extension)"
    $destination = Join-Path $filepath $destinationfile
  
    Write-Host "Source: $($_.Name)  Destination: $destinationfile"

    Get-Content -Path $_.FullName -ReadCount 0 | ForEach-Object {
        $line = $_
        foreach ($pattern in $patterns)
        {
            Write-Host "Matching pattern ""$($pattern.pattern)"" and replacing with ""$($pattern.replacement)""" -ForegroundColor Yellow
            $line = $line -replace $pattern.pattern,$pattern.replacement
        }
        $line
    } | Set-Content -Path $destination
}

Open in new window

It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Powershell

From novice to tech pro — start learning today.