Replacing bunch of string characters in a text file using Powershell

Flex Tron
Flex Tron used Ask the Experts™
on
Working on a Powershell code which will replace a set of special characters ('¿', 'Ù', 'À','Ú','³','Ä')from a text file in a folder. Each text file is 300MB in size and the characters are repeated multiple times in it. I tried doing in manually but since there are many such files, it's almost impossible. I am using a Win 7 OS and Powershell Ver 3. Attaching the code which I have.
The issue is that it creates a new file when I run the code (New_NOV_1995.txt) but it doesn't change any character in the new file as mentioned in the code. Help very much Appreciated.

$lookupTable = @{
'¿' = '|'
'Ù' = '|'
'À' = '|'
'Ú' = '|'
'³' = '|'
'Ä' = '-'
}

$original_file = 'C:\FilePath\NOV_1995.txt'
$destination_file =  'C:\FilePath\NOV_1995_NEW.txt'

Get-Content -Path $original_file | ForEach-Object {
    $line = $_

    $lookupTable.GetEnumerator() | ForEach-Object {
        if ($line -match $_.Key)
        {
            $line = $line -replace $_.Key, $_.Value
        }
    }
   $line
} | Set-Content -Path $destination_file
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Top Expert 2016

Commented:
function Replace-Characters
{
    <#
        .SYNOPSIS
        Short Description
        .DESCRIPTION
        Detailed Description
        .EXAMPLE
        Remove-Something
        explains how to use the command
        can be multiple lines
        .EXAMPLE
        Remove-Something
        another example
        can have as many examples as you like
    #>
    [CmdletBinding()]
    param
    (
        [Parameter(Mandatory=$false, Position=0)]
        [Object]
        $lookupTable = @{ '¿' = '|';'Ù' = '|';'À' = '|';'Ú' = '|';'³' = '|';'Ä' = '-'},
        
        [Parameter(Mandatory=$false, Position=1)]
        [System.String]
        $original_file = 'C:\FilePath\NOV_1995.txt',
        
        [Parameter(Mandatory=$false, Position=2)]
        [System.String]
        $destination_file = 'C:\FilePath\NOV_1995_NEW.txt',
        
        [Parameter(Mandatory=$false, Position=3)]
        [Object]
        $contents = (Get-Content -Path $original_file),
        
        [Parameter(Mandatory=$false, Position=4)]
        [Object]
        $output=@()
    )
    
    
    
    foreach ($line in $contents){
        $lookupTable.GetEnumerator() | ForEach-Object {
            if ($line -match $_.Key) {
                $line = $line -replace $_.Key, $_.Value
            }
        }
        $output += $line
    } 
    $output | out-file -FilePath $destination_file
}

Open in new window


$filelist = get-childitem c:\inputdir
foreach ($file in filelist){
$original_file = $file.fullname
$destination_file = $original_file.Substring(0,$original_file.length-4) + ' NEW.txt'
replace-characters $original_file $destination_file
}
Flex TronDeveloper

Author

Commented:
Thanks David,
I took your code in Grey  and  changed the file path for source and destination,. But when I execute it...nothing happens in the powershell terminal . Attaching the sample text file.
NOV_1995.txt
Top Expert 2016

Commented:
function Replace-Characters 
{
     
    <#
        .SYNOPSIS
        Short Description
        .DESCRIPTION
        Detailed Description
        .EXAMPLE
        Remove-Something
        explains how to use the command
        can be multiple lines
        .EXAMPLE
        Remove-Something
        another example
        can have as many examples as you like
    #>
    [CmdletBinding()]
    param     (
        $lookupTable = @{ '¿' = '|';'Ù' = '|';'À' = '|';'Ú' = '|';'³' = '|';'Ä' = '-'},
     [string]  $original_file,
     [string]  $destination_file
        )
    $output=@()
    $contents = Get-Content -Path $original_file
    $linecounter = 0
    $max = $contents.Count
    
    foreach ($line in $contents){
    $linecounter++
    write-progress -Activity 'Checking lines' -status "line # $linecounter of $max" -PercentComplete ($linecounter/$max * 100)
          $lookupTable.GetEnumerator() | ForEach-Object {
            if ($line -match $_.Key) {
                $line = $line -replace $_.Key, $_.Value
            }
        }
        $output += $line
    } 
    $output | out-file -FilePath $destination_file
}
replace-characters -original_file $original_file -destination_file $destination_file

Open in new window

$filelist = get-childitem c:\filepath\
foreach ($file in $filelist){
$original_file = $file.fullname
$destination_file = $original_file.Substring(0,$original_file.length-4) + '_NEW.txt'
$destination_file
G:\Documents\WindowsPowershell\Scripts\replace-characters.ps1 -original_file $original_file -destination_file $destination_file
} 

Open in new window

CompTIA Security+

Learn the essential functions of CompTIA Security+, which establishes the core knowledge required of any cybersecurity role and leads professionals into intermediate-level cybersecurity jobs.

Top Expert 2016

Commented:
Top Expert 2016

Commented:
changed
 $output | out-file -FilePath $destination_file -Encoding ascii

so it doesn't output in UTF8
NOV_1995_NEW.txt
Flex TronDeveloper

Author

Commented:
Thanks...But will this work in the powershell window.
Suppose I save your first code as string_replacer.ps1. and the second code as string_replacer2.ps1.
Will it work

ps> .\string_replacer1.ps1

I am not sure how to execute your code ?

Thanks
Top Expert 2016
Commented:
add your code to the bottom of the original script
  function Update-Characters 
{
     [CmdletBinding()]
    param     (
        $lookupTable = @{ '¿' = '|';'Ù' = '|';'À' = '|';'Ú' = '|';'³' = '|';'Ä' = '-'},
        [string] $source,
        [string] $destination
        )
    $output=@()
    if(($source -ne $null) -or (test-path -path $source -IsValid) -eq $true)
    {
    write-debug('Source: '+$source)
    $contents = Get-Content $source
        $linecounter = 0
        $max = $contents.Count
    
        foreach ($line in $contents){
            $linecounter++
            write-progress -Activity 'Checking lines' -status "line # $linecounter of $max" -PercentComplete ($linecounter/$max * 100)
            $lookupTable.GetEnumerator() | ForEach-Object {
                if ($line -match $_.Key) {
                    $line = $line -replace $_.Key, $_.Value
                }
            }
            $output += $line
        } 
        $output | out-file -FilePath $destination -Encoding ascii
    }
    else {
        write-ouput ('Original File:' + $source + ' not found') 
        break
    }

}

 
 $sourcedir = 'C:\filepath'
$filelist = Get-ChildItem $sourcedir
foreach ($file in $filelist){
    $filename = split-path $file -Leaf
    $filepath = split-path $file -Parent
    $destination = $filename.Substring(0,$filename.length-4)
    $destination  += '_NEW.txt'
    $destination = $filepath + $destination
  
    write-output ('Source: '+$file + ' Destination:'+ $destination)
    Update-Characters -source $file.fullname -destination $destination
    }

Open in new window

Top Expert 2014
Commented:
I tested the code in the original question and it worked fine, replacing all the characters.  However, for performance reasons I would suggest making greater use of regex capabilities, searching for any matching character vs. iteratively searching for a one specific character.
The simplest example is
Get-Content -Path $original_file -ReadCount 0 | ForEach-Object {
    $_ -replace '¿|Ù|À|Ú|³','|' -replace 'Ä','-'
} | Set-Content -Path $destination_file

Open in new window


It is possible to take your lookup (hash) table and generate the regex patterns from that, but it's a bit of a mess.
$lookupTable = @{
'¿' = '|'
'Ù' = '|'
'À' = '|'
'Ú' = '|'
'³' = '|'
'Ä' = '-'
}
$original_file = 'C:\FilePath\NOV_1995.txt'
$destination_file =  'C:\FilePath\NOV_1995_NEW.txt'
$patterns = $lookupTable.GetEnumerator() | Select -ExpandProperty Value -Unique | ForEach `
 {
    $value = $_
    [pscustomobject] @{ 
                pattern = ($lookupTable.GetEnumerator() | Where { $_.value -eq $value } | Select -ExpandProperty Name) -join "|"
                replacement = $value
                }
 }
Get-Content -Path $original_file -ReadCount 0 | ForEach-Object {
    $line = $_
    foreach ($pattern in $patterns)
    {
        Write-Host "Matching pattern ""$($pattern.pattern)"" and replacing with ""$($pattern.replacement)""" -ForegroundColor Yellow
        $line = $line -replace $pattern.pattern,$pattern.replacement
    }
    $line
} | Set-Content -Path $destination_file

Open in new window

Flex TronDeveloper

Author

Commented:
Both solutions by footech and David work. Thankyou
Flex TronDeveloper

Author

Commented:
Thanks Footech very much for your inputs.
Is there a way we can loop this code for all text files in  a directory Folder ?
Top Expert 2014

Commented:
Of course.  David shows an example of that in his code on lines 37-48 that takes care of dynamically naming the output files.  I've reworked it a bit and adjusted my code accordingly.
$lookupTable = @{
'¿' = '|'
'Ù' = '|'
'À' = '|'
'Ú' = '|'
'³' = '|'
'Ä' = '-'
}

$patterns = $lookupTable.GetEnumerator() | Select -ExpandProperty Value -Unique | ForEach `
 {
    $value = $_
    [pscustomobject] @{ 
                pattern = ($lookupTable.GetEnumerator() | Where { $_.value -eq $value } | Select -ExpandProperty Name) -join "|"
                replacement = $value
                }
 }

$sourcedir = 'C:\filepath'
Get-ChildItem $sourcedir -File | ForEach `
{
    $filepath = Split-Path $_.FullName -Parent
    $destinationfile = "$($_.BaseName)_NEW$($_.Extension)"
    $destination = Join-Path $filepath $destinationfile
  
    Write-Host "Source: $($_.Name)  Destination: $destinationfile"

    Get-Content -Path $_.FullName -ReadCount 0 | ForEach-Object {
        $line = $_
        foreach ($pattern in $patterns)
        {
            Write-Host "Matching pattern ""$($pattern.pattern)"" and replacing with ""$($pattern.replacement)""" -ForegroundColor Yellow
            $line = $line -replace $pattern.pattern,$pattern.replacement
        }
        $line
    } | Set-Content -Path $destination
}

Open in new window

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial