Avatar of Flex Tron
Flex Tron
Flag for United States of America asked on

Replacing bunch of string characters in a text file using Powershell

Working on a Powershell code which will replace a set of special characters ('¿', 'Ù', 'À','Ú','³','Ä')from a text file in a folder. Each text file is 300MB in size and the characters are repeated multiple times in it. I tried doing in manually but since there are many such files, it's almost impossible. I am using a Win 7 OS and Powershell Ver 3. Attaching the code which I have.
The issue is that it creates a new file when I run the code (New_NOV_1995.txt) but it doesn't change any character in the new file as mentioned in the code. Help very much Appreciated.

$lookupTable = @{
'¿' = '|'
'Ù' = '|'
'À' = '|'
'Ú' = '|'
'³' = '|'
'Ä' = '-'
}

$original_file = 'C:\FilePath\NOV_1995.txt'
$destination_file =  'C:\FilePath\NOV_1995_NEW.txt'

Get-Content -Path $original_file | ForEach-Object {
    $line = $_

    $lookupTable.GetEnumerator() | ForEach-Object {
        if ($line -match $_.Key)
        {
            $line = $line -replace $_.Key, $_.Value
        }
    }
   $line
} | Set-Content -Path $destination_file
Powershell

Avatar of undefined
Last Comment
footech

8/22/2022 - Mon
David Johnson, CD

function Replace-Characters
{
    <#
        .SYNOPSIS
        Short Description
        .DESCRIPTION
        Detailed Description
        .EXAMPLE
        Remove-Something
        explains how to use the command
        can be multiple lines
        .EXAMPLE
        Remove-Something
        another example
        can have as many examples as you like
    #>
    [CmdletBinding()]
    param
    (
        [Parameter(Mandatory=$false, Position=0)]
        [Object]
        $lookupTable = @{ '¿' = '|';'Ù' = '|';'À' = '|';'Ú' = '|';'³' = '|';'Ä' = '-'},
        
        [Parameter(Mandatory=$false, Position=1)]
        [System.String]
        $original_file = 'C:\FilePath\NOV_1995.txt',
        
        [Parameter(Mandatory=$false, Position=2)]
        [System.String]
        $destination_file = 'C:\FilePath\NOV_1995_NEW.txt',
        
        [Parameter(Mandatory=$false, Position=3)]
        [Object]
        $contents = (Get-Content -Path $original_file),
        
        [Parameter(Mandatory=$false, Position=4)]
        [Object]
        $output=@()
    )
    
    
    
    foreach ($line in $contents){
        $lookupTable.GetEnumerator() | ForEach-Object {
            if ($line -match $_.Key) {
                $line = $line -replace $_.Key, $_.Value
            }
        }
        $output += $line
    } 
    $output | out-file -FilePath $destination_file
}

Open in new window


$filelist = get-childitem c:\inputdir
foreach ($file in filelist){
$original_file = $file.fullname
$destination_file = $original_file.Substring(0,$original_file.length-4) + ' NEW.txt'
replace-characters $original_file $destination_file
}
Flex Tron

ASKER
Thanks David,
I took your code in Grey  and  changed the file path for source and destination,. But when I execute it...nothing happens in the powershell terminal . Attaching the sample text file.
NOV_1995.txt
David Johnson, CD

function Replace-Characters 
{
     
    <#
        .SYNOPSIS
        Short Description
        .DESCRIPTION
        Detailed Description
        .EXAMPLE
        Remove-Something
        explains how to use the command
        can be multiple lines
        .EXAMPLE
        Remove-Something
        another example
        can have as many examples as you like
    #>
    [CmdletBinding()]
    param     (
        $lookupTable = @{ '¿' = '|';'Ù' = '|';'À' = '|';'Ú' = '|';'³' = '|';'Ä' = '-'},
     [string]  $original_file,
     [string]  $destination_file
        )
    $output=@()
    $contents = Get-Content -Path $original_file
    $linecounter = 0
    $max = $contents.Count
    
    foreach ($line in $contents){
    $linecounter++
    write-progress -Activity 'Checking lines' -status "line # $linecounter of $max" -PercentComplete ($linecounter/$max * 100)
          $lookupTable.GetEnumerator() | ForEach-Object {
            if ($line -match $_.Key) {
                $line = $line -replace $_.Key, $_.Value
            }
        }
        $output += $line
    } 
    $output | out-file -FilePath $destination_file
}
replace-characters -original_file $original_file -destination_file $destination_file

Open in new window

$filelist = get-childitem c:\filepath\
foreach ($file in $filelist){
$original_file = $file.fullname
$destination_file = $original_file.Substring(0,$original_file.length-4) + '_NEW.txt'
$destination_file
G:\Documents\WindowsPowershell\Scripts\replace-characters.ps1 -original_file $original_file -destination_file $destination_file
} 

Open in new window

I started with Experts Exchange in 2004 and it's been a mainstay of my professional computing life since. It helped me launch a career as a programmer / Oracle data analyst
William Peck
David Johnson, CD

David Johnson, CD

changed
 $output | out-file -FilePath $destination_file -Encoding ascii

so it doesn't output in UTF8
NOV_1995_NEW.txt
Flex Tron

ASKER
Thanks...But will this work in the powershell window.
Suppose I save your first code as string_replacer.ps1. and the second code as string_replacer2.ps1.
Will it work

ps> .\string_replacer1.ps1

I am not sure how to execute your code ?

Thanks
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
SOLUTION
David Johnson, CD

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
GET A PERSONALIZED SOLUTION
Ask your own question & get feedback from real experts
Find out why thousands trust the EE community with their toughest problems.
ASKER CERTIFIED SOLUTION
footech

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
Flex Tron

ASKER
Both solutions by footech and David work. Thankyou
Flex Tron

ASKER
Thanks Footech very much for your inputs.
Is there a way we can loop this code for all text files in  a directory Folder ?
footech

Of course.  David shows an example of that in his code on lines 37-48 that takes care of dynamically naming the output files.  I've reworked it a bit and adjusted my code accordingly.
$lookupTable = @{
'¿' = '|'
'Ù' = '|'
'À' = '|'
'Ú' = '|'
'³' = '|'
'Ä' = '-'
}

$patterns = $lookupTable.GetEnumerator() | Select -ExpandProperty Value -Unique | ForEach `
 {
    $value = $_
    [pscustomobject] @{ 
                pattern = ($lookupTable.GetEnumerator() | Where { $_.value -eq $value } | Select -ExpandProperty Name) -join "|"
                replacement = $value
                }
 }

$sourcedir = 'C:\filepath'
Get-ChildItem $sourcedir -File | ForEach `
{
    $filepath = Split-Path $_.FullName -Parent
    $destinationfile = "$($_.BaseName)_NEW$($_.Extension)"
    $destination = Join-Path $filepath $destinationfile
  
    Write-Host "Source: $($_.Name)  Destination: $destinationfile"

    Get-Content -Path $_.FullName -ReadCount 0 | ForEach-Object {
        $line = $_
        foreach ($pattern in $patterns)
        {
            Write-Host "Matching pattern ""$($pattern.pattern)"" and replacing with ""$($pattern.replacement)""" -ForegroundColor Yellow
            $line = $line -replace $pattern.pattern,$pattern.replacement
        }
        $line
    } | Set-Content -Path $destination
}

Open in new window

This is the best money I have ever spent. I cannot not tell you how many times these folks have saved my bacon. I learn so much from the contributors.
rwheeler23