Use PowerShell to modify text in cp866 codepage

Have a lot of batch scripts (inherited from exAdmin). All of those scripts are in cp866 (Cyrillic DOS) codepage.
Those scripts work fine and do what they are expected to do.

Recently we changed some infrastructure objects, so i need to modify all those scripts to replace the infrastructure_object_names.

For example, the source file (cp866) has a text, that from PSh looks like:
PS C:\tests> get-content .\source.txt
'к?им ?йс нвЁе ┐п?ЄЁе да -жг§бЄЁе Ўг<RЄ, ¤  ўлЇ?c з о.

But actually, the content is:
Съешь ещё этих мягких французских булок, да выпей чаю.

And in this .\source.txt file i need to replace "булок" to "плюшек" and get inresult:
Съешь ещё этих мягких французских плюшек, да выпей чаю.

P.S. I understand that my example doesn't look like batch script, but the main idea is to replace a content within a cyrillic-dos-encoded file.
LVL 2
Petr PoleshkoSystems administratorAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Qlemo"Batchelor", Developer and EE Topic AdvisorCommented:
Does it display properly if you change the encoding when reading the file? Use
   get-content source.txt -Encoding X
with X one of: Unicode, UTF8, UTF7, UTF32, Ascii, Default, Oem.
0
Petr PoleshkoSystems administratorAuthor Commented:
PS C:\tests> get-content source.txt -Encoding Unicode
???????????????????????????
PS C:\tests> get-content source.txt -Encoding utf8
???? ??? ??? ???? ?????? ???, ?? ??? ??.
PS C:\tests> get-content source.txt -Encoding utf7
?e?ei ?en ia?a ┐i???a aa -?a§a??a ?a<R?, ¤  ?e??c c i.
PS C:\tests> get-content source.txt -Encoding ascii
????? ??? ???? ?????? ??????????? ?????, ?? ????? ???.
PS C:\tests> get-content source.txt -Encoding String
'к?им ?йс нвЁе ┐п?ЄЁе да -жг§бЄЁе Ўг<RЄ, ¤  ўлЇ?c з о.
PS C:\tests> get-content source.txt -Encoding byte
145
234
... ...
238
46

Encoging into oem, default are not supported and causes error:
+ get-content source.txt -Encoding <<<<  oem (default)
    + CategoryInfo          : InvalidArgument: (:) [Get-Content], ParameterBindingException
    + FullyQualifiedErrorId : CannotConvertArgumentNoMessage,Microsoft.PowerShell.Commands.GetContentCommand
0
Petr PoleshkoSystems administratorAuthor Commented:
There is what i've found on the Internet:

function ConvertTo-Encoding ([string]$From, [string]$To){
      Begin{
            $encFrom = [System.Text.Encoding]::GetEncoding($from)
            $encTo = [System.Text.Encoding]::GetEncoding($to)
      }
      Process{
            $bytes = $encTo.GetBytes($_)
            $bytes = [System.Text.Encoding]::Convert($encFrom, $encTo, $bytes)
            $encTo.GetString($bytes)
      }
}

If in my example i run:
PS C:\tests> Get-Content .\source.txt | ConvertTo-Encoding cp866 windows-1251
Съешь ещё этих мягких французских булок, да выпей чаю.

So, actualy, i can now do replace the output with parameters i need,
PS C:\tests> Get-Content .\source.txt | ConvertTo-Encoding cp866 windows-1251 | foreach-object {$_ -replace 'булок', 'плюшек'}
Съешь ещё этих мягких французских плюшек, да выпей чаю.

So, the replacement goes well, and the last thing i need in this case - write output into txt-file in CP866 encoding.
The problem is that if i use encoding: cp866 -> windows-1251 -> cp866, i get
'ъ?шь ?щё этих мя?ких фра-цу§ских п<юш?к, да вып?c чаю.

And you can see, that original cp866 text and reconverted to cp866 text are different..
0
Making Bulk Changes to Active Directory

Watch this video to see how easy it is to make mass changes to Active Directory from an external text file without using complicated scripts.

Petr PoleshkoSystems administratorAuthor Commented:
And, i was so stubborn this morning, so i tried again and again.

PS C:\tests> Get-Content .\source.txt | ConvertTo-Encoding cp866 windows-1251 | foreach-object {$_ -replace 'булок', 'плюшек'} | out-file -encoding oem .\result.txt
PS C:\tests> Get-Content .\result.txt | ConvertTo-Encoding cp866 windows-1251
Съешь ещё этих мягких французских плюшек, да выпей чаю.
PS C:\tests> Get-Content .\source.txt | ConvertTo-Encoding cp866 windows-1251
Съешь ещё этих мягких французских булок, да выпей чаю.

Open in new window


The source text in cp866 and converted+replaced text that converted back in oem (actualy i dont know how to get that exactly oem-encoding mean) encoding returns write text  :)
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Petr PoleshkoSystems administratorAuthor Commented:
Exctraction from get-help out-file -full

-Encoding <string>
    Specifies the type of character encoding used in the file. Valid values are "Unicode", "UTF7", "UTF8", "UTF32",
     "ASCII", "BigEndianUnicode", "Default", and "OEM". "Unicode" is the default.

    "Default" uses the encoding of the system's current ANSI code page.

    "OEM" uses the current original equipment manufacturer code page identifier for the operating system.

    Required?                    false
    Position?                    2
    Default value
    Accept pipeline input?       false
    Accept wildcard characters?  false

Who can explain in this, how to understand what is the "current original equipment manufacturer code page identifier for the operating system"?

Maybe because:
PS P:\tests> chcp
Active code page: 866

Open in new window


and if i run
PS P:\tests> [System.Text.Encoding]::GetEncodings()

CodePage Name                                    DisplayName
-------- ----                                    -----------
866 cp866                                   Cyrillic (DOS)

Open in new window

0
Petr PoleshkoSystems administratorAuthor Commented:
During the solution process, while being not very familiar with encoding/decoding and running powershell itself, by using logic, search results from the internet and something "else", suddenly i got what i need, by, actually, talking to myself in public :)
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Powershell

From novice to tech pro — start learning today.