APD Toronto
asked on
Replacing Unicode Characters
Hi Experts,
I have a 100+ page book in MS Word that was written in a Cyrillic font over 10 years ago. The font is nothing fancy, it was just designed simply to use the standard US Keyboard to map and output Cyrillic characters. Nowadays, as we all know all OS have Cyrillic, and all World characters built in, and it is just a matter of enabling them.
My question is, how would I convert my book into the Unicodes for Cyrillic?
I used to code in VBA, but haven't done so in years. Now I code in PHP, and if I was to code it in PHP, I would do a 2-D array, with the first layer having 64 elements (the Macedonian alphabet has 32 letters) and it would look something like this:
Then, I would loop through through array and use str_replace
Would this be possible in VBA, and maybe even to run it through Access as for other files, other fonts has been used with slightly different mappings.
By the way, I am using the following Unicode Table
https://www.rapidtables.com/code/text/unicode-characters.html
Thank you
I have a 100+ page book in MS Word that was written in a Cyrillic font over 10 years ago. The font is nothing fancy, it was just designed simply to use the standard US Keyboard to map and output Cyrillic characters. Nowadays, as we all know all OS have Cyrillic, and all World characters built in, and it is just a matter of enabling them.
My question is, how would I convert my book into the Unicodes for Cyrillic?
I used to code in VBA, but haven't done so in years. Now I code in PHP, and if I was to code it in PHP, I would do a 2-D array, with the first layer having 64 elements (the Macedonian alphabet has 32 letters) and it would look something like this:
[0]
[letter] => A
[Ascii] => 64
[Uni] => 1040
[1]
[letter] => B
[Ascii] => 65
[Uni] => 1041
Then, I would loop through through array and use str_replace
Would this be possible in VBA, and maybe even to run it through Access as for other files, other fonts has been used with slightly different mappings.
By the way, I am using the following Unicode Table
https://www.rapidtables.com/code/text/unicode-characters.html
Thank you
Can you provide an sample of your word document, I would like to make some test before answering.
ASKER
gr8gonzo - exactly
Might be good to provide both your document + a copy of your font.
I suppose in that case, you could probably use this VBA code:
Original source:
https://wordmvp.com/FAQs/MacrosVBA/FindReplaceSymbols.htm
The ReplaceAllSymbols function is what performs the replacement in the document for a single character, and ReplaceAllDeltaSymbolsWith BetaSymbol s is just an example of how to use it.
You would come up with your own version of ReplaceAllDeltaSymbolsWith BetaSymbol s (e.g. "ConvertToTrueCyrillic" or something), and then inside, you would copy the "Call" statement for each character that needs to be re-mapped.
Sub ReplaceAllDeltaSymbolsWithBetaSymbols()
'Call the main "ReplaceAllSymbols" macro (below),
'and tell it which character code and font to search for, and which to replace with
Call ReplaceAllSymbols(FindChar:= ChrW(-3996), FindFont:= "Symbol", _
ReplaceChar:=-3998, ReplaceFont:="Symbol")
End Sub
Sub ReplaceAllSymbols(FindChar As String, FindFont As String, _
ReplaceChar As String, ReplaceFont As String)
Dim FoundFont As String, OriginalRange As Range, strFound As Boolean
Application.ScreenUpdating = False
Set OriginalRange = Selection.Range
'start at beginning of document
ActiveDocument.Range(0, 0).Select
strFound = False
With Selection.Find
.ClearFormatting
.Text = FindChar
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindStop
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
Do While .Execute
'keep searching until nothing found
If Dialogs(wdDialogInsertSymbol).Font = FindFont Then
'Insert the replacement symbol where the found symbol was
Selection.InsertSymbol Font:=ReplaceFont, _
CharacterNumber:=ReplaceChar, Unicode:=True
Else
Selection.Collapse wdCollapseEnd
End If
Loop
End With
OriginalRange.Select
Set OriginalRange = Nothing
Application.ScreenUpdating = True
End Sub
Original source:
https://wordmvp.com/FAQs/MacrosVBA/FindReplaceSymbols.htm
The ReplaceAllSymbols function is what performs the replacement in the document for a single character, and ReplaceAllDeltaSymbolsWith
You would come up with your own version of ReplaceAllDeltaSymbolsWith
ASKER
Here is a little excerpt and the font used.
Basically, I need to be able to change the font to Arial and see the same characters , not `, ~, \
I'm not exactly sure how to use the above code, and setup the mappings.
Ideally, as mentioned, would like to setup an Access table for different font mappings.
EE won't let me attach, but here is a link: http://test.aces-project.com/EE-Cyrillic.zip
Basically, I need to be able to change the font to Arial and see the same characters , not `, ~, \
I'm not exactly sure how to use the above code, and setup the mappings.
Ideally, as mentioned, would like to setup an Access table for different font mappings.
EE won't let me attach, but here is a link: http://test.aces-project.com/EE-Cyrillic.zip
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thank You. I also added the Access layer
So if you're reading the document with the font installed on your computer, you might see "яблоко" but if you didn't have the font installed, you would see "abxded", because the font displays the letter "a" as "я". Is that correct?