Robert Berke
asked on
excel Code function is weird for non-breaking hyphen
I expected =code($a$1) to return 30 because $a$1 has non-breaking hyphen (which was copied from word).
Instead, it contains a 63 which is supposed to be a question mark.
Other times I get different results which are more like what I expect.
What gives
temp.xls
Instead, it contains a 63 which is supposed to be a question mark.
Other times I get different results which are more like what I expect.
What gives
temp.xls
Excel reads that character as "?"
If you copy that and place that in VBA Code editor, it will change to a "?" However if you copy it an place it in notepad then it is read as "-"
Seems like Excel is not able to understand the character
Sid
If you copy that and place that in VBA Code editor, it will change to a "?" However if you copy it an place it in notepad then it is read as "-"
Seems like Excel is not able to understand the character
Sid
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Here come the points mostly to Rorya
"Excel worksheet functions don't understand Unicode (your character is actually character code 8209)"
Aha !!! Code(x) is not what I want. ascw(x) is what I want. I needed to create a UDF for it since Excel 2003 did not provide it.
function codew(x)
codew=ascw(x)
end function
I already knew the stuff in YellowShoe's link, but it compliments this topic, so he gets points also.
Finally, this link had a useful warning
http://www.dailydoseofexcel.com/archives/2010/11/19/unicode-and-vbas-chrw-and-ascw-functions/
"for ASCII(128) to ASCII(159) Chr(CharCode) and ChrW(CharCode) produce different results.
I haven't yet decided whether to incorporate that quirk into my UDF or not, but it is good to know.
"Excel worksheet functions don't understand Unicode (your character is actually character code 8209)"
Aha !!! Code(x) is not what I want. ascw(x) is what I want. I needed to create a UDF for it since Excel 2003 did not provide it.
function codew(x)
codew=ascw(x)
end function
I already knew the stuff in YellowShoe's link, but it compliments this topic, so he gets points also.
Finally, this link had a useful warning
http://www.dailydoseofexcel.com/archives/2010/11/19/unicode-and-vbas-chrw-and-ascw-functions/
"for ASCII(128) to ASCII(159) Chr(CharCode) and ChrW(CharCode) produce different results.
I haven't yet decided whether to incorporate that quirk into my UDF or not, but it is good to know.
ASKER
So, how do I convert a cell's content to Ascii?
The help function for strconv says
vbFromUnicode Converts the string from Unicode to the default code page of the system. (Not available on the Macintosh.)
I think the following seems to do the job, it turns my non-breaking hyphen into a simple hyphen(in other words, it turns 8209 into 45), but I don't know enough about default code pages to be sure.
Function StrConvFromUnicode(cell As range)
StrConvFromUnicode = StrConv(cstr(cell.value), vbFromUnicode)
End Function
If the answer is not easy, I will post this as an additional question.
The help function for strconv says
vbFromUnicode Converts the string from Unicode to the default code page of the system. (Not available on the Macintosh.)
I think the following seems to do the job, it turns my non-breaking hyphen into a simple hyphen(in other words, it turns 8209 into 45), but I don't know enough about default code pages to be sure.
Function StrConvFromUnicode(cell As range)
StrConvFromUnicode = StrConv(cstr(cell.value), vbFromUnicode)
End Function
If the answer is not easy, I will post this as an additional question.
That should work, as far as I can see.
ASKER
When I used that function on a worksheet function went BONKERS. Everything was in Japanese.
Also, MSGBOX strConvFromUnicode(text) was not what I expected. The following double conversion works better.
Function StrConvFromUnicodeThenBack (x As String) As String
' converts many unicode like characters 8209 to their 8 bit equivalent then back to unicode
' so it can be displayed
StrConvFromUnicodeThenBack = StrConv(StrConv(x, vbFromUnicode), vbUnicode)
End Function
Now that I understand what is going on, I can finally state my full goal.
I wanted to create a file name based upon the user input. The input might come from excel cell, or selected text in MS Word, or an MS Access textbox, or maybe even the clipboard.
So, I wanted to clean the text and replace special characters like : \ etc and turn smartquotes into dumbquotes etc.
I was using a regular expression to do this, but it didn't always work. I think this double conversion will solve the problem.
Anybody interested can view that post at https://www.experts-exchange.com/questions/26662970/turn-non-breaking-hyphens-smart-quotes-etc-into-equivalent-ascii-characters.html.
But wait a few minutes till I post the final solution there.
Also, MSGBOX strConvFromUnicode(text) was not what I expected. The following double conversion works better.
Function StrConvFromUnicodeThenBack
' converts many unicode like characters 8209 to their 8 bit equivalent then back to unicode
' so it can be displayed
StrConvFromUnicodeThenBack
End Function
Now that I understand what is going on, I can finally state my full goal.
I wanted to create a file name based upon the user input. The input might come from excel cell, or selected text in MS Word, or an MS Access textbox, or maybe even the clipboard.
So, I wanted to clean the text and replace special characters like : \ etc and turn smartquotes into dumbquotes etc.
I was using a regular expression to do this, but it didn't always work. I think this double conversion will solve the problem.
Anybody interested can view that post at https://www.experts-exchange.com/questions/26662970/turn-non-breaking-hyphens-smart-quotes-etc-into-equivalent-ascii-characters.html.
But wait a few minutes till I post the final solution there.
Try a copy to notepad to excel. It will eliminate any control characters and also avoid unicode issues.