Jeff Geiselman
asked on
Unable to remove an Invisible character in a worksheet cell using VBA.
I am unable to identify and remove an extended character from cells in a worksheet.
The data comes from an download (Oracle I think).
The unwanted characters are invisible characters (assumed to be an in the extended character set).
These characters are in the first and last character positions of each cell.
I was able to identify them with the =LEN() function which gave the character string length of 2 characters more than the number of visible characters.
When it check it from VB with the 'ASC' function it incorrectly indicates that it is ASCII character 63 (which is the Question Mark and is a visible character).
But if I test that first character against a "?", it resolves to FALSE (further indicating the VB incorrectly identifies this character).
The main reason for trying to remove these characters is that the VLOOKUP() function fails to match these cells against another worksheet with these data strings that originated from another source and truly do not have these extended characters.
My goal is to be able to identify and remove these unwanted characters with a VB macro, so a VLOOKUP() can be properly performed.
(A workbook is attached with a sample of the problem cells)
Unidentified-Extended-characters.xlsx
The data comes from an download (Oracle I think).
The unwanted characters are invisible characters (assumed to be an in the extended character set).
These characters are in the first and last character positions of each cell.
I was able to identify them with the =LEN() function which gave the character string length of 2 characters more than the number of visible characters.
When it check it from VB with the 'ASC' function it incorrectly indicates that it is ASCII character 63 (which is the Question Mark and is a visible character).
But if I test that first character against a "?", it resolves to FALSE (further indicating the VB incorrectly identifies this character).
The main reason for trying to remove these characters is that the VLOOKUP() function fails to match these cells against another worksheet with these data strings that originated from another source and truly do not have these extended characters.
My goal is to be able to identify and remove these unwanted characters with a VB macro, so a VLOOKUP() can be properly performed.
(A workbook is attached with a sample of the problem cells)
Unidentified-Extended-characters.xlsx
You didn't succeed in attaching your workbook. You need to upload the workbook before clicking 'Submit'.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
What you want to use is the Clean() function. See the following Microsoft article for more information: WorksheetFunction.Clean Method (Excel).
Range("A1").Value = Application.WorksheetFunction.Clean(Range("A1"))
Possible alternative solution:
Export the data as a CSV file. Then (assuming that the hidden characters are all the same strings), remove them using a text editor. Then read the CSV back in.
Export the data as a CSV file. Then (assuming that the hidden characters are all the same strings), remove them using a text editor. Then read the CSV back in.
The ASCW values for the characters in A9 are:
1 8237
2 78
3 32
4 53
5 54
6 50
7 50
8 53
9 55
10 8236
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
If you need to preserve some Cr, Lf, or Tab characters, you can extend the range like this:
Public Function ReallyClean(ByVal parmCellText) As String
Static oRE As Object
If oRE Is Nothing Then
Set oRE = CreateObject("vbscript.regexp")
oRE.Global = True
oRE.Pattern = "[^\x00-\x7f]"
End If
ReallyClean = oRE.Replace(parmCellText, vbNullString)
End Function
ASKER
Rgonzo1971, thanks for pointing me to the ASCW() function. I had overlooked the possibility of it being UNICODE.
aikimark, thanks for the alternate function.
PS: The clean() function does not appear to remove the this Unicode character.
Thanks to all for the quick responses!
aikimark, thanks for the alternate function.
PS: The clean() function does not appear to remove the this Unicode character.
Thanks to all for the quick responses!