Control Characters Visible on one particular system and no others

We have an accounting package installed on multiple Windows servers.
  • Windows 2003 Server
  • Windows 2008 Server
  • Windows 2012 R2 Server

On our Windows 2003 Server, control characters (e.g. CR-LF) that are inadvertently entered into a character field of this accounting package are visible as vertical bars.  In the attached snippet these can be seen in the description field.  This typically happens when a user copies an Excel spreadsheet cell and pastes into the field.  We like that these control characters are visible because it alerts the user to the issue and these control characters create problems.  If we could figure out why they show up on that one box we would try to reproduce this behavior on the newer servers.

On none of the other systems, running the same version of the software, are these characters visible (although they are present).  On demonstrating this to the senior support team for the accounting package, I have learned that they have never seen this behavior on any of their servers, going back to when 2003 Server was their primary platform.  This is not an application that permits you to configure fonts or character sets.  Whatever is different seems likely to be a system-wide setting.

I have reset the Windows Display themes on the  to the Windows Classic.  But that had no effect.
Capture.JPG
fakaulAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Michael PfisterCommented:
Could be the system displaying the bars is using a different font for the edit control text.

This can be adjusted in control panel, color and appearance, window colors and metrics. Click on the part you want to change (I assume its "Window Text") and change the font.
Check on the working system which font its using and see if its available on the newer systems.

HTH
0
Michael PfisterCommented:
https://mcmw.abilitynet.org.uk/windows-7-changing-fonts/

Probably better than my description :)
0
David Johnson, CD, MVPOwnerCommented:
This typically happens when a user copies an Excel spreadsheet cell and pastes into the field.  We like that these control characters are visible because it alerts the user to the issue and these control characters create problems Same version of the client? The allowing of ctrl characters means that the programmer is not validating/sanitizing the input from the user and allowing  non-alphanumeric input. This is a bug that they should address.

This is breaking a cardinal programming rule to never trust user input.  This means that your database is corrupted.  Imagine if you did a search for 'something' and got no results  yet search for 'something[cr][lf]' or "something||" will return results..
0
Problems using Powershell and Active Directory?

Managing Active Directory does not always have to be complicated.  If you are spending more time trying instead of doing, then it's time to look at something else. For nearly 20 years, AD admins around the world have used one tool for day-to-day AD management: Hyena. Discover why

fakaulAuthor Commented:
Adjusting the various windows fonts in the Display settings control panel does not have any impact on this issue, as least none I have tried.
0
fakaulAuthor Commented:
I agree the input validation is not what it should be.  I have been pushing the company that developed the software to address this.  Fortunately the field involved is not heavily used in queries but the corruption does result in some issues.
0
fakaulAuthor Commented:
Another anomaly is that on this system, masked passwords appear as vertical bars as opposed to round "bullets" typically shown.
0
DansDadUKCommented:
Another anomaly is that on this system, masked passwords appear as vertical bars as opposed to round "bullets" typically shown

Another indication that there are differences in the fonts and/or coded character sets used on the different systems.
0
fakaulAuthor Commented:
I agree.  Fonts however have been reset to defaults however.  Still at lost how to identify and adjust the character sets in play.
0
DansDadUKCommented:
How non-graphic characters (control-code characters) and 'missing' characters (those not present in a particular font or coded-character set) are displayed will depend on:
The display font in use.
The coded-character set in force (this may be selectable, or the chosen font may be 'bound' to a particular set).
The application in use - this may apply some filtering to 'convert' characters with code-point values which it considers to be outside the normal range.

The attached file contains 16 lines of text, each line containing:
A header (to indicate the (hexadecimal) range of the 16 characters in that line).
Two horizontal tab control-code characters (to space out the line).
16 bytes, containing the code-point values indicated by the header for that line.
The two control-bytes CarriageReturn (0x0D) and LineFeed (0x0A) to terminate the line.

To illustrate the points above regarding font, etc., I opened the file in NotePad (on a Windows Pro 10 64-bit system).

With the font selected in NotePad as Courier New, with Script=Western (the default choice):

NotePad view - Courier New - Western
Note that:
Although the Courier New font is supposed to be a fixed-pitch font (all characters the same width), the first two rows (where non-graphic control-code characters are in the text) appear to be proportionally-spaced (suggesting, perhaps, that a different font has been invoked?).
Most of the control-code characters are displayed as a thin rectangle; this shape is typically used as the 'notdef' character in TrueType / OpenType fonts, which is used if the font doesn't contain a glyph corresponding to the character code-point of the character; some fonts have a blank 'notdef' glyph.
At least one of these characters is displayed differently, suggesting that the font actually contains glyphs for those code-points (albeit blank in some cases).
In a different application (e.g. Word) some of the control code bytes (e.g. FormFeed (0x0C)) would be treated differently.
Some of the bytes (specifically those in the ranges 0x80-0x8F and 0x90-0x9F) display characters which are not the same as their Unicode range (in his case U+0080 -> U+9F) - this is because the Script = Western indicates that the coded-character set  in use is the Windows ANSI set.
The characters in the 0x80 -> 0x9F ranges are also shown as proportionally-spaced.


If I change the font in use by NotePad to System Bold, with Script=Western (the only choice), this is what the file then displays as:
NotePad view - System Bold - Western
Note that:
The characters in the 0x80 -> 0x9F ranges are different to those shown with the Courier New font, suggesting that the fonts define different sets of characters in these ranges.

... and this is what the file looks like, if viewed in NotePad using the Terminal font (with Script = OEM/DOS (the only choice):
NotePad view - Terminal- OEM/DOS
which shows that this font/character set is very different.


Perhaps you could try viewing this test file in NotePad (and/or other applications) on your different systems to see if it yields any clues?
8-bit_raw_with_Rowheaders.txt
1

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
fakaulAuthor Commented:
I have followed your suggestion DansDadUK and gained the following additional insights.
It appears the display font used by this accounting package is MS San Serif.  On all systems, this font looks identical to the input field font and the corresponding control code (which I know to be 0D hex) is visible on our Windows 2003 servers (retired) as a vertical black bar.  This is how the program displays it on that OS.
 2003 Server MS Sans SerifOn the newer OS this character is invisible between the Female and Note symbols in the text file and invisible in the program.
 2008 Server MS Sans SerifI have verified I am not looking at an OpenType version of MS Sans Serif.  For the regular font, Western Script is the only one available.  I expect that my issue is that the font itself changed between versions, not that there is a different font or character set involved.  I am going to experiment with attempting to substitute the old version of the font for the new one on a test system and see what effect that has.
0
Michael PfisterCommented:
Nice finding. You could try and copy MS San Serif font from the old to the new system (make backup of existing font first)

Should be C:\Windows\fonts\sserife.fon
0
DansDadUKCommented:
the corresponding control code (which I know to be 0D hex) is visible on our Windows 2003 servers (retired) as a vertical black bar.  This is how the program displays it on that OS

Looking more closely at your image, for the 'characters' in the range 0x00 - 0x0F:

extract from NotePad image
it appears that "the program" (i.e. NotePad) has shown narrow vertical rectangles for all of those bytes except:
0x00 - shown as space
0x09 - interpreted as horizontal tab

i.e. it has shown the 'bar' glyph  for the individual 0x0A (LineFeed) and 0x0D (CarriageReturn) characters.
But each line in the test file is terminated with a 0x0D0A (CarriageReturn LineFeed) pair, and these characters are not shown as bars - they are interpreted as the control codes they are.
So there is some sort of level of filtering occurring within the application.


... and just a few general comments:
Most operating systems (including Windows) now use Unicode as the 'internal' encoding.
In the Unicode coded-character set, the range U+0000 -> U+001F is reserved for the C0 control-code characters.
Similarly, the range U+0080 -> U+009F is reserved for the C1 control-code characters.
The 8-bit ISO-8859-1 "Latin-1" character set is a strict subset of Unicode; the range 0x00 --> 0xFF exactly matches U+0000 -> U+00FF.
The "Windows ANSI" (codepage 1252) character set (which is probably what matches "Western script") is a superset of ISO-8859-1 (in that it replaces some of the C1 control-code characters with Unicode graphic characters defined in ranges above U+0100), but that means that it departs from the Unicode coded character set.
Most systems now use Unicode, or one of its encoded representations (typically UTF-8), so it is probably unwise to rely on encoding which are not compliant.
1
Seth SimmonsSr. Systems AdministratorCommented:
No comment has been added to this question in more than 21 days, so it is now classified as abandoned.

I have recommended this question be closed as follows:

Accept: DansDadUK (https:#a42057847)

If you feel this question should be closed differently, post an objection and the moderators will review all objections and close it as they feel fit. If no one objects, this question will be closed automatically the way described above.

seth2740
Experts-Exchange Cleanup Volunteer
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Windows Server 2003

From novice to tech pro — start learning today.