Solved

utf8everywhere.org

Posted on 2014-12-14
11
122 Views
Last Modified: 2015-05-04
Hi experts,
Does anyone here follow the utf8everywhere.org manifesto for how to do Unicode and text properly on Windows?  I.e., they say that you should not use wstring, refrain from using L"", there are a bunch of strict rules to follow to create cross platform code.  I'm wondering if I want to go through the pains of rewriting some libraries to do this...  opinions and help and comments please! :-)

For sure my ultimate goal is to do cross platform world wide software...  but not if it puts me in a depression and prevents me from getting anything done...

Thanks,
Mike
0
Comment
Question by:thready
  • 5
  • 4
11 Comments
 
LVL 19

Assisted Solution

by:mrwad99
mrwad99 earned 333 total points
Comment Utility
I worked at a company that did heavyweight MFC software, and we used UNICODE and the _T macro everywhere.  And they were doing that for years before I joined.  It never caused any issues.

SetWindowText is actually SetWindowTextA or SetWindowTextW, depending on whether UNICODE is defined.  So I don't quite see the problem the author of that page has with it...

I can't comment on what will be standard in years to come, only what has been standard recently.  Overall, if I was still writing MFC apps, I would still be using _T et al...

Other's opinions may vary though, it would be interesting to hear those...
0
 
LVL 1

Author Comment

by:thready
Comment Utility
It doesn't sound like the code was cross platform though right?
0
 
LVL 19

Assisted Solution

by:mrwad99
mrwad99 earned 333 total points
Comment Utility
The apps weren't cross platform.

I've read the article linked from this site (http://www.joelonsoftware.com/articles/Unicode.html) which makes some good points.  After this, I re-read utf8everywhere.org and it made more sense, with compelling arguments.

I wouldn't feel comfortable recommending you follow the advice on this site as I haven't done so myself.  What I would recommend is researching more into software internationalisation with C++ to get a better feel for it,  Additionally, with any luck other experts will be along here to give input on what is evidently a complex topic.
0
 
LVL 1

Author Comment

by:thready
Comment Utility
Yep, I've read that one too.  It certainly is a complex topic.  It's sad that we still have these issues.  I really need to get to the bottom of this.
0
 
LVL 19

Expert Comment

by:mrwad99
Comment Utility
If you want to do cross-platform stuff, then since the advice on that site is targeted at Windows it isn't necessarily the best thing to be looking at.

What kind of software are you writing, and what is the functionality of the libraries you say you would need to rewrite?
0
Why You Should Analyze Threat Actor TTPs

After years of analyzing threat actor behavior, it’s become clear that at any given time there are specific tactics, techniques, and procedures (TTPs) that are particularly prevalent. By analyzing and understanding these TTPs, you can dramatically enhance your security program.

 
LVL 1

Author Comment

by:thready
Comment Utility
The advice on that site doesn't seem to be targeted at windows- they appear to be making a real effort to get people to agree on how everyone should be storing strings.

My own libraries are the ones I would need to rewrite.  I'm most interested at the moment in writing cross platform SQLite use the Kompex library, but that's the tip of the iceberg.  I need to build up my libraries again and have something that is simultaneously being developed cross platform.  I never want to touch this code again basically.
0
 
LVL 19

Expert Comment

by:mrwad99
Comment Utility
I would suggest requesting attention to this question (link at top) to get more input from others, as I have reached the end of my (limited) knowledge on this I am afraid...
0
 
LVL 1

Author Comment

by:thready
Comment Utility
I would like to leave this one open.  I do believe it's probably one of the most important topics to advance in computer science.  If we can't truly reuse what we do, we're never advancing as a whole and we're wasting our time with every added function in the future by duplicating work, bugs, etc.  Everything we do should be cross platform.  At least, that's what is true in an ideal world.  To be cross platform, one has to think about just about every aspect of the final product:  the native user-interface which should not be cross-platform for obvious reasons, some common cross-platform business logic, and the "how do they talk to each other" interface.  That separation does not include the encoding logic, so an encoding map needs to be known for each operating system call/API pair between common base code and the layer that separates the specific native code.  All common code uses UTF-8, making conversions only where absolutely necessary.  Each component should always only ever encode their strings in one way, but there are probably libraries out there that can easily produce inconsistent encodings.  I think I may have answered my own question here.  There's probably no other way to do this than to clearly think about all the necessary separations - and even sometimes unfortunately discover that the underlying libraries you've chosen, are also going to be the source of your headaches with unicode or cross-platform"ness".
0
 
LVL 40

Accepted Solution

by:
evilrix earned 167 total points
Comment Utility
Interestingly enough (and quite unrelated) I just wrote an article about my thoughts on Microsoft's flirtations with "Unicode".

http://www.experts-exchange.com/articles/18363/When-is-Unicode-not-Unicode-When-Microsoft-gets-involved.html

Basically, my view is, always use UTF8 if your code stands any chance what so ever of being ported from Windows. In fact even if it doesn't, use UTF8 internally and only convert to UTF16 when you absolutely have to. Avoid UTF16 like the plague because - well, that's exactly what it is! There's a good reason why all other sensible OSs use UTF8, because it's the only sensible and portable character encoding to use.

Microsoft jumped on the bandwagon of UTF16 without really thinking through the consequences. They have also brainwashed millions of Windows software engineers into thinking Unicode is something it is not.
0
 
LVL 1

Author Closing Comment

by:thready
Comment Utility
Thanks everyone
0

Featured Post

Enabling OSINT in Activity Based Intelligence

Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

Join & Write a Comment

A short article about a problem I had getting the GPS LocationListener working.
Exception Handling is in the core of any application that is able to dignify its name. In this article, I'll guide you through the process of writing a DRY (Don't Repeat Yourself) Exception Handling mechanism, using Aspect Oriented Programming.
This tutorial will introduce the viewer to VisualVM for the Java platform application. This video explains an example program and covers the Overview, Monitor, and Heap Dump tabs.
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

9 Experts available now in Live!

Get 1:1 Help Now