Question

Need width of a string, how? (for asian fonts)

Asked by: totsubo

I am trying to write an email containing data in a table. The email will be read in a mail viewer that uses a fixed-width font so I would like to line up the data in columns.

If I were dealing with ASCII only data that would be easy as each printing characters has a width of exactly and always 1.

However I am writing out Japanese characters as data. In Japanese most characters have a width of two, but some have a width of only 1.

Is there anyway for me to figure out (assuming I am using a fixed-width font):

#1 the width of a string

OR

#2 if a character is singled-width or double width (the I could just loop thorugh all the characters in a string to figure out  the width).

This Question has been solved and asker verified All Experts Exchange premium technology solutions are available to subscription members.

Subscribe now for full access to Experts Exchange and get

Instant Access to this Solution

  • Plus...
  • 30 Day FREE access, no risk, no obligation
  • Collaborate with the world's top tech experts
  • Unlimited access to our exclusive solution database
  • Never be left without tech help again

Subscribe Now

Asked On
2003-04-01 at 18:32:02ID20570949
Topics

Java Programming Language

,

Font Creator

Participating Experts
4
Points
250
Comments
67

Trusted by hundreds of thousands everyday for fast, accurate and reliable tech support.

  • "The time we save is the biggest benefit of Experts Exchange to Warner Bros. What could take multiple guys 2 hours or more each to find is accessed in around 15 minutes on Experts Exchange." Mike Kapnisakis, Warner Bros.
  • "Our team likes having a resource that is more secure than just using Google and most experts using this service really know their stuff. It's nice to look here first versus using Google." Dayna Sellner, Lockheed Martin
  • "Anytime that I've been stumped with a problem, 9 out of 10 times Experts Exchange has either the accepted solution or an open discussion of the potential solution to the problem." Kenny Red, eBay Inc.

See what Experts Exchange can do for you.

Got a question?

We've got the answer.

Experts Exchange has been collecting answers to technology questions since 1996…3 million and counting! If you have a question, chances are we already have your answer.

Screenshot of Experts Exchange Knowledgebase

Need individual assistance?

Our experts are ready to help.

If you can't find the exact answer you're looking for, ask our exclusive community of 50,000 experts. You’ll get a personalized answer from a trusted professional.

Screenshot of Experts Exchange Knowledgebase

Want to learn from the best?

Read articles from industry experts.

Thousands of free tech tips, tricks, how-to’s and tutorials are available in our peer reviewed articles section. See for yourself how smart our experts are, no login required.

Screenshot of an Article

Working on a long term project?

Store your work and research.

Save solutions to your questions, answers you’ve discovered through searching plus helpful articles in your personal knowledgebase for easy future access.

Screenshot of Experts Exchange Knowledgebase

Access the answers to your technology questions today.

Subscribe Now

30-day free trial. Register in 60 seconds.

What Makes Experts Exchange Unique?

Members of the expert community talk about why the experience at Experts Exchange is different than what you will find anywhere else.

Trusted by the world's most respected brands.

image of each brand's logo

Faithfully serving IT professionals since 1996.

Experts Exchange Logo

Try it out and discover for yourself.

Subscribe Now

30-day free trial. Register in 60 seconds.

Related Solutions

  1. fonts
    How to set a font as a default font for any application running on Mac OS 8.5?
  2. asian font problem in acrobat reader
    In most of the Japanese website, I cannot open their pdf file sucessfully. Usually I will see some error message saying that the "encoding error" or the "?? font (CMap) is not found" I guess some of the asian font is not available in my Acrobat Reader...

Free Tech Articles

  1. WARNING: 5 Reasons why you should NEVER fix a computer for free.
    It is in our nature to love the puzzle. We are obsessed. The lot of us. We love puzzles. We love the challenge. We thrive on finding the answer. We hate disarray. It bothers us deep in our soul. W...
  2. SCCM OSD Basic troubleshooting
    SCCM 2007 OSD is a fantastic way to deploy operating systems, however, like most things SCCM issues can sometimes be difficult to resolve due to the sheer volume of logs to sift through and the dispe...
  3. Migrate Small Business Server 2003 to Exchange 2010 and Windows 2008 R2
    This guide is intended to provide step by step instructions on how to migrate from Small Business Server 2003 to Windows 2008 R2 with Exchange 2010. For this migration to work you will need the fo...
  4. Create a Win7 Gadget
    This article shows you how to create a simple "Gadget" -- a sort of mini-application supported by Windows 7 and Vista. Gadgets can be dropped anywhere on the desktop to provide instant information, ...
  5. Outlook continually prompting for username and password
    There have been a lot of questions recently regarding Outlook prompting for a username and password whilst using Exchange 2007. There are a few reasons why this would happen and I will try to cover t...
  6. Backup Exchange 2010 Information Store using Windows Backup
    There seems to be quite a lot of confusion around the ability to backup Exchange 2010 using the built in Windows Backup feature. This stems from the omission of this feature prior to Exchange 2007 s...

Cloud Class Webinars

  1. Avoiding Bugs in Microsoft Access
    Alison Balter takes and in-depth look at avoiding bugs in Access. In this webinar you will learn about using the immediate window to debug your applications, invoking the debugger, using breakpoints to troubleshoot, stepping through code, setting the next statement to execute, ...
  2. Top 10 Best New Features in Visio 2010
    Scott Helmers gives live demonstrations of the top 10 new features in Visio 2010. This webinar will teach you how to create compelling diagrams by adding shapes to the page with a single click, linking the shapes in a diagram to data in Excel (or SQL Server, or SharePoint), ...
  3. IT Consultant Business Secrets Revealed
    Michael Munger, Experts Exchange tech pro and IT consultant, pulls back the curtain on his very successful businesses and answers question on every IT consultant and business owner should know about. He shares secrets on what he did to solve the 5 most common problems in IT, ...
  4. Disaster Recovery and Business Continuity
    Quest CTO, Mike Billon, gives an overview of the steps involved in building a dunamic disaster recovery plan. Through case studies and an examination of software/hardware tooles for monitoring and testing, you'll gain a better understandin of where you are, where you want ...
  5. Organize Your Visio Diagrams with Containers and Lists
    Scott Helmers uses cross functional flowcharts, wireframe diagrams, data graphic legends and seating charts to teach you: how to ustilize all three new structured diagram components in Visio 2010, the best practices for organizeing shapes in previous version of Visio, how to organize ...
  6. How to Us Objects, Properties, Events and Methods in Microsoft Access
    Alison Dalter gives an in-depbth look at objects, properties, events and methods in Microsoft Access. In this webinar you will learn about using the object browser, referring to objects, working with properties and methods, working with object variables, understanding the ...

Join the Community

Give a Little. Get a Lot.

Join the community of experts here and help other tech pros by answering question in your area of expertise. You can earn FREE access to all Experts Exchange's premium features and resources.

Join the Community

Answers

 

by: objectsPosted on 2003-04-01 at 18:50:54ID: 8250529

You can calculate the width of a string using the FontMetric class.

 

by: totsuboPosted on 2003-04-01 at 18:52:59ID: 8250539

Please show me how. Further more how do I get a FontMetric object since it is an abstract class?

 

by: objectsPosted on 2003-04-01 at 18:56:52ID: 8250568

FontMetrics fm = comp.getFontMetrics(font);
int width = fm.stringWidth(s);

 

by: totsuboPosted on 2003-04-01 at 19:04:18ID: 8250605

I don't have any components to call getFontMetric() on ... This application has no GUI.

 

by: objectsPosted on 2003-04-01 at 19:09:35ID: 8250633

Then how is the string getting displayed?

 

by: totsuboPosted on 2003-04-01 at 19:12:02ID: 8250651

Read the question again. I am sending emails. Nothing needs to be displayed. The application is run from the console or called externally from another program.

 

by: objectsPosted on 2003-04-01 at 19:20:42ID: 8250695

Sorry I thought your application was the mail viewer :)

There is no way you can find the width then, in fact the width will be different depending on what application is viewing it and what font they have installed in their system.

 

by: totsuboPosted on 2003-04-01 at 19:26:33ID: 8250730

Yes you are absolutely right :) But ...

The mail viewer will be using a fixed-width font. So as long as I line things up on my end the will line up in the mail viewer.

 

by: burtdavPosted on 2003-04-01 at 19:51:45ID: 8250876

You might still be able to use a FontMetrics to calculate your width - something like
FontMetrics fm = Toolkit.getDefaultToolkit().getFontMetrics(new Font(...));

 

by: totsuboPosted on 2003-04-01 at 20:02:40ID: 8250917

Yes, that would work but that method is deprecated ...

What is the "new" (non-deprecated) way of getting a FontMetric object?

The API docs talk of getting a LineMetrics object but that object does not have a stringWidth() method ....

Did the java people forget to include such a method?

 

by: objectsPosted on 2003-04-01 at 20:17:09ID: 8250966

LineMetrics are not used for caclulating rendered string width, thats what FontMetrics are for.

 

by: totsuboPosted on 2003-04-01 at 21:46:57ID: 8251343

Ok, so FontMetrics it is.

I still can't figure out how to get a FontMetrics object though.

FontMetrics fm = Toolkit.getDefaultToolkit().getFontMetrics(new Font(...));

Would seem the way to go but this method it deprecated.

 

by: objectsPosted on 2003-04-01 at 21:52:22ID: 8251366

You could create an image and get the font metrics objects from the associated Graphics object.

 

by: totsuboPosted on 2003-04-01 at 22:02:02ID: 8251395

I agree, but there must be a way to get a FontMetrics object withouth creating anything I don't need?

Or is this a bug in Java (i.e. they forgot a method to get a FontMetrics object when one has no visual components).

 

by: objectsPosted on 2003-04-01 at 22:10:46ID: 8251437

The size of a font is dependant on the attributes of what you are rendering to, thus you need to know what you are rendering to get a FontMetrics instance.

 

by: objectsPosted on 2003-04-01 at 22:11:37ID: 8251442

> Or is this a bug in Java

No its not a bug. A method exists but it has been deprecated for the above reasons.

 

by: ksivananthPosted on 2003-04-01 at 22:13:43ID: 8251454

But through the LineMetrics also you can get the string width, right?

 

by: totsuboPosted on 2003-04-01 at 22:18:27ID: 8251474

ksivananth:

No, LineMetrics does not have a corresponding stringWidth() method

objects:

Hum ... To know how wide a string is all one needs to know is what Font is being used, no?

There is no need for a Component.

So there should be a way to get the width of a string knowing only which Font will be used to render it.

Or are things more complicated than I think?

 

by: objectsPosted on 2003-04-01 at 22:27:18ID: 8251509

> Or are things more complicated than I think?

Things are more complicated than you think :)
The width will vary depending on the device it is being rendered to.
This may not be a problem in your case as you not only do not know know what device the font will be rendered on, but you also do not know the details of the actual font being used. So whatever you use it is not going to be accurate anyway.

 

by: totsuboPosted on 2003-04-01 at 22:38:51ID: 8251542

You're right.

I had forgotten about that since in my case I don't care about the actual size, but the relative size. i.e. I just want to know if a character takes up 1 "space" or 2, or if two Strings are the width or not, if not I pad them with spaces until they are.

I guess I'll get a FontMetrics object from a dummy Component and use that.

Things where indeed not as simple as I had imagined (or hope ... ;)

Points go to objects unless there are object(ions)s? :)

 

by: burtdavPosted on 2003-04-02 at 02:34:08ID: 8252516

Are you using my suggestion?

 

by: totsuboPosted on 2003-04-02 at 02:54:03ID: 8252599

No because it is deprecated. Though it *is* the solution I *would* like to use since it means I don't have to create a dummy component ...

But objects explained pretty well why that method got deprecated and why it makes no sense to get a FontMetrics object without something that will actually display that font.

 

by: mcogan1966Posted on 2003-04-02 at 03:43:15ID: 8252783

I'm thinking the only direction to go is taking your value, and doing a length() on it.  Taking that, then turning your string into, say ByteArrayInputStream using getBytes().  You can then take each character individually using their byte values.  You'll probably have to use your soultion #2 basis, but that way you'll end up with the data you want.

 

by: burtdavPosted on 2003-04-02 at 03:49:35ID: 8252809

Interesting suggestion - but do the half-width characters correspond to single-byte characters? I think not, and I think objects has earnt the points. Have fun implementing, totsubo. It's kind of a strange thing to be doing... using AWT in a console app...

 

by: totsuboPosted on 2003-04-02 at 04:10:50ID: 8252895

burtdav:

It might seems strange but it's not :)

The app is run from a cron job that picks up invoicing data from a database then automatically generates the emails to send out.

I want the invoicing data to line up properly, in a tabular form.

True, that if the user uses a mail viewer that doesn't use a fixed-width font this solution has no impact. But for those that do it will make for a nicely-formatted email :)

It's a lot of work for little pay-back it's what the customer wants ... :(

 

by: burtdavPosted on 2003-04-02 at 04:15:01ID: 8252912

I was referring to the simple fact of using windowing components in a non-windowed app being weird. Well, it's the customer who pays the bills, isn't it? I hope you're able to provide value even in this case.
Cheers!

 

by: totsuboPosted on 2003-04-02 at 04:35:51ID: 8253000

I agree it's weird.

As for value ... I guess at the rate I get paid the company is getting value.

But I still have a long way to go before I can say my Java programs are well-written. Lots more practice needed.

 

by: totsuboPosted on 2003-04-02 at 07:08:41ID: 8253984

Object:

I had run some quick tests and your method seemed to be working, but now that I am using it on real data it doesn't anymore. Can you help?

I use the below function many times as I built up a line to make the columns line up. I padd with "+" at the end of the line until I reach the begining of the next column.

Font font = new Font("Courier", Font.PLAIN, 2);
FontMetrics fm = Toolkit.getDefaultToolkit().getFontMetrics(font);
int LINE_LENGTH = 72;
String mpc, title, dsc, qty, price, total;

line = mpc;
line = pad(line, 10) + title;
line = pad(line, 50) + dsc;
line = pad(line, 60) + qty;
line = pad(line, 63) + price;
line = pad(line, LINE_LENGTH - 5) + total;


String pad(String s, int len, boolean) {
  while (fm.stringWidth(s) < len) {s += "+";}
  System.out.println(s + " (this line is " + fm.stringWidth(s) + " wide)");
}

Here is some sample output. If you copy-paste these lines into a japanese text editor using a fixed-width font the lines do not line up but Java says they are the same width:

AIO-048++&#12472;&#12515;&#12452;&#12450;&#12531;&#12488;&#12502;&#12483;&#12484;+++++++++++++++++++GOODS++1++5800++++++5800 (this line is 77 wide)
DOLL-010+&#23569;&#22899;&#12356;&#12383;&#12378;&#12425;&#12288;&#12371;&#12435;&#12394;&#12467;&#12488;&#12377;&#12427;&#12398;&#21021;&#12417;&#12390;&#12384;&#12424; (DVD)
++++++++++++++++++++++++++++++++++++++++++++++++++DVD+++++++1++3700++++  3700 (this line is 77 wide)
DDGB-016+&#25335;&#21839;&#35386;&#23519;&#23460;&#12288;&#32654;&#23569;&#22899;&#12463;&#12522;&#12491;&#12483;&#12463;&#12288;16++++++++DVD+++++++1++4800++++++4800 (this line is 77 wide)


Help! :)

 

by: objectsPosted on 2003-04-02 at 14:07:17ID: 8257072

> Font font = new Font("Courier", Font.PLAIN, 2);

Shouldn't you be using a Japanese font?

 

by: totsuboPosted on 2003-04-02 at 20:12:23ID: 8258714

Good point. I changed my code to use a japanese font but I still have the same problem:

Font font = new Font("FixedSys", Font.PLAIN, 16);
FontMetrics fm = Toolkit.getDefaultToolkit().getFontMetrics(font);

System.out.println("AAAAAAAAAA" + " (this line is " + fm.stringWidth("AAAAAAAAAA") + " characters)");
System.out.println("1234567890" + " (" + fm.stringWidth("1234567890") + " wide)");
System.out.println("&#12354;&#12354;&#12354;&#12354;&#12354;" + " (" + fm.stringWidth("&#12354;&#12354;&#12354;&#12354;&#12354;") + " wide)");
System.out.println("&#65297;&#65298;&#65299;&#65300;&#65301;" + " (" + fm.stringWidth("&#65297;&#65298;&#65299;&#65300;&#65301;") + " wide)");
System.out.println("&#31169;&#12399;&#38263;&#12356;&#12449;" + " (" + fm.stringWidth("&#65297;&#65298;&#65299;&#65300;&#65301;") + " wide)");


OUTPUT:

AAAAAAAAAA (110 wide)
1234567890 (90 wide)
&#12354;&#12354;&#12354;&#12354;&#12354; (80 wide)
&#65297;&#65298;&#65299;&#65300;&#65301; (80 wide)
&#31169;&#12399;&#38263;&#12356;&#12449; (80 wide)

Thought it seems that the japanese characters are always 16 pixels wide ...

 

by: objectsPosted on 2003-04-02 at 20:16:57ID: 8258733

(As a test) have you tried displaying then using Java to see if they do in fact line up then or not.

 

by: totsuboPosted on 2003-04-02 at 20:58:29ID: 8258882

No, as wether they line up in Java or not is not important.

One of the specs is that in the mail viewer one ASCII character takes up one space and one full-width japanese character takes up two spaces.

Unfortunately there are also japanese half-width characters that take up one space, so I can't jsut check to see if a character falls in the ASCII range or not :(

I've tried to find a list of the unicode ranges for half-width chracters but with no luck. As far as I can tell they are all over the place ...

 

by: objectsPosted on 2003-04-02 at 21:08:07ID: 8258914

> One of the specs is that in the mail viewer one ASCII
> character takes up one space and one full-width japanese character takes up two spaces.

Does the Java fixed width font follow the same rules?

 

by: totsuboPosted on 2003-04-02 at 21:44:47ID: 8259022

Yes, as far as I can tell it does. All fonts that support japanese that I have tested follow the same rules.

 

by: objectsPosted on 2003-04-02 at 21:56:26ID: 8259060

Then testing if it lines up in Java should be useful then.
As it should line up in Java.

 

by: totsuboPosted on 2003-04-02 at 22:12:40ID: 8259108

No, I guess I didn't quite catch your question.

The test case I gave shows that in Java the chracters do not line up:

AAAAAAAAAA (110 wide)
1234567890 (90 wide)

What I meant in my answer to your question was this:

If a chracter is double-width in the email viewer, it will be double-width in Java, and the same for half-width characters.

But wheras in the email viewer all half-width characters have the same width (and the same for the full-width chracters) I have yet to find in Java a truly fixed-width font where all half-width characters have the same width (in pixels) when using fm.stringWidth() to measure the width.

 

by: burtdavPosted on 2003-04-03 at 00:37:11ID: 8259685

Can you populate a boolean[] reference array with false for single-width characters and true for double-width characters? Like this:
// in a class
private static boolean[] charIsDoubleWidth;
// in a constructor or method, before it needs to be used
if (charIsDoubleWidth == null) {
    charIsDoubleWidth = new boolean[65536];
    for (int i = 0; i < charIsDoubleWidth.length; i++) {
        charIsDoubleWidth[i] = (i > 0xff && i != 0x1234 && i != 0x1235 // ...
                                                                      );
    }
}
It would be somehow better to initialise it with an aggregate (public static final boolean[] cidw={false,...}), but that would be prohibitively huge.
Then testing a character is as simple as evaluating charIsDoubleWidth[charToTest]. But if you don't know that list, or if it's impractical to express in terms of exceptions like I've tried to show, then this is not your solution.

 

by: objectsPosted on 2003-04-03 at 00:43:07ID: 8259709

I'm getting confused. So are you saying that in the email viewer all japanese characters have the same width.
But the message may contain a mix of japanese and ascii characters.

 

by: totsuboPosted on 2003-04-03 at 01:25:41ID: 8259922

Objects:

You've almost got it. The text can contain a mix of japanese and ascii characters, *and* to make matters more complicated some japanese chracters that up the same space as ASCII characters whereas others (most) take up twice as much space.

burtdav:

Your suggestion is good but how do you know if a character is half or double width? All chracters in the ASCII range are half-width but not all characters above that are full width ...

 

by: burtdavPosted on 2003-04-03 at 03:16:46ID: 8260477

My rule above accounts for that: "i > 0xff"  says that double-width characters are all above '\u00ff', and the "!="s after that specify single-width characters; read it like this:
charIsDoubleWidth[i] = (i > 0xff && i != 0x1234 && i != 0x1235 ...)
double width if (above ascii range BUT not '\u1234' AND not '\u1235' etc.)
You could do this if it was practical to type in the character codes of all the exceptions. You could do this using a target mail client: generate an email with characters next to character codes on separate lines, and it will be easy to differentiate between the two types.

 

by: totsuboPosted on 2003-04-03 at 03:26:51ID: 8260532

I agree that your solution would work the only problem is that I don't know what all the half-width characters are ...

I can guess at most of them (all the half-width kana) but there are some I don't know about. There are many half-width punctuation marks and graphics that I don't know about.

I've tried looking for a chart of these but can't find one.

 

by: burtdavPosted on 2003-04-03 at 03:50:51ID: 8260624

You can make one by generating a (fairly long) email...
public class ListCharacters;
public static void main(String[]args){
    PrintWriter out = new PrintWriter(new FileOutputStream("myoutputfile.txt"));
    for (char c = 1; c <= '\uffff'; c++) {
        out.print(c + " " + (int)c);
    }
    out.flush();
    out.close();
}}
Hopefully that will make a unicode file you can copy into an email and view in your email client - widths should become apparent.

 

by: objectsPosted on 2003-04-03 at 12:01:25ID: 8264083

On a side note, what character encoding are you using to mix japanese and ascii character.

 

by: burtdavPosted on 2003-04-03 at 13:53:28ID: 8264978

objects, I think that characters 0x00 through to 0x7f are fairly consistent between most modern encodings; so 0x0041 in an asian character set would represent 'A'.

 

by: totsuboPosted on 2003-04-03 at 19:05:40ID: 8266584

burtdav:

You would like me to go through 65,535 character by hand?

Objects:

I'm using iso-2022-jp and though I am not an expert I believe that for all japanese encodings anything below char(256) is single-width.

 

by: objectsPosted on 2003-04-03 at 19:17:08ID: 8266641

Can you use the FontMetrics to determine which characters are single width and which are double width?

 

by: burtdavPosted on 2003-04-03 at 19:27:17ID: 8266693

It's just an idea: if there are relatively few single-width characters, you can set up rules for finding them like I've explained. You would only have to search through the limited range of characters that are actually used. If it's still a mixture (ie a lot of single-width characters, and not just in a few ranges), then obviously it's not practical.

Again, you might be able to set up the same kind of thing as a literal array using FontMetrics - you could have a once-off java program to produce the code for that array by checking the width using a FontMetrics in a graphical context.

 

by: burtdavPosted on 2003-04-03 at 19:29:01ID: 8266705

Thinking even more outside the square, can you use tab characters or html tables to do the formatting? Though I don't suppose you'd be here if you could.

 

by: totsuboPosted on 2003-04-03 at 19:34:40ID: 8266728

burtdav:

If I could used tabs I wouldn't be here :)

objects:

Using *anything* to find the width of a character would be fine. But as I showed with my little test characters which have the same width in the email viewer (i.e. one "space") don't give the same width using FontMetrics ...

The following two strings take up the same width in the viewer but FontMetrics reports two different widths:

AAAAAAAAAA (110 wide)
1234567890 (90 wide)

 

by: objectsPosted on 2003-04-03 at 19:41:06ID: 8266764

I realise that Java font you are has varying widths, but you may still be able to use to distinguish whether a character is single or double width. ie. you don't use the width directly, you just use it to determine if its a single or double width char.
You could then count how many single and double width characters there are and calculate width simply based on these numbers.

width = n * w (s + (2 * d))
where
n = number of characters
w = single char width in email viewer
s = # of single width chars
d = # of double width chars

 

by: totsuboPosted on 2003-04-03 at 20:08:25ID: 8266864

"ie. you don't use the width directly, you just use it to determine if its a single or double width char"

That's what I've been trying to do all along :) So how does one use a chracters width to decide if it's single or double sized? I think I see where you are going with this but I just want to make sure ...

 

by: burtdavPosted on 2003-04-03 at 20:08:51ID: 8266867

int w = getFontMetricsWidth(charToTest);
// compare to arbitrary width in this font below which all characters are "single-width" and above which all characters are "double-width"
boolean charIsDoubleWidth = w < 150;

 

by: objectsPosted on 2003-04-03 at 20:17:38ID: 8266916

I'm assuming that a double width char will be about twice as wide as a single width characters.
eg.

width= 9 -> single
width=19 -> double
width=11 -> single
width=18 -> double
width=22 -> double

 

by: totsuboPosted on 2003-04-03 at 20:37:42ID: 8266991

Yup, that's the hack I finally came up with last night at 2am. I'm assuming that any character that has a width < 16 is single and >=16 is double.

int getWidth(String s) {
  int l, width = 0;
  char c;
  Character character;
  for (int i = 0; i < s.length(); i++) {
    c = s.charAt(i);
    character = new Character(c);
    l = fm.stringWidth(character.toString());
    if (l == 16) width += 2;
    else width++;
  }
  return width;
}


Seems I got lucky and the font I picked uses the same width for all double-width characters (16) and it's only the single-width characters that have variable widths.

Horrible hack and I was hoping for a better solution but I guess there might not be one.

 

by: burtdavPosted on 2003-04-03 at 20:41:20ID: 8267009

You can make it a bit safer by changing (l == 16) to (l >= 16) or maybe even (l >= 15).

 

by: burtdavPosted on 2003-04-03 at 20:44:42ID: 8267020

You can also safe some time by getting rid of all reference to Character and changing (fm.stringWidth(character.toString())) to (fm.stringWidth(String.valueOf(c)))

 

by: objectsPosted on 2003-04-03 at 20:46:20ID: 8267024

> l = fm.stringWidth(character.toString());

Theres a charWidth() function you can use instead of creating a string.

 

by: totsuboPosted on 2003-04-03 at 21:26:10ID: 8267182

Thanks for the optimisation tips, they've been incorporated.

Optimisation was the last thing on my mind last night. Just getting the bloody thing to work was an achievement :)

 

by: burtdavPosted on 2003-04-03 at 21:32:45ID: 8267211

// I'm curious about the character set... what does this method display for the font you're using?
private void printChangeCount() {
boolean new, old;
int count;
for (int c = 1; c < 65535; c++) {
    new = fm.charWidth(c) >= 16;
    if (new ^ old) { // I hope this is correct to XOR 2 booleans; if not, ((!(new&&old))&&(new||old))
        count++;
    }
    old = new;
}
System.out.println(count);
}

 

by: totsuboPosted on 2003-04-03 at 21:46:58ID: 8267270

it prints out 8652

What does your function check?

Also how can I print out all the single-width characters?

I can't find a way to convert an int to a char or Character ...

 

by: objectsPosted on 2003-04-03 at 21:55:02ID: 8267296

> I can't find a way to convert an int to a char or Character ...

char c = (char) i;

why do you need that?

 

by: burtdavPosted on 2003-04-03 at 22:30:46ID: 8267443

It checks how often a wide char is next to a narrow one or vice-versa, thus measuring how many contiguous blocks of narrow characters there are. There are 4326. That's a lot, and I can conjecture that they may be well-mixed within the used range of characters. How many of the characters are actually used?

 

by: totsuboPosted on 2003-04-03 at 22:39:10ID: 8267477

objects:

I was just curious as to what the half-width characters were so I wanted to print them out.

I still can't figure out what burtdav's function does though.

*and* I was able to finally find a table giving the widths for characters. PHP has a mb_strwidth() function that returns the width of a string. They use these values:

Unicode range     Character width
---------------------------------
U+0000 - U+0019   0
U+0020 - U+1FFF   1
U+2000 - U+FF60   2
U+FF61 - U+FF9F   1
U+FFA0 -          2

Now, I know this is a simple question, but how does one check the unicode value of a char?

Would I just do:

int getWidth(String s) {
 int width = 0, c;
  for (int i = 0; i < s.length(); i++) {
    c = (int)s.charAt(i);
    if (c >= 0x0020 && c <= 0x1FFF) {
      width++;
    }
    else if (c >= 0xFF61 && c <= 0xFF9F)   {
      width++;
    }
    else if (c >= 0x2000 && c <= 0xFF60) {
      width += 2;
    }
    else if (c >= 0xFFA0) {
      width += 2;
    }
  }
  return width;
}

 

by: objectsPosted on 2003-04-03 at 22:49:09ID: 8267515

Yes that looks reasonable.

 

by: burtdavPosted on 2003-04-03 at 23:13:15ID: 8267622

There's no need to cast to int (assigning to c) - that cast is implicit.
It might be "nicer" to declare c as char anyway, and compare with char literals: if (c >= '\u0020' && c <= '\u1FFF') etc
As char is an integer data type, char and int are almost interchangable. (The only exception is that you can't implicitly cast int to char, because char is smaller.)

My function adds 1 to its count every time it finds a character with width >= 16 next to a narrower character,  ie if '\uff60' is wide and '\uff61' is narrow. If that table was going to produce the same results as your FontMetrics width >= 16 check, that function would return 3. So, either the table's wrong for this charset, or the width method is very unreliable.

 

by: totsuboPosted on 2003-04-04 at 00:48:02ID: 8268040

I'm not sure what the reason is for the PHP table and your program's output not agreeing, but I would probably say it has to do with the charset.

Do you know of any font that use the same code-space as Unicode? If so I could re-run your test using that font to see if it matches the table.

 

by: burtdavPosted on 2003-04-05 at 00:05:10ID: 8274397

MS Word (2000+ i think) comes with a "universal font" as an option under localisation in the install. I think that's a unicode font. Another issue might just be the arbitrary 16 point limit we've been using - a little higher or lower would change the results, maybe dramatically.

20120131-EE-VQP-002

3 Ways to Join

30-Day Free Trial

The Experts

98% positive feedback on 31,087 answers since March 2000. angeliii is a Microsoft Most Valuable Professional for his work with MS SQL Server & Develoment.

He has also proven his knowledge of Visual Basic Programming, PHP Scripting and Oracle Databases.

The Experts

97% positive feedback on 10,752 answers since July 2000. lrmoore has more than 18 years experience in the networking industry.

The six-time Mircosoft MVPs specialties include firewalls, virtual private networking, and network management.

Testimonials

"...and excellent source for support... Kind of like having your very own IT dept." Electriciansnet

Testimonials

"I was apprehensive at signing up at first. However... it has already made my life as an IT administrator much easier." JaCrews

Testimonials

"WOW! You guys have great, active, and knowledgeable people on here." moore50

Business Clients

Business Clients

In the Press

"If you’ve got a question... Experts Exchange can supply an answer.”

In the Press

"...an invaluable aid for both IT professionals and those who require tech support."

In the Press

"where IT professionals provide quick answers on just about any topic"

Business Account Plans

Loading Advertisement...