Link to home
Start Free TrialLog in
Avatar of TheOwner
TheOwner

asked on

JTextPane Control Character Formatting / Custom Markup

Hi,

I would please like to know what the best / easiest means of accomplishing the following would be:

A JTextPane that receives text that has custom characters for formatting text, for example @TEXT@

Would show the word "TEXT" in the JTextPane and be in bold.

I have looked into custom HTML tags, but they seem to follow HTML markup (for example, <B>TEXT</B> would accomplish the same result, but it has to be enclosed with the '<' and '>').

I unfortunately don't have code as I have no clue where to start and is just a simple JTextPane.

Any ideas? I am out of ideas here.


Regards,
Avatar of CEHJ
CEHJ
Flag of United Kingdom of Great Britain and Northern Ireland image

What's wrong with using html tags rather than reinventing the wheel?
You'd be far better off using <span> and <div> with CSS. See

http://weblogs.java.net/blog/enicholas/archive/2008/07/introducing_jav.html
another example here

http://www.exampledepot.com/egs/javax.swing.text/tp_StyledText.html


you're going to need to parse your text and apply the appropriate styling

> You'd be far better off using  and  with CSS. See

actually offers no pros that I can see, and would actually seem trickier to implement in your case.
>>actually offers no pros that I can see,

TheOwner, if you're confused about the pros too, let me know and i'll go through it
can't you just post us some code then to show us what you mean, seems a terrible overkill to us for basic formatting (that is already supported)

Avatar of TheOwner
TheOwner

ASKER

Hi there,

Thanks for the replies. As far as I understand, CSS would still need the enclosing tag '<' and '>'. The thing is, if I iterate through all the words and replace the @, _, etc control characters then I would need to make sure that the word is in fact enclosed with those characters (for instance an e - mail address of test@domain.com must not become test<b>.com).

As for "re - inventing the wheel", I agree with you but it is for a client that already implements these standards and he does not want to re - do the backend.

Still stuck unfortunately :(
Sure, let me create some code for you.
my comment was actually directed at CEHJ and his suggestion to use html.

I don't see it's necessary, see the links I posted above for how to apply the styling as you parse the text.

Yes i sympathise. This is what happens when people don't use standards-based technologies.

>>for instance an e - mail address of test@domain.com must not become test<b>.com

Yes, you're right, so they've made a poor choice using '@' as a delimiter - maybe they chose it before email was invented ;-)

The first thing to do then is to approach the issue of parsing and it would be useful to see a complete example

The code below is just copy and paste from the example here, as I have not coded anything yet because I do not know in which direction to go with this.
 
JTextPane rxTextPane = new JTextPane(); //Receives the String
JScrollPane scrollingArea = new JScrollPane(rxTextPane);
 
StyledDocument doc = (StyledDocument)trxTextPane.getDocument();
 
Style style = doc.addStyle("BOLD", null);
StyleConstants.setBold(style, true);
 
doc.insertString(doc.getLength(), "Some Text", style); // Option one of input.
rxTextPane.setText("Some text"); // Option two of input (I would rather like to use this method). No formatting done here in this case.
 
rxTextPane.setText("@Some text"); // No formatting done here in this case.
 
rxTextPane.setText("@Some text@"); // "Some text" becomes bold.

Open in new window

No, we need to see the actual text you're going to be using
As far as the text that will be used, the text that will be used are responses from a server (it's not a "set" / final String). I.e. it dynamically changes.
If your text already contains your markup character then theres not much you can really do to determine which are markups and which are actuall @'s.

>> (it's not a "set" / final String). I.e. it dynamically changes.

Yes, i'm aware of that. I mean just an example
Spot on, objects. The text already has the markup and is dynamic.

My best thought at the moment is to get each word in the String to a String array and then check the first character and last character to match the control chars - then substring that array element's first and last chars off and insert HTML code in their place.
that should work as long as any real @'s are embedded. Use split() to break up the line and then process each word


You don't need to insert any html though, that will just be messy

you could do something like this. may need to vary it a little depending on your exact requirements.

Style bold = doc.addStyle("BOLD", null);
StyleConstants.setBold(bold, true);

if (bold(word)) {
   style = bold;
else if (italic(word)) {
   style = italic;
...

doc.insertString(doc.getLength(), word, style);

Thanks for the tip, I think that is the best option so far. I am just worried the server doesn't reply with:

@This is an example@

@This is an@ example

It is ideal for:

@This@ @is@ @an@ @example@

This @is@ an example

I'll need to capture the packets and see.
if its doing that you'll need to get a little more sophisticated and remember the current style.
So if u see @ at start of word youe set style to bold, then insert text, then if you see @ at end of word set the current style back to plain.

>>I'll need to capture the packets and see.

Yes, when you know what the actual markup is, let us know
I have captured the data and it unfortunately spans across multiple words, eg:

@This is an example@
This @is an@ example
That's not a problem in itself. The following would convert that to CSS (for example)
s = s.replaceAll("@(.*?)@", "<span class=\"x\">$1</span");

Open in new window

Thanks for that snippet. I am not the best with that kind of String handling, so I am just trying to understand the $1, but I'll try that out ;)
SOLUTION
Avatar of CEHJ
CEHJ
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Hi,

I've been having great success with the above CEHJ's code, however there is just one small problem: I have noticed that in some of the server responses, they make use of the character asterisk '*' (yeah, it's absurd I know).

Using the above method, I get the exception below:
java.util.regex.PatternSyntaxException: Dangling meta character '*' near index 0

Any idea for a way around that?
Oh wait I got it, I just use [ and ] to enclose the asterisks. Thanks guys, I'll give another update soon ^^
All you need is this:

Style current = plain;
String words = line.split(" ");
for (String word : words) {
   if (word.startsWith("@")) current = bold;
   doc.insertString(doc.getLength(), strip(word), current);
   if (word.endsWith("@")) current = plain;
}

Hi objects,

With your code, does it satisfy the situation if there is just an @ character, but no secondary one? For instance, would the String representing an e - mail address

account@example.com show the "example.com" in bold?

I am just worried for bugs in case data may genuinely have an @ symbol like in the above scenario. It seems a LOT neater than using HTML though.

I've created much of everything, I am just doing testing with stacked markup characters (e.g. @#STRING#@) using pattern matching, still too early to know for sure if those HTML tags will play nicely.
it would, it breaks up the sentences on spaces so embedded @'s would be left alone

If they use an asterisk, you can do
s = s.replaceAll("\\*(.*?)\\*", "<span class=\"x\">$1</span");

Open in new window

Oops - copy/pasted the typo with the missing angle last angle bracket - but you get the idea
the strip() method used in the code above would look like this (just strips the @ before inserting into document

static strip(String word) {
   int start = (word.startsWith("@") ? 1 : 0);
   int end = (word.endsWith("@") ? word.length() : word.length() - 1;
   return word.substring(start, end);
}

you could incorporate this in the loop instead of having a separate method if you liked, up to you.

Thanks for the help again guys, much of the app is working - I just have a few more hurdles to cross (text colouring is sent as #HEXCOLOUR) and only begins at the start of the String - I can obviously test this by checking charAt(0) for '#' but then it may be "#zzzzzz" which would be an invalid colour.

The way I see it, it can only be the characters from A-F and 0-9 due to hexidecimal. I am guessing something regex like [a-fA-F[1-9]]

Program is otherwise coming along nicely ;)
Yes, regex would probably do it, but how does that marker end?
That's the thing - it doesn't if it is the entire sentece :( it just has that beginning marker segment
Awful ;-)
You might try something like
s = s.replaceAll("#[a-fA-F1-9]{6}([^\\.]*?)\\*", "<span class=\"clred\">$1</span>");

Open in new window

>  I can obviously test this by checking charAt(0) for '#' but then it may be "#zzzzzz" which would be an invalid colour.

Just test the colour if the first char is a #, no regexp needed.
And you need to parse the colour anyway

Hi objects,

I have spent the past day tweaking your code, etc and it's great but I have some serious issues with it cutting characters off when they are not enclosed with metacharacters, for example:

*test* second text *third test*

My output would be bolding correctly, but the middle String would be missing characters. Another thing is if there is spacing between the metacharacter and the String tokens - it breaks it. I tried using a scanner on the string and then getting all the tokens that are meant to be bolded, but then it started getting complicated when comparing those to the original string.

I've had to resort to the HTML formatting / replaceAll technique as it is very stable and takes into account multiple whitespacing.

As for the checking for colour, I can't accept a colour with #zzzzzz because that may actually be a response from the server - I have to literally check that it is indeed a colour, (which I is accomplished from the code a few posts ago)

So far so good though!
>>(which I is accomplished from the code a few posts ago)

which post do you mean btw?
>  but the middle String would be missing characters.

sorry thats a little bug in the split() method I posted, just adjust the index when it doesn't find a meta charactere
let me know if u don't understand.

> Another thing is if there is spacing between the metacharacter and the String tokens - it breaks it.

Thats easy to fix, you just need to check for meta characters on their own.

> I can't accept a colour with #zzzzzz because that may actually be a response from the server - I have to literally check that it is indeed a colour

exactly why you need to use the approach I suggested as it allows you to check exactly what is there

> The way I see it, it can only be the characters from A-F and 0-9 due to hexidecimal. I am guessing something regex like [a-fA-F[1-9]]

again no regex needed, just parse the hex value

http://helpdesk.objects.com.au/java/how-to-parse-a-hex-string

Hi there objects,

I see what you mean now, I was using Color.decode() but that makes sense;)

I would still like to you use your code, but the problem is I still can't get the middle text to not have missing characters :(

I've tried tweaking the index, etc but no luck. Also to be safe, I need to know how to have it interperate
@ word@ correctly and not just @word@.
> I would still like to you use your code, but the problem is I still can't get the middle text to not have missing characters :(

the end index was the wrong way around in the code I posted earlier, needs to be

               int end = (word.endsWith("@") ? word.length()-1 : word.length());

Super, I noticed that I after playing around with the index, etc I messed up the core logic (I placed the insert somewhere else). That works super duper, but now I need to convert this HTML to that ;)

Back to the grinder for me!
Hi objects,

The thing is I need to make sure that the String segment is indeed enclosed in a pair of '*''s - it could just be a single * with no partner that isn't meant for formatting. Do you think it's best that I use regex to see if it is enclosed and then apply your code?
Hi there,

Just an update, the regex that I have figured out to do the regex is

"(?<=\\*)([^\\*]*)(?=\\*)"

That will do checking for two '*' characters. Any idea guys how to best use this regex with object's code above, though? That way it can ensure that the bolded sentence is indeed encapsulated with the delimiter.
how would you decide which was meant for formatting and which wasn't?

a *test string *which is confusing*

Hi objects,

That is correct

a *test string *which is confusing*

In that example, it should have "test string *which is confusing" in bold only. I guess the outermost *'s, but then the code must not be confused with:

a *test string which is confusing

Must not make  "test string which is confusing" bold.
Sorry objects, I wasn't thinking logically, it should actually be:

a *test string *which is confusing*

=

just "test string " in bold

a *test string *which is confusing* with some added text

=

"test string " in bold and not " with some added text". Sorry for any confusion!
in that case you would toggle from bold to plain each toime you encoutered a @

            String[] words = line.split(" ");
            for (String word : words) {
                  if (word.equals("@")) {
                        bold = toggle(list, bold);
                  } else {
                        if (word.startsWith("@")) {
                              bold = toggle(list, bold);
                        }
                        list.add(strip(word));
                        if (word.endsWith("@")) {
                              bold = toggle(list, bold);
                        }
                  }
            }
            toggle(list, false);


where toggle would add the words from the list to the document using the flag to determine if they were bolded or not. It would also clear the list and toggle the bold flag.

Hi objects,

I am unclear as to the reason I would toggle from bold to plain after each @. It would then bring me back to the original problem of a response containing:

This is an *example sentence

making "example sentence" bold.

I would love to try out your code, but I am unsure about the toggle method, would be similar to
void toggle(String text, boolean isBold) {
if (isBold)
doc.insertString(doc3.getLength(), text, bold);
else
doc.insertString(doc3.getLength(), text, plain);
}

Open in new window

> This is an *example sentence
> making "example sentence" bold.

if you look at the last line of code it displays any trailing words always as plain.
Only wrapped words would get bolded

toggle could be done something like this:

void toggle(List list, boolean isBold) {
for (String text : list) {
if (isBold)
   doc.insertString(doc3.getLength(), text, bold);
else
   doc.insertString(doc3.getLength(), text, plain);
}
list.clear();
return !isBold;
}

Ok that's awesome objects, I am going to get to this now then! I can finally put this misery behind me!

BTW is my code alright? That is how I guess I need to assume the toggle method. Let me know if it's wrong.
Ah thanks objects, seems like we replied close together ;)

I'll let you know how this works out and hopefully end this ;)
Hi objects,

Quick question, what should the initial values of bold and list should be? For instance, I get null pointer exception with my code below:
    public static void testInsertion(String message, JTextPane pane) throws Exception {
        String[] words = message.split(" ");
        Boolean bold = null;
        List tempList = null;
 
        for (String word : words) {
            if (word.equals("*")) {
                    bold = toggle(tempList, bold, pane);
            } else {
                if (word.startsWith("*")) {
                    bold = toggle(tempList, bold, pane);
                }
                tempList.add(strip(word));
                if (word.endsWith("*")) {
                    bold = toggle(tempList, bold, pane);
                }
            }
        }
        toggle(tempList, false, pane);
    }

Open in new window

Use a boolean (initially false) instead of Boolean
List could be an empty ArrayList
        List tempList = new ArrayList();

Hi objects,

I could get the code to work, but only if I changed the !return isBold to just return isBold. The only problem is it seems to go in reverse now, for instance:

This is an example * sentence

Makes "This is an example " go bold for me for some reason. The cool part though is mid sentence is stable, for example:

This is an example s*entence

leaves the formatting as is ;)

 Any ideas?
Hi there objects,

I think this is one of the few cases in programming where I am going to go the substandard route and use HTML with replaceAll (I really made a concerted effort to use your code over the past few days). It just seems to be the safer bet as it is rock solid and I don't want any bugs in the application.

I do have a slightly different question about inserting a SimpleAttributeSet as an ImageIcon into a JTextPane that has HTML formatting - should I open up a new question thread about that? I don't think it's possible though as it uses RTF tagging from my observations.

The reason I really want(ed) to use the insertString methodology was it kept things simple for inserting icons (and obviously cleaner coding).
whats your code currently look like

> It just seems to be the safer bet as it is rock solid and I don't want any bugs in the application.

I'd actually see the regexp as the higher risk as you're doing a global replace on the text
Handling the parsing yourself is not only cleaner but also allows you to implement it exactly as you need it

Hi objects,

If I can get this insertString method going fine (handling the metacharacters correctly) then I am definitely all for it, but I don't want to cause you too much effort.

As for the previous question, I like the user to be able to copy and paste the text without losing the text that the image represents. You will see what I mean below.

As I said, I just don't wan too cause too much hassle for you.
case '1':
doc2.remove(index,2); // remove old text, let's insert a picture there and then insert text below the picture.
sas=new SimpleAttributeSet();
StyleConstants.setIcon(sas,new ImageIcon("picture.png"));
doc2.insertString(index,"smalltext",smi);
start=(index+2);
break;

Open in new window

was asking to see the code to fix:
 "Makes "This is an example " go bold for me for some reason. "

you shouldn't get that problem

Hi objects,

Sure thing, here we go:
    public static boolean toggle(List<String> list, boolean isBold, JTextPane c) throws Exception {
 
        StyledDocument doc = c.getStyledDocument();
        Style bold = doc.addStyle("BOLD", null);
        StyleConstants.setBold(bold, true);
 
        Style plain = doc.addStyle("PLAIN", null);
 
        for (String text : list) {
            if (isBold) {
                doc.insertString(doc.getLength(), text, bold);
            } else {
                doc.insertString(doc.getLength(), text, plain);
            }
        }
 
        list.clear();
        return isBold;
    }
 
    public static void insertText(JTextPane c, String message) throws Exception {
        testInsertion(message, c);
    }
 
    public static void testInsertion(String message, JTextPane c) throws Exception {
        String[] words = message.split(" ");
        boolean bold = false;
        List<String> tempList = new ArrayList<String>();
        tempList.add(" ");
 
        for (String word : words) {
            if (word.equals("*")) {
                bold = toggle(tempList, bold, c);
            } else {
                if (word.startsWith("*")) {
                    bold = toggle(tempList, bold, c);
                }
                tempList.add(strip(word));
                if (word.endsWith("*")) {
                    bold = toggle(tempList, bold, c);
                }
            }
        }
        toggle(tempList, false, c);
    }

Open in new window

>         return isBold;

bold will never get set that way, needs to be

        return !isBold;

Hi objects,

I had to change that the other way around, otherwise it bolds the wrong text (simple enough to swap the logic around though).

I replaced it to isbold but I still get the same effect - a single asterix causes the bolding effect.
add spome debug to see whats going on

                System.out.println("BOLD "+text);
                doc.insertString(doc.getLength(), text, bold);
            } else {
                System.out.println("plain "+text);
                doc.insertString(doc.getLength(), text, plain);

Insertion of "test String*" = // Incorrect

BOLD  
BOLD test
BOLD String*

Insertion of "*test String*" = // Correct, but opposite / no bold with !isBold

BOLD  
PLAIN *test
PLAIN String*

Insertion of "test *long String*" = // Correct, but opposite / no bold with !isBold

BOLD  
BOLD test
PLAIN *long
PLAIN String*
you sure you're running the same code that you posted, cause I get different result here when I run it :)

I am most definitely running that exact code ;) Are you using all three of my methods and not just yours? As I said, maybe I should just resort to HTML and use img tags...
am running the same 3 methods (with my strip method which you didn't post)
bold is initially false so I can't see how it is always true at the start in your output

output for the first one I get

plain  
plain test
plain STring

you output also does not appearing to be stripping the delimiters, what does your strip look like?

ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Oh yes, the Strip method:
public static String strip(String word) {
int start = (word.startsWith("*") ? 1 : 0);
int end = (word.endsWith("*") ? word.length()-1 : word.length());
return word.substring(start, end);
}

Open in new window

What can I say, I ran your code and you are right, it works great. I then copied and pasted my methods with the ones you quoted and it works for me now too!

I am very grateful for your time and patience, you're very level - headed and helpful.

I think we can put this to rest now, thanks again ^^
no worries, its been fun :)
I don't get to write a lot of code these day, been a refreshing change.
Thank you very much objects and CEHJ - both of you gave very good solutions i my opinion ;)