How does Experts Exchange format their text?

Posted on 2004-11-01
Last Modified: 2006-11-17
Meaning that, right now I am entering text into a textbox (textarea) and I put in some simple formatting in here ... nothing rich text though.

Just some line breaks ....

    - And some spaces for maybe code or example text

All I want to do is get the same simple HTML formatting into my pages ... I am using JSP to do the dynamic input and output for my pages and so far I am saving the results to the database without any problem. I am just looking for the best way to format it on the way back out ... or is it best to save the initial input as the formatted HTML that I want so I don't need to perform an action on the text every time I pull it out from the DB?

So far I am thinking of just doing a simple replace of a carriage return (character 32?) with a "<br>" and I think that I will look for instaces of 2 spaces in a row and replace that with a "&nbsp;" ... regular expression ... I think.

What are your thoughts? Code snippets in Java?
Question by:dfu23
    LVL 2

    Expert Comment

    Do you want save multiple spaces as sequential &nbsp;s?

    Also, how do you have your textarea's configured?  Some paramters make a big difference in perserving CRLFs.
    LVL 14

    Author Comment

    Moving the question to the Community Support to be answered by the Engineering team would be fine ...


    This input it being saved into an Oracle DB from a web form/textarea. Right now I only really have to worry about IE as the primary browser but I would like it to work fine with any browsers input. And to your first question my answer is yes. I want to find the places where there are two or more spaces together and replace them with the same number of &nbsp;'s. At least that is the thought for now ... any better suggestions are welcome.
    LVL 2

    Expert Comment

    1 to many spaces in HTML text will always be displayed one space (duh!).  A problem with regular expression are that they are "single-pass".

    So, consider the following line:

    Beginning onespace onespaceagain  twospaces    fourspaces     fivespaces

    We can't search for "  "(two spaces) and recursively consume and match each space until we get something like this:

    Beginning onespace onespaceagain&nbsp; twospaces&nbsp;&nbsp;&nbsp; fourspaces&nbsp;&nbsp;&nbsp;&nbsp; fivespaces

    But, we can be tricky with one pass and still retain the spacing.  Consider this substitution phrase:

    /  / \&nbsp;/g

    on the original line, we would get:

    Beginning onespace onespaceagain &nbsp;twospaces &nbsp; &nbsp;fourspaces &nbsp; &nbsp; fivespaces

    Which would exactly retain the initial spacing used in our original example(and would save on some space over the recursively
    consuming example) when displayed in HTML.
    LVL 14

    Author Comment

    ok, so let's say that I have the input from the textarea stored in a String:

    String strInput = request.getParameter("txtMessage");

    How would I then run the regular expression you suggested against this string?
    LVL 2

    Accepted Solution

    (The following exmaple is in java 1.4)

          String strInput = request.getParameter("txtMessage");
            java.util.regex.Pattern p = java.util.regex.Pattern.compile("  ");  // Two spaces
            java.util.regex.Matcher m = p.matcher(strInput);  // Apply regular expression to string
            String newReplacedString = m.replaceAll(" &nbsp;");  //  Replace all applicable matches
    LVL 14

    Author Comment

    Very cool,

    I really like the thinking on saving space and doing:

     &nbsp; &nbsp;

    Instead of:


    I have had success now with getting the basic formatting of what I wanted with the following:

    java.util.regex.Pattern p1 = java.util.regex.Pattern.compile("  ");
    java.util.regex.Matcher m1 = p1.matcher(message.getMessage());
    String strFormattedMessage = m1.replaceAll(" &nbsp;");
    java.util.regex.Pattern p2 = java.util.regex.Pattern.compile("[\n]");
    java.util.regex.Matcher m2 = p2.matcher(strFormattedMessage);

    Where message.getMessage() retrieves the message from the DB for the matcher in the first instance and then a second pattern matcher is used to find new lines. I then spit out the results later in the page like this:


    Any suggestions on improving this? Also, I'm not handling characters yet that could mess with the HTML code ... do you know of any easy ways to replace these instances too?

    BTW - I'm increasing the points because you have been a great help to me and feel that you deserve more :)
    LVL 2

    Expert Comment

    Your solution for newlines is pretty much right on.  My only critique is you can use the pattern "\n" instead of
    "[\n]".  The brackets match any character in a specified range.  In your code, your specified range is just "\n",
    so it will only match "\n".

    Check out:

    Now, for handling characters that would mess up HTML, first of all, don't forget characters that will mess up SQL.
    Second, I can't think of a way to do all the HTML special character substitions in regular expression.

    But, in regards to SQL special characters, spaces and HTML special characters, you probably don't want to save
    the "&nbsp;"s and other HTML special characters in your database (like I said before about saving space).  Just
    do the translations after you pull the data out of your database.  But, beware of special characters that might mess
    up your SQL statements.  Escape special SQL characters and try to save all of the data from the page in the same
    format of as what was typed in.  That way, if a comment includes "&quot;", you will save "&quot" in the database
    and then when you output it to HTML, you'll output "&amp;quot;".

    (Check the HTML that EE created for my comment).

    Here is a summary in pseudo-code:

    1.  Read input and save as string
    2.  Escape all SQL characters
    3.  Save to database

    Then, when requested:

    1.  Read string from database (assuming you escaped the SQL chars correctly, this string should match from the
    step 1 above)
    2.  Escape all HTML characters, with a series of regular expressions
       a.  You might even want to encapsulate those three statements into one method like:
              String substitute(String inputString, String expressionString, String replacementString);
    3.  Spit it out into HTML
    LVL 14

    Author Comment

    I think that I have the trouble covered of handling special SQL characters because I am using a stored procedure to interact with Oracle on saving and retrieving the data. Let me know if you know differently on that subject ...

    From past experience with server-side web dev tools I seen methods built in, like HTMLEncode, which will take a string and replace the special characters with what is needed to display appropriately on a web page. Do you know if such a thing is available in JSP?
    LVL 2

    Expert Comment

    Stored procedures work fine.  I was just trying to be precautious (I've spent a few nights tracking down
    little problems like this).

    Try this for HTML Encoding:

    LVL 14

    Author Comment


    Thanks very much for all of your help. With a little further searching I found a nice little peice of code from an earlier post here in the Java Topic Area that will replace any special character in a string with its equivilant. If you are interested here is a link:

    So basically I do that first ... replace special characters.
    Then I do another sweep with the regular expression to replace two spaces with a space and a &nbsp;
    Then I do another sweep with the regular expression to replace newlines with a <br/>
    Then I do another sweep with the regular expression to replace tabs with a space a &nbsp; a space a &nbsp; and a space (5 spaces)

    I am very pleased with the results.
    LVL 2

    Expert Comment

    Cool!  I'm glad I could help.

    I was thinking about the tab replacement.  I'm glad you found a solution to that, also.

    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    IT, Stop Being Called Into Every Meeting

    Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

    Introduction This article is the last of three articles that explain why and how the Experts Exchange QA Team does test automation for our web site. This article covers our test design approach and then goes through a simple test case example, how …
    Java functions are among the best things for programmers to work with as Java sites can be very easy to read and prepare. Java especially simplifies many processes in the coding industry as it helps integrate many forms of technology and different d…
    This tutorial covers a practical example of lazy loading technique and early loading technique in a Singleton Design Pattern.
    This theoretical tutorial explains exceptions, reasons for exceptions, different categories of exception and exception hierarchy.

    857 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    18 Experts available now in Live!

    Get 1:1 Help Now