String concatenation in Java: syntactic sugar versus efficiency

CPColinSenior Java Architect
EE Senior Architect
Published:
Updated:
Java provides two main ways to perform string concatenation and one of them could cause you some trouble. The first one most of us encounter is the + operator, where doing something like String value = "Hello " + "world"; results in a string with the value "Hello world". Most of the time, developers don't immediately learn that the + operator is just syntactic sugar that the compiler turns into StringBuilder operations, that being the other main way one can concatenate strings in Java.

This article uses examples from the Java source code in ( ConcatenationDemo.java). It also uses some of the same bytecode-reading techniques I used in my previous article.

Just about every time the compiler encounters a + operator that has String objects or literals as its arguments, it translates that code into a three-step sequence of operations:
 
  1. Create a new StringBuilder instance, passing the first argument to its constructor.
  2. Call StringBuilder.append() and pass the second argument.
  3. Call StringBuilder.toString().

The compiler does optimize this operation, when it can, though. For example, if a line of code is creating a String value by concatenating several literals or constants together, the compiler will concatenate those values itself and place the resulting value in the compiled class:
 
   public String concatenationDemo1()
                         {
                            String value = "literal1" + " " + "literal2" + " " + STRING_CONSTANT;
                            
                            return value;
                         }

Open in new window

The value that ends up in the compiled bytecode is "literal1 literal2 constant". This is exactly what syntactic sugar is meant for!

If you break that line into its individual parts, though, it doesn't work out so well:
 
   public String concatenationDemo2()
                         {
                            String value = "literal1";
                            
                            value += " ";
                            value += "literal2";
                            value += " ";
                            value += STRING_CONSTANT;
                            
                            return value;
                         }

Open in new window

Each use of the += operator is converted into that three-step sequence of operations mentioned above! The resulting bytecode will construct four instances of StringBuilder, append a single string value to each of them, and call toString() on the result. That's a bit of a waste. You'd be much better off constructing your own, single instance of StringBuilder and calling append() on it four times.

The compiler will also optimize multiple string concatenations that occur within a single statement, since the intermediate values can't possibly be used anywhere. For example, the following two methods produce identical bytecode and can be thought of as exactly equivalent:
 
   public String concatenationDemo3()
                         {
                            String value = createRandomString()
                               + ' '
                               + createRandomString()
                               + ' '
                               + STRING_CONSTANT;
                            
                            return value;
                         }
                         
                         public String concatenationDemo4a()
                         {
                            String value = new StringBuilder(String.valueOf(createRandomString()))
                               .append(' ')
                               .append(createRandomString())
                               .append(' ')
                               .append(STRING_CONSTANT).toString();
                            
                            return value;
                         }

Open in new window

I have two asides to note here. It's probably more efficient to use append(' ') over append(" "), since the latter method has to iterate over a string value, adding each of its characters, while the former method only has to add a single character. As for that concatenationDemo4a() method, I've never been a huge fan of that style of programming, where a method on an object returns this so calls on that object can be chained. To my eye, this version looks better:
 
   public String concatenationDemo4b()
                         {
                            StringBuilder value = new StringBuilder();
                            
                            value.append(createRandomString());
                            value.append(' ');
                            value.append(createRandomString());
                            value.append(' ');
                            value.append(STRING_CONSTANT);
                            
                            return value.toString();
                         }

Open in new window

This produces slightly longer bytecode, when compiled, though, so this is a classic tradeoff of readability versus efficiency (and, if you read my previous article, it may not make a difference anyway).

If your code is called occasionally, you probably don't have to worry about the exact style of string concatenation you're using—you don't affect the processing time or memory requirements much, either way. If you're looping, though, here's where this stuff makes a big difference. For example:
 
   public String concatenationDemo5()
                         {
                            String value = "";
                            
                            for (int i = 0; i < ITERATIONS; i++)
                            {
                               value += createRandomString();
                            }
                            
                            return value;
                         }

Open in new window

This method is going to do that three-step sequence of operations for each iteration of the loop. If ITERATIONS equals 1000, that means this code will be creating 1,000 short-lived StringBuilder objects, then calling append() and toString() on each of them. The code would be much better off if it looked like this:
 
   public String concatenationDemo6()
                         {
                            StringBuilder value = new StringBuilder();
                            
                            for (int i = 0; i < ITERATIONS; i++)
                            {
                               value.append(createRandomString());
                            }
                            
                            return value.toString();
                         }

Open in new window

This way, there's one StringBuilder instance per call to the method, saving plenty of processing power and memory overhead. (Where, by the way, would you find code that builds a string by looping and concatenating? It typically happens when reading text from a stream, like from a file or over a network.)

So, as always, it's best to focus optimization efforts on areas of code that have proven to be problems. Hopefully, this article helps your string concatenation code avoid becoming one such problem area in the first place.
1
3,081 Views
CPColinSenior Java Architect
EE Senior Architect

Comments (0)

Have a question about something in this article? You can receive help directly from the article author. Sign up for a free trial to get started.