java string concatenation speed tests

Looking through some code, I saw a lot of string additions being used. This is generally considered as less optimal as it would generate a lot of garbage and thus cause a garbage collection sooner. I used to remedy this in the past by using the javolution library (TextBuilder class) which is built for real-time performance and which is supposed to reuse memory instead of creating a lot os short lived objects.

However, with the advances in java compilers and (just-in-time) interpretters, is this still true?

I wrote a simple test which I adopted to check performance and memory usage of string concatenations.
The version with string additions looks like this

  public static void main(String args[]) {
    long startMemory = Runtime.getRuntime().totalMemory();
    long startTime = System.currentTimeMillis();
    for (int i = 50000; i > 0; i--) {
      String base1 = "" + (char) (65 + (i % 26)); // assure we don't always start with the same letter
      String base2 = "" + (char) (65 + ((i+1) % 26)); // assure we don't always start with the same letter
      String calc = "";
      for (int j = 100; j > 0; j--) {
        calc = calc + base1 + base2;
        process(calc);
      }
    }
    long endTime = System.currentTimeMillis();
    long endMemory = Runtime.getRuntime().totalMemory();
    System.out.println("time spent " + (endTime - startTime) + "ms, memory used "+((endMemory-startMemory)/1024)+"kB");
  }

  private static void process(String value) {
    // we don't do anything here, as that may slow the timing too much, but need string!
    //System.out.println(value);
  }

The other versions are similar (StringBuilder example shown)

  public static void main(String args[]) {
    long startMemory = Runtime.getRuntime().totalMemory();
    long startTime = System.currentTimeMillis();
    for (int i = 50000; i > 0; i--) {
      String base1 = "" + (char) (65 + (i % 26)); // assure we don't always start with the same letter
      String base2 = "" + (char) (65 + ((i+1) % 26)); // assure we don't always start with the same letter
      String calc = "";
      for (int j = 100; j > 0; j--) {
        StringBuilder sb = new StringBuilder();
        sb.append(calc);
        sb.append(base1);
        sb.append(base2);
        calc = sb.toString();
        process(calc);
      }
    }
    long endTime = System.currentTimeMillis();
    long endMemory = Runtime.getRuntime().totalMemory();
    System.out.println("time spent " + (endTime - startTime) + "ms, memory used "+((endMemory-startMemory)/1024)+"kB");
  }

  private static void process(String value) {
    // we don't do anything here, as that may slow the timing too much, but need string!
    //System.out.println(value);
  }

I have run these tests on an Ubuntu system, using the sun java compiler/interpretter. All tests were run five times and averages are shown.

Considering the advice to use StringBuilder instead of StringBuffer (as the latter is synchronized), I used both.
I started out using javolution 3.7.10 for the TextBuilder case, and was surprised about the results, so I also tried the latest version (5.3.1) hoping to show improvements, only to be surprised again.


Sun Java 1.5.0.18

String addition                    5.542s   29.192MB
StringBuffer                       6.761s   18.841MB 
StringBuilder                      5.510s   24.768MB
javolution TextBuilder 3.7.10     30.724s    6.976MB
javolution TextBuilder 5.3.1       7.068s   25.638MB

Sun Java 1.6.0.14

String addition                    4.660s   35.084MB
StringBuffer                       5.678s   30.234MB
StringBuilder                      5.575s   26.846MB

I only did one run using javolution’s Textbuilder on java 6 which indicated that the tendency was the same as in 5.

Results :

When considering speed, the difference between using string additions or StringBuilder is negligible on java 5. However string additions are actually faster in java 6.

When considering memory usage, javolutions 3.7 TextBuilder has a clear advantage with with a almost sixfold speed penalty. The latest javolution loses this advantage entirely.

On java 6 StringBuilder and StringBuffer are almost equally fast, making the distinction irrelevant when there is no contention. In fact, StringBuffer then has the advantage because it creates less garbage.

Surprisingly, the use of (the synchronized) StringBuffer outperforms StringBuilder on memory usage in java 5, but this advantage is lost in java 6 in favor of a reduced speed penalty.

10 Comments

  1. Gaurav says:

    The reason why StringBuffer and StringBuilder show almost equal performance on java 6 is lock elision.

  2. Casper Bang says:

    Personally I think the emphasis on bad String cancatanation performance is a case of don’t-optimize-prematurely. It’s not often I experience a scenario that justifies using StringBuilder (StringBuffer if syncronization is needed).

    And btw. if you are going to use StringBuilder, you can gain a further 20%-30% in speed by pre-allocating the memory you need.

  3. Pekka Enberg says:

    Your microbenchmarks lack a warmup phase, for example, so the results are not reliable. You might want to check out Brian Goetz’s article on how to write microbenchmarks for Java before redoing the tests.

  4. Nick says:

    String addition is actually going to be faster in the most common case in which everything is added together in a single line. The compiler will optimize that, usually using a StringBuilder but in many cases it will combine the string literals. I’ve seen cases where people have gone as far as to create utility methods where you pass in an array of Strings and it will use a StringBuffer to add them together. Ironically there they are getting none of the StringBuffer advantages (the individual strings still have to be created) and a lot of extra disadvantages. In general, don’t try to outsmart the compiler. If you think of a cool way to optimize some relatively common code, then there is a good chance the compiler developers figured that out a long time ago.

  5. LCT says:

    The reason why string concatenation is fast is optimized bytecode. Look at the generated bytecode (javap -c) and you will see that the compiler is translating the code to stringbuilder (at least on 1.6).

    So unless you can predict what the compiler would co, you should never rely on this and effectively use StringBuilder instead.

  6. Kirk says:

    Jeroen Borger wrote a synchronization microbenchmark published on QCon (Do java 6 lock optimizations really work iirc). That bench was validated by myself, Cliff Click and a couple of others that have seen way too many broken benchmarks.

    I don’t have the time to properly dig through this bench but given the results and the conclusions I’m going to suggest that this bench isn’t answering the question you believe it is. For one, StringBuilder and StringBuffer share the same implementation. The only difference is the synchronized modifier on the method signatures. In the single threaded bench Jeroen clearly demonstrated that losing the synchronized keyword had a significant impact on performance in just about every case tested.
    You still cannot count on the compiler, HotSpot and the JIT to correct inefficient code. It does a pretty decent job most of the time but just as often is doesn’t manage to do what we’d hope it would.

    Most StringBuffer usage should really be StringBuilder. The compiler isn’t going to make that switch for you. Biased locking may or may not kick in as you’d hope it would thus making StringBuffer about the same cost as StringBuilder.

    Short story, I would always recommend using a presized StringBuilder on only back away from presizing when it isn’t practical. This is true for the 1.6 and will most likely be true for the 1.7.

    One last, if you are going to report averages, do us a favor and report variance. The variance in this benchmark should be close to 0. If not, it’s yet another sign that the bench is broken. Variance of 0 doesn’t mean the bench is answering the question asked of it but it’s a sign that the behavior is stable and that you’re not measuring random effects.

    If you want to learn how to do this right, take my performance tuning course! Gosh, that sounds too commercial 😉 Ok, read Cliff Click’s blog instead.

  7. joachim says:

    @Gauras, I already assume java6 was doing the synchronization better.

    @LCT This makes it only more remarkable that the automatic conversion etc is actually making it faster than directly using StringBuilder. Though admittedly the contrived test may make it possible for the compiler to do some extra optimizations.

    @Kirk Fortunately the test indeed always makes StringBuilder faster than StringBuffer. I would have very much doubted the results otherwise. The difference caused by the synchronization is well known.
    I would indeed not be surprised that some (for me) unexpected effects taint the result of this test.

  8. Carl says:

    The history of Java is full of people who optimize (or benchmark) the wrong phenomenon – and end up surprised. This post is another case in point.

    As others have pointed out, the compiler will reformulate a one-liner concatenating a small number of Strings into the corresponding construct using StringBuilder. For such cases, ‘+’ is the better alternative simply because it’s more readable.

    The poster child for StringBuilder is the case of assembling a lengthy string by piecewise addition of small chunks:

    String a = “”; for (int j=0; j

  9. Carl says:

    There’s a character limit on comments???

    String a = “”; for (int j=0; j

  10. Carl says:

    Oh, it’s the < sign.

    I’m not gonna brave the comment filter again. Build a line of code that concatenates 1000 or 10000 asterisks one at a time into a String and you’ll see what I mean.

    THIS is what you should have tested to demonstrate the performance benefit of StringBuilder or some other 3rd party lib.

Leave a Reply

Your email address will not be published. Required fields are marked *

question razz sad evil exclaim smile redface biggrin surprised eek confused cool lol mad twisted rolleyes wink idea arrow neutral cry mrgreen

*