.Net 4.0 StringBuilder implementation change : StringBuilder is now a linked list


An interesting change in the .Net 4.0 framework is the new System.Text.StringBuilder implementation. The .Net 1.0-era version has been ditched (it was basically an unsafe string* with an exponential memory allocation strategy on append).


The new implementation is actually a linked list of StringBuilder instances, with the current instance pointing to the last part of the string. This strategy favors insertion speed as the list always point to the latest piece of data that was inserted (it's a reversed linked list if you will).

Let's dig deeper in the implications of the .Net 4.0 System.Text.StringBuilder implementation.





ToString implementation
The implementation of the StringBuilder.ToString method just creates an unsafe string pointer, then populates the string from end to beginning by walking each "Chunk" in the linked list.

Benchmark : StringBuilder in .Net 2.0 / 3.5 vs StringBuilder in .Net 4.0
Here's the benchmarking code:

using System;
using System.Diagnostics;
using System.Text;

namespace OneKStrongOxen
{
 internal class StringBuilderDemo
 {
  private static void Main(string[] args)
  {
   Console.Out.WriteLine("Runtime version :" + Environment.Version);

   Console.Out.WriteLine("--Memory tests--");
   for (int i = 0; i < 1024; i += 64)
    TestMem(i);

   Console.Out.WriteLine("--Append tests--");
   TestPerf(1024, "12345", "append", (x, y) => x.Append(y));
   TestPerf(1024, "12345", "head", (x, y) => x.Insert(0, y));
   TestPerf(1024, "12345", "mid", (x, y) => x.Insert(x.Length / 2, y));

  }

  private static void TestPerf(int iterations, 
                               string toAdd, 
                               string testName, 
                               Action<StringBuilder, string> action)
  {
   StringBuilder sb = new StringBuilder();

   // dry run so the JIT will have time to run
   TimeSpan total=new TimeSpan();
   int dryRuns = 100;
   for (int run = dryRuns; run >= 0; run--)
   {
    GC.Collect();
    var v = Stopwatch.StartNew();
    for (int i = 0; i < iterations; i++)
    {
     action(sb, toAdd);
    }
    v.Stop();
    total=total.Add(v.Elapsed);
    if (run == 0) // only sample the last run
    {
     Console.Out.WriteLine("{0,-5} {1} of '{2}' in {3,-5}ms - total for {4} runs : {5,-7}ms", 
                           iterations, testName, 
                           toAdd, v.Elapsed.TotalMilliseconds, 
                           dryRuns, total.TotalMilliseconds);
    }
   }
  }

  private static void TestMem(int max)
  {
   var startMem = GC.GetTotalMemory(true);
   StringBuilder sb = new StringBuilder();
   for (int i = 0; i < max; i++)
   {
    sb.Append("123456");
   }
   var delta = GC.GetTotalMemory(true) - startMem;

   Console.Out.WriteLine("Length : {0,5} - delta M : {1,5} bytes", 
                         sb.Length, delta);
  }
 }
}
.Net 2.0 - RELEASE output
Runtime version :2.0.50727.3082
--Memory tests--
Length :     0 - delta M :    72 bytes
Length :   384 - delta M :  1064 bytes
Length :   768 - delta M :  2088 bytes
Length :  1152 - delta M :  4136 bytes
Length :  1536 - delta M :  4136 bytes
Length :  1920 - delta M :  4136 bytes
Length :  2304 - delta M :  8232 bytes
Length :  2688 - delta M :  8232 bytes
Length :  3072 - delta M :  8232 bytes
Length :  3456 - delta M :  8232 bytes
Length :  3840 - delta M :  8232 bytes
Length :  4224 - delta M : 16424 bytes
Length :  4608 - delta M : 16424 bytes
Length :  4992 - delta M : 16424 bytes
Length :  5376 - delta M : 16424 bytes
Length :  5760 - delta M : 16424 bytes
--Append tests--
1024  append of '1234' in 0.0224ms - total for 100 runs : 2.6472 ms
1024  insert head of '1234' in 93.4164ms - total for 100 runs : 4707.8472ms
1024  insert mid of '1234' in 46.9041ms - total for 100 runs : 2248.8582ms
--ToString tests--
128   ToString() on string length 402 in 0.0002ms - total for 100 runs : 0.0209 ms

.Net 4.0 - RELEASE output

Runtime version :4.0.21006.1
--Memory tests--
Length :     0 - delta M :    72 bytes
Length :   384 - delta M :  1264 bytes
Length :   768 - delta M :  2328 bytes
Length :  1152 - delta M :  4416 bytes
Length :  1536 - delta M :  4416 bytes
Length :  1920 - delta M :  4416 bytes
Length :  2304 - delta M :  8552 bytes
Length :  2688 - delta M :  8552 bytes
Length :  3072 - delta M :  8552 bytes
Length :  3456 - delta M :  8552 bytes
Length :  3840 - delta M :  8552 bytes
Length :  4224 - delta M : 16784 bytes
Length :  4608 - delta M : 16784 bytes
Length :  4992 - delta M : 16784 bytes
Length :  5376 - delta M : 16784 bytes
Length :  5760 - delta M : 16784 bytes
--Append tests--
1024  append of '1234' in 0.0121ms - total for 100 runs : 1.5007 ms
1024  insert head of '1234' in 188.0165ms - total for 100 runs : 9541.9805ms
1024  insert mid of '1234' in 172.2325ms - total for 100 runs : 8673.9825ms
--ToString tests--
128   ToString() on string length 402 in 0.0005ms - total for 100 runs : 0.036  ms
Press any key to continue . . .

Conclusions
The .Net Framework V2.0 implementation of System.Text.StringBuilder is optimized towards conservative memory use and a high performance ToString() implementation, whereas the .Net 4.0 version of StringBuidler is built for high-performance Append operations. The .Net 4.0 framework StringBuilder.Append is approximatively 1.7 times faster than the 2.0 version.
The ToString calls are virtually free for each compared to the time it takes for an append operation, although the 2.0 framework edges-out 4.0 due to the fact that it's actually a wrapper around an unsafe string pointer in 2.0 and a linked list in 4.0.
For any other insertion scenario, once again, 2.0 wins hand-up against 4.0, but one has to consider the fact that real-world uses of StringBuilder reveal that the most likely use case is multiple Append calls followed by a final ToString() call. On paper, .Net 4.0's implementation is slower than .Net 2.0's, but in fine, overall performance is much nicer in 4.0.

This kind of real-world profiling and performance tweaking is pervasive throughout the 4.0 Framework and make it a really interesting upgrade. Now go on and have fun fixing all these StringBuidler.Insert optimizations you added in your 2.0-era code!

2 comments:

  1. What am I not seeing correctly? It looks to me like the 4.0 version uses more memory every time.

    in the 1st test 4.0 (0.0121ms - total for 100 runs : 1.5007 ms) is slightly faster than 2.0 (0.224ms - total for 100 runs : 2.6472ms)

    in the 2nd test 4.0 (188.0165ms - total for 100 runs : 9541.9805ms) is more than 2x *slower* than 2.0 (93.4164ms - total for 100 runs : 4707.8472ms)

    in the 3rd test 4.0 (172.2325ms - total for 100 runs : 8673.9825ms) is almost 4x *slower* than 2.0 (46.904ms - total for 100 runs : 2284.8582ms)

    So the change in 4.0 was to make StringBuilder a tiny bit faster in small use cases but significantly slower when bigger strings are packed into it?

    ReplyDelete
  2. You are correct, the change is to make it a tiny bit faster when appending. All other cases are worse.

    And this is a concious decision. Why? Well think about how you typically use StringBuilder: by repeatedly calling Append, our case #1, and very rarely calling Insert (case 2 for index 0 and 3 for index >0 and < Length).

    StringBuilder should be renamed StringAppender :)

    ReplyDelete

Please leave your comments in English or French and I will be pleased to answer them if you have any questions.

Spammers will be walked down the plank matey. Arrr!