Replace Multiple Strings Effectively

Update

Please, see the comments below and be aware that this article is completely wrong. Also, I recommend reading a StackOverflow post about this subject and follow the links there.

Apologies for any mystification.


Recently, I had to clean a string from occurrences of different ones. There is a nifty little method on the String class called Replace that allows you to substitute one string with the other. But is it also suitable for multiple runs?

So, let’s write an extension method for it.

public static string Replace(this string value, IEnumerable<Tuple<string, string>> toReplace)
{
    string result = value;
    foreach (var item in toReplace)
    {
        result = value.Replace(item.Item1, item.Item2);
    }
    return result;
}

What could possibly go wrong? Unfortunately, you are allocating a new string instance every time you call the Replace method, and in some cases it could be a performance bottleneck.

Now, let’s write another extension method, but with help of a StringBuilder class.

public static string ReplaceWithStringBuilder(this string value, IEnumerable<Tuple<string, string>> toReplace)
{
    var result = new StringBuilder(value);
    foreach (var item in toReplace)
    {
        result.Replace(item.Item1, item.Item2);
    }
    return result.ToString();
}

And to compare the performance we create a simple benchmark; we use one string and try to replace five occurrences of others with an empty ones. We iterate this method a million times and measure the whole case with Stopwatch class.

var watch = new Stopwatch();
watch.Start();

for (int i = 0; i < 1000000; i++)
{
    string result = "Lorem ipsum dolor sit amet".Replace(new [] 
    { 
        Tuple.Create("Lorem", String.Empty), 
        Tuple.Create("ipsum", String.Empty), 
        Tuple.Create("dolor", String.Empty), 
        Tuple.Create("Other", String.Empty), 
        Tuple.Create("New", String.Empty), 
    });
}

watch.Stop();
Console.WriteLine("{0} ms with String Replace.", watch.ElapsedMilliseconds);
watch.Restart();

for (int i = 0; i < 10000; i++)
{
    string result = "Lorem ipsum dolor sit amet".ReplaceWithStringBuilder(new[]
    { 
        Tuple.Create("Lorem", String.Empty), 
        Tuple.Create("ipsum", String.Empty), 
        Tuple.Create("dolor", String.Empty), 
        Tuple.Create("Other", String.Empty), 
        Tuple.Create("New", String.Empty), 
    });
}

watch.Stop();
Console.WriteLine("{0} ms with StringBuilder.", watch.ElapsedMilliseconds);
Console.ReadKey();

Let’s take a look at the results; it took the Replace method 710ms to run the test. On the other hand, the StringBuilder performed much better – it only needed 8ms to complete all the iterations.

StringBuilder Replace is faster that String Replace

To sum up, the Replace method on the string is handy when you are dealing with few instances. If you need to replace more strings and do it several times, or you work in a performance critical section, it is better to go with the StringBuilder.


Would you like to get the most interesting content about C# every Monday?
Sign up to C# Digest and stay up to date!