Friday, April 27, 2007

Misleading performance comparisons: C# vs. Java

In a blog post: The Ultimate Java Versus C# Benchmark the author tried to prove that Java outperforms C# in a real life benchmark.
The benchmark actually measures the performance of regular expressions in both languages (say frameworks).
He concludes the post saying that C# failed even to give the results, and consumed too much memory till an OutOfMemoryException was thrown.
I tried both code samples (C# using .net 2.0, Java using Java SE runtime version 1.6).
The Java version worked as mentioned in the blog post (took about 4031 milliseconds on my PC), the C# version took too long time so I had to terminate it.
I noticed that the whole benchmark is about comparing these lines of code:
Regex regexpr = new Regex(matchthis, RegexOptions.Compiled);
Boolean b = regexpr.IsMatch(_doc);

Pattern regexpr = Pattern.compile(matchthis); Matcher matcher = egexpr.matcher(_doc);
boolean b = matcher.find();

When I changed the C# code to:
Regex regexpr = new Regex(matchthis);
Boolean b = regexpr.IsMatch(_doc);

Execution took about 14672 milliseconds. Which is somehow acceptable, even if a lot longer that Java execution time. Note that this version does not compile the regular expression.

When I removed the regular expression comparison from both codes the result was:
C#: 891 milliseconds
Java: 1750 milliseconds

See the difference!!

  1. It's unfair to compare two versions of code written in a way that is optimized for one language but not for another. In .net, compiling regular expressions is good when you are going to reuse them.
  2. Taking a single point of comparison (regular expression in this case) is not a valid measurement when you compare two huge frameworks like .net and Java.
I'm not trying to defend C#, and I'm not claim that .net outperforms Java generally. I just suggest that performance comparisons should not be performed by biased persons. The test should not be optimized for one language without the other.


Anonymous said...

True. Such benchmarks are most of the time biaised in some way. It's a shame.

Anonymous said...

True. Such benchmarks are most of the time biaised in some way. It's a shame.

Radek Petrik said...

You're absolutly right. The post The Ultimate Java Versus C# Benchmark you are talking about made me angry. It's there from 2003-08-17 and I think, that the author Carlos E. Perez must already know about the mistake he made. Why is he not correcting or deleting the post? That's totaly wrong.

x4m said...

performance tests based on regex?
i'm not shure, but there are different models, nfa & dfa...
googling for performance comparison i wanted to explore instruction level performance.