Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
240 views
in Technique[技术] by (71.8m points)

c# - Why is OfType<> faster than Cast<>?

In answer to the following question: How to convert MatchCollection to string array

Given The two Linq expressions:

var arr = Regex.Matches(strText, @"[A-Za-z-']+")
    .OfType<Match>() //OfType
    .Select(m => m.Groups[0].Value)
    .ToArray();

and

var arr = Regex.Matches(strText, @"[A-Za-z-']+")
    .Cast<Match>() //Cast
    .Select(m => m.Groups[0].Value)
    .ToArray();

OfType<> was benchmarked by user Alex to be slightly faster (and confirmed by myself).

This seems counterintuitive to me, as I'd have thought OfType<> would have to do both an 'is' comparison, and a cast (T).

Any enlightenment would be appreciated as to why this is the case :)

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

My benchmarking does not agree with your benchmarking.

I ran an identical benchmark to Alex's and got the opposite result. I then tweaked the benchmark somewhat and again observed Cast being faster than OfType.

There's not much in it, but I believe that Cast does have the edge, as it should because its iterator is simpler. (No is check.)

Edit: Actually after some further tweaking I managed to get Cast to be 50x faster than OfType.

Below is the code of the benchmark that gives the biggest discrepancy I've found so far:

Stopwatch sw1 = new Stopwatch();
Stopwatch sw2 = new Stopwatch();

var ma = Enumerable.Range(1, 100000).Select(i => i.ToString()).ToArray();

var x = ma.OfType<string>().ToArray();
var y = ma.Cast<string>().ToArray();

for (int i = 0; i < 1000; i++)
{
    if (i%2 == 0)
    {
        sw1.Start();
        var arr = ma.OfType<string>().ToArray();
        sw1.Stop();
        sw2.Start();
        var arr2 = ma.Cast<string>().ToArray();
        sw2.Stop();
    }
    else
    {
        sw2.Start();
        var arr2 = ma.Cast<string>().ToArray();
        sw2.Stop();
        sw1.Start();
        var arr = ma.OfType<string>().ToArray();
        sw1.Stop();
    }
}
Console.WriteLine("OfType: " + sw1.ElapsedMilliseconds.ToString());
Console.WriteLine("Cast: " + sw2.ElapsedMilliseconds.ToString());
Console.ReadLine();

Tweaks I've made:

  • Perform the "generate a list of strings" work once, at the start, and "crystallize" it.
  • Perform one of each operation before starting timing - I'm not sure if this is necessary but I think it means the JITter generates code beforehand rather than while we're timing?
  • Perform each operation multiple times, not just once.
  • Alternate the order in case this makes a difference.

On my machine this results in ~350ms for Cast and ~18000ms for OfType.

I think the biggest difference is that we're no longer timing how long MatchCollection takes to find the next match. (Or, in my code, how long int.ToString() takes.) This drastically reduces the signal-to-noise ratio.

Edit: As sixlettervariables pointed out, the reason for this massive difference is that Cast will short-circuit and not bother casting individual items if it can cast the whole IEnumerable. When I switched from using Regex.Matches to an array to avoid measuring the regex processing time, I also switched to using something castable to IEnumerable<string> and thus activated this short-circuiting. When I altered my benchmark to disable this short-circuiting, I get a slight advantage to Cast rather than a massive one.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...