The difference between dt2 and dt3 is due to

1
dateTime :> IComparable<DateTime>

being several times slower than

1
dateTime :> IComparable

i.e. boxing a DateTime into a IComparable is much faster than boxing it into a IComparable<DateTime>. This difference is due to the JIT generating significantly slower code for boxing structs into instances of generic types. There's no obvious reason why boxing a struct into a "generic" type instance should be slower than boxing it into a "non-generic" one (in both cases the JIT should only need to compute the necessary information for constructing the boxed instance once), so this is either due to an oversight or due to a design-limitation of the current JIT implementation (probably the latter). This also explains the differences between the 32-bit and the 64-bit

JIT.

dt4 is so much faster because the CompareTo call is a direct instance method call which doesn't go through the IComparable interface and hence doesn't require boxing.

I've updated the post.

By on 7/2/2010 6:45 AM ()

i.e. boxing a DateTime into a IComparable is much faster than boxing it into a IComparable<DateTime>. I haven't checked whether this is due to a special optimization for IComparable or because boxing to a "generic" type is generally slower (the latter seems unlikely, but sometimes you just hit on code paths in the JIT which the implementers didn't deem important enough to optimize).

I believe (from reading the Ecma IL reference) that that is rather due to the 'unoptimization' of the generic variant. In particular, as I said above, when the cast destination type is generic, the runtime will have more work to do to check if the two types match.

But of course, it all depends on the actual CLR implementation.

By on 7/2/2010 7:10 AM ()

I believe (from reading the Ecma IL reference) that that is rather due to the 'unoptimization' of the generic variant. In particular, as I said above, when the cast destination type is generic, the runtime will have more work to do to check if the two types match.

But of course, it all depends on the actual CLR implementation.

Yeah, I just tested this and while you were writing that post I updated mine.

I probably underestimate the complexity and real-world design constraints behind the implementation of the JIT, but I don't understand why generic types are handled so inefficiently, at least in this case.

By on 7/2/2010 7:28 AM ()

I don't have FSI ready so I can't test it out now, but here are some things that should remedy your perf issues on cases such as #3 if you had explicitly implemented an interface on a struct that F# only allows and you have to explicitly upcast which causes boxing.

First get rid of the boxing of the structs from the explicit upcast by

picking one of these subtype constrained helpers below that call CompareTo directly, they both mean

the same thing.
Then make sure you test generic code by compiling

with FSC and not in FSI, sometimes FSI causes slower code.

1
2
3
let compareOpt (x : #System.IComparable<'T>) (y : #System.IComparable<'T>) = x.CompareTo(y)
    
let compareOpt<'T when 'T :> System.IComparable<'T>> (x : 'T) (y : 'T) = x.CompareTo(y)

#1 and #2 will always call the non-specialized IComparable.CompareTo that uses casts that cause boxing like Brian says.

As for #4 I guess they have implemented System.IComparable<'T>> as a regular implicit

interface that you can call directly without boxing, so issue #3 shouldn't apply here.

By on 7/2/2010 6:06 AM ()

It is useful to look at the source code for FSharp.Core (prim-types.fs). My general rule of thumb is that the F# '>' and '<' operators are 'slow', since they work for all types T and do a lot of reasoning/logic at runtime to figure out how to dispatch the call.

(Recall that in F# you can write generic algorithms that use '<' and are generic for all types T where T is comparable. Now consider what code must go into the '<' operator to make such a generic algorithm work for structural comparison of arrays, types that implement IComparable, types that only implement IComparable<T>, or IStructuralComparable<T>, or native pointers which don't implement any interfaces but which are nonetheless considered comparable for some reason, or ... The moral is that '<' must take a very abstract view of types. I guess ideally the compiler would use static analysis to notice call sites where the types are known and constant-fold the compile-time types through the type-dispatch logic to create static dispatch, but that's an incredibly deep optimization.)

Also, I am just guessing, based on my own experience and reasoning and looking at the code and profiling data; I don't know a ton about the murkiest depths of the compiler or the CLR or the JIT to say much else.

By on 7/2/2010 5:58 AM ()

I knew that F# comparers were more than just calling op_GreaterThan :).

I looked at the source for the compare functions and I was thinking it was just a matter of having another "static optimization conditional" like "when 'T : IComparable" in the Generic...Fast functions, but I guess it's more complicated than that.

Meanwhile, in a way I'm quite pleased. The code in question sniffs, parses and MD5s packets off the network. To get a 4% perf boost just by changing a > to a CompareTo is pretty neat!

By on 7/2/2010 10:29 AM ()

Mhh the difference in results between #2 and #3 is interesting, considering that the non-generic IComparable version has one more type check to do (and apart from that, they both just compare the tick count).
So it must be a cast issue.

Edit: both casts use the same IL instruction castclass.

The castclass instruction performs the same type check,

but instead of pushing null onto the stack if the type check fails, it

throws an InvalidCastException object.

I wonder if the castclass implementation can account for such a performance difference in case of a generic type argument. I guess it's possible since the runtime must check if the actual generic type matches the formal type, by checking type argument arity etc.

Try doing the cast once, and just iterate over CompareTo()...

By on 7/2/2010 3:10 AM ()
IntelliFactory Offices Copyright (c) 2011-2012 IntelliFactory. All rights reserved.
Home | Products | Consulting | Trainings | Blogs | Jobs | Contact Us | Terms of Use | Privacy Policy | Cookie Policy
Built with WebSharper