The difference between dt2 and dt3 is due to
1
dateTime :> IComparable<DateTime>
being several times slower than
1
dateTime :> IComparable
i.e. boxing a DateTime into a IComparable is much faster than boxing it into a IComparable<DateTime>. This difference is due to the JIT generating significantly slower code for boxing structs into instances of generic types. There's no obvious reason why boxing a struct into a "generic" type instance should be slower than boxing it into a "non-generic" one (in both cases the JIT should only need to compute the necessary information for constructing the boxed instance once), so this is either due to an oversight or due to a design-limitation of the current JIT implementation (probably the latter). This also explains the differences between the 32-bit and the 64-bit
JIT.
dt4 is so much faster because the CompareTo call is a direct instance method call which doesn't go through the IComparable interface and hence doesn't require boxing.
I've updated the post.
i.e. boxing a DateTime into a IComparable is much faster than boxing it into a IComparable<DateTime>. I haven't checked whether this is due to a special optimization for IComparable or because boxing to a "generic" type is generally slower (the latter seems unlikely, but sometimes you just hit on code paths in the JIT which the implementers didn't deem important enough to optimize).
I believe (from reading the Ecma IL reference) that that is rather due to the 'unoptimization' of the generic variant. In particular, as I said above, when the cast destination type is generic, the runtime will have more work to do to check if the two types match.
But of course, it all depends on the actual CLR implementation.
I believe (from reading the Ecma IL reference) that that is rather due to the 'unoptimization' of the generic variant. In particular, as I said above, when the cast destination type is generic, the runtime will have more work to do to check if the two types match.
But of course, it all depends on the actual CLR implementation.
Yeah, I just tested this and while you were writing that post I updated mine.
I probably underestimate the complexity and real-world design constraints behind the implementation of the JIT, but I don't understand why generic types are handled so inefficiently, at least in this case.
I don't have FSI ready so I can't test it out now, but here are some things that should remedy your perf issues on cases such as #3 if you had explicitly implemented an interface on a struct that F# only allows and you have to explicitly upcast which causes boxing.
First get rid of the boxing of the structs from the explicit upcast by
picking one of these subtype constrained helpers below that call CompareTo directly, they both mean
the same thing.
Then make sure you test generic code by compiling
with FSC and not in FSI, sometimes FSI causes slower code.
1 2 3
let compareOpt (x : #System.IComparable<'T>) (y : #System.IComparable<'T>) = x.CompareTo(y) let compareOpt<'T when 'T :> System.IComparable<'T>> (x : 'T) (y : 'T) = x.CompareTo(y)
#1 and #2 will always call the non-specialized IComparable.CompareTo that uses casts that cause boxing like Brian says.
As for #4 I guess they have implemented System.IComparable<'T>> as a regular implicit
interface that you can call directly without boxing, so issue #3 shouldn't apply here.
It is useful to look at the source code for FSharp.Core (prim-types.fs). My general rule of thumb is that the F# '>' and '<' operators are 'slow', since they work for all types T and do a lot of reasoning/logic at runtime to figure out how to dispatch the call.
(Recall that in F# you can write generic algorithms that use '<' and are generic for all types T where T is comparable. Now consider what code must go into the '<' operator to make such a generic algorithm work for structural comparison of arrays, types that implement IComparable, types that only implement IComparable<T>, or IStructuralComparable<T>, or native pointers which don't implement any interfaces but which are nonetheless considered comparable for some reason, or ... The moral is that '<' must take a very abstract view of types. I guess ideally the compiler would use static analysis to notice call sites where the types are known and constant-fold the compile-time types through the type-dispatch logic to create static dispatch, but that's an incredibly deep optimization.)
Also, I am just guessing, based on my own experience and reasoning and looking at the code and profiling data; I don't know a ton about the murkiest depths of the compiler or the CLR or the JIT to say much else.
I knew that F# comparers were more than just calling op_GreaterThan :).
I looked at the source for the compare functions and I was thinking it was just a matter of having another "static optimization conditional" like "when 'T : IComparable" in the Generic...Fast functions, but I guess it's more complicated than that.
Meanwhile, in a way I'm quite pleased. The code in question sniffs, parses and MD5s packets off the network. To get a 4% perf boost just by changing a > to a CompareTo is pretty neat!
Mhh the difference in results between #2 and #3 is interesting, considering that the non-generic IComparable version has one more type check to do (and apart from that, they both just compare the tick count).
So it must be a cast issue.
Edit: both casts use the same IL instruction castclass.
The castclass instruction performs the same type check,
but instead of pushing null onto the stack if the type check fails, it
throws an InvalidCastException object.
I wonder if the castclass implementation can account for such a performance difference in case of a generic type argument. I guess it's possible since the runtime must check if the actual generic type matches the formal type, by checking type argument arity etc.
Try doing the cast once, and just iterate over CompareTo()...
Topic tags
- f# × 3705
- websharper × 1897
- compiler × 286
- functional × 201
- ui next × 139
- c# × 121
- classes × 97
- web × 97
- .net × 84
- book × 84
- async × 76
- ui.next × 67
- bug × 54
- core × 49
- website × 49
- server × 45
- parallel × 43
- ui × 43
- enhancement × 41
- parsing × 41
- testing × 41
- trywebsharper × 41
- typescript × 37
- html × 35
- javascript × 35
- owin × 35
- asynchronous × 30
- monad × 28
- ocaml × 28
- tutorial × 27
- warp × 27
- haskell × 26
- sitelet × 25
- linq × 22
- workflows × 22
- wpf × 20
- fpish × 19
- introduction × 19
- silverlight × 19
- sitelets × 19
- monodevelop × 17
- rpc × 17
- suave × 17
- piglets × 16
- collections × 15
- feature request × 15
- jquery × 15
- templates × 15
- getting started × 14
- pipeline × 14
- kendoui × 13
- reactive × 12
- 4.1.0.171 × 11
- monads × 11
- opinion × 10
- 4.0.190.100-rc × 9
- deployment × 9
- fixed × 9
- formlets × 9
- in × 9
- json × 9
- plugin × 9
- proposal × 9
- scheme × 9
- solid × 9
- basics × 8
- concurrent × 8
- highcharts × 8
- how-to × 8
- python × 8
- 4.1.1.175 × 7
- complexity × 7
- documentation × 7
- visual studio × 7
- 4.1.2.178 × 6
- lisp × 6
- real-world × 6
- released in 4.0.192.103-rc × 6
- remoting × 6
- resources × 6
- scala × 6
- websharper ui.next × 6
- workshop × 6
- xaml × 6
- 4.0.193.110 × 5
- 4.2.3.236 × 5
- aspnetmvc × 5
- authentication × 5
- azure × 5
- bootstrap × 5
- conference × 5
- dsl × 5
- formlet × 5
- java × 5
- list × 5
- metaprogramming × 5
- ml × 5
- released in Zafir.4.0.188.91-beta10 × 5
- sql × 5
- visualstudio × 5
- websharper.forms × 5
- zafir × 5
- 4.0.192.106 × 4
- 4.0.195.127 × 4
- 4.1.0.38 × 4
- 4.2.1.86 × 4
- 4.2.6.118 × 4
- css × 4
- example × 4
- extensions × 4
- fsi × 4
- fsx × 4
- html5 × 4
- jqueryui × 4
- lift × 4
- reflection × 4
- remote × 4
- rest × 4
- spa × 4
- teaching × 4
- template × 4
- websocket × 4
- wontfix × 4
- 4.0.196.147 × 3
- 4.1.0.34 × 3
- 4.1.6.207 × 3
- 4.2.1.223-beta × 3
- 4.2.11.258 × 3
- 4.2.4.114 × 3
- 4.2.4.247 × 3
- 4.2.5.115 × 3
- 4.2.6.253 × 3
- 4.2.9.256 × 3
- ajax × 3
- alt.net × 3
- aml × 3
- asp.net mvc × 3
- canvas × 3
- cloudsharper × 3
- compilation × 3
- database × 3
- erlang × 3
- events × 3
- extension × 3
- file upload × 3
- forums × 3
- inline × 3
- issue × 3
- kendo × 3
- macro × 3
- mono × 3
- msbuild × 3
- mvc × 3
- pattern × 3
- piglet × 3
- released in Zafir.4.0.187.90-beta10 × 3
- svg × 3
- type provider × 3
- view × 3
- 4.1.1.64 × 2
- 4.1.5.203 × 2
- 4.1.7.232 × 2
- 4.2.10.257 × 2
- 4.2.3.111 × 2
- 4.2.5.249 × 2
- android × 2
- asp.net × 2
- beginner × 2
- blog × 2
- chart × 2
- client × 2
- client server app × 2
- clojure × 2
- computation expressions × 2
- constructor × 2
- corporate × 2
- courses × 2
- cufp × 2
- d3 × 2
- debugging × 2
- direct × 2
- discriminated union × 2
- docs × 2
- elm × 2
- endpoint × 2
- endpoints × 2
- enterprise × 2
- entity framework × 2
- event × 2
- f# interactive × 2
- fable × 2
- flowlet × 2
- formdata × 2
- forms × 2
- fsc × 2
- google maps × 2
- hosting × 2
- http × 2
- https × 2
- iis 8.0 × 2
- install × 2
- interactive × 2
- interface × 2
- iphone × 2
- iteratee × 2
- jobs × 2
- jquery mobile × 2
- keynote × 2
- lens × 2
- lenses × 2
- linux × 2
- listmodel × 2
- mac × 2
- numeric × 2
- oauth × 2
- obfuscation × 2
- offline × 2
- oop × 2
- osx × 2
- packaging × 2
- pattern matching × 2
- performance × 2
- pipelines × 2
- q&a × 2
- quotation × 2
- reference × 2
- released in Zafir.4.0.185.88-beta10 × 2
- rx × 2
- script × 2
- security × 2
- self host × 2
- seq × 2
- sockets × 2
- stm × 2
- tcp × 2
- trie × 2
- tutorials × 2
- type × 2
- url × 2
- var × 2
- websharper.charting × 2
- websharper4 × 2
- websockets × 2
- wig × 2
- xna × 2
- zh × 2
- .net interop × 1
- 2012 × 1
- 4.0.194.126 × 1
- 4.1.3.184 × 1
- 4.1.4.189 × 1
- 4.2.0.214-beta × 1
- 4.2.12.259 × 1
- 4.2.2.231-beta × 1
- 4.2.8.255 × 1
- Canvas Sample Example × 1
- DynamicStyle Animated Style × 1
- Fixed in 4.0.190.100-rc × 1
- Released in Zafir.UI.Next.4.0.169.79-beta10 × 1
- SvgDynamicAttribute × 1
- WebComponent × 1
- abstract class × 1
- accumulator × 1
- active pattern × 1
- actor × 1
- addin × 1
- agents × 1
- aggregation × 1
- agile × 1
- alter session × 1
- animation × 1
- anonymous object × 1
- apache × 1
- api × 1
- appcelerator × 1
- architecture × 1
- array × 1
- arrays × 1
- asp.net 4.5 × 1
- asp.net core × 1
- asp.net integration × 1
- asp.net mvc 4 × 1
- asp.net web api × 1
- aspnet × 1
- ast × 1
- attributes × 1
- authorization × 1
- b-tree × 1
- back button × 1
- badimageformatexception × 1
- bash script × 1
- batching × 1
- binding-vars × 1
- bistro × 1
- body × 1
- bundle × 1
- camtasia studio × 1
- cas protocol × 1
- charts × 1
- clarity × 1
- class × 1
- cli × 1
- clipboard × 1
- clojurescript × 1
- closures × 1
- cloud × 1
- cms × 1
- coding diacritics × 1
- color highlighting × 1
- color zones × 1
- combinator × 1
- combinators × 1
- compile × 1
- compile code on server × 1
- config × 1
- confirm × 1
- content × 1
- context × 1
- context.usersession × 1
- continuation-passing style × 1
- coords × 1
- cordova × 1
- cors × 1
- coursera × 1
- cross-domain × 1
- csla × 1
- current_schema × 1
- custom content × 1
- data × 1
- data grid × 1
- datetime × 1
- debug × 1
- declarative × 1
- delete × 1
- devexpress × 1
- dhtmlx × 1
- dictionary × 1
- directattribute × 1
- disqus × 1
- distance × 1
- do binding × 1
- doc elt ui.next upgrade × 1
- docker × 1
- dojo × 1
- dol × 1
- dom × 1
- domain × 1
- du × 1
- duf-101 × 1
- dynamic × 1
- eastern language × 1
- eclipse × 1
- edsl × 1
- em algorithm × 1
- emacs × 1
- emotion × 1
- enums × 1
- error × 1
- etw × 1
- euclidean × 1
- eventhandlerlist × 1
- examples × 1
- ext js × 1
- extension methods × 1
- extra × 1
- facet pattern × 1
- failed to translate × 1
- fake × 1
- fantomas × 1
- fear × 1
- float × 1
- form × 1
- form-data × 1
- forum × 1
- fp × 1
- frank × 1
- fsdoc × 1
- fsharp × 1
- fsharp.core × 1
- fsharp.powerpack × 1
- fsharpx × 1
- fsunit × 1
- function × 1
- functional style × 1
- game × 1
- games × 1
- gc × 1
- generic × 1
- geometry × 1
- getlastwin32error × 1
- getting-started × 1
- google × 1
- google.maps × 1
- grid × 1
- group × 1
- guide × 1
- hash × 1
- headers × 1
- hello world example × 1
- heroku × 1
- highchart × 1
- history × 1
- how to × 1
- html-templating × 1
- http405 × 1
- httpcontext × 1
- hubfs × 1
- i18n × 1
- ie 8 × 1
- if-doc × 1
- iis × 1
- image × 1
- images × 1
- inheritance × 1
- initialize × 1
- input × 1
- install "visual studio" × 1
- installer × 1
- int64 × 1
- interfaces × 1
- internet explorer × 1
- interop × 1
- interpreter × 1
- io × 1
- iobservable × 1
- ios × 1
- iot × 1
- ipad × 1
- isomorphic × 1
- javascript optimization × 1
- javascript semanticui resources × 1
- jquery-plugin × 1
- jquery-ui × 1
- jquery-ui-datepicker × 1
- js × 1
- kendo datasource × 1
- kendochart × 1
- kendoui compiler × 1
- knockout × 1
- l10n × 1
- learning × 1
- library × 1
- libs × 1
- license × 1
- licensing × 1
- lineserieszonescfg × 1
- local setting × 1
- localization × 1
- logging × 1
- loop × 1
- macros × 1
- mailboxprocessor × 1
- mapping × 1
- maps × 1
- markerclusterer × 1
- markup × 1
- marshal × 1
- math × 1
- mathjax × 1
- message × 1
- message passing × 1
- message-passing × 1
- meta × 1
- metro style × 1
- micro orm × 1
- minimum-requirements × 1
- mix × 1
- mobile installation × 1
- mod_mono × 1
- modal × 1
- module × 1
- mouseevent × 1
- mouseposition × 1
- multidimensional × 1
- multiline × 1
- multithreading × 1
- mysql × 1
- mysqlclient × 1
- nancy × 1
- native × 1
- nested × 1
- nested loops × 1
- node × 1
- nunit × 1
- object relation mapper × 1
- object-oriented × 1
- om × 1
- onboarding × 1
- onclick × 1
- optimization × 1
- option × 1
- orm × 1
- os x × 1
- output-path × 1
- override × 1
- paper × 1
- parameter × 1
- persistence × 1
- persistent data structure × 1
- phonegap × 1
- pola × 1
- post × 1
- powerpack × 1
- prefix tree × 1
- principle of least authority × 1
- privacy × 1
- private × 1
- profile × 1
- programming × 1
- project × 1
- project euler × 1
- projekt_feladat × 1
- protected × 1
- provider × 1
- proxy × 1
- ptvs × 1
- public × 1
- pure f# × 1
- purescript × 1
- qna × 1
- quant × 1
- query sitelet × 1
- question × 1
- quotations × 1
- range × 1
- raphael × 1
- razor × 1
- rc × 1
- reactjs × 1
- real-time × 1
- ref × 1
- region × 1
- released in 4.0.190.100-rc × 1
- reporting × 1
- responsive design × 1
- rest api × 1
- rest sitelet × 1
- restful × 1
- round table × 1
- router × 1
- routing × 1
- rpc reverseproxy × 1
- runtime × 1
- sales × 1
- sample × 1
- sampleapp × 1
- scriptcs × 1
- scripting × 1
- search × 1
- self hosted × 1
- semanticui × 1
- sequence × 1
- serialisation × 1
- service × 1
- session-state × 1
- sharepoint × 1
- signals × 1
- sitelet website × 1
- sitelet.protect × 1
- sitlets × 1
- slickgrid × 1
- source code × 1
- sqlentityconnection × 1
- ssl × 1
- standards × 1
- static content × 1
- stickynotes × 1
- streamreader × 1
- stress × 1
- strong name × 1
- structures × 1
- submitbutton × 1
- subscribe × 1
- svg example html5 websharper.ui.next × 1
- sweetalert × 1
- system.datetime × 1
- system.reflection.targetinvocationexception × 1
- table storage × 1
- targets × 1
- tdd × 1
- templates ui.next × 1
- templating × 1
- text parsing × 1
- three.js × 1
- time travel × 1
- tls × 1
- tooltip × 1
- tracing × 1
- tsunamiide × 1
- turkish × 1
- twitter-bootstrap × 1
- type erasure × 1
- type inference × 1
- type providers × 1
- type-providers × 1
- typeprovider × 1
- ui next forms × 1
- ui-next × 1
- ui.next jqueryui × 1
- ui.next charting × 1
- ui.next formlets × 1
- ui.next forms × 1
- ui.next suave visualstudio × 1
- ui.next templating × 1
- unicode × 1
- unittest client × 1
- upload × 1
- usersession × 1
- validation × 1
- vb × 1
- vb.net × 1
- vector × 1
- view.map × 1
- visal studio × 1
- visual f# × 1
- visual studio 11 × 1
- visual studio 2012 × 1
- visual studio shell × 1
- vs2017 compiler zafir × 1
- vsix × 1
- web api × 1
- web-scraping × 1
- webapi × 1
- webcomponents × 1
- webforms × 1
- webgl × 1
- webrtc × 1
- webshaper × 1
- websharper async × 1
- websharper codemirror × 1
- websharper f# google × 1
- websharper forms × 1
- websharper reactive × 1
- websharper rpc × 1
- websharper sitelets routing × 1
- websharper warp × 1
- websharper-interface-generator × 1
- websharper.chartsjs × 1
- websharper.com × 1
- websharper.exe × 1
- websharper.owin × 1
- websharper.ui.next × 1
- websharper.ui.next jquery × 1
- websockets iis × 1
- why-websharper × 1
- windows 7 × 1
- windows 8 × 1
- windows-phone × 1
- winrt × 1
- www.grabbitmedia.com × 1
- xamarin × 1
- xml × 1
- yeoman × 1
- yield × 1
- zafir beta × 1
- zafir websharper4 × 1
- zarovizsga × 1
![]() |
Copyright (c) 2011-2012 IntelliFactory. All rights reserved. Home | Products | Consulting | Trainings | Blogs | Jobs | Contact Us | Terms of Use | Privacy Policy | Cookie Policy |
Built with WebSharper |
I came across this perf issue in profiling real code (HashCompare.GenericGreaterThanIntrinsic had a lot of samples). Using the normal F# generic greater-than is producing code that's several times slower than the IComparable implementation:
Here are the timing results (1M iterations):
Real: 00:00:00.601, CPU: 00:00:00.592, GC gen0: 10, gen1: 0, gen2: 0
Real: 00:00:00.097, CPU: 00:00:00.093, GC gen0: 10, gen1: 0, gen2: 0
Real: 00:00:01.134, CPU: 00:00:01.123, GC gen0: 5, gen1: 0, gen2: 0
Real: 00:00:00.034, CPU: 00:00:00.031, GC gen0: 0, gen1: 0, gen2: 0
The actual comparison on a DateTime is just comparing a 64-bit field, and that's reflected in the speed and memory usage of #4. I understand the generic compare in F# will use IComparable. But if so, why is it so much slower than #2, which explicitly uses IComparable? Why's it compiling to all this helper code?
As a side note, what's up with the incredibly slower #3, when using the generic IComparable? I'd have thought the less boxing/type test for the compare would have helped, and indeed there are less gen0s, but it's much slower.
Adding to the fun, when run on the x64 JIT (the above is on 32-bit), the F# generic greater-than (#1) goes even slower (1.3s) but the explicit generic IComparable (#3) gets faster (0.8s).
I have a feeling I'm missing something fundamental here.
Thanks,
-Michael