The reason is simply that composing functions indirectly in this way (i.e. via Seq) adds a lot of overhead compared to the trivial % and = ops that you're doing.

See F# for Scientists section 8.4.4.1 for a similar example and discussion.

Cheers,

Jon Harrop.

By on 9/20/2008 5:37 PM ()

Can you give a more detailed explanation of the type of overhead using Seq in the above manner generates. I understand I could find an explanation in the aforementioned book but I currently don't own a copy and it isn't free. If you don't mind a short explanation would be very useful.

By on 9/20/2008 10:43 PM ()

I don't have the time to profile your code in depth, but first experiments indicate the performance overhead (I get about 20-50% for your sample, not 300%) is mainly due to extra allocations. If you replace the Seq.map call in is_prime1 with a call to my optimized implementation of map attached below, the overhead more or less vanishes.

[Update: I'm not sure anymore whether it's the extra allocations or extra indirections, and how much faster the below implementation actually is. To get more reliable results, you probably need to reduce your sample to something simpler and then run it under a profiler.]

Best regards,
Stephan

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
type MapEnumerableator<'a, 'b>(f: 'a -> 'b, s: seq<'a>) =
     let mutable e = null
     
     interface IEnumerable<'b> with
         member this.GetEnumerator() =
             match e with 
             | null -> 
                 e <- s.GetEnumerator() 
                 this :> IEnumerator<'b> // return this as an IEnumerator 
             | _ ->
                 // this is already used as an enumerator, 
                 // so we need a new instance 
                 (new MapEnumerableator<_,_>(f, s) :> seq<_>).GetEnumerator()
            
     interface IEnumerable with
        member t.GetEnumerator() =
            (t :> seq<'b>).GetEnumerator() :> IEnumerator
            
     interface IEnumerator<'b> with
         member t.Current = f e.Current
     
     interface IEnumerator with 
         member t.Current = box (f e.Current)
         member t.MoveNext() = e.MoveNext()
         member t.Reset() = e.Reset()
     
     interface System.IDisposable with 
         member t.Dispose() = 
            match e with
            | null -> ()
            | _    -> e.Dispose()


let map f s = new MapEnumerableator<_,_>(f, s) :> seq<_>
By on 9/21/2008 3:56 AM ()

I just realized that most users probably expect GetEnumerator to be thread-safe. Therefore, one should better use the following version of map, which implements a thread-safe lazy initialization scheme:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
open System.Collections
open System.Collections.Generic
open System.Threading


type MapEnumerableator<'a, 'b>(f: 'a -> 'b, s: seq<'a>) =
     let mutable e = null          
     member private t.SetE(v) = e <- v     
     
     interface IEnumerable<'b> with
         member this.GetEnumerator() =            
             let newEnumerator = s.GetEnumerator()
             match Interlocked.CompareExchange<_>(&e, newEnumerator, null) with
             | null -> // e has been replaced with newEnumerator 
                 this :> IEnumerator<_>
             | _  -> // e was already initialized
                 let newEnum = new MapEnumerableator<_,_>(f, s)
                 newEnum.SetE(newEnumerator)
                 newEnum :> IEnumerator<_>
            
     interface IEnumerable with
        member t.GetEnumerator() =
            (t :> seq<'b>).GetEnumerator() :> IEnumerator
            
     interface IEnumerator<'b> with
         member t.Current = f e.Current
     
     interface IEnumerator with 
         member t.Current = box (f e.Current)
         member t.MoveNext() = e.MoveNext()
         member t.Reset() = e.Reset()
     
     interface System.IDisposable with 
         member t.Dispose() = 
            match e with
            | null -> ()
            | _    -> e.Dispose()


let map f s = new MapEnumerableator<_,_>(f, s) :> seq<_>
By on 9/21/2008 8:53 AM ()

The question is whether or not a lazy sequence stays lazy when being mapped and called exists on it. If is_prime1 at some point has to create the whole sequence before it starts looking, that would explain why it's 3 times slower.

Another thing, the sequence only has to be {2..n/2}.

By on 9/20/2008 9:41 AM ()

Anyone?

By on 9/20/2008 7:24 AM ()

Anyone?

See this paper on Stream Fusion, and the related F# thread for a possible explanation and potential fix for the compiler. Especially note my latest post on the subject.

By on 9/21/2008 6:06 PM ()
IntelliFactory Offices Copyright (c) 2011-2012 IntelliFactory. All rights reserved.
Home | Products | Consulting | Trainings | Blogs | Jobs | Contact Us | Terms of Use | Privacy Policy | Cookie Policy
Built with WebSharper