Hi, that is unrelated to the original post but as it is a comparison between C# and F# implementations of Sieve of Erastothenes I reasoned it will be better to post here than open a new thread.

So I was thinking of using some not so imperative techniques to implement the sieve on .NET. I came up with this for C# using LINQ:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
    class Program
    {
        public static List<int> Sieve(int max)
        {
            int[] numbers = new int[max - 1];
            for (int i = 0; i < max - 1; i++)
            {
                numbers[i] = i + 2;
            }
            var ceiling = Math.Sqrt(max);
            var primes = new List<int>();
            while (numbers[0] <= ceiling)
            {
                primes.Add(numbers[0]);
                numbers = (from r in numbers
                           where r % numbers[0] != 0
                           select r).ToArray();
                //alternatively
                //query = query.Where(i => i % first != 0).ToArray()
            }
            primes.AddRange(numbers);
            return primes;
        }
        static void Main(string[] args)
        {
            foreach (int i in Sieve(120))
            {
                Console.WriteLine(i);
            }
            DateTime start = DateTime.Now;
            Console.WriteLine(Sieve(10000000).Count);
            Console.WriteLine("Time = " + DateTime.Now.Subtract(start).TotalMilliseconds);
        }
    }

As you can see I also timed it and the result on my machine was like 13 seconds for max value of 10000000

I came up with this implementation of F# (I am still a beginner in F# and I havent even got to use sequences yet but...):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
let sieve max =
    let ceiling = sqrt (float max) in
    
    let rec sieveStep (numbers, currentPrimes) =
        if float (List.hd numbers) > ceiling 
        then 
            (numbers, currentPrimes)
        else
            let newPrimes = currentPrimes @ [numbers.Head] in
            sieveStep (List.filter (fun x -> x % numbers.Head <> 0) numbers, newPrimes) in
    
    let (primesLeft, primes) = sieveStep ([2..max], []) in
    primes @ primesLeft;;


let start = System.DateTime.Now;;
120 |> sieve |> List.iter (fun x -> printfn "%d" x);;
printfn "%f" (System.DateTime.Now.Subtract(start)).TotalMilliseconds;;

I also timed this and the result was more than 90 seconds.

So my question is what brings this performance difference. I perfectly understand that neither of this is the perfect implementation (I know you can do it with boolean array and use the indexes for the most efficient implementation). I perfectly understand that the two implementations are a lot different. Arrays vs linked lists, iteration vs recursion, etc. However what is the thing in the F# implementation that brings that huge performance penalty. I even instanciate new array objects for every iteration in the C# implementation so even object allocation in recursive implementation has somewhat of equivalent in C# code.

I also would like to know if these implementations are good enough and readable in both languages in your opinion and how can they be improved with functional constructs.

Thank you for your time

By on 10/9/2008 11:41 PM ()

Regarding performance:

I presume it took >90s for a number bigger than "120"? (For 120, it ran in 19ms on my machine.) Also, in your F# code, you are timing the printing of the entire list (the printing is what's slow), whereas for C# you just print the count. The use of "@" (append) in the loop is expensive; prefer to cons onto the front with "::". But the main difference is printing; compare apples to apples for a fairer comparison.

By on 10/10/2008 1:02 AM ()

Oh yes sorry I did not notice what I pasted as output code.

I tested with 10 000 000 (same as C# version) and I printed .Length

then I removed even that because I reasoned that it may take a lot of time to find the length of a linked list but that did not do it either.

So it is not related to printing and not even measuring the length.

And how do I append an element to the end of the list without @ ? I cannot :: because it will create a list with 2 elements list and an int, right?

By on 10/10/2008 1:12 AM ()

Wow, the performance difference between arrays and lists here seems to be pretty big.

Here it is with an array:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#light 

let sieve max =
    let ceiling = sqrt (float max)
    
    let rec sieveStep (numbers : array<_>, currentPrimes) =
        if float numbers.[0] > ceiling 
        then 
            (numbers, currentPrimes)
        else
            let newPrimes = numbers.[0] :: currentPrimes 
            sieveStep (Array.filter (fun x -> x % numbers.[0] <> 0) numbers, newPrimes)
    
    let (primesLeft, primes) = sieveStep ([|2..max|], []) 
    primes @ (List.of_array primesLeft)

let start = System.DateTime.Now
10000000 |> sieve |> List.length |> printfn "%d"
printfn "%f" (System.DateTime.Now.Subtract(start)).TotalMilliseconds

And it runs in about 12s on my box. Changing "numbers" to a list:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
#light 

let sieve max =
    let ceiling = sqrt (float max)
    
    let rec sieveStep (numbers, currentPrimes) =
        let firstNum = List.hd numbers
        if float firstNum > ceiling 
        then 
            (numbers, currentPrimes)
        else
            let newPrimes = firstNum :: currentPrimes 
            sieveStep (List.filter (fun x -> x % firstNum <> 0) numbers, newPrimes)
    
    let (primesLeft, primes) = sieveStep ([2..max], []) 
    primes @ primesLeft

let start = System.DateTime.Now
10000000 |> sieve |> List.length |> printfn "%d"
printfn "%f" (System.DateTime.Now.Subtract(start)).TotalMilliseconds

takes about 82s.

By on 10/10/2008 3:12 AM ()

Any explaination as to why? I don't see a reason for that to happen. After all they both create objects how does creating a list differ from creating an array in this case. What is more the only element accessed is the head (or element 0) so there is no difference when searching for that element.

By on 10/10/2008 3:43 AM ()

I am guessing it's allocation/GC (the array solution generates a few large objects, whereas the F# list generates tons of tiny ones), but I don't have a profiler handy to dig deeper now.

By on 10/10/2008 4:43 AM ()

Seq.unfold is great if you have a well-defined sequence - e.g., the result of a method call. For more complex scenarios with 'yield return' you should prefer Sequence Computation Expressions. ([link:blogs.msdn.com])

Translated code (a little messily):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
#light
open System.Collections.Generic
let primes = 
    seq { 
        // First prime
        yield 2
        let sieve = new Dictionary<INT, int>()
        // Loop through all odd numbers < 1E3
        for i in 3 .. 2 .. 1000 do
            // Check if its in our sieve (if not, its prime)
            let (found, value) = sieve.TryGetValue(i)
            let marker = if not found then i else value
            // Add to sieve
            let _ =
                let step = marker + marker
                let finalX =
                    let initX = marker + step
                    let rec incX x = 
                        if sieve.ContainsKey(x) then
                            incX (x + step)
                        else 
                            x
                    incX initX
                sieve.Add(finalX, marker)
                
            if not found then 
                yield i
    }
printfn "%A" (Seq.take 100 primes)
By on 4/24/2008 9:31 AM ()

Thanks a bunc! I wound up with this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
let Primes =
    seq {
        yield 2
        let sieve = new Dictionary<int,int>()
        for i in 3 |> Seq.unfold (fun x -> Some(x, (x + 2))) do
            let (found, value) = sieve.TryGetValue(i)
            let marker = if found then value else i
            let _ =
                let step = marker + marker
                let rec Next x = if sieve.ContainsKey(x) then Next (x + step) else x
                let next = Next (marker + step)
                sieve.Add(next, marker)
                
            if not found then yield i
    }

The only thing I don't quite understand is why the yield needs to be the last thing in the seq and why "let _" is needed.

By on 4/24/2008 10:56 PM ()

Hello.

Unfortunly, F# compiler does not allow to create recursive functions in sequence expressions, but we can create those into nested context (f.e. using "match ..." or "let _ = ...").
Also, we may not use "yield" into such nested contexts. So you code looks as it looks.
It's not necessary to use "yield" as last expression:

1
2
3
4
5
6
seq {
	let c = ref 1
	for i in 1..1000 do
		yield i,if 1 = (i&&&1) then 0 else !c
		do incr c
}
By on 4/25/2008 8:28 PM ()
IntelliFactory Offices Copyright (c) 2011-2012 IntelliFactory. All rights reserved.
Home | Products | Consulting | Trainings | Blogs | Jobs | Contact Us | Terms of Use | Privacy Policy | Cookie Policy
Built with WebSharper