1.2k
views9
comments

There is a sweet little discussion of the following problem at New Scientist:

Gary Foshee, a collector and designer of puzzles from Issaquah near Seattle walked to the lectern to present his talk. It consisted of the following three sentences: "I have two children. One is a boy born on a Tuesday. What is the probability I have two boys?"

The surprising answer is 13/27, which you will probably come to believe if you read the logic correctly. But who needs logic when you've got simulation?

 

open System

let r = new Random()
// 1 = male; 2 = female
// 1 = Sunday, 2 = Monday, 3 = Tuesday ...

let sims i = Seq.init i (fun n -> ((r.Next(1,3), r.Next(1,8)),(r.Next(1,3), r.Next(1,8))))

let could_be_my_son x =
    let (a,b) = x
    ( ((fst a) = 1) && ((snd a) = 3) || ((fst b) = 1) && ((snd b) = 3) )

let is_two_sons x =
    let (a, b) = x
    ((fst a) = 1) && ((fst b) = 1)

let two_sons_simulation n =
    let s = sims n
    let denom = s |> Seq.filter could_be_my_son |> Seq.map (fun x -> 1) |> Seq.sum
    let num   = s |> Seq.filter could_be_my_son |> Seq.filter is_two_sons |> Seq.map (fun x -> 1) |> Seq.sum
    float num / float denom

> two_sons_simulation 10000000;;
val it : float = 0.4803767456

> 13.0 / 27.0;;
val it : float = 0.4814814815

Interesting thread.

1
2

let denom = s |> Seq.filter could_be_my_son |> // etc ...
let num   = s |> Seq.filter could_be_my_son |> // etc. ...

I wonder whether this is entierly correct. As the s sequence is initialized with side-effecting method calls (Random.Next) you generate different "children-combination-sequences" each time you call Seq.filter on it. To avoid that you probably want to force the sequence evaluation once before filtering:

let s = sims n |> Seq.toList

Else your denominator and or nominator would be based on two different random experiments.

Or am I missing something here?

By Novox on 5/30/2010 6:43 AM (permalink)

Cool.

Instead of :

1
2
3

let could_be_my_son x =
  let (a,b) = x
  ( ((fst a) = 1) && ((snd a) = 3) || ((fst b) = 1) && ((snd b) = 3) )

try: (all untested)

1
2
3

let could_be_my_son = function
  | (1,3),(1,3) -> true
  | _ -> false

And rather than:

let num = s |> Seq.filter could_be_my_son |> Seq.filter is_two_sons |> Seq.map (fun x -> 1) |> Seq.sum

try:

let num = s |> Seq.sumBy (fun e -> if could_be_my_son e && is_two_sons e then 1 else 0)

1
2

let num = s |> Seq.fold (fun acc e-> acc + if could_be_my_son e && is_two_sons e) then 1 else 0) 0

Sorry I've been doing this alot today w/another fn :)

By anotheraccount on 5/26/2010 6:24 PM (permalink)

Nice; here are the functions with pattern matching instead:

let could_be_my_son = function

    | (1, 3), _ -> true

    |  _ , (1, 3) -> true

    | _ -> false

let is_two_sons = function

    | (1, _ ), (1, _ ) -> true

    | _ -> false

Summing by Seq.sumBy is faster than summing by Seq.sum, as you propose; but summing by Seq.fold is consistently slower than either of these. About a 10% difference in the 3 methods.

Thanks!

By will.fitzgerald on 5/27/2010 12:47 AM (permalink)

BTW, are you running a Release build? Have you tried "inline"ing your filter fn's?

By anotheraccount on 5/27/2010 9:27 AM (permalink)

Wow, that problem had me confused as well. In a good way.

I noticed that you did some double work, you calculate the denominator and numerator separately. That means that you traverse the sequence of pairs twice. You could calculate both num end denom in one pass using a fold:

let two_sons_simulation n =
    let s = sims n
    let denom,num =
        Seq.fold (fun (denom,num) x -> 
                    match could_be_my_son x, is_two_sons x with
                    | true, true -> (denom+1, num+1)
                    | true, false -> (denom+1, num)
                    | _ -> (denom,num)) (0,0) s
    float num / float denom

This version is approximately twice as fast as calculating num and denom after each other.

But if you really want to go fast, mutable wins:

let ugly_mutable n = 
    let mutable count = 0
    let mutable denom = 0
    let mutable num = 0
    while count < n do
        let x = (r.Next(1,3), r.Next(1,8)) , (r.Next(1,3), r.Next(1,8))
        match could_be_my_son x, is_two_sons x with
            | true, true -> denom <- denom + 1
                            num <- num + 1
            | true, false -> denom <- denom + 1
            | _ -> ()
        count <- count + 1
    float num / float denom

Ugly, but 3.5x faster than the paired fold above. But if you see yourself coding F# like this, maybe it's time to go to C instead. I prefer OP's original version for readability. That's the way I like to work: get a readable version first and then make it faster only if necessary.

Interesting note: in the ugly mutable version, inlining the filter functions makes it go 10% slower.

By cfern on 5/28/2010 12:52 AM (permalink)

<NonCompilerWriterSpouting>
I'm surprised (esp. seeing that fsc is meta-compiled (no?)) that there are no re-write rules in fsc that can take the fold version and re-write it in a loop/mutable form. Emitting TCO calls is a stop-gap for stack usage, but apparently the JIT can't do a rewrite at the IL level for better perf. (which is harder), but it should be possible in fsc. I'm not saying this is trivial, but I recall that the MS C/C++ compiler could do some pretty cool global optimizations in it's time - in fsc higher-order compiler rules should be easier, and possibly even extended by the developer in attributed quotation-based fns in DLL rule files or other forms. Basically the compiler could call user-specified codeDom->codeDom transformers in the local codebase or external plug-ins, and domain-experts could provide these for their usrers. (If the dom were sexprs, these would be cool typed macros of sorts.)
</NonCompilerWriterSpouting>

By anotheraccount on 5/28/2010 8:37 AM (permalink)

Nice; here are the functions with pattern matching instead:
1
2
3
let could_be_my_son = function
 | (1, 3), _ -> true
 | _ , (1, 3) -> true 

Sorry I read your code too fast and coded an && rather than an || in the pattern...

About a 10% difference in the 3 methods

10% isn't so bad - if there were more "guts" in the computation, this would disappear pretty quicly. Still, seeing the fold is the most primitive/general HOF, it should strive to be quick as possible.

By anotheraccount on 5/27/2010 7:04 AM (permalink)

It seems to me that it would be more natural (and maybe faster as well) to use Seq.filter followed by Seq.length rather than using sum or sumBy at all.

By Keith Battocchi on 5/27/2010 6:29 AM (permalink)

It seems to me that it would be more natural (and maybe faster as well) to use Seq.filter followed by Seq.length rather than using sum or sumBy at all.

I think the subBy makes sense to me - but so does your suggestion - as for faster, there's only one way to tell. (The sumBy has the advantage of only one loop, not two, though the loop overhead is probably pretty small.)

However, I never think of a Seq as having a length - seeing getting the length is O(n) and not O(1), I think it's better to do the counting explicitly. (I'm surpised Seq.length exists, but like list.[x] which is O(n), I also appreciate the convenenice - however, things are inconsistent in this regard, as I wouldn't mind list slices as well.)

By anotheraccount on 5/27/2010 6:59 AM (permalink)

Topic tags

Built with WebSharper

Home

Answers

Events

Courses

Groups and Conferences

Blogs

Jobs

Developers

Topic tags