7.0k
views8
comments

Functional While, Repeat/Until, and Monad-free Break/Continue

Was playing around with sort algorithms in F# this weekend to kill some time and again lamented the lack of repeat/until and break in F#. Then it hit me that recursive tail call functions could achieve exactly the same thing, with only a slight ding to readability. So, I threw out 'mutable bDone' and 'while not bDone' and tried writing the code without these imperative constructs. The following distills out just the looping construct and shows how to write repeat/until, do/while, while/do, break/continue, and test-in-the-middle style code using tailcalls. These all appear to run at exactly the same speed as a traditional F# 'while' statement, but your mileage may vary (some platforms may not properly implement tailcall). At the end is a (bad) sort algorithm written in both styles, for comparison.

Let's start with a 'do/while' loop, written in traditional F# imperative style, then look at functional variations which provide both the same type of loop, as well as different semantics like while/do, repeat/until, test in the middle, and even break/continue (without monads.. um, workflows!).

#light
(* something to work on... *)
let v = ref 0
let f() = incr v;
let g() = !v;
let N = 100000000

let imperDoWhile() =
    let mutable x = 0
    let mutable bDone = false
    while not bDone do
        f()
        x <- x + 1
        if x >= N then bDone <- true

Ok, that's easy enough. Keep in mind that f() is always called at least once (do/while).

Here is the same code, but in a functional style. Note that we don't need to declare a mutable here.

let funDoWhile() =
    let rec loop x =
        f()             (*Do*)
        if x < N then   (*While*)
            loop (x+1)
    loop 0

We can spin that to a traditional do/while by putting the function call inside the if block.

let funWhileDo() =
    let rec loop x =
        if x < N then   (*While*)
            f()         (*Do*)
            loop (x+1)
    loop 0

How about repeating a block until some condition is true (repeat/until)? Easy enough...

let funRepeatUntil() =
    let rec loop x =
        f()                 (*Repeat*)
        if x >= N then ()   (*Until*)
        else loop (x+1)
    loop 0

What was that about a monad-less break? Well, just introduce a conditional expression which returns 'unit', as in:

let funBreak() =
    let rec loop() =
        let x = g()
        if x > N then ()    (*break*)
        else
            f()
            loop()
    loop()

How about continue? Well, that's just another call to loop! First, with a syntax crutch:

let funBreakContinue() =
    let break' () = ()
    let rec continue' () =
        let x = g()
        if   x > N then break'()
        elif x % 2 = 0 then
            f(); f(); f();
            continue'()
        else
            f()
            continue'()
    continue'()

And then again without the (ugly) syntax crutch:

let funBreakContinue'() =
    let rec loop () =
        let x = g()
        if   x > N then ()
        elif x % 2 = 0 then
            f(); f(); f();
            loop()
        else
            f()
            loop ()
    loop()

Easy as pie!

One nice result of these loop forms is that it makes it easier to spot and implement states in your loops. For example, a bubble sort continually loops over an entire array, swapping values that are out of place as it finds them. It keeps track of whether a pass over the array produced any exchanges. If not, then every value must be in the right place, so the sort can terminate. As an optimization, on every pass thru the array, the last value in the array ends up sorted into the correct place. So, the loop can be shortened by one each time through. Most algorithms check for a swap and update a "bModified" flag every time there is one. However, once the first swap is done, there is no need for that assignment; it's already set to true!

Here is F# code which implements a bubble sort (yes, bubble sort is terrible algorithm; quicksort rocks). At the end is an imperative implementation which does not change state; it updates the bModified flag for every exchange. Interestingly, the imperative solution is faster on tiny arrays and just a percent or two slower on large ones. (Made for a good example, though).

let inline sort2 f i j (a:'a array) =
    let i' = a.[ i ]
    let j' = a.[ j ]
    if f i' j' > 0 then
        a.[ i ] <- j'
        a.[ j ] <- i'

let bubble f (xs:'a array) =
    if xs.Length = 0
    then ()
    
    let rec modified i endix =
        if i = endix then
            unmodified 0 (endix-1)
        else
            let j = i+1
            sort2 f i j xs
            modified j endix
    and unmodified i endix =
        if i = endix then
            ()
        else
            let j = i+1
            let i' = xs.[ i ]
            let j' = xs.[ j ]
            if f i' j' > 0 then
                xs.[ i ] <- j'
                xs.[ j ] <- i'
                modified j endix
            else
                unmodified j endix
    in unmodified 0 (xs.Length-1)


let bubble_imperitive f (xs:'a array) =
    let mutable bModified = true
    let mutable endix = xs.Length - 1
    while bModified do
        bModified <- false
        endix <- endix - 1
        for i in 0..endix do
            let j = i+1
            let i' = xs.[ i ]
            let j' = xs.[ j ]
            if f i' j' > 0 then
                xs.[ i ] <- j'
                xs.[ j ] <- i'
                bModified <- true
        done
    done

[Sorry about the wrong identations in the code. I am not sure how to fix it.]

Another approach is to use a "functional" "while do" (and "do .. while") loop:

let funWhileDo (body : 'a -> 'a)(condition : 'a -> bool) (initial : 'a) =

  let s = Seq.initInfinite (fun i -> i) 

            |> Seq.scan (fun s _ -> body s) initial 

            |> Seq.takeWhile (fun x -> condition x)

   let init2 = s |> Seq.fold (fun _ state -> state) initial

   if (condition init2) then

      body init2

   else

      init2

let funDoWhile (body : 'a -> 'a) (condition : 'a -> bool) (initial : 'a) =

   let init2 = body initial

   funWhileDo body condition init2

This approach encourages the programmer to be explicit about the state of the loop.

Then, BubbleSort can be defined as follows:

let bubblePass f (array : 'a []) (_, endix) = 

    let mutable modified = false

    for i in 0 .. (endix-1) do

        let j = i+1

        let ix = array.[ i ]

        let jx = array.[ j ]

        if f ix jx > 0 then

            array.[ i ] <- jx

            array.[ j ] <- ix

            modified <- true

    (modified, endix-1)

let bubbleNotDone (modified, _) = (modified = true)

let doBubbleSort array = 

    let _ = funDoWhile (bubblePass (fun x y -> x-y) array) bubbleNotDone (false, array.Length-1)

    array

By gantsich on 3/10/2010 2:53 PM (permalink)

I found Sequences very expressive and powerful as well, but then also quite slow compared to normal tail recursive calls. A simple monte carlo option pricer can be very easily expressed using infinite sequences, and boy is it readable :). However, it was also an order of magnitude slower than a tail recursive implementation.

I found sequences and sequence expressions excellent for prototyping though.

Ninja edit: Obviously I should not be making claims like sequences are slow. They can be significantly faster in the right scenario (e.g. where you can save time with lazy initialization and caching already computed items).

By tomf on 3/11/2010 9:29 AM (permalink)

seq {} is slow and if used incorrectly(using skip 1 to simulate head/tail), the slowdown is exponential.

LazyList in most case beats seq {} but lose the generator style(yield which can be very convenient) and is still noticeably slower than tail recursive call(which becomes loop, the fastest).

So F# is interesting in that while it should promote functional style programming, it penalize that form in terms of performance. I am not sure if it is the underlying VM(i.e. function calls and lambdas are slow).

In other words, function is first class object with a second class status in terms of performance.

By gary on 3/11/2010 2:25 PM (permalink)

I don't think it is the lambdas that are slow. In .NET 4, delegate call performance is right up there with callvirt. Afaik sequences are implemented as state machines, which means they are bound to be slower than simple function calls.

Nevertheless, I was also a bit disappointed that my delightful 4 line option pricer has to be "devolved" into a sprawling recursive function that is a lot less straightforward to read.

By tomf on 3/11/2010 2:36 PM (permalink)

Nevertheless, I was also a bit disappointed that my delightful 4 line option pricer has to be "devolved" into a sprawling recursive function that is a lot less straightforward to read.

If you don't mind sharing, I'd be interested to see a short real-world example of this (both the slow elegant code and the fast ugly code), to have a real example to noodle on (in my copious free time, heh) for thinking of ways that we can keep a great programming model but improve the perf.

By brianmcn on 3/11/2010 2:51 PM (permalink)

Brian,

I deleted the sequence based code since I had no use for it, but I tried to quickly rewrite it here. For some reason there is a tiny difference between the prices generated, but I can't immediately see where the problem is (it is definitely in CalculatePayoff2 though - PriceAsian2 works correctly). Nevertheless, the sequence based code is consistently an order of magnitude slower.

I know that the sequence based code is easier to parallelize, but then again, monte carlo methods are embarrassingly parallel. We need to price thousands of these options at a time, so we can just plop in the parallelism at the option level and cut down on overhead at the same time. I'm just saying this to clarify that there is little to gain in parallelizing the actual pricing algorithm.

#light

open System
open System.Diagnostics

module Main =

    // Pricing input
    type PricingInput = {
        Spot : float
        Strike : float
        TimeStep : float
        UpStepSize : float
        DownStepSize : float
        UpMovementProb : float
        Interest : float
        Expiry : float
        Steps : int32
        Trials : int32
        Seed : int
    }

    (********************************************)
    (*** Somewhat hard to read recursive code ***)
    (********************************************)
    let CalculatePayoff (input:PricingInput) (random:Random) =
        let rec EvaluatePath spot total step =
            match step with
            | 0 -> (total + spot)
            | _ ->
                match random.NextDouble() with
                | x when x < input.UpMovementProb -> EvaluatePath (spot * input.UpStepSize) (total + spot) (step - 1)
                | _                               -> EvaluatePath (spot * input.DownStepSize) (total + spot) (step - 1)
        let averageSpot = (EvaluatePath input.Spot 0.0 (input.Steps-1))/float(input.Steps)
        Math.Max(averageSpot - input.Strike, 0.0)

    let PriceAsian (input:PricingInput) =
        let random = new Random(input.Seed)
        let rec payoffSampler total trial =
            match trial with
            | 0 -> total
            | _ -> payoffSampler (total + (CalculatePayoff input random)) (trial - 1)
        let payoffSum = payoffSampler 0.0 input.Trials
        payoffSum/float input.Trials
        
    (************************************)
    (*** The nice sequence based code ***)
    (************************************)
    let CalculatePayoff2 (input:PricingInput) (random:Random) =
        let GetPayoff average = Math.Max(average - input.Strike, 0.0)
        input.Spot |> Seq.unfold (fun spot -> match random.NextDouble() with
                                              | x when x < input.UpMovementProb -> Some(spot, spot * input.UpStepSize)
                                              | _                               -> Some(spot, spot * input.DownStepSize))
                   |> Seq.take (input.Steps) |> Seq.average |> GetPayoff

    let PriceAsian2 (input:PricingInput) =
        let random = new Random(input.Seed)
        let GetAveragePrice total = total/float input.Trials
        Seq.initInfinite (fun i -> CalculatePayoff2 input random) |> Seq.take input.Trials 
        |> Seq.fold (fun price total -> total+price) 0.0   |> GetAveragePrice


    // Set up pricing parameters
    let steps = 100
    let expiry = 0.08
    let volatility = 0.15
    let interest = 0.04
    let timeStep = expiry/float steps
    let variance = volatility * Math.Sqrt(timeStep)
    let upSize = Math.Exp(variance)
    let downSize = Math.Exp(-variance)
    let growth = Math.Exp(interest*timeStep)
    let upMovementProb = (growth - downSize)/(upSize - downSize)

    let input = { Spot = 50.0;
                  Strike = 50.0;
                  TimeStep = timeStep;
                  UpStepSize = upSize;
                  DownStepSize = downSize;
                  UpMovementProb = upMovementProb;
                  Interest = interest;
                  Expiry = expiry;
                  Steps = steps;
                  Trials = 250000;
                  Seed = 1000}

    let GetPreciseMilisecs (watch:Stopwatch) =
        float(watch.ElapsedTicks)/float(Stopwatch.Frequency)*1000.0;

    let rec DoWhile (watch:Stopwatch) =
        watch.Start()
        let price = PriceAsian input
        let discountedPrice = price * Math.Exp(-interest*expiry)
        watch.Stop()
        printfn "Single-Pricer(recursive)>>Price is %f calculated in %f miliseconds" discountedPrice (GetPreciseMilisecs watch)
        watch.Reset()
        watch.Start()
        let price = PriceAsian2 input
        let discountedPrice = price * Math.Exp(-interest*expiry)
        watch.Stop()
        printfn "Single-Pricer(sequence)>>Price is %f calculated in %f miliseconds" discountedPrice (GetPreciseMilisecs watch)
        watch.Reset()
        let choice = Console.ReadKey().KeyChar
        match choice with
        | 'q' -> ()
        | _ -> DoWhile watch

    printfn "Press 'q' to quit or any other key to re-run the calculation"
    DoWhile (new Stopwatch())

By tomf on 3/12/2010 3:41 AM (permalink)

Nevertheless, the sequence based code is consistently an order of magnitude slower.

Ah, yes, when all you're doing is floating-point arithmetic in a tight loop, then any allocation kills you (once you say 'new' or 'seq' or anything that allocates, you've lost compared to solutions that don't allocate).

I suppose the only thing I have to offer is that it may be useful to capture common abstractions into useful efficient helpers. For example, the "unfold... take... average" bit of CalculatePayoff2 seems like it might perhaps be a common pattern that you might factor into "AverageN" as shown here:

 

    /// Compute the average of [x, f(x), f(f(x)), ...(n times)...]
    let AverageN f x n =
        let mutable s = x
        let mutable total = 0.0
        for i in 0..n-1 do
            total <- total + s
            s <- f s
        total/float n

    let CalculatePayoff2 (input:PricingInput) (random:Random) =
        let GetPayoff average = Math.Max(average - input.Strike, 0.0)
        AverageN (fun spot ->
                    match random.NextDouble() with
                    | x when x < input.UpMovementProb -> spot * input.UpStepSize
                    | _                               -> spot * input.DownStepSize
                ) input.Spot input.Steps |> GetPayoff

and maybe AverageN can be reused in other contexts. I guess there are lots of algorithms that are elegantly structured as "create a (possibly infinite) sequence, apply some function, (truncate to N elements), fold to yield a result" and the way to hand-optimize it is always to do loop fusion so that the "actual sequence" disappears: "unfold...blahblah...fold" reduces to a loop and an accumulator or two. To keep the code elegant, I guess you just need to recognize the most common idioms/patterns of "unfold...blahblah...fold" and encapsulate them in efficient functions with good names. No silver bullets here. :)

By brianmcn on 3/12/2010 1:48 PM (permalink)

The recursive implementations run at about the same speed because the F# compiler actually compiles them into imperative loops. There are two issues which prevent the recursive implementation technique from being a general replacement for imperative loops: 1) the loop body is translated into a separate function so that you incur the overhead of passing all explicit and implicit arguments to the function every time the loop is entered and 2) you can't reference mutable variable from inside the loops. However, I suppose a future version of the F# compiler might inline the body of local loop functions into the caller, which would eliminate the calling overhead and would also make tuple unpacking optimizations applicable to the return value, so that you could replace multiple mutable variables with a single tuple return value.

By Stephan on 4/28/2009 2:01 AM (permalink)

Topic tags

Built with WebSharper

Home

Answers

Events

Courses

Groups and Conferences

Blogs

Jobs

Developers

Topic tags