3.2k
views15
comments

I thought I understood what let! and friends were all about, but now I have some doubts. Following is a code fragment:

async { ... let! count = inStream.ReadAsync(n) ...}

As I understand it, when the above workflow is executed (via Async.Run), the ReadAsync(n) returns its own async workflow, but that workflow is immediately executed and the resulting value bound to "count". Under the covers I'm guessing that this also causes the outer workflow to sleep until the result is produced. Further the ReadAsync is (may be) executed on a different thread. So, is this correct? If so, what is the advantage of using let! here?

In particular, how does the above win relative to:

async { ... let count = inStream.Read(n) ...}

In both cases the computation must wait for the Read/ReadAsync to complete, right? In the former case we also have the overhead of spawning a new task/thread -- so where's the win?

Thanks in advance,

Bill

this depends on your code - just ask yourself what the gain of using ReadAsync instead of Read is - this has nothing to do with the async workflow

By Carsten Koenig on 12/17/2008 9:39 PM (permalink)

this depends on your code - just ask yourself what the gain of using ReadAsync instead of Read is - this has nothing to do with the async workflow

In this instance I see *no* advantage to using ReadAsync vs. Read. Normally I'd use a ReadAsync when the program has some other work to do while the read progresses on a different thread. In this case, due to the semantics of let! (as I understand it) there's nothing to do but wait (in the original thread.) So, am I misunderstanding something -- or does the use of let! make the associated use of an Async version of the operation less than optimal?

Thanks,

Bill

By Bill_Cohagan on 12/18/2008 4:08 PM (permalink)

Yes, in both cases you must wait for the operation to complete before you have access to the "count". The difference is how you wait, in the first case the thread is blocked till the operations complete. This is not so good as threads are expensive (by default on .NET each thread has a 1mb stack). In the second case no thread is blocked, only an object holding a callback is registered with the thread pool (which only costs a few bytes), the original thread is free to carry on doing other work and the end of the computation will be done by a thread from the thread pool.

By Robert on 12/18/2008 12:15 AM (permalink)

Yes, in both cases you must wait for the operation to complete before you have access to the "count". The difference is how you wait, in the first case the thread is blocked till the operations complete. This is not so good as threads are expensive (by default on .NET each thread has a 1mb stack). In the second case no thread is blocked, only an object holding a callback is registered with the thread pool (which only costs a few bytes), the original thread is free to carry on doing other work and the end of the computation will be done by a thread from the thread pool.

A potential point of misunderstanding is that Robert switched 'first' and 'second' above.

let count = stream.Read()

blocks a thread.

   let! count = stream.ReadAsync()

does not block a thread (it starts the request, but then schedules a callback on the IO completion port, so when the data arrives from the network/filesystem/whatever, the OS finds a free threadpool thread to invoke the callback on and the workflow continues from there).

By brianmcn on 1/21/2009 9:53 PM (permalink)

...
A potential point of misunderstanding is that Robert switched 'first' and 'second' above.
1
let count = stream.Read()
blocks a thread.
1
let! count = stream.ReadAsync()
does not block a thread (it starts the request, but then schedules a callback on the IO completion port, so when the data arrives from the network/filesystem/whatever, the OS finds a free threadpool thread to invoke the callback on and the workflow continues from there).

Hmm. So, what happens to the thread containing the let! after the request is started? Surely it doesn't continue past the let! -- so I assume that thread just teminates? Then when the callback (which is the continuation?) is executed a new thread is obtained and it is executed, right. Assuming this is correct, the use of let! has caused us to obtain a thread (for the ReadAsync), return one thread (the original thread containing the let!), and finally obtain another thread when it comes time to continue past the let!. Also we must have returned the thread associated with the ReadAsync to the pool.

If we just used let/Read wouldn't we avoid all the thread manipulation? Yes, we "block" a thread -- but unless that blocking is a busy/wait we're just putting the thread to sleep. So, clearly using let!/ReadAsync must have some advantage to compensate for all the thread cruft, but I don't understand what that might be.

So, bottom line, I think I do not understand the relative cost of "blocking" a thread via let/Read vs. the cost of thread manipulations with let!/ReadAsync. Can you elaborate? [Please feel free to assume I'm ignorant!]

Thanks,

Bill

By Bill_Cohagan on 1/22/2009 3:51 PM (permalink)

Ok, so Robert said

The difference is how you wait, in the first case the thread is blocked till the operations complete. This is not so good as threads are expensive (by default on .NET each thread has a 1mb stack). In the second case no thread is blocked, only an object holding a callback is registered with the thread pool (which only costs a few bytes), the original thread is free to carry on doing other work and the end of the computation will be done by a thread from the thread pool.

The point being, threads are expensive, you want to create as few as possible.

Now suppose that 1000 copies of this async workflow are running in parallel, and suppose the thread pool has 10 threads to start, and suppose each IO call (Read or whatever) will block for 10ms (on disk or network or whatever; the exact numbers don't matter much). In the 'let' case (synchronous IO call), either

we don't add any more threads to the thread pool, which means we start 10 reads, wait for them to complete, then start the next 10, wait, ... and end up taking more than a full second (1000 reads to do, 10 at a time, 10ms each = 1s) to do all this IO, most of the time with our CPU idle waiting, or
we add threads to the threadpool to keep the CPU busy... but threads are expensive, and we just increased the memory footprint of our app (as well as some other overhead), and it's unclear how many threads you should add (trading off memory and other overhead to attempt to keep CPU busy)

Compare to the 'let!' case (async IO call):

we start the first 10 (and release the thread after kicking off the read), then immediately start the next 10, etc, kicking off all the reads as fast as the CPU can go (but at most 10 BeginRead()s at a time, since just 10 threads), and probably get them all going before the first result even comes back
as the calls return, free threads from the threadpool are used to process the results, each using only a few bytes in the queue on the IOCompletionPort between the time when the 'data is ready' and 'a thread is available to process the result' (invoking the EndRead() callback on a free thread).

The end result is our CPUs are always busy AND we have not had to create any extra threads. You can do all this with the existing async programming model (BeginFoo/EndFoo) but it is a huge pain to author that code and get it correct and the resultant code is always unreadable. With F#, all the goo is encapsulated and so you just change 'let' to 'let!' and 'Foo()' to 'FooAsync()' and you get the CPU/memory benefits of the async-style code with hardy any fuss.

Note that a line like

let! result = ReadAsync()

effectively means "on the current thread we are on prior to this line of code, kick off the read and schedule a callback to run the continuation of this workflow when the result arrives, and then release the current thread". So 'let!' does a little work and then lets go of the thread. Eventually something will call back, and so a (probably) new thread will run the continuation. You can see this happening by printing out the value of System.Threading.Thread.CurrentThread.ManagedThreadId before and after a 'let!'. In sum, this releases a thread and gets a (probably) new thread back, but while the long-running IO operation is pending, no thread is blocked, leaving that thread free to do other work rather than blocked and waiting.

By brianmcn on 1/22/2009 4:43 PM (permalink)

Hi Bill,

Your question seems closely related to a question I asked a while ago as well. That thread might help:

[link:cs.hubfs.net]

cheers,

Kurt

By Kurt on 1/22/2009 12:59 PM (permalink)

Hi Bill, Your question seems closely related to a question I asked a while ago as well. That thread might help: [link:cs.hubfs.net] cheers, Kurt

Kurt

I took a look at the referenced thread and yes, it's almost exactly the question I have. From that thread I gather that a blocked thread is unavailable while an Async call releases the calling thread back to the thread pool. I can see the advantage there. What's misleading is that it's not the "Async-ness" that's buying you anything; rather it's the return of the thread to the pool. You're not speeding things up (assuming there are enough threads), but are conserving resources (that might cause that last assumption to become false.)

So, this is beginning to remind me of "cooperative multitasking" in the sense that the programmer has to know/understand this resource conservation technique. Having it presented in so many examples (having to do with performance improvements) without such an explanation must be causing confusion to more than just you and me!

Thanks for jumping in. The old thread helped a lot.

Bill

By Bill_Cohagan on 1/22/2009 4:27 PM (permalink)

I took a look at the referenced thread and yes, it's almost exactly the question I have. From that thread I gather that a blocked thread is unavailable while an Async call releases the calling thread back to the thread pool. I can see the advantage there. What's misleading is that it's not the "Async-ness" that's buying you anything; rather it's the return of the thread to the pool. You're not speeding things up (assuming there are enough threads), but are conserving resources (that might cause that last assumption to become false.)

I think that's a nice way of looking at it. Thanks.

And yes, there are subtle aspects at work here. For example, usually IO is done in a physically parallel thread, more or less down to the hardware level. No need to block the calling resource for that.

Also, as in Brian's example, there is an aspect of load balancing: you assume that all the processing and reading steps don't take exaclty the same amount of time (if that were the case, theoretically, you wouldn't see a speedup at all). But by using async, the granularity of the tasks you're firing is much smaller, allowing the system to balance the load much better among availbale resources.

Kurt

By Kurt on 1/22/2009 11:50 PM (permalink)

Yes, in both cases you must wait for the operation to complete before you have access to the "count". The difference is how you wait, in the first case the thread is blocked till the operations complete. This is not so good as threads are expensive (by default on .NET each thread has a 1mb stack). In the second case no thread is blocked, only an object holding a callback is registered with the thread pool (which only costs a few bytes), the original thread is free to carry on doing other work and the end of the computation will be done by a thread from the thread pool.

So, Robert, it appears you are saying that using let! and an Async operation (the 1st case) is NOT the Right Thing, correct? Is the only justification for let! and friends that you sometimes need to run (to completion) some Async operation supplied from elsewhere?

I think I'm still confused...

Bill

By Bill_Cohagan on 12/18/2008 4:10 PM (permalink)

Yes, if you only have one operation to do then using you will pay a small overhead for using the asynchronous technique. This is because Async.Run will bloke the main thread and wait for the result.

To take advantage of the async functionality you need to be in one of two case:

1) You don't care about result. In this case you can use Async.Spawn which will execute the task asynchronously, continue processing on the main thread and throw away the result. This is sometimes called the fire and forget model.

2) You have multiple operations you need to execute and you don't care about the execution order. You can compose the operations using Async.Parallel and then use Async.Run to collect a list of the result. In this case main thread will start and execute all the commands then bloke till all the results have been executed and processed by thread pool threads.

Is that any clearer?

Cheers,

Rob

By Robert on 12/19/2008 3:08 AM (permalink)

Robert,

I've thought further about your response. While it addresses the use of ASync forms of methods, I think I'm still confused about let! (!). Can your provide an example where using let! in association with an async form of method (e.g., ReadAsync) WOULD be appropriate?

If not, then why are let! and friends even in the language?

Bill

By Bill_Cohagan on 12/19/2008 6:40 AM (permalink)

Hi Bill,

Don't nearly all the examples from Expert F# fall into the second category I mentioned? Most process a number of "tasks" in parallel.

(I'm going on memory here as I don't have the book with me) For example the image processing sample uses AsyncRead to process a number of images in a directory asynchronously. It doesn't matter too much about which order the images are processed in so we run them all in parallel. We get a performance gain because no threads are blocked while reading the images.

Thanks,

Robert

By Robert on 12/19/2008 7:20 AM (permalink)

Hi Bill, Don't nearly all the examples from Expert F# fall into the second category I mentioned? ...

Robert

I've been away for a while, but am back and would like to continue this thread as I'm still uncomfortable with the use of let!. In answer to your comment about the image processing example, the answer is yes and no. I'm looking at the code on p. 372 if you have your book handy. (I can type in the code if you don't have easy access to the book.)

The function ProcessImagesAsync() constucts a sequence of Async operations that are ultimately executed via calls to Async.Run (and Async.Parallel) and this makes perfect sense to me. The function ProcessImageAsync(i) is called to construct each of the Async operations. Now, that function makes less sense to me. In particular it uses let! to bind the result of a call to ReadAsync -- and this is exactly the pattern that makes no sense to me.

Can you explain to me the advantage to using let!/ReadAsync vs. using let/Read in this example? Given my current understanding the latter is a clear win since no multithreading overhead is incurred. In either case (as I understand it) the thread is blocked until the binding of the pixels variable.

I fear that I'm oversimplifying things somewhere, but for the life of me I just don't see the usefulness of let! (unless the expression suppling the value is passed in and is an Async<a> type.) Bill

By Bill_Cohagan on 1/21/2009 3:48 PM (permalink)

Yes, if you only have one operation to do then using you will pay a small overhead for using the asynchronous technique. This is because Async.Run will bloke the main thread and wait for the result. To take advantage of the async functionality you need to be in one of two case: 1) You don't care about result. In this case you can use Async.Spawn which will execute the task asynchronously, continue processing on the main thread and throw away the result. This is sometimes called the fire and forget model. 2) You have multiple operations you need to execute and you don't care about the execution order. You can compose the operations using Async.Parallel and then use Async.Run to collect a list of the result. In this case main thread will start and execute all the commands then bloke till all the results have been executed and processed by thread pool threads. Is that any clearer? Cheers, Rob

OK, that helps and is consistent with my assumptions. It's somewhat misleading how there are so many examples like the one I originally posted where use of let! seems inappropriate. Most of the async examples in _Expert F#_ have this characteristic for instance. While I realize these are somewhat pedantic, I would have expected some sort of disclaimer regarding the use of let! in order to avoid precisely the confusion I suffered!

Thanks again.

Bill

By Bill_Cohagan on 12/19/2008 6:25 AM (permalink)

Topic tags

Built with WebSharper

Home

Answers

Events

Courses

Groups and Conferences

Blogs

Jobs

Developers

Topic tags