Lists and Seq's each have an outstanding "feature" which helps you decide which to use.

For a list, the "feature" is that it has the structure hd::tl, which makes left-to-right processing and pattern matching easy, and computationally efficient.

For a seq, the "feature" is that it lazy evaluated, ie. the "head" element is only computed when it is used. Thus a seq can be used for input, in a way that lists can't.

Example 1 - Use a seq for large, or infinite, sequences

Aim: Print the first leaf of a very large list of lists (or seq of seqs)

open List
let test_big_list =
let chunk = [1..10000000]
let mem_hog = [ for i in 1..10000000 -> chunk]
printf "First leaf = %i\n" (hd (hd mem_hog))
test_big_list

open Seq
let test_big_seq =
let chunk = seq {1..10000000}
let mem_hog = seq { for i in 1..10000000 -> chunk }
printf "First leaf = "%i\n" (hd (hd mem_hog))
test_big_list

If you put both of these in the interpreter, you'll find:
- The List version takes several seconds to compute, or causes an out-of-memory error, because the whole structure is built in memory before the "printf" is invoked.
- The seq version is evaluated instantly, because it only computes the first element.

Example 2 - Use a list for manipulating structured data

Aim: Given a list, or seq, "double" it, by making each element appear twice

let rec double_up = function
| [] -> []
| hd::tl -> hd::hd::(double_up tl)

With a list, this is too easy! :)

Seq is much less obvious, and involves application of two functions..

let rec double_up_s s = Seq.map_concat (fun a -> seq [a;a]) s

(it took me quite a while to work that out, after looking for a more direct solution)

--------------------------------

I agree that it is hard finding this stuff in the books, but it is there.

See Syme p 463 for file i/o with a Seq

Harrop (F# for Scientists) p. 130 has a beautiful example of file i/o with Seq.

let lines_of_file filename =
seq { use stream = File.OpenRead filename
use reader = new StreamReader(stream)
while not reader.EndOfStream do
yield reader.ReadLine() }

Then he reads the first three lines with...

lines_of_file "my_text.txt" |> Seq.take 3;;

Now that's when a compution expression is more than sytactic sugar!

By on 3/21/2009 6:46 PM ()

For me, the difference between Lists and Seqs is that List is a data structure and Seq is a computation.

I agree the outstanding feature of Lists is pattern matching or structural decomposition. This allows you to write recursive functions over the structure of Lists. You can't do this (efficiently) with Seqs because they have no structure, they're a computation (you can't pattern match a function).

I don't see lazy evaluation as the outstanding feature of Seqs though. You can have a LazyList, it is a data structure like a List, enjoys structural decomposition like a List, but it is lazyily evaluated. Seqs on the other hand are a computation, so if that computation is a generator function that evaluates the next element then yes that looks lazy, but if that computation just looks up the element from a List already in memory then it is not really any more lazy than the List itself.

What I do see as the outstanding feature of Seqs is stream fusion.

1
2
3
4
let list0 = [1..1000000]
let list1 = List.map (fun x -> 2 * x) list0
let list2 = List.filter (fun x -> x % 2 = 0) list1
let list3 = List.map (fun x -> x / 2) list2

In this code I allocate a list of a million elements (list0), then I map over it and allocate another list of one million elements (list1), then I filter it and there's another million, and then another. So at the end of the day I have 4 million list nodes in memory. If I wrote the same code with Seqs then

1
2
3
4
let list0 = [1..1000000]
let seq1 = Seq.map (fun x -> 2 * x) list0
let seq2 = Seq.filter (fun x -> x % 2 = 0) seq1
let seq3 = Seq.map (fun x -> x / 2) seq2

I still allocate a million nodes for my initial list. But each time I perform an operation on it, I am not copying the entire list because the Seq.* functions build a computation rather than transform a data structure. So at the end of this code the memory contains just 1 million list nodes and 3 Seq computation objects.

By on 3/22/2009 7:56 PM ()

So how fast are conversions to/from Seq, like List.of_seq or Seq.of_array?

And the only "pairwise" function in the library is Seq.pairwise but it's easy to follow the same "template" and write "pairwise" that operates on lists, or that operates on a list and returns pairs in ResizeArrays, etc. So unless conversions to/from Seq are very cheap, I wonder why there is not List.pairwise etc.

By on 3/25/2009 12:09 PM ()

Conversions to seq are no-ops (upcasts - Lists and Arrays and whatnot implement IEnumerable<T>).

Conversions from seq to concrete types are O(N) in time and space. But so is any 'creation of a new collection', so in a sense this is 'as cheap as possible' (from a big-oh perspective anyway).

Seq.pairwise works great for concrete collections.

By on 3/25/2009 12:13 PM ()

I like the description of the 'outstanding features'.

Seq is much less obvious, and involves application of two functions..

let rec double_up_s s = Seq.map_concat (fun a -> seq [a;a]) s

(it took me quite a while to work that out, after looking for a more direct solution)

Here's an easy solution:

1
2
3
4
5
6
let DoubleUp s =
    seq { 
        for x in s do 
            yield x
            yield x
    }
By on 3/21/2009 7:36 PM ()

I'll try to help out a bit.

The Seq type is IEnumerable, correct (it's just a type alias, AFAIK). F# List, BCL's List<T>(ResizeArray), Array -- they all implement the IEnumerable interface, and hence are all Seqs.

The important thing to understand about IEnumerable/seq is that it is simply a forward-only, read-only enumeration type. While it can be backed by arrays and lists, it could also be backed by code that generates values as needed. For example, reading a file or reading a database recordset could be viewed as IEnumerables.

Sequence computations (seq { for ... }) have counterparts: lists are "[ for ... ]" and array is "[| for ... |]".

Seq has more functions cause a seq function can work across all seq types, not just list, array, etc. The difference is that a generic sequence-based function won't always be as efficient. For example, doing length on an array is O(1). Doing length on a sequence is O(n). But, the F# team is also still cleaning up those modules so expect them to change some still.

In your example of doing List.of_seq on a ResizeArray, that's what's expected, because a ResizeArray IS a seq: it implements IEnumerable.

Does that help some?

By on 3/18/2009 5:24 PM ()

yes that helps, thanks.

By on 3/20/2009 2:24 PM ()
IntelliFactory Offices Copyright (c) 2011-2012 IntelliFactory. All rights reserved.
Home | Products | Consulting | Trainings | Blogs | Jobs | Contact Us | Terms of Use | Privacy Policy | Cookie Policy
Built with WebSharper