Why choose this representation? It seems like it is very inefficient both for creation (you must rebuild the entire tree for every insert) and lookup (each letter takes O(n), well ok, O(26)). It is interesting to experiment with such structures for practice working with immutable data structures, but I think a real application would be better served by a mutable data structure (which would be easier to author, and give better run-time performance).

By on 12/6/2008 2:22 PM ()

Why choose this representation? It seems like it is very inefficient both for creation (you must rebuild the entire tree for every insert) and lookup (each letter takes O(n), well ok, O(26)). It is interesting to experiment with such structures for practice working with immutable data structures, but I think a real application would be better served by a mutable data structure (which would be easier to author, and give better run-time performance).

The reason you gave about interesting to experiment for practice working with immutable data structures :P

Since Ocaml has been around for a while longer than F#, there's more learning material about it. Particularly books with exercises, which I find is the best way to feel like you have practical experience working with a language even when you don't. I found this one at the end of chapter 2 of the freely available Ocaml ebook.

By on 12/6/2008 3:27 PM ()

Ok, if just for practice, then here is my stab at it (I need practice with this too :) ).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
#light

open System.Text

type LexNode = Node of char * bool * LexNode list  
// throughout, I'll use "lnl" for a variable of type "LexNode list"

// handy helper
let Char (Node(c,_,_)) = c

// Note: functions below are not tail recursive; deep tree could stack overflow

// useful display of data structure, I can't imagine debugging/verifying without it
let PrettyPrint lnl =
    let acc = new StringBuilder()
    let rec PrettyPrint lnl revPrefix depth =
        let spaces = new string(Array.create depth ' ')
        let Word c revPrefix = new string(c::revPrefix |> List.rev |> Array.of_list)
        match lnl with
        | [] -> ()
        | (Node(c,isWord,children)) :: t ->
            if isWord then
                acc.AppendFormat("{0}{1}            {2}\n", spaces, c, Word c revPrefix) |> ignore
            else
                acc.AppendFormat("{0}{1}\n", spaces, c) |> ignore
            PrettyPrint (children |> List.sort_by Char) (c::revPrefix) (depth+1)
            PrettyPrint t revPrefix depth
    PrettyPrint (lnl |> List.sort_by Char) [] 0
    acc.ToString()
            
// test the printer
let example = [ Node('t', false, [
                    Node('h', false, [
                        Node('e', true, [
                            Node('n', true, [])])])])
                Node('h', false, [
                    Node('i', true, [])])]
printfn "%s" (PrettyPrint example)

// given lnl, return a new lnl with word inserted
let Insert (word:string) lnl =
    let rec Insert lnl letters =
        match letters with
        | [] -> lnl
        | c :: suffix ->
            let isLastChar = List.is_empty suffix
            let same,others = lnl |> List.partition (fun x -> c = Char x)
            let newSame =
                match same with
                | [] -> Node(c, isLastChar, Insert [] suffix)
                | [Node(_,isWord,children)] -> Node(c, isWord || isLastChar, Insert children suffix)
                | _ -> failwith "there should not be multiple entries in list with same char"
            newSame :: others
    Insert lnl (word |> Seq.to_list)

let example2 =
    []
    |> Insert "then"
    |> Insert "the"
    |> Insert "hi"
    |> Insert "high"

printfn "-------------"
printfn "%s" (PrettyPrint example2)
By on 12/6/2008 4:29 PM ()

Heh, I can't believe I didn't think to convert the string to a char list first. That was one of the more annoying implementation aspects when I was doing mine, passing the index around through the recursion and having to index the string, pattern matching on the string makes it much simpler.

The lack of tail recursion bothered me at first, but I guess it's forgivable in this example, unless you're doing a dictionary for Icelandic or something where words can be extremely long.

By on 12/6/2008 5:04 PM ()
IntelliFactory Offices Copyright (c) 2011-2012 IntelliFactory. All rights reserved.
Home | Products | Consulting | Trainings | Blogs | Jobs | Contact Us | Terms of Use | Privacy Policy | Cookie Policy
Built with WebSharper