For my project in practice, I've been using the technique described in
Expert F#, Chapter 9, Using XML as a Concrete Language Format (starting p.212).

I have a feeling that by using reflection, it might be possible to automate the (de)serialization of discriminated union types into XML format just like it's possible to automate the (de)serialization of record types into an CSV format.

Anyone up for the challenge? :-)

By on 3/26/2008 4:06 PM ()

Let me elaborate. For discriminated unions, perhaps some form of automatic XML format would make sense. The .NET serialization doesn't work out of the box, because discriminated unions don't have empty constructors.

By on 3/26/2008 10:58 AM ()

...automatic XML ... The .NET serialization doesn't work out of the box, because discriminated unions don't have empty constructors.

Investigating the current status, the next step is to duplicate the problem namin encountered. Then to devise some appropriate solution. This first automated approach will likely generate an XML representation that is more verbose than the succinct one we did by hand.

Background info:

Xml Serializer is described here:

[link:msdn2.microsoft.com]

an example of C# disassembly of a discriminated union using Reflector is here:

[link:blogs.msdn.com]

By on 3/28/2008 3:54 PM ()

Clearly, XmlSerialize of F# structures isn't getting us to the desired result. Possibly this can all be fixed by internal changes to F# compiler -- though the ignoring of the standard [NonSerialized] attribute by XmlSerializer is worrisome. XmlSerializer has its own attributes for controlling its behavior, but it shouldn't be necessary to use those for most .NET data.

EDIT: See "Using Reflection to Serialize F# Data to/from XML", [link:cs.hubfs.net] for the implementation of a generalized solution.

By on 3/29/2008 1:14 PM ()

Translated Micado serialization from C#, and used that to xml-serialize an int, and then [unsuccessfully!] a discriminated union containing ints:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
type NodeIndex = int

type Wires =
    Complete 
  | Input of NodeIndex    // Input Terminal with wire coming out.
  | Output of NodeIndex   // Output terminal with wire coming in.
  | Thru of NodeIndex * NodeIndex

module IO = System.IO
module TEXT = System.Text
module X = System.Xml
module XS = System.Xml.Serialization

let UTF8ByteArrayToString (chars:byte[]) :string =
  let encoding = new TEXT.UTF8Encoding()
  encoding.GetString(chars)

/// return Xml string.
let SerializeObject (ob:obj) :string =
  let typ = ob.GetType()
  let mem = new IO.MemoryStream()
  let xs = new XS.XmlSerializer( typ )
  let writer = new X.XmlTextWriter(mem, TEXT.Encoding.UTF8)
  xs.Serialize(writer, ob)
  let mem2 = writer.BaseStream :?> IO.MemoryStream
  let resultStr = UTF8ByteArrayToString (mem2.ToArray())
  resultStr

// --- interactive session ---
pam (SerializeObject (box(12))) "Serial 12"
//> "<?xml version=\"1.0\" encoding=\"utf-8\"?><int>12</int>" = Serial 12

let w1 = Input 1
pam w1 "w1"
//> Input 1 = w1

//SerializeObject (box(w1))
//==>   *** ERROR -- No Parameterless Constructor
 

The integer came through as "<int>12</int>", but the union threw an error. Disassemblying in Reflector, we see that each alternate (here, "Input") of the union becomes a nested class, with a constructor that takes the needed data -- here, "Wires._Input(_Input1:int)"

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
[Serializable, CompilationMapping(SourceLevelConstruct.SumType)]
public class Wires : IStructuralHash, IComparable
{
...
  [Serializable]
  public class _Input : MyModule.Wires
  {
    // Fields
    public int _Input1;

    // Methods
    public _Input(int _Input1)
    {
        this._Input1 = _Input1;
    }
  }
...
}

This seems to be a bug in F#'s implementation of unions. For the sake of serialization, there should be a private parameterless constructor added to each alternative:

1
2
3
4
 

    private _Input(){}
By on 3/29/2008 11:28 AM ()

Further experimentation shows additional issues serializing F# types.

Record structure:

1
2
type WRecord =
  { In: NodeIndex; Out: NodeIndex }

Also lacks a private parameterless constructor for the serializer to use.

Class with fields:

1
2
3
4
5
6
7
8
type WClass() =
  [<DefaultValue>]
  val mutable In: NodeIndex
  [<DefaultValue>]
  val mutable Out: NodeIndex
  with
  member w.init (in0:NodeIndex) (out0:NodeIndex) =
    w.In <- in0; w.Out <- out0;

That is the ugly construct I came up with to be able to serialize. Has an empty constructor, and then an init member that client must call to fill the mutable fields. (NOTE: if the implementation added a private parameterless constructor, this work-around would not be needed.) Even then, it yielded a verbose serialization:

<WClass><_In>11</_In><_Out>22</_Out><In>11</In><Out>22</Out></WClass>

The ideal output would be:

<WClass><In>11</In><Out>22</Out></WClass>

What happened is that F# implements as an internal field "_In" combined with a property "In", and both were given values in the serialization. The fix would be for F# to mark the internal field's as "NonSerialized". See "Object Serialization in the .NET Framework/ Selective Serialization" [link:msdn2.microsoft.com] Here, I will do so manually:

1
2
3
4
5
6
7
8
type WClass() =
  [<DefaultValue>][<System.NonSerialized>]
  val mutable In: NodeIndex
  [<DefaultValue>][<System.NonSerialized>]
  val mutable Out: NodeIndex
  with
  member w.init (in0:NodeIndex) (out0:NodeIndex) =
    w.In <- in0; w.Out <- out0;

Which unfortunately doesn't change the XmlSerialized result. Examining Disassembly in Reflector, the fields are now marked "[NonSerialized, DefaultValue, NonSerialized]". Not sure why assembly space is being wasted by the double-marking of NonSerialized. Don't know why XmlSerializer is ignoring the normal Serialization markings.

As a separate issue, putting such a serialization into a file wouldn't yield human readable text, as no newlines or indentation is added.

By on 3/29/2008 11:49 AM ()

Added the following logic to convert the results of XmlSerialize into a human readable file. The approach is to let XmlDocument do the pretty-printing work, via LoadXML and Save:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
 

module RE = System.Text.RegularExpressions

/// Strip the xml version line "<?...?>"
/// NOTE: Also stripping invalid char that appeared BEFORE that version info!
let stripXmlVersion (s:string) :string =
  //let re = new RE.Regex(@"<?.*?>")
  RE.Regex.Replace(s, @".*\<\?.*\?\>", "")

/// Convert from all-in-one-line string designed for PC interchange,
/// To a human readable file with newlines and indentation.
/// NOTE: XmlDocument can't handle the (standard!) xml version info
/// at the string start, so strip it out first.
let SavePrettyXml (s:string) (filename:string) :unit =
  let doc = new X.XmlDocument()
  let s2 = stripXmlVersion s
  doc.LoadXml (stripXmlVersion s)
  doc.Save filename

/// XmlSerialize object and save to specified file.
let ser2file (ob:#obj) (filename:string) =
  let xmlString = SerializeObject (box(ob))
  SavePrettyXml xmlString filename


ser2file wc "wclass.xml"

--- file "wclass.xml" contents: ---

1
2
3
4
5
6
<WClass xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <_In>11</_In>
  <_Out>22</_Out>
  <In>11</In>
  <Out>22</Out>
</WClass>
By on 3/29/2008 12:53 PM ()

NOTE: "scheme" => "schema"

How about a three-way interchange between a compact form that is easy to type, an XML representation, and F# data structures?

Sounds like a straightforward test of the parser generator I am constructing. Good fit to FParsec abilities, so this gives me further motivation to construct my generator on FParsec.

Even before the generator exists, manually constructing [F# code calling FParsec functions] a parser for a specific schema should be simple enough; I will give that a go. If you want to try that yourself, see [link:quanttec.com]

Can you provide a short example of one such recursive data structure involving discriminated unions, so that I know I am heading on the track you had in mind? ~TMSteve

By on 3/26/2008 4:07 PM ()

The Expert F# example mentioned above (which is about Composite Scenes) is a good simple example.

In my case, here is an example of a discriminating union I am actually using:

1
2
3
4
5
6
type FlowBox =
  Primitive of Attachments * Used
  | Extended of Attachments * Used * FlowBox
  | Or of Attachments * FlowBox array * Ordering
  | And of Attachments * FlowBox array
  | Seq of Attachments * FlowBox array

In the same style as the Expert F# example, I parse an XML representation of this type as follow:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
let rec extractBox (node : XmlNode) =
  let attribs = node.Attributes
  let childNodes = node.ChildNodes
  let attachments = extractAttachments(attribs)
  let childBoxes() =
  [| for child in childNodes -> extractBox(child) |]
  match node.Name with
  | "Primitive" ->
    FlowBox.Primitive(attachments, extractUsed(attribs))
  | "Extended" ->
    FlowBox.Extended(attachments, extractUsed(attribs), extractBox (childNodes.Item(0))) 
  | "Or" ->
    FlowBox.Or(attachments, childBoxes(), extractOrdering(attribs))
  | "And" ->
    FlowBox.And(attachments, childBoxes())
  | "Seq" ->
    FlowBox.Seq(attachments, childBoxes())
  | s ->
failwith ("unrecognized box type " ^ s)

I am omitting lots of code here :-) If you're interested, the export and import to XML is done in the module Serialization of the following file:
[link:code.google.com]

By on 3/26/2008 6:03 PM ()

Working towards a schema-driven reader/writer for F# discriminated unions.

The first step is to take a test case and write by hand the code to read and write a FlowBox. This version is NOT driven by a schema. It just shows what needs to occur. This code will be similar to what namin already did, in the file he linked. I've stripped out anything not needed for this example, and reorganized:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
#light

type NodeIndex = int

type Wires =
    Complete 
  | Input of NodeIndex    // Input Terminal with wire coming out.
  | Output of NodeIndex   // Output terminal with wire coming in.
  | Thru of NodeIndex * NodeIndex
  with
    static member create inputWire outputWire =
        match inputWire, outputWire with
        | None, None -> Complete
        | None, Some output -> Input output // careful not to reverse
        | Some input, None -> Output input
        | Some input, Some output -> Thru (input, output)
    member w.inputOpt =
      match w with
      | Complete
      | Input _ -> None
      | Output inIndex
      | Thru (inIndex, _ ) -> Some inIndex
    member w.outputOpt =
      match w with
      | Complete
      | Output _ -> None
      | Input outIndex
      | Thru (_, outIndex ) -> Some outIndex

type FlowBox =
    Prim of Wires  // Primitive
  | Extd of Wires * FlowBox // Extended
  | And of Wires * FlowBox array
    
/// The XML Serialization is loosely inspired by Expert F#, Chapter 9, 
/// section Using XML As a Concrete Language Format (starting p.212)
module Serialization =
    open System.Xml
    let InputStr = "input";
    let OutputStr = "output";
    let PrimStr = "Primitive";
    let ExtdStr = "Extended";
    let AndStr = "And";
    
    // ----- Build data from XML. -----
    
    let importInts (text : string) =
        text.Split([|'[';']';';'|]) 
     |> Array.map (fun s -> s.Trim()) 
     |> Array.filter (fun s -> s <> "")
     |> Array.map System.Int32.Parse
    
    let importIntSet = importInts >> Set.of_array

    let extractIntOption attrName (attribs : XmlAttributeCollection) =
        let item = attribs.GetNamedItem(attrName)
        if item = null
        then None
        else Some (Int32.of_string(item.Value))

    let extractWires (attribs : XmlAttributeCollection) =
        let inputOpt = extractIntOption InputStr attribs
        let outputOpt = extractIntOption OutputStr attribs
        Wires.create inputOpt outputOpt

    let rec extractBox (node : XmlNode) =
        let attribs = node.Attributes
        let childNodes = node.ChildNodes
        let wires = extractWires(attribs)
        let childBoxes() =
            [| for child in childNodes -> extractBox(child) |]
        match node.Name with
        | "Primitive" ->
            FlowBox.Prim(wires)
        | "Extended" ->
            FlowBox.Extd(wires, extractBox (childNodes.Item(0)))           
        | "And" ->
            FlowBox.And(wires, childBoxes())
        | s ->
            failwith ("unrecognized box type " ^ s)
    
    let extractFlowBoxDoc (doc:XmlDocument) :FlowBox =
        //ALTERNATE let rootNode = doc.DocumentElement;
        let rootNode = doc.FirstChild;
        if rootNode = null
        then failwith "empty doc"
        extractBox rootNode
        
    // ----- Build XML tree from data. -----
    
    let rec box2XML (doc: XmlDocument) (fb: FlowBox ) :XmlElement =
        let mutable name = ""
        let wo =
            match fb with
                | Prim w -> name <- PrimStr; w
                | Extd (w, fb) -> name <- ExtdStr; w
                | And (w, fba) -> name <- AndStr; w
        let node = doc.CreateElement(name)
        let sOf (i:NodeIndex) = i.ToString()
        let setAttrib name w = node.SetAttribute(name, sOf w)
        match wo.inputOpt with Some w -> (setAttrib InputStr w) | _ -> ()
        match wo.outputOpt with Some w -> (setAttrib OutputStr w) | _ -> ()
        let appendFb fb = node.AppendChild( (box2XML doc fb) ) |> ignore
        match fb with
            | Prim w -> ()
            | Extd (w, fb) -> appendFb fb
            | And (w, fba) -> Array.iter appendFb fba
        node
    
    let buildFlowBoxDoc (fb: FlowBox) :XmlDocument =
        let doc = new XmlDocument()
        doc.AppendChild (box2XML doc fb) |> ignore
        doc
        

// ---------- Tests ----------
open System.Xml
open Serialization  // The module above.

let ps s = s |> printfn "%A"
let psm s msg = printfn "%A = %s" s msg

let C = Complete
let I w = Input w
let O w = Output w
let T w1 w2 = Thru (w1, w2)
let P w = Prim w
let E w fb = Extd (w, fb)
let A w fba = And (w, fba)
    
let w1 = I 1
let w2 = T 1 2
let w3 = O 2
let fb1 = P w1
let fb2 = E w2 fb1
// Not a very meaningful diagram...
let fb3 = A w3 [| fb1; fb2 |]
psm w1 "w1"
psm w2 "w2"
psm w3 "w3"
psm fb1 "fb1"
psm fb2 "fb2"
psm fb3 "fb3"

// Create XML for flowbox.
let doc = buildFlowBoxDoc fb3
ps doc.OuterXml
// TODO: Missing "<?xml version=\"1.0\" encoding=\"utf-8\" ?>"
doc.Save("flowbox.xml") // Save as file.

// Rebuild flowbox from XML.
let fbIn = extractFlowBoxDoc doc
psm fbIn "fbIn"

// Verify round trip.
psm (fbIn = fb3) "(fbIn = fb3)"

== output ==>

1
2
3
4
5
6
7
8
9
Input 1 = w1
Thru (1,2) = w2
Output 2 = w3
Prim Input 1 = fb1
Extd (Thru (1,2),Prim Input 1) = fb2
And (Output 2,[|Prim Input 1; Extd (Thru (1,2),Prim Input 1)|]) = fb3
"<And input=\"2\"><Extended input=\"1\" output=\"2\"></Extended></And>"
And (Output 2,[|Prim Input 1; Extd (Thru (1,2),Prim Input 1)|]) = fbIn
true = (fbIn = fb3)

Here is the generated file, "flowbox.xml" (TODO: Start with standard "<?xml version=\"1.0\" encoding=\"utf-8\" ?>") :

1
2
3
4
<And input="2">  
  <Extended input="1" output="2">    
  </Extended>
</And>
By on 3/27/2008 10:35 PM ()
IntelliFactory Offices Copyright (c) 2011-2012 IntelliFactory. All rights reserved.
Home | Products | Consulting | Trainings | Blogs | Jobs | Contact Us | Terms of Use | Privacy Policy | Cookie Policy
Built with WebSharper