The state of FParsec's documentation is currently quite unsatisfactory. I
have rough drafts of a tutorial and user guide for the next release of
FParsec, but unfortunately other priorities took over almost all my time
in the previous months. I'm hoping to find some time in the coming
weeks to finish the next release of FParsec, including the new documentation, but I can't promise
anything.
Some comments regarding your questions:

If I have two functional searcher
thing-a-ma-bobs, once they find their targets, how do I tell the system
to keep looking for their next matches?

Usually you use one of the variants of the `many` combinator to collect all matches.

Also, when adding these parsers in, the compiler can't infer the types,
so I get error messages for simple "let" statements -- until I hook them
all up later. That means there can be any combinatorial parser pieces
lying around unused? Sounds kind of weird.

You can avoid the error messages with a type annotation, for example:

1
2
3
type UserState = unit // assuming you don't need a user state
type Parser<'Result> = Parser<'Result, UserState>
let myParser : Parser<_> = pstring "test" // the ": Parser<_>" annotation prevents an error, if the compiler can't infer the type otherwise

And just what _are_ the arguments to these parsers and combinators,
anyway? Sometimes they stand alone, sometimes you have to give them an
integer, sometimes you have to put parenthesis around entire groups,
sometimes you don't. I understand that the answer to my question is in
the type signatures, but these are some seriously butt-ugly type
signatures. And from the tutorial and help pages I'm not always sure
which Parser<_,_> is doing what to whom.

Arguments to combinators can be other parsers or configuration parameters. Have you seen the reference documentation in the doc subfolder of the fparsec source tree at [link:bitbucket.org]? Could you give me an example for a "butt-ugly type signature", preferably together with a suggestion how to improve the signature?

I've got the simplest problem
imaginable: I have straight text I want to leave alone. Sprinkled
throughout the text I have "commands" which are delimited by #$# around
the Command Name. Basically it's just MadLibs.
All I need to do is leave the text alone, identify the commands, and run
them to find new text to insert in the stream.

One way to approach this problem is to parse the input as a sequence of tokens. The token can either be a text-snippet or a command. So you need a parser for a text snippedt and a parser for a command. The result type of both parsers could be an union type like "type Token = TextSnippet string | Command string". You can then combine both parsers into a parser for a token (e.g. "let token = textSnippet <|> command") and use something like "many token" to parse the complete document.
If you find FParsec with the current documentation too inaccessible, I'd suggest to write a simple recursive-descent parser in F# by hand. Your parsing problem seems easy enough, and writing a simple parser by hand can be quite instructive.

By on 1/27/2011 3:24 PM ()

So I'll reply in-depth in a bit, just trying to do this first part.
It helped to bring up both projects as part of my solution and build them with my code. Because the type engine doesn't give the variable names assigned to the types, it's difficult to know what the Parser that you are providing as a parameter is supposed to do.
Right now I am trying to read blank text up to my command token, "#$#" So i figure I will copy this line directly from the tutorial and then change the ";;" to be my token, but it doesn't compile:

1
let r =  (manyChars (notFollowedBy (pstring ";;") ";;" >>. anyChar) .>> pstring ";;")

I am also getting "invalid IL" errors, which are weird, and one of the reasons I brought all the code together in one spot. (This is being done on Mono)
To me the requirements seem quite simple: read all the text up until the command token. Return that as one chunk. Then read the text inside the command token. Return that as another chunk (of type Command). Repeat and Rinse. I'll walk the list later and run the commands to create the new text and then walk it a last time to reassemble the text. It's just MadLibs.
Here is my code so far.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
    type FutonAst = 
        TextSnippet of string
        | Command of string
    let commandToken = "#$#"
    let p_textPiece = 
        let r =  (manyChars (notFollowedBy (pstring ";;") ";;" >>. anyChar) .>> pstring ";;")
        r |>> TextSnippet
    let p_command = 
        let r = between ((pstring "#$#") .>> spaces) ((pstring "#$#") .>> spaces) (FParsec.CharParsers.anyString 65536)
        r |>> Command
    let token = p_textPiece <|> p_command // OR combine being used, but doesn't show up in the html
    
    let parseFile = many (token)
    
    let testTxt (txt:string) = 
        let result = FParsec.CharParsers.run parseFile txt
        match result with
            | Success(a,b,c)->printfn "success: %A" a
            | Failure (a,b,c) -> printfn "failed: %s" a
    
    //testTxt "this is one piece #$#dogs#$# This is another "

I apologize if I seem a little stressed. This is a great library and I want to learn it. While I can write a parser probably very easily, it's better to have all these little parser units done for me, it's easier to combine them in these chains, and I'd much rather use a big hunk of code that does everything I need instead of having to roll it all on my own.
This can't be that difficult. It seems I "almost" have it, but just not quite. The |>> operator has failed several times, and beats me exactly what it does (except turn a string into a discriminated union type) So it's hard for me to figure out what it might be doing wrong. Perhaps in some of my previous code I was confusing char lists with strings, having one parser return a list of char and another return a string. It seems very difficult to have the system just return a string up until my command token without consuming the token, but that's a very easy request. i must be missing something.

By on 1/27/2011 5:45 PM ()

The not-followed-by parser is missing an L, it must be (notFollowedByL (pstring ";;") ";;")

The command parser can't work, because the "anyString N" parser will try to parse a string with exactly N chars.

Below is a working parser (without whitespace stripping).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
open FParsec.Primitives
open FParsec.CharParsers

type FutonAst = TextSnippet of string
              | Command of string

let maxLength = System.Int32.MaxValue

let delim = "#$#"

let textSnippet =
    many1Chars (notFollowedByString "#$#" >>. anyChar)
    |>> TextSnippet

let command =
    skipString "#$#" >>. charsTillString "#$#" System.Int32.MaxValue
    |>> Command

let token = textSnippet <|> command
let tokens = many token

let testTxt (txt: string) =
    let result = FParsec.CharParsers.run tokens txt
    match result with
        | Success(result, _, _) -> printfn "success: %A" result
        | Failure(msg, _, _) -> printfn "failed: %s" msg

do testTxt "this is one piece #$#dogs#$# This is another "

This is not the most efficient way to parse this grammar, but it does the job.

By on 1/28/2011 4:59 AM ()

Thanks Stephan. It always helps to have a working example that does what you want in order to learn. Very much appreciated.
Now, for the bad part. I have code that compiles, but I still get the IL error. I'm going to dump the code and the error here for those of you interested in why mono is barfing.

Perhaps it's my build, or my environment -- I'm new to linux and thrashed around quite a bit getting everything set up. I may attempt to build this is in windows and just move the dlls over, as I guess it might be more of a compile-time IL generation problem than a runtime problem. Or I may just write my own parser. Not a happy spot to be in after spending this much time on it.

Here's the code. This is the fsx version. This compiles just fine in an .fs file, and when entering it by hand into F# interactive, it goes in fine (Note that I type-annotated a couple of the parsers to keep the interpreter from barfing)
But when I run the last line -- the part that actually parses the string -- it crashes and burns. Dump attached after the code.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
#r "/home/daniel/Dropbox/source/Futon/FutonAlpha/FParsecCS/bin/Debug/FParsecCS.dll";;
#r "/home/daniel/Dropbox/source/Futon/FutonAlpha/FParsec/bin/Debug/FParsec.dll";;
open FParsec.Primitives;;
open FParsec.CharParsers;;
open FParsec.OperatorPrecedenceParser;;
    type FutonAst = 
        TextSnippet of string
        | Command of string;;
    let maxLength = System.Int32.MaxValue;;
    let delim = "#$#";;
    let textSnippet:Parser<FutonAst,unit> =
        many1Chars (notFollowedByString "#$#" >>. anyChar)
        |>> TextSnippet;;
    let command:Parser<FutonAst, unit> =
        skipString "#$#" >>. charsTillString "#$#" System.Int32.MaxValue
        |>> Command;;
    let token = textSnippet <|> command;;
    let tokens = many token;;
    let testTxt (txt: string) =
        let result = FParsec.CharParsers.run tokens txt
        match result with
            | Success(result, _, _) -> printfn "success: %A" result
            | Failure(msg, _, _) -> printfn "failed: %s" msg;;
do testTxt "this is one piece #$#dogs#$# This is another ";;

Here's the stack trace

1
2
3
4
5
6
7
8
9
10
11
12
13
14
System.InvalidProgramException: Invalid IL code in FParsec.CharParsers/many1Chars2@819-2:Invoke (FParsec.State`1</i>): IL_004d: stloc.s   5
  at FParsec.Primitives+op_BarGreaterGreater@151-2[Microsoft.FSharp.Core.Unit,FSI_0005+FutonAst,System.String].Invoke (FParsec.State`1 state) [0x00000] in <filename unknown>:0 
  at FParsec.Primitives+op_LessBarGreater@240-2[FSI_0005+FutonAst,Microsoft.FSharp.Core.Unit].Invoke (FParsec.State`1 state) [0x00000] in <filename unknown>:0 
  at FParsec.Primitives+many@780-14[Microsoft.FSharp.Core.Unit,FSI_0005+FutonAst].Invoke (FParsec.State`1 state) [0x00000] in <filename unknown>:0 
  at FParsec.CharParsers.applyParser[FSharpList`1,Unit] (Microsoft.FSharp.Core.FSharpFunc`2 parser, FParsec.State`1 state) [0x00000] in <filename unknown>:0 
  at FParsec.CharParsers+runParserOnString@80-2[Microsoft.FSharp.Collections.FSharpList`1[FSI_0005+FutonAst],Microsoft.FSharp.Core.Unit].Invoke (Microsoft.FSharp.Core.FSharpFunc`2 parser, FParsec.State`1 state) [0x00000] in <filename unknown>:0 
  at FParsec.Helper.RunParserOnString[ParserResult`2,FSharpFunc`2,Unit] (System.String str, Int32 index, Int32 length, Microsoft.FSharp.Core.FSharpFunc`2 applyParser, Microsoft.FSharp.Core.FSharpFunc`2 parser, Microsoft.FSharp.Core.Unit userState, System.String streamName) [0x00000] in <filename unknown>:0 
  at FParsec.CharParsers.runParserOnString[FSharpList`1,Unit] (Microsoft.FSharp.Core.FSharpFunc`2 parser, Microsoft.FSharp.Core.Unit ustate, System.String streamName, System.String chars) [0x00000] in <filename unknown>:0 
  at FParsec.CharParsers.run[FSharpList`1] (Microsoft.FSharp.Core.FSharpFunc`2 parser, System.String string) [0x00000] in <filename unknown>:0 
  at FSI_0012.testTxt (System.String txt) [0x00000] in <filename unknown>:0 
  at <StartupCode$FSI_0013>.$FSI_0013.main@ () [0x00000] in <filename unknown>:0 
  at (wrapper managed-to-native) System.Reflection.MonoMethod:InternalInvoke (object,object[],System.Exception&)
  at System.Reflection.MonoMethod.Invoke (System.Object obj, BindingFlags invokeAttr, System.Reflection.Binder binder, System.Object[] parameters, System.Globalization.CultureInfo culture) [0x00000] in <filename unknown>:0 
Stopped due to error
By on 1/28/2011 6:17 AM ()

Update: The code runs fine on Windows. But no matter how I try, I can't get it to run on Ubuntu mono 2.6.7 due to IL errors.

Just thought I could save somebody else out there some time. I don't think you can get there from here.

By on 1/28/2011 7:40 AM ()

Hi,

The "invalid IL" error is annoying. Could you check you're using a recent version of Mono (at least 2.8) and latest F# version (November 2010 update, either from Codeplex or from MSDN)?
It would be good to get a small repro of this, so that the bug can be fixed.

Laurent.

By on 1/28/2011 3:24 AM ()

I'm running mono 2.6.7, which is the only build I can find for Ubuntu Lynx. 2.8.2 looks like it's for VMWare, VirtualPC, MacOS, Solaris -- everything but Ubuntu

By on 1/28/2011 3:46 AM ()
IntelliFactory Offices Copyright (c) 2011-2012 IntelliFactory. All rights reserved.
Home | Products | Consulting | Trainings | Blogs | Jobs | Contact Us | Terms of Use | Privacy Policy | Cookie Policy
Built with WebSharper