The way you have the class now, you can only safely get an enumerator once - anyone who uses an enumerator is bound to Dispose() it when they're done, at which point this object is dead. So this class is not too useful, since its legal-use-pattern is unexpected.

If you want an IEnumerable that's reusable (e.g. can call GetEnumerator on it more than once), your best (only?) bet is to buffer. One simple implementation is to use this class as an implementation detail of a function that takes a TextReader and returns an IEnumerable<Token>. The function body would be like

new Tokenizer(reader) |> Seq.cache

where Seq.cache does the buffering.

By on 5/9/2009 5:25 PM ()

The way you have the class now, you can only safely get an enumerator once ...

This is exactly my intention. I only use the sequence once. At least I only use it once explicitly. It seems though that Seq.is_empty creates another enumerator over the same enumerable. At least if this would be the case, it would explain the behavior it displays.

If you want an IEnumerable that's reusable (e.g. can call GetEnumerator on it more than once), your best (only?) bet is to buffer. One simple implementation is to use this class as an implementation detail of a function that takes a TextReader and returns an IEnumerable<Token>. The function body would be like

new Tokenizer(reader) |> Seq.cache

where Seq.cache does the buffering.

But would the Seq.cache read the sequence to the end to cache it? This would defeat the purpose of the refactoring I am doing. I had this code working with entire file read into a string and then working off the string. I wanted to change it to avoid reading entire file into memory only to discard it after I tokenized it

By on 5/9/2009 10:18 PM ()

This is exactly my intention. I only use the sequence once. At least I only use it once explicitly. It seems though that Seq.is_empty creates another enumerator over the same enumerable. At least if this would be the case, it would explain the behavior it displays.

Yes, of course. Seq.is_empty, and every other function in the universe that operates on seqs calls GetEnumerator(). That's the only function there is on IEnumerable! If you want to know anything about the contents of an IEnumerable, you must call GetEnumerator and then MoveNext/Current.

But would the Seq.cache read the sequence to the end to cache it? This would defeat the purpose of the refactoring I am doing. I had this code working with entire file read into a string and then working off the string. I wanted to change it to avoid reading entire file into memory only to discard it after I tokenized it.

Try it. Seq.cache does exactly what you want.

[link:research.microsoft.com]

By on 5/9/2009 10:49 PM ()

But would the Seq.cache read the sequence to the end to cache it? This would defeat the purpose of the refactoring I am doing. I had this code working with entire file read into a string and then working off the string. I wanted to change it to avoid reading entire file into memory only to discard it after I tokenized it.

Try it. Seq.cache does exactly what you want.

[link:research.microsoft.com]

Ha, I see. For some reason I expected that the Seq.cache functionality is already built into every seq.

By on 5/10/2009 7:10 AM ()
IntelliFactory Offices Copyright (c) 2011-2012 IntelliFactory. All rights reserved.
Home | Products | Consulting | Trainings | Blogs | Jobs | Contact Us | Terms of Use | Privacy Policy | Cookie Policy
Built with WebSharper