Well, I guess it would be possible to write an HTML
grammar for Grammatica. But the question is more if
it would really be a good fit. The thing with HTML
is that *lots* of the real-world web pages are
So I think to write a good HTML-parser, one really
needs to do it by hand. Adding special code
everywhere to recover from common problems and
Also, HTML is a very unstrict syntax, allowing new
unknown tags to be used, end tags to be omitted, etc,
etc. So it is very hard to create a correct BNF
grammar that covers all that still provides something
more than a pure tokenizer.
On thu, 2005-12-15 at 11:33 -0800, John Kleven wrote:
> Hi all,
> Curious if anybody has used Grammatica to create an
> HTML parser?
> Not sure if thats a good fit for grammatica or not but
> it seemed like it might be. The existing C# HTML
> parsers out there all seem to leave something (or
> quite a bit) to be desired.
> Any info appreciated!
> Do You Yahoo!?
> Tired of spam? Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com >
> Grammatica-users mailing list
> [hidden email] > http://lists.nongnu.org/mailman/listinfo/grammatica-users >