Getting token regular expressions

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Getting token regular expressions

Oliver Bock
I have a token like this:

FFORMAT = <<F(\d+)\.(\d+)>>

In my analyzer I would like to refer to the groups without needing to
copy-and-paste the text of the regular expression, because the
expression might change.  I cannot see a way to get the expression back
from my Tokenizer class.  Is it possible?


  Oliver

P.S. Grammatica rocks: so simple, so un-yacc.


_______________________________________________
Grammatica-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/grammatica-users
Reply | Threaded
Open this post in threaded view
|

Re: Getting token regular expressions

Oliver Gramberg

Hi Oliver,

as far as I know, there is no grouping extraction in Grammatica.

Class Tokenizer has TokenPattern GetPattern(int id), and TokenPattern has string GetPattern() (or the property string Pattern for .NET) - would this help?

You could also
- define a production, with a token for each part of your RE,
- hack the tokenizer to disable white space upon recognition of the first token,
- enable white space when done.
I use that hack, but for a different purpose. If needed, I can provide a patch.

It would be nice if Grammatica had a "fragment" feature like one or two other parser generators. I think this is on the feature proposition list.

Regards
Oliver



Oliver Gramberg
ABB AG
Forschungszentrum Deutschland
DECRC/I2
Wallstadter Str. 59
D-68526 Ladenburg

Phone: +49 6203/71-6461
Fax: +49 6203/71-6253
E-mail:
oliver.gramberg@...

Sitz/Head Office: Mannheim
Registergericht/Registry Court: Mannheim
Handelsregisternummer/Commercial Register No.: HRB 4664

Vorstand/Managing Board: Peter Smits (Vorsitzender/Chairman), Heinz-Peter Paffenholz, Dr. Joachim Schneider, Hendrik Weiler
Vorsitzender des Aufsichtsrats/Chairman of Supervisory Board: Bernhard Jucker

Diese E-Mail enthaelt vertrauliche und/oder rechtlich geschuetzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtuemlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Mail.
Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail ist nicht gestattet.

This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail.
Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.




Oliver Bock <[hidden email]>
Sent by: grammatica-users-bounces+oliver.gramberg=[hidden email]

15.09.2009 08:51

Please respond to
[hidden email]

To
[hidden email]
cc
Subject
[Grammatica-users] Getting token regular expressions





I have a token like this:

FFORMAT = <<F(\d+)\.(\d+)>>

In my analyzer I would like to refer to the groups without needing to
copy-and-paste the text of the regular expression, because the
expression might change.  I cannot see a way to get the expression back
from my Tokenizer class.  Is it possible?


 Oliver

P.S. Grammatica rocks: so simple, so un-yacc.



_______________________________________________
Grammatica-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/grammatica-users
Reply | Threaded
Open this post in threaded view
|

Re: Getting token regular expressions

Oliver Bock
Hi,

The C# runtime libraries (I have not looked at the Java) do include
GetPattern(), but only in TokenMatcher-derived classes, not in Tokenizer
itself.  (Tokenizer does have GetPatternDescription().).  I like your
idea about hacking the tokenizer, but my problem is not so important to
want to do that so I will just copy+paste for now.  Thanks anyway!

  Oliver

Oliver Gramberg wrote:

>
> Hi Oliver,
>
> as far as I know, there is no grouping extraction in Grammatica.
>
> Class Tokenizer has TokenPattern GetPattern(int id), and TokenPattern
> has string GetPattern() (or the property string Pattern for .NET) -
> would this help?
>
> You could also
> - define a production, with a token for each part of your RE,
> - hack the tokenizer to disable white space upon recognition of the
> first token,
> - enable white space when done.
> I use that hack, but for a different purpose. If needed, I can provide
> a patch.
>
> It would be nice if Grammatica had a "fragment" feature like one or
> two other parser generators. I think this is on the feature
> proposition list.
>
> Regards
> Oliver
>
> ------------------------------------------------------------------------
>
> Oliver Gramberg
> ABB AG
> Forschungszentrum Deutschland
> DECRC/I2
> Wallstadter Str. 59
> D-68526 Ladenburg
>
> Phone: +49 6203/71-6461
> Fax: +49 6203/71-6253
> E-mail: [hidden email].com_ <mailto:[hidden email]>
>
> Sitz/Head Office: Mannheim
> Registergericht/Registry Court: Mannheim
> Handelsregisternummer/Commercial Register No.: HRB 4664
>
> Vorstand/Managing Board: Peter Smits (Vorsitzender/Chairman),
> Heinz-Peter Paffenholz, Dr. Joachim Schneider, Hendrik Weiler
> Vorsitzender des Aufsichtsrats/Chairman of Supervisory Board: Bernhard
> Jucker
>
> Diese E-Mail enthaelt vertrauliche und/oder rechtlich geschuetzte
> Informationen. Wenn Sie nicht der richtige Adressat sind oder diese
> E-Mail irrtuemlich erhalten haben, informieren Sie bitte sofort den
> Absender und vernichten Sie diese Mail.
> Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail ist
> nicht gestattet.
>
> This e-mail may contain confidential and/or privileged information. If
> you are not the intended recipient (or have received this e-mail in
> error) please notify the sender immediately and destroy this e-mail.
> Any unauthorized copying, disclosure or distribution of the material
> in this e-mail is strictly forbidden.
>
>
>
>
> *Oliver Bock <[hidden email]>*
> Sent by: grammatica-users-bounces+oliver.gramberg=[hidden email]
>
> 15.09.2009 08:51
> Please respond to
> [hidden email]
>
>
>
> To
> [hidden email]
> cc
>
> Subject
> [Grammatica-users] Getting token regular expressions
>
>
>
>
>
>
>
>
>
> I have a token like this:
>
> FFORMAT = <<F(\d+)\.(\d+)>>
>
> In my analyzer I would like to refer to the groups without needing to
> copy-and-paste the text of the regular expression, because the
> expression might change.  I cannot see a way to get the expression back
> from my Tokenizer class.  Is it possible?
>
>
>  Oliver
>
> P.S. Grammatica rocks: so simple, so un-yacc.
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Grammatica-users mailing list
> [hidden email]
> http://lists.nongnu.org/mailman/listinfo/grammatica-users
>  



_______________________________________________
Grammatica-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/grammatica-users