Empty file parsing

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Empty file parsing

Leszek Doniec
Hello,
 
is there any possibility to parse a file that includes only comments? I use grammatica 1.4.
 
Thanks for help,
Leszek Doniec

_______________________________________________
Grammatica-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/grammatica-users
Reply | Threaded
Open this post in threaded view
|

Re: Empty file parsing

Per Cederberg
I assume that the comment tokens are marked with %ignore%.
In that case it is not possible to do it with Grammatica,
as all productions must match at least 1 token.

You could do some kind of work-around, though, by ignoring
parse errors on some occations. Perhaps the easiest way
would be to subclass the Tokenizer to set a flag if it
returns a token. In case of parse error, you can then check
the flag to see if the cause was lack of input. Care must
be taken to handle tokenizer exceptions properly though.

I know this isn't an ideal solution. So I've registered a
feature request on the issue:

https://savannah.nongnu.org/bugs/index.php?func=detailitem&item_id=16783

Cheers,

/Per

Leszek Doniec wrote:

> Hello,
>  
> is there any possibility to parse a file that includes only comments?
> I use grammatica 1.4.
>  
> Thanks for help,
> Leszek Doniec
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Grammatica-users mailing list
> [hidden email]
> http://lists.nongnu.org/mailman/listinfo/grammatica-users


_______________________________________________
Grammatica-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/grammatica-users
Reply | Threaded
Open this post in threaded view
|

Re: Empty file parsing

Yakov Keselman
Perhaps, for this specific case (parsing a file with
some comments only, not an empty file), it would be
easier not to ignore them (remove the %ignore% tag)?
Since comments will never propagate up the tree, the
overhead is likely to be minimal.

= Yakov


--- Per Cederberg <[hidden email]> wrote:

> I assume that the comment tokens are marked with
> %ignore%.
> In that case it is not possible to do it with
> Grammatica,
> as all productions must match at least 1 token.
>
> You could do some kind of work-around, though, by
> ignoring
> parse errors on some occations. Perhaps the easiest
> way
> would be to subclass the Tokenizer to set a flag if
> it
> returns a token. In case of parse error, you can
> then check
> the flag to see if the cause was lack of input. Care
> must
> be taken to handle tokenizer exceptions properly
> though.
>
> I know this isn't an ideal solution. So I've
> registered a
> feature request on the issue:
>
>
https://savannah.nongnu.org/bugs/index.php?func=detailitem&item_id=16783

>
> Cheers,
>
> /Per
>
> Leszek Doniec wrote:
> > Hello,
> >  
> > is there any possibility to parse a file that
> includes only comments?
> > I use grammatica 1.4.
> >  
> > Thanks for help,
> > Leszek Doniec
> >
> >
> >
>
------------------------------------------------------------------------
> >
> > _______________________________________________
> > Grammatica-users mailing list
> > [hidden email]
> >
>
http://lists.nongnu.org/mailman/listinfo/grammatica-users
>
>
> _______________________________________________
> Grammatica-users mailing list
> [hidden email]
>
http://lists.nongnu.org/mailman/listinfo/grammatica-users
>


http://www.kinderspirit.org/yakovkeselman/

====

Nothing is so firmly believed as that which we least know.
-- Michel de Montaigne





_______________________________________________
Grammatica-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/grammatica-users
Reply | Threaded
Open this post in threaded view
|

Re: Empty file parsing

Yakov Keselman
Upon further thought, I realized that my suggestion
was not useful -- it would require inserting a COMMENT
token in too many places in the grammar.

= Yakov


--- Yakov Keselman <[hidden email]> wrote:

> Perhaps, for this specific case (parsing a file with
> some comments only, not an empty file), it would be
> easier not to ignore them (remove the %ignore% tag)?
> Since comments will never propagate up the tree, the
> overhead is likely to be minimal.
>
> = Yakov
>
>
> --- Per Cederberg <[hidden email]> wrote:
>
> > I assume that the comment tokens are marked with
> > %ignore%.
> > In that case it is not possible to do it with
> > Grammatica,
> > as all productions must match at least 1 token.
> >
> > You could do some kind of work-around, though, by
> > ignoring
> > parse errors on some occations. Perhaps the
> easiest
> > way
> > would be to subclass the Tokenizer to set a flag
> if
> > it
> > returns a token. In case of parse error, you can
> > then check
> > the flag to see if the cause was lack of input.
> Care
> > must
> > be taken to handle tokenizer exceptions properly
> > though.
> >
> > I know this isn't an ideal solution. So I've
> > registered a
> > feature request on the issue:
> >
> >
>
https://savannah.nongnu.org/bugs/index.php?func=detailitem&item_id=16783

> >
> > Cheers,
> >
> > /Per
> >
> > Leszek Doniec wrote:
> > > Hello,
> > >  
> > > is there any possibility to parse a file that
> > includes only comments?
> > > I use grammatica 1.4.
> > >  
> > > Thanks for help,
> > > Leszek Doniec
> > >
> > >
> > >
> >
>
------------------------------------------------------------------------
> > >
> > > _______________________________________________
> > > Grammatica-users mailing list
> > > [hidden email]
> > >
> >
>
http://lists.nongnu.org/mailman/listinfo/grammatica-users
> >
> >
> > _______________________________________________
> > Grammatica-users mailing list
> > [hidden email]
> >
>
http://lists.nongnu.org/mailman/listinfo/grammatica-users

> >
>
>
> http://www.kinderspirit.org/yakovkeselman/
>
> ====
>
> Nothing is so firmly believed as that which we least
> know.
> -- Michel de Montaigne
>
>
>
>
>
> _______________________________________________
> Grammatica-users mailing list
> [hidden email]
>
http://lists.nongnu.org/mailman/listinfo/grammatica-users
>


http://www.kinderspirit.org/yakovkeselman/

====

Nothing is so firmly believed as that which we least know.
-- Michel de Montaigne





_______________________________________________
Grammatica-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/grammatica-users
Reply | Threaded
Open this post in threaded view
|

Feature request (more meaningful javadoc)

Yakov Keselman
Hello Per,

one feature I'd like to request (unless it has been
requested before) is a more meaningful javadoc to be
generated for production methods. Here is hopefully an
explanatory example.

Consider the following production (in your syntax):

PrimeMeridian = PRIMEM_K LEFT_PAREN Name "," Longitude
RIGHT_PAREN ;

PRIMEM_K, LEFT_PAREN, ",", and RIGHT_PAREN are all
terminals (atoms). Name and Longitude are
non-terminals. Currently generated stub for
exitPrimeMeridian is shown below.

    /**
     * Called when exiting a parse tree node.
     *
     * @param node           the node being exited
     *
     * @return the node to add to the parse tree, or
     *         null if no parse tree should be created
     *
     * @throws ParseException if the node analysis
discovered errors
     */
    protected Node exitPrimeMeridian(Production node)
        throws ParseException {

        return node;
    }

I think that it will be more useful to have something
like this (note the production included in javadoc and
numbering of its children):

    /**
     * Called when exiting a parse tree node.
     *
     * @param node           the node being exited
     *
     * PrimeMeridian = PRIMEM_K [0]  LEFT_PAREN [1]
Name [2]  "," [3]  Longitude [4]  RIGHT_PAREN [5] ;
     */
    protected Node exitPrimeMeridian(Production node)
throws ParseException {

    return node;
    }


Perhaps even more useful can be something of this
kind:

    protected Node exitPrimeMeridian( node primem_k,
node left_paren, node name, node comma, node
longitude, node right_paren ){ ... }


Would the first or the second suggestion be difficult
to implement? Has anyone else requested or thought
about this?

Thanks,

= Yakov


http://www.kinderspirit.org/yakovkeselman/

====

Nothing is so firmly believed as that which we least know.
-- Michel de Montaigne





_______________________________________________
Grammatica-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/grammatica-users
Reply | Threaded
Open this post in threaded view
|

Re: Feature request (more meaningful javadoc)

Per Cederberg
This is a very good suggestion. A variant of it has been
discussed before, although I don't quite remember if it
was on this list or in some private conversation I had.
It is also somewhat implemented in the parser generator
SableCC.

Basically the idea is to generate a type-safe analyzer:

   protected Node exitPrimeMeridian(PrimeMeridianProduction node)
       throws ParseException {

       return node;
   }

Where the class PrimeMeridianProduction contains:

   public Token getPrimemK();
   public Token getLeftParen();
   public NameProduction getName();
   public Token getComma();
   public LongitudeProduction getLongitude();
   public Token getRightParen();

This way, changes in the grammar are more likely to
cause compilation errors in the Analyzer classes. Which
is much better than the current run-time errors. It
also lends itself much better to documentation.

There has been no work yet in this direction in
Grammatica, so the exact details remain to be drawn
out. I always wanted to combine the above with a
simplification to the grammar format, allowing only
two types of productions:

   Choice = FirstAlternative
          | SecondAlternative
          ...
          | LastAlternative ;

   Sequence = First Second ... Last ;

This model fits perfectly with the type-safe nodes,
since all the FirstAlternative, SecondAlternative,
etc. nodes will then inherit ChoiceProduction. Which
is very object oriented. (Not my idea, saw it in a
paper on compiler generators.)

Anyway. These are my thoughts about it. Someone
should probably write it down more formally and
register a future improvement bug about this. Or
even implement some prototype?

Thanks for reading this far.

/Per

Yakov Keselman wrote:

> Hello Per,
>
> one feature I'd like to request (unless it has been
> requested before) is a more meaningful javadoc to be
> generated for production methods. Here is hopefully an
> explanatory example.
>
> Consider the following production (in your syntax):
>
> PrimeMeridian = PRIMEM_K LEFT_PAREN Name "," Longitude
> RIGHT_PAREN ;
>
> PRIMEM_K, LEFT_PAREN, ",", and RIGHT_PAREN are all
> terminals (atoms). Name and Longitude are
> non-terminals. Currently generated stub for
> exitPrimeMeridian is shown below.
>
>     /**
>      * Called when exiting a parse tree node.
>      *
>      * @param node           the node being exited
>      *
>      * @return the node to add to the parse tree, or
>      *         null if no parse tree should be created
>      *
>      * @throws ParseException if the node analysis
> discovered errors
>      */
>     protected Node exitPrimeMeridian(Production node)
>         throws ParseException {
>
>         return node;
>     }
>
> I think that it will be more useful to have something
> like this (note the production included in javadoc and
> numbering of its children):
>
>     /**
>      * Called when exiting a parse tree node.
>      *
>      * @param node           the node being exited
>      *
>      * PrimeMeridian = PRIMEM_K [0]  LEFT_PAREN [1]
> Name [2]  "," [3]  Longitude [4]  RIGHT_PAREN [5] ;
>      */
>     protected Node exitPrimeMeridian(Production node)
> throws ParseException {
>
>     return node;
>     }
>
>
> Perhaps even more useful can be something of this
> kind:
>
>     protected Node exitPrimeMeridian( node primem_k,
> node left_paren, node name, node comma, node
> longitude, node right_paren ){ ... }
>
>
> Would the first or the second suggestion be difficult
> to implement? Has anyone else requested or thought
> about this?
>
> Thanks,
>
> = Yakov
>
>
> http://www.kinderspirit.org/yakovkeselman/
>
> ====
>
> Nothing is so firmly believed as that which we least know.
> -- Michel de Montaigne
>
>
>
>
>
> _______________________________________________
> Grammatica-users mailing list
> [hidden email]
> http://lists.nongnu.org/mailman/listinfo/grammatica-users
>
>


_______________________________________________
Grammatica-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/grammatica-users