a problem

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

a problem

Ivan Vankov
Hello there,

I have a problem with Grammatica which I can not resolve. I am trying to parse some LISP like source code by a simple grammar, but it seems that the constructed parser does not work correctly, it fails to parse syntactically correct structures. I trying the old versions of Grammatica and found out it works only with version 1.0. Attached are the grammar file, the test source file and the output I get when I debug the parser with the --parse option.

I would appreciate any comments you have.


Regards,

Ivan Vankov

_______________________________________________
Grammatica-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/grammatica-users

kb.grammar (1022 bytes) Download Attachment
test.lsp (595 bytes) Download Attachment
output.txt (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: a problem

Per Cederberg
Hi,

The problem is an ambivalence in the grammar that Grammatica
fails to detect properly. Consider these productions:

FILLER         = INTEGER
                | '(' INTEGER+ ')'
                | TAG
                | '(' TAG+ ')'
                | REFERENCE
                | '(' REFERENCE+ ')'
                ;
REFERENCE      = IDENT
                | '(' IDENT  (NUMBER | ('.'  SSLOTLABEL) ) ')'
                | '(' '(' IDENT  '.'  SSLOTLABEL ')'  NUMBER ')'
                ;


 From the output of --debug you can clearly see that there
is a huge overlap between the last two alternatives in the
FILLER production for any input starting with:

   '(' '(' <IDENT>

I think the solution here is to rewrite the grammar to a
less semantical representation. Lisp is very difficult to
parse on the level you are attempting for, as there are
no distinctions between expressions, statements, etc.
They all use the same () separators.

/Per

Ivan Vankov wrote:

> Hello there,
>
> I have a problem with Grammatica which I can not resolve. I am trying to
> parse some LISP like source code by a simple grammar, but it seems that
> the constructed parser does not work correctly, it fails to parse
> syntactically correct structures. I trying the old versions of
> Grammatica and found out it works only with version 1.0. Attached are
> the grammar file, the test source file and the output I get when I debug
> the parser with the --parse option.
>
> I would appreciate any comments you have.
>
>
> Regards,
>
> Ivan Vankov
>
>
> ------------------------------------------------------------------------
>
> D:\projects\nbu\DUAL\Grammatica>java -jar lib\grammatica-1.4.jar kb.grammar --parse test.lsp
>
> Parse tree from test.lsp:
> KB(2001)
>   AGENTDEFINITION(2002)
>     LEFT_PAREN(1011): "(", line: 1, col: 1
>     DEFAGENT(1001): "defagent", line: 1, col: 2
>     IDENT(1004): "left-VAR_object_1-VAR_sit_name", line: 1, col: 13
>     IDENT(1004): "instance-agent", line: 1, col: 47
>     GSLOT(2003)
>       GSLOTLABEL(2008)
>         COLUMN(1002): ":", line: 2, col: 3
>         IDENT(1004): "type", line: 2, col: 4
>       FILLER(2005)
>         LEFT_PAREN(1011): "(", line: 2, col: 15
>         TAG(2007)
>           COLUMN(1002): ":", line: 2, col: 16
>           IDENT(1004): "instance", line: 2, col: 17
>         TAG(2007)
>           COLUMN(1002): ":", line: 2, col: 27
>           IDENT(1004): "object", line: 2, col: 28
>         RIGHT_PAREN(1012): ")", line: 2, col: 34
>     GSLOT(2003)
>       GSLOTLABEL(2008)
>         COLUMN(1002): ":", line: 3, col: 3
>         IDENT(1004): "modality", line: 3, col: 4
>       FILLER(2005)
>         TAG(2007)
>           COLUMN(1002): ":", line: 3, col: 15
>           IDENT(1004): "init", line: 3, col: 16
>     GSLOT(2003)
>       GSLOTLABEL(2008)
>         COLUMN(1002): ":", line: 4, col: 3
>         IDENT(1004): "situation", line: 4, col: 4
>       FILLER(2005)
>         REFERENCE(2006)
>           LEFT_PAREN(1011): "(", line: 4, col: 15
>           IDENT(1004): "sit-VAR_sit_name", line: 4, col: 16
>           NUMBER(1007): "0.2", line: 4, col: 33
>           RIGHT_PAREN(1012): ")", line: 4, col: 36
>     GSLOT(2003)
>       GSLOTLABEL(2008)
>         COLUMN(1002): ":", line: 5, col: 3
>         IDENT(1004): "inst-of", line: 5, col: 4
>       FILLER(2005)
>         REFERENCE(2006)
>           IDENT(1004): "VAR_object_1", line: 5, col: 15
>     GSLOT(2003)
>       GSLOTLABEL(2008)
>         COLUMN(1002): ":", line: 6, col: 3
>         IDENT(1004): "c-coref", line: 6, col: 4
> Error: in test.lsp: line 6:
>     unexpected token ".", expected one of ")", <IDENT>, or "("
>
>   :c-coref    (((VAR_bin_spat_rel-1-VAR_sit_name . :slot1) 0.25)
>                                                  ^
> Error: in test.lsp: line 7:
>     unexpected token "0.25" <NUMBER>, expected ")"
>
>                ((color-of-1-VAR_sit_name . :slot1) 0.25)
>                                                    ^
>
> D:\projects\nbu\DUAL\Grammatica>
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Grammatica-users mailing list
> [hidden email]
> http://lists.nongnu.org/mailman/listinfo/grammatica-users


_______________________________________________
Grammatica-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/grammatica-users
Reply | Threaded
Open this post in threaded view
|

Re: a problem

Ivan Vankov
Thank you, Per

I rewrote the grammar and now it works fine with version 1.4.


Ivan


2006/5/14, Per Cederberg <[hidden email]>:
Hi,

The problem is an ambivalence in the grammar that Grammatica
fails to detect properly. Consider these productions:

FILLER         = INTEGER
                | '(' INTEGER+ ')'
                | TAG
                | '(' TAG+ ')'
                | REFERENCE
                | '(' REFERENCE+ ')'
                ;
REFERENCE      = IDENT
                | '(' IDENT  (NUMBER | ('.'  SSLOTLABEL) ) ')'
                | '(' '(' IDENT  '.'  SSLOTLABEL ')'  NUMBER ')'
                ;


From the output of --debug you can clearly see that there
is a huge overlap between the last two alternatives in the
FILLER production for any input starting with:

   '(' '(' <IDENT>

I think the solution here is to rewrite the grammar to a
less semantical representation. Lisp is very difficult to
parse on the level you are attempting for, as there are
no distinctions between expressions, statements, etc.
They all use the same () separators.

/Per

Ivan Vankov wrote:

> Hello there,
>
> I have a problem with Grammatica which I can not resolve. I am trying to
> parse some LISP like source code by a simple grammar, but it seems that
> the constructed parser does not work correctly, it fails to parse
> syntactically correct structures. I trying the old versions of
> Grammatica and found out it works only with version 1.0. Attached are
> the grammar file, the test source file and the output I get when I debug
> the parser with the --parse option.
>
> I would appreciate any comments you have.
>
>
> Regards,
>
> Ivan Vankov
>
>
> ------------------------------------------------------------------------
>
> D:\projects\nbu\DUAL\Grammatica>java -jar lib\grammatica-1.4.jar kb.grammar --parse test.lsp
>
> Parse tree from test.lsp:
> KB(2001)
>   AGENTDEFINITION(2002)
>     LEFT_PAREN(1011): "(", line: 1, col: 1
>     DEFAGENT(1001): "defagent", line: 1, col: 2
>     IDENT(1004): "left-VAR_object_1-VAR_sit_name", line: 1, col: 13
>     IDENT(1004): "instance-agent", line: 1, col: 47
>     GSLOT(2003)
>       GSLOTLABEL(2008)
>         COLUMN(1002): ":", line: 2, col: 3
>         IDENT(1004): "type", line: 2, col: 4
>       FILLER(2005)
>         LEFT_PAREN(1011): "(", line: 2, col: 15
>         TAG(2007)
>           COLUMN(1002): ":", line: 2, col: 16
>           IDENT(1004): "instance", line: 2, col: 17
>         TAG(2007)
>           COLUMN(1002): ":", line: 2, col: 27
>           IDENT(1004): "object", line: 2, col: 28
>         RIGHT_PAREN(1012): ")", line: 2, col: 34
>     GSLOT(2003)
>       GSLOTLABEL(2008)
>         COLUMN(1002): ":", line: 3, col: 3
>         IDENT(1004): "modality", line: 3, col: 4
>       FILLER(2005)
>         TAG(2007)
>           COLUMN(1002): ":", line: 3, col: 15
>           IDENT(1004): "init", line: 3, col: 16
>     GSLOT(2003)
>       GSLOTLABEL(2008)
>         COLUMN(1002): ":", line: 4, col: 3
>         IDENT(1004): "situation", line: 4, col: 4
>       FILLER(2005)
>         REFERENCE(2006)
>           LEFT_PAREN(1011): "(", line: 4, col: 15
>           IDENT(1004): "sit-VAR_sit_name", line: 4, col: 16
>           NUMBER(1007): "0.2", line: 4, col: 33
>           RIGHT_PAREN(1012): ")", line: 4, col: 36
>     GSLOT(2003)
>       GSLOTLABEL(2008)
>         COLUMN(1002): ":", line: 5, col: 3
>         IDENT(1004): "inst-of", line: 5, col: 4
>       FILLER(2005)
>         REFERENCE(2006)
>           IDENT(1004): "VAR_object_1", line: 5, col: 15
>     GSLOT(2003)
>       GSLOTLABEL(2008)
>         COLUMN(1002): ":", line: 6, col: 3
>         IDENT(1004): "c-coref", line: 6, col: 4
> Error: in test.lsp: line 6:
>     unexpected token ".", expected one of ")", <IDENT>, or "("
>
>   :c-coref    (((VAR_bin_spat_rel-1-VAR_sit_name . :slot1) 0.25 )
>                                                  ^
> Error: in test.lsp: line 7:
>     unexpected token "0.25" <NUMBER>, expected ")"
>
>                ((color-of-1-VAR_sit_name . :slot1) 0.25)
>                                                    ^
>
> D:\projects\nbu\DUAL\Grammatica>
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Grammatica-users mailing list
> [hidden email]
> http://lists.nongnu.org/mailman/listinfo/grammatica-users


_______________________________________________
Grammatica-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/grammatica-users


_______________________________________________
Grammatica-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/grammatica-users