Page 1 of 1

Python grammar: EBNF to Flex & Bison

Posted: Wed Jul 15, 2009 2:31 am
by narke
Hello,

I'm trying to write a python interpreter with Flex & Bison.
There is the python grammar available here: http://docs.python.org/dev/py3k/reference/grammar.html.

I take as an example the following specification in EBNF notation from the python grammar:

Code: Select all

file_input: (NEWLINE | stmt)* ENDMARKER
How can I translate this in Flex & Bison?

In a bison file I wrote:

Code: Select all

file_input:
    NEWLINE
  | stmt
  | ENDMARKER
  ;
But this is not pretty exact, as the ENDMARKER is mandatory at the end and here I can't force it.

Do you know how to translate EBNF notation to Flex & Bison?

Thanks in advance!

Re: Python grammar: EBNF to Flex & Bison

Posted: Wed Jul 15, 2009 3:15 am
by Solar
I have absolutely zero experience with writing for Flex & Bison, but I've read one or two standards documents, so I dare a guess. Would this work?

Code: Select all

# file_input is file_input_statements plus ENDMARKER
file_input:
    file_input_statements ENDMARKER

# file_input_statements (plural) is one or more file_input_statement (singular)
file_input_statements:
    file_input_statement | file_input_statements

# file_input_statement is a statement or a newline
file_input_statement:
    statement | NEWLINE
Syntax is probably off, and naming isn't really intuitive, but I hope I got the idea across. (Credits to ISO 9899:1999...)

Re: Python grammar: EBNF to Flex & Bison

Posted: Wed Jul 15, 2009 4:50 am
by stephenj
This isn't the best way to do this, but I'm in a hurry to go to work.

Code: Select all

file_input
   : NEWLINE file_input
   | stmt file_input
   | ENDMARKER
   ;
I know books on compilers suck, but you may want to pick one up.

Re: Python grammar: EBNF to Flex & Bison

Posted: Wed Jul 15, 2009 10:58 am
by narke
As I understand I must find solutions case by case.
This would be a little more difficult than I imaganed but I will try.

Re: Python grammar: EBNF to Flex & Bison

Posted: Wed Jul 15, 2009 12:05 pm
by JamesM
stephenj's solution brings everything into one nonterminal, but the most "honest" translation would probably be

Code: Select all

file_input: file_input_optional ENDMARKER
          ;

file_input_optional: /* Intentionally left blank, for lambda-transition */
                   | NEWLINE file_input_optional
                   | stmt file_input_optional
                   ;
Introducing another nonterminal for the optional Kleene-starred part.