How to solve LL(1) Grammars

Here you will learn how to calculate the first set, follow set and parse table of a grammar.

Already an expert? Start practicing

Vocabulary

  • Nonterminal Symbols: These are symbols which can be replaced by combinations of nonterminal and terminal symbols according to production rules. In this tutorial, these are denoted by the upper case letters A, B, C, and D.
  • Terminal Symbols: These are symbols which cannot be replaced and make up the strings of the language. In this tutorial, these are denoted by the lower case letters w, x, y, and z.
  • Start Symbol: This is a predefined nonterminal symbol from which all strings in the language must be derived from. In other words, if a string cannot be made through a series of productions starting from the Start Symbol, it is not in the language. In the following tutorial, “A” is always going to be the Start Symbol.
  • Nullable Symbols: These are symbols that can generate the empty string ε. Symbol A is nullable if there is a production A -> ε or a production A -> B, for some nullable symbol B.
  • LL(1) Grammars: Subset of grammars that fulfill the condition that parsers only need one nonterminal character of lookahead to parse from a string in the language.
  • First Set of a Symbol S: The set of all terminal symbols that begin the strings derived from S.
  • Follow Set: The set of all terminal symbols which immediately follow all the strings derived from S.
  • Production or Production Rule: A rewrite rule specifying a symbol substitution that can be recursively performed to generate new symbol sequences.
  • Production: A rewrite rule which specifies a symbol substitution to generate new symbol sequences. A nonterminal can have one or more productions. Terminal symbols do not have any productions.

Notation

  • The tutorial may refer to the First Set of A and the Follow Set of A as First(A) and Follow(B) respectively.
  • The symbol ε refers to the empty string, and when used in a production like A -> ε, means that a nonterminal can be removed entirely. By convention, it is not included in first and follow sets.
  • The symbol $ refers to the end of the input. It implicitly follows the start symbol (and possibly other nonterminals). It may appear in follow sets.

References