This is the third part of the semester-long project. More descriptions of the project can be found in the overview slides and the instructions for Part 1 and Part 2
❗️ NOTE: You have to correctly implement Part 1 first before starting this one. Simply copy your code from Part 1 to BigInt.h
.
In this part, you will write code to support expressions including assignments. We will first convert an expression in infix notation to postfix notation. Next, we will build an expression tree to represent the expression. Then, we evaluate the expression tree and output the result, instead of directly calculating the result from postfix representation as in Part 2 since the structure of expression is much more complicated in Part 3.
Below is an informal description of the syntax and evaluation rules for input expressions. The syntax rules is a subset of the Python grammar.
The simplest expressions are numbers and variables.
456
and 1123
.
BigInt
made from the number.a
and bCD
.
BigInt
associated with it called value. This can be done via assignment (to be introduced shortly). Reassignments override a variable’s value.There are two types of arithmetic expressions: additive and subtractive. They are defined recursively as follows:
expr1 + expr2
, where expr1
and expr2
could be a number, a variable, or another arithmetic expression.
expr1
and expr2
are evaluated in order (see Operator precedence). The evaluation result of the whole additive expression is the sum of the evaluation results of expr1
and expr2
.expr1 - expr2
, where expr1
and expr2
could be a number, a variable, or another arithmetic expression.
expr1
and expr2
are evaluated in order (see Operator precedence). The evaluation result of the whole subtractive expression is the absolute difference of the evaluation results of expr1
and expr2
.We also allow creation and assignment for variables.
var1 = expr2
where var1
is a variable and expr2
is a number, a variable, or an arithmetic expression.
expr2
. The expression associates var1
with the evaluation result.Compound expressions are also defined recursively, as follows:
expr1 expr2;
where expr1
could be another compound expression or empty, and expr2
could be a number, a variable, an arithmetic expression, or an assignment expression.
expr1
and expr2
are evaluated in order. The evaluation result of the compound expression is the evaluation result of expr2
.The following table lists the precedence and associativity of supported operators. Operators are listed top to bottom, in descending precedence.
Precedence | Operator | Associativity |
---|---|---|
1 | a+b , a-b |
Left-to-right |
2 | a=b |
Right-to-left |
Where it is desired to override the precedence and associativity conventions, or even simply to emphasize them, parentheses ()
can be used to indicate an alternative order of operations (or to simply reinforce the default order of operations). For example, 2 + (3 - 4)
forces subtraction to precede addition.
Also note that the ;
in a compound expression is NOT an operator.
a = 456 + (1123 - 1); b = a + 1; b;
is a compound expression.
a = 456 + (1123 - 1)
is the result of 456 + (1123 - 1)
(which is 1578
). It also associates variable a
with this result, i.e., 1578
is the value of variable a
.b = a + 1
is evaluated to a + 1
which is 1579
(using Arithmetic expressions) and associates it with variable b
.b
has value 1579
. Therefore, the compound expression as a whole has evaluation result 1579
.🦄 Challenge: Add support to multiplicative expressions.
As in Part 2, a tokenizer is given in Tokenizer.h
.
Since the input expression now supports variable definition and assignments, we consider more kinds of tokens.
In addition to Number
, LeftParen
, RightParen
, Plus
, Minus
, Asterisk
(for multiplication), End
, and Unexpected
, the tokenizer also recognizes Variable
, Equal
, and Semicolon
. You can find the detailed definition of the Token
class in Token.h
.
Given input a = 456 + (1123 - 1); b = a + 1; b;
, the tokenizer generates a token list:
Variable("a") Equal("=") Number("456") Plus("+") LeftParen("(") Number("1123") Minus("-") Number("1") RightParen(")") Semicolon(";") Variable("b") Equal("=") Variable("a") Plus("+") Number("1") Semicolon(";") Variable("b") Semicolon(";")
where every term between the parentheses is the literal string for each token.
As in Part 2, you can access the underlying literal string of a token using lexeme
. For instance, let t
be the token Variable("a")
, t.lexeme
gives the name of variable "a"
as string. For token Number("1123")
, you can create a BigInt
of 1123
from the number by
BigInt number(t.lexeme);
To check the kind of a token t
, you can use the is
and is_one_of
helper functions defined in Token.h
. Below are some examples:
t.is(Token::Kind::Variable);
// returns true if `t` is a `Variable` token and false otherwise
t.is_one_of(Token::Kind::Plus, Token::Kind::Minus, Token::Kind::Number);
// returns true if `t` is a `Plus`, `Minus`, or `Number` token
As in Part 2, you should complete the InfixToPostfixTransformer::infixToPostfix
function in Transformer.cpp
that transforms an infix token list from Tokenization to its corresponding postfix token stack using the reverse Polish notation.
Recall that the syntax of postfix expression is operand1 operand2 operator
where the two operands are specified first, followed by the operator.
When creating the token stack, you should follow the rules in Part 2, that is, pushing operand1
first, then operand2
, and finally operator
to the stack.
❗️ NOTE: ;
is not an operator. When encountering a semicolon, for example something;
, push something
first and then ;
to the stack.
For the example input a = 456 + (1123 - 1); b = a + 1; b;
, the output postfix token stack is:
top
+----------------+
| Semicolon(";") |
+----------------+
| Variable("b") |
+----------------+
| Semicolon(";") |
+----------------+
| Equal("=") |
+----------------+
| Plus("+") |
+----------------+
| Number("1") |
+----------------+
| Variable("a") |
+----------------+
| Variable("b") |
+----------------+
| Semicolon(";") |
+----------------+
| Equal("=") |
+----------------+
| Plus("+") |
+----------------+
| Minus("-") |
+----------------+
| Number("1") |
+----------------+
| Number("1123") |
+----------------+
| Number("456") |
+----------------+
| Variable("a") |
+----------------+
where Semicolon(";")
is the top of the stack and Variable("a")
is the bottom.
🍦 Tips: If you’ve spent weeks and still couldn’t figure out how to do the transformation, you can
Each expression described in Supported expressions can be represented using an expression tree. The definition of ExprTreeNode
can be found in ExprTreeNode.h
. Each node contains a token
, an expr1
pointer (pointing to the node for the first sub-expression), and an expr2
pointer (pointing to the node for the second sub-expression).
You need to implement the buildExprTree
function in Evaluator.cpp
that builds an expression tree from given postfix token stack. The expression tree for the example input a = 456 + (1123 - 1); b = a + 1; b;
can be found in main.cpp
, where the node that follows ├─
is expr1
and the node that follows └─
is expr2
. To create an ExprTreeNode
given a token t
, you can do
ExprTreeNode *n = new ExprTreeNode(t); // `expr1` and `expr2` will be `nullptr` by default
Then you can access the token
, expr1
and expr2
pointers by n->token
, n->expr1
and n->expr2
.
The expression tree for a number contains only one ExprTreeNode
where expr1
and expr2
pointers are both nullptr
. For example, expression 456
has a tree as follows, where the token is displayed inside the node:
+---------------+
| Number("456") |
+---------------+
The expression tree for a variable has only one ExprTreeNode
. For example, expression a
has a tree as follows:
+---------------+
| Variable("a") |
+---------------+
The expression tree for an arithmetic expression (in the form expr1 operator expr2
) contains at least three ExprTreeNode
s, with the root being the node for the operator and its two child nodes being the root of subtrees for the two operands expr1
and expr2
.
For example, expression 456 - 1123
(456 1123 -
in postfix notation) has a tree as follows:
+-------------+
| Plus("+") |
+-------------+
/ \
expr1 expr2
/ \
+---------------+ +----------------+
| Number("456") | | Number("1123") |
+---------------+ +----------------+
As another example, a nested arithmetic expression a + (456 - 1123)
(a 456 1123 - +
in postfix notation) has a tree as follows:
+-------------+
| Plus("+") |
+-------------+
/ \
expr1 expr2
/ \
+---------------+ +--------------+
| Variable("a") | | Minus("-") |
+---------------+ +--------------+
/ \
expr1 expr2
/ \
+---------------+ +----------------+
| Number("456") | | Number("1123") |
+---------------+ +----------------+
The expression tree for an assignment expression (in the form var1 = expr2
) contains at least three ExprTreeNode
s, with the root being the node for the operator =
and its two child nodes being the root of subtrees for the two operands var1
and expr2
.
For example, expression a = 5
(a 5 =
in postfix notation) has a tree like the following:
+--------------+
| Equal("=") |
+--------------+
/ \
expr1 expr2
/ \
+---------------+ +-------------+
| Variable("a") | | Number("5") |
+---------------+ +-------------+
As another example, the tree for expression a = b + 456
(a b 456 + =
in postfix notation) is as follows:
+--------------+
| Equal("=") |
+--------------+
/ \
expr1 expr2
/ \
+---------------+ +--------------+
| Variable("a") | | Plus("+") |
+---------------+ +--------------+
/ \
expr1 expr2
/ \
+---------------+ +---------------+
| Variable("b") | | Number("456") |
+---------------+ +---------------+
The expression tree for a compound expression (in the form expr1 expr2;
) may contain the followingExprTreeNode
s: the root being the node for ;
, one child node being the root of subtree for expr1
if expr1
is not empty, and the other child node being the root of subtree for expr2
. When expr1
is empty, there is only one child node from the root for expr2
.
For example, the tree for expression a = 456; a + 1;
(a 456 = ; a 1 + ;
in postfix notation) is as follows:
+----------------+
| Semicolon(";") |
+----------------+
/ \
expr1 expr2
/ \
+----------------+ +-----------+
| Semicolon(";") | | Plus("+") |
+----------------+ +-----------+
/ | | \
expr1 expr2 expr1 expr2
/ | | \
nullptr +----------------+ +---------------+ +-------------+
| Equal("=") | | Variable("a") | | Number("1") |
+----------------+ +---------------+ +-------------+
/ \
expr1 expr2
/ \
+---------------+ +---------------+
| Variable("a") | | Number("456") |
+---------------+ +---------------+
The nullptr
means that the expr1
of a compound expression is empty.
The construction of the expression trees should follow the syntax rules in Supported expressions. For each Variable
and Number
token, there should be only one ExprTreeNode
being created. For each Plus
and Minus
token, you should create one node for the token, and at least two more nodes for expr1
and expr2
. For an Equal
token, you should create one node for the token, one node for the var1
, and at least one node for expr2
. For a Semicolon
token, you should create one node for the token, at least one node for expr2
, and at least one node for expr1
if it is not empty.
You need to implement the evaluateExprTree
function in Evaluator.cpp
that evaluates the expression tree rooted at root
by traversing through the tree, and returns the resulting BigInt
.
The evaluation rules are described in Supported expressions.
evaluateExprTree
has a parameter map<string, BigInt> &varTbl
that stores the mapping from a literal variable to its BigInt
value.
You need to update or access varTbl
when you see assignments or uses of variables during the traversal of the tree. To update varTbl
with a given key k
and value v
, do varTbl[k] = v
. To get the value of a variable k
, do varTbl[k]
.n
by n->token
and you can access the literal string of the token by n->token.lexeme
. More use cases can be found in Tokenization - Example above.In summary, the whole process of evaluation of a given expression combines the following 4 steps:
Step | Input (i.e., the parameter) | Output (i.e., the return value) |
---|---|---|
Tokenization of input infix expression (given in Tokenizer.h ) |
String "a = 456 + (1123 - 1); b = a + 1; b;" |
Infix token list in Tokenization - Example |
Infix token list to postfix token stack transformation | Output from above step | Postfix token stack in Infix to postfix - Example |
Create expression tree from postfix token stack | Output from above step | Tree in main.cpp |
Evaluate the expression tree and return the result | Output from above step | 1579 as BigInt |
A sample doctest test case is given in main.cpp
.
To run the given test, simply execute in the terminal: make clean test
.
We encourage your to write your own tests as the sample one might be “too simple”. But they will not be used for grading.
After testing the program locally, you should submit Evaluator.cpp
and Transformer.cpp
, as well as BigInt.h
, to the autograding system.
There will be 10 test cases (30 points in total) to evaluate your implementation. For each passing test case, you will get 3 points if there is no memory error and 2 points if there are memory errors.
We will also manually look at your code to evaluate the quality of your code (50 points), including coding style, comments, indentation, and so on.
✨ Extra Credit: There are 5 test cases (expressions) that contain the multiplication operator, each of which is 1 point.