Simple calculator

This page (including the images) was contributed by ynn.

When you implement an interpreter for a domain-specific language (DSL), or any programming language, the process typically involves the following steps:

  1. Lexing: Splitting the input stream (i.e., source code string) into tokens via a lexer.

  2. Parsing: Converting the tokens into an abstract syntax tree (AST) via a parser.

  3. Evaluation: Evaluating the AST to produce the result.

In this example, we implement a simple calculator that evaluates arithmetic expressions such as 1 + 2 * 3 or ((1 + 2) * 3 + 4) * 2 + 4 / 3.

We use logos as the lexer generator and chumsky as the parser generator.

flow chart

1. Try It

Before diving into the implementation details, let's play with it1.

$ cargo run --example calculator '1 + 7 * (3 - 4) / 2'
1

You first need to clone this repository.

Output:

[AST]
Add(
    Int(
        1,
    ),
    Div(
        Mul(
            Int(
                7,
            ),
            Sub(
                Int(
                    3,
                ),
                Int(
                    4,
                ),
            ),
        ),
        Int(
            2,
        ),
    ),
)

[result]
-2

Full Code

use std::env;

use chumsky::prelude::*;
use logos::Logos;

#[derive(Logos, Debug, PartialEq, Eq, Hash, Clone)]
#[logos(skip r"[ \t\n]+")]
#[logos(error = String)]
enum Token {
    #[token("+")]
    Plus,

    #[token("-")]
    Minus,

    #[token("*")]
    Multiply,

    #[token("/")]
    Divide,

    #[token("(")]
    LParen,

    #[token(")")]
    RParen,

    #[regex("[0-9]+", |lex| lex.slice().parse::<isize>().unwrap())]
    Integer(isize),
}

#[derive(Debug)]
enum Expr {
    // Integer literal.
    Int(isize),

    // Unary minus.
    Neg(Box<Expr>),

    // Binary operators.
    Add(Box<Expr>, Box<Expr>),
    Sub(Box<Expr>, Box<Expr>),
    Mul(Box<Expr>, Box<Expr>),
    Div(Box<Expr>, Box<Expr>),
}

impl Expr {
    fn eval(&self) -> isize {
        match self {
            Expr::Int(n) => *n,
            Expr::Neg(rhs) => -rhs.eval(),
            Expr::Add(lhs, rhs) => lhs.eval() + rhs.eval(),
            Expr::Sub(lhs, rhs) => lhs.eval() - rhs.eval(),
            Expr::Mul(lhs, rhs) => lhs.eval() * rhs.eval(),
            Expr::Div(lhs, rhs) => lhs.eval() / rhs.eval(),
        }
    }
}

#[allow(clippy::let_and_return)]
fn parser() -> impl Parser<Token, Expr, Error = Simple<Token>> {
    recursive(|p| {
        let atom = {
            let parenthesized = p
                .clone()
                .delimited_by(just(Token::LParen), just(Token::RParen));

            let integer = select! {
                Token::Integer(n) => Expr::Int(n),
            };

            parenthesized.or(integer)
        };

        let unary = just(Token::Minus)
            .repeated()
            .then(atom)
            .foldr(|_op, rhs| Expr::Neg(Box::new(rhs)));

        let binary_1 = unary
            .clone()
            .then(
                just(Token::Multiply)
                    .or(just(Token::Divide))
                    .then(unary)
                    .repeated(),
            )
            .foldl(|lhs, (op, rhs)| match op {
                Token::Multiply => Expr::Mul(Box::new(lhs), Box::new(rhs)),
                Token::Divide => Expr::Div(Box::new(lhs), Box::new(rhs)),
                _ => unreachable!(),
            });

        let binary_2 = binary_1
            .clone()
            .then(
                just(Token::Plus)
                    .or(just(Token::Minus))
                    .then(binary_1)
                    .repeated(),
            )
            .foldl(|lhs, (op, rhs)| match op {
                Token::Plus => Expr::Add(Box::new(lhs), Box::new(rhs)),
                Token::Minus => Expr::Sub(Box::new(lhs), Box::new(rhs)),
                _ => unreachable!(),
            });

        binary_2
    })
    .then_ignore(end())
}

fn main() {
    //reads the input expression from the command line
    let input = env::args()
        .nth(1)
        .expect("Expected expression argument (e.g. `1 + 7 * (3 - 4) / 5`)");

    //creates a lexer instance from the input
    let lexer = Token::lexer(&input);

    //splits the input into tokens, using the lexer
    let mut tokens = vec![];
    for (token, span) in lexer.spanned() {
        match token {
            Ok(token) => tokens.push(token),
            Err(e) => {
                println!("lexer error at {:?}: {}", span, e);
                return;
            }
        }
    }

    //parses the tokens to construct an AST
    let ast = match parser().parse(tokens) {
        Ok(expr) => {
            println!("[AST]\n{:#?}", expr);
            expr
        }
        Err(e) => {
            println!("parse error: {:#?}", e);
            return;
        }
    };

    //evaluates the AST to get the result
    println!("\n[result]\n{}", ast.eval());
}

2. Lexer

Our calculator supports the following tokens:

  • Integer literals: 0, 1, 15, etc;

  • Unary operator: -;

  • Binary operators: +, -, *, /;

  • Parenthesized expressions: (3 + 5) * 2, ((1 + 2) * 3 + 4) * 2 + 3 / 2, etc.

#[derive(Logos, Debug, PartialEq, Eq, Hash, Clone)]
#[logos(skip r"[ \t\n]+")]
#[logos(error = String)]
enum Token {
    #[token("+")]
    Plus,

    #[token("-")]
    Minus,

    #[token("*")]
    Multiply,

    #[token("/")]
    Divide,

    #[token("(")]
    LParen,

    #[token(")")]
    RParen,

    #[regex("[0-9]+", |lex| lex.slice().parse::<isize>().unwrap())]
    Integer(isize),
}

3. Parser

While it is easy enough to manually implement a parser in this case (e.g., Pratt parsing), let's just use chumsky crate, which is one of the most popular parser generator libraries in Rust.

3.1 AST Definition

First, we define the AST.

#[derive(Debug)]
enum Expr {
    // Integer literal.
    Int(isize),

    // Unary minus.
    Neg(Box<Expr>),

    // Binary operators.
    Add(Box<Expr>, Box<Expr>),
    Sub(Box<Expr>, Box<Expr>),
    Mul(Box<Expr>, Box<Expr>),
    Div(Box<Expr>, Box<Expr>),
}

Note that

  • We name the enum not AST but Expr because an AST is just nested expressions.

  • There is no Parenthesized variant because parentheses only affect the order of operations (i.e., precedence), which is reflected in the AST structure.

  • Box is used as a recursive enum is not allowed in Rust.

3.2 Parser Implementation

Next, we define the parser. The code may look a bit complicated if you are not familiar with parser combinator libraries, but it is actually quite simple. See Chumsky's official tutorial for the details.

fn parser() -> impl Parser<Token, Expr, Error = Simple<Token>> {
    recursive(|p| {
        let atom = {
            let parenthesized = p
                .clone()
                .delimited_by(just(Token::LParen), just(Token::RParen));

            let integer = select! {
                Token::Integer(n) => Expr::Int(n),
            };

            parenthesized.or(integer)
        };

        let unary = just(Token::Minus)
            .repeated()
            .then(atom)
            .foldr(|_op, rhs| Expr::Neg(Box::new(rhs)));

        let binary_1 = unary
            .clone()
            .then(
                just(Token::Multiply)
                    .or(just(Token::Divide))
                    .then(unary)
                    .repeated(),
            )
            .foldl(|lhs, (op, rhs)| match op {
                Token::Multiply => Expr::Mul(Box::new(lhs), Box::new(rhs)),
                Token::Divide => Expr::Div(Box::new(lhs), Box::new(rhs)),
                _ => unreachable!(),
            });

        let binary_2 = binary_1
            .clone()
            .then(
                just(Token::Plus)
                    .or(just(Token::Minus))
                    .then(binary_1)
                    .repeated(),
            )
            .foldl(|lhs, (op, rhs)| match op {
                Token::Plus => Expr::Add(Box::new(lhs), Box::new(rhs)),
                Token::Minus => Expr::Sub(Box::new(lhs), Box::new(rhs)),
                _ => unreachable!(),
            });

        binary_2
    })
    .then_ignore(end())
}

4. Evaluator

Evaluating the AST is straightforward. We just implement it using depth-first search (DFS) such that the mathematical operations are processed in the correct order.

impl Expr {
    fn eval(&self) -> isize {
        match self {
            Expr::Int(n) => *n,
            Expr::Neg(rhs) => -rhs.eval(),
            Expr::Add(lhs, rhs) => lhs.eval() + rhs.eval(),
            Expr::Sub(lhs, rhs) => lhs.eval() - rhs.eval(),
            Expr::Mul(lhs, rhs) => lhs.eval() * rhs.eval(),
            Expr::Div(lhs, rhs) => lhs.eval() / rhs.eval(),
        }
    }
}

Example

Evaluating 1 + 3 * 12 will proceed as below.

how evaluator works

5. main() Function

Finally, we put everything together in the main() function.

fn main() {
    //reads the input expression from the command line
    let input = env::args()
        .nth(1)
        .expect("Expected expression argument (e.g. `1 + 7 * (3 - 4) / 5`)");

    //creates a lexer instance from the input
    let lexer = Token::lexer(&input);

    //splits the input into tokens, using the lexer
    let mut tokens = vec![];
    for (token, span) in lexer.spanned() {
        match token {
            Ok(token) => tokens.push(token),
            Err(e) => {
                println!("lexer error at {:?}: {}", span, e);
                return;
            }
        }
    }

    //parses the tokens to construct an AST
    let ast = match parser().parse(tokens) {
        Ok(expr) => {
            println!("[AST]\n{:#?}", expr);
            expr
        }
        Err(e) => {
            println!("parse error: {:#?}", e);
            return;
        }
    };

    //evaluates the AST to get the result
    println!("\n[result]\n{}", ast.eval());
}

6. Extend the Calculator

Now that you've implemented a basic calculator, try extending its functionality with the following tasks:

  • Handle zero-division gracefully: The current evaluator panics when zero-division occurs. Change the return type of the evaluator from isize to Result<isize, String>, making it possible to return an error message.

  • Add support for the modulo operator (%): Update the lexer, parser, and evaluator to handle expressions like 10 % 3.

  • Add support for built-in functions: Implement built-in functions such as abs(x), pow(x, y) or rand().