Brainfuck interpreter
In most programming languages, commands can be made of multiple program tokens, where a token is simply string slice that has a particular meaning for the language. For example, in Rust, the function signature pub fn main()
could be split by the lexer into tokens pub
, fn
, main
, (
, and )
. Then, the parser combines tokens into meaningful program instructions.
However, there exists programming languages that are so simple, such as Brainfuck, that each token can be mapped to a single instruction. There are actually 8 single-characters tokens:
/// Each [`Op`] variant is a single character.
#[derive(Debug, Logos)]
enum Op {
/// Increment pointer.
#[token(">")]
IncPointer,
/// Decrement pointer.
#[token("<")]
DecPointer,
/// Increment data at pointer.
#[token("+")]
IncData,
/// Decrement data at pointer.
#[token("-")]
DecData,
/// Output data at pointer.
#[token(".")]
OutData,
/// Input (read) to data at pointer.
#[token(",")]
InpData,
/// Conditionally jump to matching `']'`.
#[token("[")]
CondJumpForward,
/// Conditionally jump to matching `'['`.
#[token("]")]
CondJumpBackward,
}
All other characters must be ignored.
Once the tokens are obtained, a Brainfuck interpreter can be easily created using a Finite-state machine. For the sake of simpliciy, we collected all the tokens into one vector called operations
.
Now, creating an interpreter becomes straightforward1:
let mut i: usize = 0;
// True program execution.
loop {
match operations[i] {
Op::IncPointer => pointer += 1,
Op::DecPointer => pointer -= 1,
Op::IncData => data[pointer] = data[pointer].wrapping_add(1),
Op::DecData => data[pointer] = data[pointer].wrapping_sub(1),
Op::OutData => print_byte(data[pointer]),
Op::InpData => data[pointer] = read_byte(),
Op::CondJumpForward => {
if data[pointer] == 0 {
// Skip until matching end.
i = *pairs.get(&i).unwrap();
}
}
Op::CondJumpBackward => {
if data[pointer] != 0 {
// Go back to matching start.
i = *pairs_reverse.get(&i).unwrap();
}
}
}
i += 1;
if i >= len {
break;
}
}
There is a small trick to make it easy. As it can be seen in the full code, we first perform a check that all beginning loops ('['
) have a matching end (']'
). This way, we can create two maps, pairs
and pairs_reverse
, to easily jump back and forth between them.
Finally, we provide you the full code that you should be able to run with2:
cargo run --example brainfuck examples/hello_word.bf
You first need to clone this repository.
use logos::Logos;
use std::collections::HashMap;
use std::env;
use std::fs;
use std::io::{self, Read};
/// Each [`Op`] variant is a single character.
#[derive(Debug, Logos)]
enum Op {
/// Increment pointer.
#[token(">")]
IncPointer,
/// Decrement pointer.
#[token("<")]
DecPointer,
/// Increment data at pointer.
#[token("+")]
IncData,
/// Decrement data at pointer.
#[token("-")]
DecData,
/// Output data at pointer.
#[token(".")]
OutData,
/// Input (read) to data at pointer.
#[token(",")]
InpData,
/// Conditionally jump to matching `']'`.
#[token("[")]
CondJumpForward,
/// Conditionally jump to matching `'['`.
#[token("]")]
CondJumpBackward,
}
/// Print one byte to the terminal.
#[inline(always)]
fn print_byte(byte: u8) {
print!("{}", byte as char);
}
/// Read one byte from the terminal.
#[inline(always)]
fn read_byte() -> u8 {
let mut input = [0u8; 1];
io::stdin()
.read_exact(&mut input)
.expect("An error occurred while reading byte!");
input[0]
}
/// Execute Brainfuck code from a string slice.
pub fn execute(code: &str) {
let operations: Vec<_> = Op::lexer(code).filter_map(|op| op.ok()).collect();
let mut data = [0u8; 30_000]; // Minimum recommended size
let mut pointer: usize = 0;
let len = operations.len();
// We pre-process matching jump commands, and we create
// a mapping between them.
let mut queue = Vec::new();
let mut pairs = HashMap::new();
let mut pairs_reverse = HashMap::new();
for (i, op) in operations.iter().enumerate() {
match op {
Op::CondJumpForward => queue.push(i),
Op::CondJumpBackward => {
if let Some(start) = queue.pop() {
pairs.insert(start, i);
pairs_reverse.insert(i, start);
} else {
panic!(
"Unexpected conditional backward jump at position {}, does not match any '['",
i
);
}
}
_ => (),
}
}
if !queue.is_empty() {
panic!("Unmatched conditional forward jump at positions {:?}, expecting a closing ']' for each of them", queue);
}
let mut i: usize = 0;
// True program execution.
loop {
match operations[i] {
Op::IncPointer => pointer += 1,
Op::DecPointer => pointer -= 1,
Op::IncData => data[pointer] = data[pointer].wrapping_add(1),
Op::DecData => data[pointer] = data[pointer].wrapping_sub(1),
Op::OutData => print_byte(data[pointer]),
Op::InpData => data[pointer] = read_byte(),
Op::CondJumpForward => {
if data[pointer] == 0 {
// Skip until matching end.
i = *pairs.get(&i).unwrap();
}
}
Op::CondJumpBackward => {
if data[pointer] != 0 {
// Go back to matching start.
i = *pairs_reverse.get(&i).unwrap();
}
}
}
i += 1;
if i >= len {
break;
}
}
}
fn main() {
let src = fs::read_to_string(env::args().nth(1).expect("Expected file argument"))
.expect("Failed to read file");
execute(src.as_str());
}