Using Extras
When deriving the Logos
traits, you may want to convey some internal state
between your tokens. That is where Logos::Extras
comes to the rescue.
Each Lexer
has a public field called extras
that can be accessed and
mutated to keep track and modify some internal state. By default,
this field is set to ()
, but its type can by modified using the derive
attribute #[logos(extras = <some type>)]
on your enum
declaration.
For example, one may want to know the location, both line and column indices, of each token. This is especially useful when one needs to report an erroneous token to the user, in an user-friendly manner.
/// Simple tokens to retrieve words and their location.
#[derive(Debug, Logos)]
#[logos(extras = (usize, usize))]
enum Token {
#[regex(r"\n", newline_callback)]
Newline,
#[regex(r"\w+", word_callback)]
Word((usize, usize)),
}
The above token definition will hold two tokens: Newline
and Word
.
The former is only used to keep track of the line numbering and will be skipped
using Skip
as a return value from its callback function. The latter will be
a word with (line, column)
indices.
To make it easy, the lexer will contain the following two extras:
extras.0
: the line number;extras.1
: the char index of the current line.
We now have to define the two callback functions:
/// Update the line count and the char index.
fn newline_callback(lex: &mut Lexer<Token>) -> Skip {
lex.extras.0 += 1;
lex.extras.1 = lex.span().end;
Skip
}
/// Compute the line and column position for the current word.
fn word_callback(lex: &mut Lexer<Token>) -> (usize, usize) {
let line = lex.extras.0;
let column = lex.span().start - lex.extras.1;
(line, column)
}
Extras can of course be used for more complicate logic, and there is no limit
to what you can store within the public extras
field.
Finally, we provide you the full code that you should be able to run with1:
cargo run --example extras Cargo.toml
1 You first need to clone this repository.
use logos::{Lexer, Logos, Skip};
use std::env;
use std::fs;
/// Update the line count and the char index.
fn newline_callback(lex: &mut Lexer<Token>) -> Skip {
lex.extras.0 += 1;
lex.extras.1 = lex.span().end;
Skip
}
/// Compute the line and column position for the current word.
fn word_callback(lex: &mut Lexer<Token>) -> (usize, usize) {
let line = lex.extras.0;
let column = lex.span().start - lex.extras.1;
(line, column)
}
/// Simple tokens to retrieve words and their location.
#[derive(Debug, Logos)]
#[logos(extras = (usize, usize))]
enum Token {
#[regex(r"\n", newline_callback)]
Newline,
#[regex(r"\w+", word_callback)]
Word((usize, usize)),
}
fn main() {
let src = fs::read_to_string(env::args().nth(1).expect("Expected file argument"))
.expect("Failed to read file");
let mut lex = Token::lexer(src.as_str());
while let Some(token) = lex.next() {
if let Ok(Token::Word((line, column))) = token {
println!("Word '{}' found at ({}, {})", lex.slice(), line, column);
}
}
}