Embedding Lua in sqleibniz with Rust

Table of Contents

Tags:

I am currently writing a analysis tool for Sql: sqleibniz, specifically for the sqlite dialect.

The goal is to perform static analysis for sql input, including: syntax checks, checks if tables, columns and functions exist. Combining this with an embedded sqlite runtime and the ability to assert conditions in this runtime, creates a really great dev experience for sql.

Furthermore, I want to be able to show the user high quality error messages with context, explainations and the ability to mute certain diagnostics.

After completing the static analysis part of the project, I plan on writing a lsp server for sql, so stay tuned for that.

Lua as scriptable configuration & extending sqleibniz with hooks

I want to get the most out of sqleibniz, for me this includes the ability for configuration while providing sensible defaults.

Before the changes layed out in this post, sqleibniz was configured via a leibniz.toml file:

TOML
 1# this is an example file, consult: https://toml.io/en/ for syntax help and
 2# src/rules.rs::Config for all available options
 3[disabled]
 4    # see sqleibniz --help for all available rules
 5    rules = [
 6        # by default, sqleibniz specific errors are disabled:
 7        "NoContent", # source file is empty
 8        "NoStatements", # source file contains no statements
 9        "Unimplemented", # construct is not implemented yet
10        "BadSqleibnizInstruction", # source file contains a bad sqleibniz instruction
11
12        # ignoring sqlite specific diagnostics:
13        # "UnknownKeyword", # an unknown keyword was encountered
14        # "UnterminatedString", # a not closed string was found
15        # "UnknownCharacter", # an unknown character was found
16        # "InvalidNumericLiteral", # an invalid numeric literal was found
17        # "InvalidBlob", # an invalid blob literal was found (either bad hex data or incorrect syntax)
18        # "Syntax", # a structure with incorrect syntax was found
19        # "Semicolon", # a semicolon is missing
20    ]

Tip

A rule refers to a group of diagnostics, as their comments document. Sqleibniz groups diagnostics according to these rules. This enables omitting a singular or multiple diagnostics, alternatively to the configuration file, sqleibniz accepts the -D (short for disable) cli flag, followed the be rule to disable (the list of available rules can be found with sqleibniz --help). For instance, disabling all non sqlite diagnostics:

TEXT
1$ sqleibniz \
2    -Dno-statements \
3    -Dno-content \
4    -Dunimplemented \
5    -Dbad-sqleibniz-instruction

Sqleibniz prints the rules it currently ignores:

TEXT
 1$ sqleibniz \
 2    -Dno-statements \
 3    -Dno-content \
 4    -Dunimplemented \
 5    -Dbad-sqleibniz-instruction
 6warn: Ignoring the following diagnostics, as specified:
 7 -> NoStatements
 8 -> NoContent
 9 -> Unimplemented
10 -> BadSqleibnizInstruction

Why switch from toml to lua when cleary toml already allows us to have all the configuration we need? The answer is scripting. I want to enable users to write their own plugins/addons/hooks for whatever usecase anyone could have.

My idea is to provide an array of hooks in lua, each one with a name, a node type to run the callback for and a callback that, once run, gets the context of the node. Node refers to an element in the abstract syntax tree generated by sqleibniz. leibniz.lua already contains the configuration from before, extended with two examplary hooks:

LUA
 1-- this is an example configuration, consult: https://www.lua.org/manual/5.4/
 2-- or https://learnxinyminutes.com/docs/lua/ for syntax help and
 3-- src/rules.rs::Config for all available options
 4leibniz = {
 5    disabled_rules = {
 6        -- ignore sqleibniz specific diagnostics:
 7        "NoContent",               -- source file is empty
 8        "NoStatements",            -- source file contains no statements
 9        "Unimplemented",           -- construct is not implemented yet
10        "BadSqleibnizInstruction", -- source file contains a bad sqleibniz instruction
11
12        -- ignore sqlite specific diagnostics:
13
14        -- "UnknownKeyword", -- an unknown keyword was encountered
15        -- "UnterminatedString", -- a not closed string was found
16        -- "UnknownCharacter", -- an unknown character was found
17        -- "InvalidNumericLiteral", -- an invalid numeric literal was found
18        -- "InvalidBlob", -- an invalid blob literal was found (either bad hex data or incorrect syntax)
19        -- "Syntax", -- a structure with incorrect syntax was found
20        -- "Semicolon", -- a semicolon is missing
21    },
22    -- sqleibniz allows for writing custom rules with lua
23    hooks = {
24        {
25            -- summarises the hooks content
26            name = "idents should be lowercase",
27            -- instructs sqleibniz which node to execute the `hook` for
28            node = "literal",
29            -- sqleibniz calls the hook function once it encounters a node name
30            -- matching the hook.node content
31            --
32            -- The `node` argument holds the following fields:
33            --
34            --```
35            --    node: {
36            --     kind: string,
37            --     content: string,
38            --     children: node[],
39            --    }
40            --```
41            --
42            hook = function(node)
43                if node.kind == "ident" then
44                    if string.match(node.content, "%u") then
45                        -- returing an error passes the diagnostic to sqleibniz,
46                        -- thus a pretty message with the name of the hook, the
47                        -- node it occurs and the message passed to error() is
48                        -- generated
49                        error("All idents should be lowercase")
50                    end
51                end
52            end
53        },
54        {
55            name = "idents shouldn't be longer than 12 characters",
56            node = "literal",
57            hook = function(node)
58                local max_size = 12
59                if node.kind == "ident" then
60                    if string.len(node.content) >= max_size then
61                        error("idents shouldn't be longer than " .. max_size .. " characters")
62                    end
63                end
64            end
65        }
66    }
67}

Since no one uses sqleibniz yet and I have no semantic versioning in place, I do not care about breaking backwards compatibility and just made the change, small projects ROCK!

Rust to Lua, Lua to Rust

Since the lua configuration is only useful when accessed inside the rust application, I created an equivalent data structure, containg both the disabled rules and the hooks.

RUST
1pub struct Config {
2    pub disabled_rules: Vec<Rule>,
3    pub hooks: Option<Vec<Hook>>,
4}

I use the mlua package, because it has serde support and a lot of examples, even though I no longer use this feature.

TOML
1mlua = { version = "0.10.2", features = ["lua54", "vendored"] }

The vendored-feature allows me to not care about dependency managment regarding lua:

vendored: build static Lua(JIT) library from sources during mlua compilation using lua-src or luajit-src crates

mlua uses the FromLua and IntoLua traits for converting rust types to lua types and vice versa.

RUST
1// from mlua/src/traits.rs
2
3/// Trait for types convertible from [`Value`].
4pub trait FromLua: Sized {
5    /// Performs the conversion.
6    fn from_lua(value: Value, lua: &Lua) -> Result<Self>;
7}

mlua implements these traits for all primitive types and some ADT, while the serde-feature enables the serialization and deserialization of structures annotated with serde::Deserialize and serde::Serialize. The only issue I found with the above, is the ability to deserialize lua functions (mlua::Function). Serde does not support these, thus I implemented FromLua and IntoLua for my types on my own, taking serde out of the equation:

RUST
 1impl FromLua for Config {
 2    fn from_lua(value: mlua::Value, lua: &mlua::Lua) -> mlua::Result<Self> {
 3        let table: Table = lua.unpack(value)?;
 4        let disabled_rules: Vec<Rule> = table.get("disabled_rules").unwrap_or_else(|_| vec![]);
 5        let hooks: Option<Vec<Hook>> = table.get("hooks").ok();
 6        Ok(Self {
 7            disabled_rules,
 8            hooks,
 9        })
10    }
11}

Since the context (lua) is passed into the conversion, we can unpack the value to convert, because we want to work directly on the mlua::Value type.

Implementing FromLua for Config requires sqleibniz::types::config::Rule and sqleibniz::types::config::Hook to implement FromLua too:

RUST
 1pub enum Rule {
 2    NoContent,
 3    NoStatements,
 4    Unimplemented,
 5    UnknownKeyword,
 6    BadSqleibnizInstruction,
 7    UnterminatedString,
 8    UnknownCharacter,
 9    InvalidNumericLiteral,
10    InvalidBlob,
11    Syntax,
12    Semicolon,
13}
14
15impl mlua::FromLua for Rule {
16    fn from_lua(value: mlua::Value, lua: &mlua::Lua) -> mlua::Result<Self> {
17        let value: String = lua.unpack(value)?;
18        Ok(match value.as_str() {
19            "NoContent" => Self::NoContent,
20            "NoStatements" => Self::NoStatements,
21            "Unimplemented" => Self::Unimplemented,
22            "UnterminatedString" => Self::UnterminatedString,
23            "UnknownCharacter" => Self::UnknownCharacter,
24            "InvalidNumericLiteral" => Self::InvalidNumericLiteral,
25            "InvalidBlob" => Self::InvalidBlob,
26            "Syntax" => Self::Syntax,
27            "Semicolon" => Self::Semicolon,
28            "BadSqleibnizInstruction" => Self::BadSqleibnizInstruction,
29            "UnknownKeyword" => Self::UnknownKeyword,
30            _ => {
31                return Err(mlua::Error::FromLuaConversionError {
32                    from: "string",
33                    to: "sqleibniz::rules::Rule".into(),
34                    message: Some("Unknown rule name".into()),
35                })
36            }
37        })
38    }
39}

The same for HookContext, but a lot shorter:

RUST
 1pub struct Hook {
 2    pub name: String,
 3    /// node is optional, because omitting it executes the hook for every encountered node
 4    pub node: Option<String>,
 5    pub hook: Option<Function>,
 6}
 7
 8impl mlua::FromLua for Hook {
 9    fn from_lua(value: mlua::Value, lua: &mlua::Lua) -> mlua::Result<Self> {
10        let table: Table = lua.unpack(value)?;
11        let name = table.get("name")?;
12        let node = table.get("node").ok();
13        let hook: Option<Function> = table.get("hook").ok();
14        Ok(Self { name, node, hook })
15    }
16}

Calling Lua functions from Rust

Since we now have the ability to convert a lua value to a mlua::Function, we can call said function and provide the context it needs as its argument(s):

RUST
1impl Hook {
2    pub fn exec(&self, arg: HookContext) -> mlua::Result<()> {
3        if let Some(hook) = &self.hook {
4            hook.call(arg)?
5        }
6        Ok(())
7    }
8}

The sqleibniz::types::ctx::HookContext represents the context I want every hook to have, specifically:

RUST
 1pub struct HookContext {
 2    /// [Self::kind] will be the name of the node for most nodes, except nodes
 3    /// that hold different kinds, such as Literal, which can be an Ident, a
 4    /// String, a Number, etc.
 5    pub kind: String,
 6    /// [Self::content] holds the textual representation of a nodes contents if
 7    /// it is [crates::parser::nodes::Literal].
 8    pub content: Option<String>,
 9    pub children: Vec<HookContext>,
10}

Due to us passing this structure to Hook::exec and therefore to mlua::Function::call it has to implement the IntoLua trait:

RUST
1impl IntoLua for HookContext {
2    fn into_lua(self, lua: &mlua::Lua) -> mlua::Result<mlua::Value> {
3        let table = lua.create_table()?;
4        table.set("kind", self.kind)?;
5        table.set("text", self.content.unwrap_or_else(|| String::new()))?;
6        table.set("children", self.children)?;
7        lua.pack(table)
8    }
9}

Putting it all together

Inside of the lua scripting context, we now are able to access all of these fields:

LUA
 1leibniz = {
 2    hooks = {
 3        {
 4            name = "hook test",
 5            hook = function(node)
 6                print(node.kind .. " " .. node.text .. " " .. #node.children)
 7            end
 8        }
 9    }
10}

Executing this hook with the HookContext ends in the expected result: literal this_is_an_ident 0.

The following shows the full example I use for sqleibniz:

RUST
 1fn configuration(lua: &mlua::Lua, file_name: &str) -> Result<Config, String> {
 2    let conf_str = fs::read_to_string(file_name)
 3        .map_err(|err| format!("Failed to read configuration file '{}': {}", file_name, err))?;
 4
 5    // load the lua configuration string, execute it
 6    lua.load(conf_str)
 7        .set_name(file_name)
 8        .exec()
 9        .map_err(|err| format!("{}: {}", file_name, err))?;
10    let globals = lua.globals();
11
12    let raw_conf = globals
13        .get::<mlua::Value>("leibniz")
14        .map_err(|err| format!("{}: {}", file_name, err))?;
15    // if the leibniz table does not exist, mlua does not return an Err, we
16    // have to check for this case
17    if raw_conf.is_nil() {
18        return Err(format!(
19            "{}: leibniz table is missing from configuration",
20            file_name
21        ));
22    }
23
24    let conf: Config = lua
25         // calls mlua::FromLua(conf)
26        .unpack(raw_conf)
27        .map_err(|err| format!("{}: {}", file_name, err))?;
28    Ok(conf)
29}
30
31fn main() {
32    let mut config = Config {
33        disabled_rules: vec![],
34        hooks: None,
35    };
36
37    // lua defined here because it would be dropped at the end of configuration(), in the
38    // future this will probably need to be moved one scope up to life long enough for analysis
39    let lua = mlua::Lua::new();
40    match configuration(&lua, &args.config) {
41        Ok(conf) => config = conf,
42        Err(err) => {
43            error::warn(&err.to_string());
44        }
45    }
46
47    if let Some(hooks) = &config.hooks {
48        let ctx = types::ctx::HookContext {
49            kind: "literal".into(),
50            content: Some("this_is_an_ident".into()),
51            children: vec![],
52        };
53
54        for hook in hooks {
55            let _ = hook.exec(ctx.clone());
56        }
57    }
58}

If the configuration has invalid syntax or the leibniz table is missing, a warning is omitted and sqleibniz falls back to the default empty configuration:

TEXT
1warn: leibniz.lua: syntax error: [string "leibniz.lua"]:6: '}' expected (to close '{' at line 4) near 'bled_rules'
2warn: leibniz.lua: leibniz table is missing from configuration