Sophia Lang Weekly - 01

Table of Contents

Tags:

This week I added support for - in identifiers, reworked the array and object index notation (again, but this time with an even better reasoning than last weeks), introduced something i call known function interface - KFI for short for exposing go functions into the sophia runtime and I have a lot of features and improvements planned some of which I will outline at the end.

Support for ‘-’ in identifiers

I like the possibilites S-Expressions bring to language expression, such as allowing - in identifiers. Therefore I added support for - in identifiers to the lexer:

LISP
1;; mock hash
2(fun fnv-1a (_ string)
3    (len string))
4(println (fnv-1a "Hello World")) ;; 11

Reworked array and object index notation (V2)

Read about my second try of getting array and object indexing right - hopefully the last time :^).

The First Problem

Last week I introduced a pretty nasty bug, caused by the lexer not being able to correctly tokenize (println array.2.0) (maybe correct, but not in the way we want it to):

The expression results in the following token stream:

TokentypeRaw
TOKEN_LEFT_BRACE(
TOKEN_IDENTprintln
TOKEN_IDENTarray
TOKEN_DOT.
TOKEN_FLOAT2.0
TOKEN_RIGHT_BRACE)

This is absolutly not what we want, our parser now thinks we want to access the array at position 2 instead of accessing the first element of the array element at position 2 of the original array.

The Second Problem

The simple solution is to use a different way of indexing into arrays, I mean it works for objects right?

Sadly it doesn’t - consider the following:

LISP
1(let object { name: "xnacly" skill: 0 })
2(let keys "name" "skill")
3(for (_ i) keys
4;; how do we dynamically access the keys of the object?
5    (println "what am i doing?"))

The catch being - we can’t use our previous object index notation for accessing object fields dynamically and thats a feature i really want.

Even if we were to change the notation for array indexing we would still not be able to access objects in a nice way and we would have two notations for a pretty similar operation, like JavaScript does it:

JS
1let arr = [1, 2, 3, 4];
2console.log(arr[0]);
3let object = { name: "xnacly", skill: 0 };
4console.log(object.name);
5// or
6console.log(object["name"]);

I do not want two ways to do the same thing, thus I though about ways to make indexing as intuitive as possible and came to the conclusion: Why not simply discard the object.field notation and only use object["field"] and array[index]? That way the language has a consistent feel for indexing and we solved our lexer issues.

Tip

The cautios reader will have noticed this syntax change will cause some problems regarding array creation. Don’t worry, I will lay out the mitigation in the following chapter.

The solution

I took inspiration from the previous mentioned dynamic JavaScript object index syntax and of course took a look at python, that uses the same syntax but also omits the object.field notation, similar to what i want to implement:

PY
1obj = {
2    "name": "xnacly",
3    "skill": 0,
4}
5print(obj["name"])

So I changed the parser and now object and array index notation is awesome and id say more redable than before:

LISP
 1(let object {
 2    workers: #[ ;; new array declaration syntax
 3        { name: "drone1" efficiency: 0.25 }
 4        { name: "drone2" efficiency: 0.55 }
 5    ]
 6})
 7
 8;; accessing an object field by a known key:
 9(println object["workers"])
10
11;; accessing an object field dynamically:
12(let field "whatever")
13(println object[field]) ;; results in <nil>
14
15;; nested array and object access:
16(println object["workers"][1]["name"]) ;; results in "drone2"

I also again improved the error messages for indexing, see 4ed459e.

Reworked array creation

I reworked array declaration because I need the [] for the previously introduced syntax feature, thus I switched from

LISP
1(let arr [1 2 3 4])

To prefixing array creation with #:

LISP
1(let arr #[1 2 3 4])

Arrays are now treated as primitives and can be included in object definitions:

LISP
 1(let person {
 2    name: "xnacly"
 3    skill: 0.0
 4    stats: #[
 5        {
 6            name: "health"
 7            value: 0.75
 8        }
 9        {
10            name: "saturation"
11            value: 0.75
12        }
13    ]
14})
15(println person["name"] "stats:")
16(for (_ stat) person["stats"]
17    (println stat["name"] stat["value"]))

This enables the parser to distinguish between array access and array declaration.

Somewhat of a foreign function interface

pims on lobster asked for a very cool feature:

[…]

Not sure if you’re open to feature requests, but exposing host functions, similar to wasm/lua/etc. would be great. One could write most of the logic in Sophia but hook into the extensive go ecosystem when needed.

~ pims, link

As I am always open to suggestions and didn’t even think about exposing functions written in Go into the sophia runtime before, I got to work.

Tip

KFI is a pun on FFI, because we know our functions, their signature and their body and they must be defined in the same binary the sophia language runtime is embedded in.

For starters I rewrote the expr.Call.Eval() method to include a fast path for executing built ins, simply because Go manages argument assignment, stack management, scoping, etc. thus we can omit that:

GO
 1// core/expr/Call.go
 2func (c *Call) Eval() any {
 3	storedFunc, ok := consts.FUNC_TABLE[c.Key]
 4	if !ok {
 5		serror.Add(c.Token, "Undefined function", "Function %q not defined", c.Token.Raw)
 6		serror.Panic()
 7	}
 8
 9	def, ok := storedFunc.(*Func)
10	if !ok {
11        // this branch is hit if a function is not of type *Func which only
12        // happens for built ins, thus the cast can not fail
13		function, _ := storedFunc.(func(token *token.Token, args ...types.Node) any)
14		return function(c.Token, c.Params...)
15	}
16    // [...]
17}

Then I reused my allocator for faster map keys and create the first built-in replacing the put keyword and the expr.Put structure:

GO
 1// builtin provides functions that are built into the sophia language but are
 2// written in pure go, they may interface with the sophia lang via AST
 3// manipulation and by accepting AST nodes and returning values or nodes.
 4//
 5// See docs/Embedding.md for more information.
 6package builtin
 7
 8import (
 9	"os"
10	"sophia/core/alloc"
11	"sophia/core/consts"
12	"sophia/core/serror"
13	"sophia/core/shared"
14	"sophia/core/token"
15	"sophia/core/types"
16	"strings"
17)
18
19var sharedPrintBuffer = &strings.Builder{}
20
21func init() {
22    // [...]
23	consts.FUNC_TABLE[alloc.NewFunc("println")] = func(tok *token.Token, args ...types.Node) any {
24		sharedPrintBuffer.Reset()
25		shared.FormatHelper(sharedPrintBuffer, args, ' ')
26		sharedPrintBuffer.WriteRune('\n')
27		os.Stdout.WriteString(sharedPrintBuffer.String())
28		return nil
29	}
30}

Usage Examples

Now some examples taken from docs/Embedding.

Linking strings.Split

GO
 1func init() {
 2	// [...]
 3	consts.FUNC_TABLE[alloc.NewFunc("strings-split")] = func(tok *token.Token, args ...types.Node) any {
 4		if len(args) != 2 {
 5			serror.Add(tok, "Argument error", "Expected exactly 2 argument for strings-split built-in")
 6			serror.Panic()
 7		}
 8		v := args[0].Eval()
 9		str, ok := v.(string)
10		if !ok {
11			serror.Add(tok, "Error", "Can't split target of type %T, use a string", v)
12			serror.Panic()
13		}
14
15		v = args[1].Eval()
16		sep, ok := v.(string)
17		if !ok {
18			serror.Add(tok, "Error", "Can't split string with anything other than a string (%T)", v)
19			serror.Panic()
20		}
21
22		out := strings.Split(str, sep)
23
24		// sophia lang runtime only sees arrays containing
25		// elements whose types were erased as an array.
26		r := make([]any, len(out))
27		for i, e := range out {
28			r[i] = e
29		}
30
31		return r
32	}
33}

This maps the strings.Split function from the go standard library to the strings-split sophia function. All functions defined with the KFI have access to the callees token and all its arguments, for instance:

LISP
1(strings-split "Hello World" "")
2;; token: strings-split
3;; n: "Hello World", " "

The token parameter points to strings-split, n contains 0 or more arguments to the call, here its ["Hello World", " "].

typeof

We can do whatever go and the sophia lang type system allow. You can print an expressions type without evaluating it:

GO
1consts.FUNC_TABLE[alloc.NewFunc("typeof")] = func(tok *token.Token, n ...types.Node) any {
2    if len(n) != 1 {
3        serror.Add(tok, "Argument error", "Expected exactly 1 argument for typeof built-in")
4        serror.Panic()
5    }
6    return fmt.Sprintf("%T", n[0])
7}

And call this function from sophia:

TEXT
 1$ cat test.phia; echo "------"; sophia test.phia
 2(println (typeof #[1 "test" test 25.0]))
 3(println (typeof true))
 4(println (typeof "test"))
 5(println (typeof 12))
 6(println (typeof { key: "value" }))
 7------
 8*expr.Array
 9*expr.Boolean
10*expr.String
11*expr.Float
12*expr.Object

Consequences

  • calling built-in functions is less expensive than calling sophia functions, simply because the go runtime manages the callstack, scope, etc… => high performance for often used operationes, very nice
  • removal of the expr.Put structure

Removing the JavaScript target

I removed the JavaScript target because I grew tired of maintaining a secondary backend while not using it for anything. Thus I stripped the Node.CompileJs(b *strings.Builder) function from the Node interface and removed the function from all expressions.

In the same breath I cleaned the configuration and cli flags up: I removed the -ast and -tokens flags and merged them under the -dbg flag. I also removed the -target flag, simply because its not used anymore. All of these were also removed from the core.Config structure.

Planned features

Regarding Modules

Do not worry, i havent forgotten about last weeks planned feature. Its a big feature and I’m still experimenting with the way i want to implement and design syntax.

Embedding the sophia programming language

I plan to enable the embedding of the sophia programming language runtime into go application by a singular call and make it configurable to include links to functions written in go via the new KFI feature, this will look something like:

GO
 1package main
 2
 3import (
 4    "sophia"
 5    "sophia/core/types"
 6    "sophia/core/token"
 7)
 8
 9func fib(n float64) float64 {
10	bLast := 0.0
11	last := 1.0
12	for i := 0.0; i < n-1; i++ {
13		t := bLast + last
14		bLast = last
15		last = t
16	}
17	return last
18}
19
20func main() {
21    sophia.Embed(sophia.Configuration{
22        Kfi: map[string]func(*token.Token, ...types.Node) any {
23            "fib": func(tok *token.Token, args ...types.Node) any {
24                if len(args) != 1 {
25                    serror.Add(tok, "Argument error", "Expected exactly 1 argument")
26                    serror.Panic()
27                }
28                ev := args[0].Eval()
29                in, ok := ev.(float64)
30                if !ok {
31                    serror.Add(args[0].GetToken(), "Type error", "Expected float64 got %T", ev)
32                    serror.Panic()
33                }
34                return fib(in)
35            }
36        }
37    })
38}