This week I added support for -
in identifiers, reworked the array and object
index notation (again, but this time with an even better reasoning than last
weeks), introduced something i call known function interface - KFI
for
short for exposing go functions into the sophia runtime and I have a lot of
features and improvements planned some of which I will outline at the end.
Support for ‘-’ in identifiers
I like the possibilites S-Expressions bring to language expression, such as
allowing -
in identifiers. Therefore I added support for -
in identifiers to the lexer:
1;; mock hash
2(fun fnv-1a (_ string)
3 (len string))
4(println (fnv-1a "Hello World")) ;; 11
Reworked array and object index notation (V2)
Read about my second try of getting array and object indexing right - hopefully the last time :^).
The First Problem
Last week I introduced a pretty nasty bug, caused by the lexer not being able
to correctly tokenize (println array.2.0)
(maybe correct, but not in the way
we want it to):
The expression results in the following token stream:
Tokentype | Raw |
---|---|
TOKEN_LEFT_BRACE | ( |
TOKEN_IDENT | println |
TOKEN_IDENT | array |
TOKEN_DOT | . |
TOKEN_FLOAT | 2.0 |
TOKEN_RIGHT_BRACE | ) |
This is absolutly not what we want, our parser now thinks we want to access the
array at position 2 instead of accessing the first element of the array element
at position 2 of the original array
.
The Second Problem
The simple solution is to use a different way of indexing into arrays, I mean it works for objects right?
Sadly it doesn’t - consider the following:
1(let object { name: "xnacly" skill: 0 })
2(let keys "name" "skill")
3(for (_ i) keys
4;; how do we dynamically access the keys of the object?
5 (println "what am i doing?"))
The catch being - we can’t use our previous object index notation for accessing object fields dynamically and thats a feature i really want.
Even if we were to change the notation for array indexing we would still not be able to access objects in a nice way and we would have two notations for a pretty similar operation, like JavaScript does it:
1let arr = [1, 2, 3, 4];
2console.log(arr[0]);
3let object = { name: "xnacly", skill: 0 };
4console.log(object.name);
5// or
6console.log(object["name"]);
I do not want two ways to do the same thing, thus I though about ways to make
indexing as intuitive as possible and came to the conclusion: Why not simply
discard the object.field
notation and only use object["field"]
and
array[index]
? That way the language has a consistent feel for indexing and we
solved our lexer issues.
Tip
The cautios reader will have noticed this syntax change will cause some problems regarding array creation. Don’t worry, I will lay out the mitigation in the following chapter.The solution
I took inspiration from the previous mentioned dynamic JavaScript object index
syntax and of course took a look at python, that uses the same syntax but also
omits the object.field
notation, similar to what i want to implement:
1obj = {
2 "name": "xnacly",
3 "skill": 0,
4}
5print(obj["name"])
So I changed the parser and now object and array index notation is awesome and id say more redable than before:
1(let object {
2 workers: #[ ;; new array declaration syntax
3 { name: "drone1" efficiency: 0.25 }
4 { name: "drone2" efficiency: 0.55 }
5 ]
6})
7
8;; accessing an object field by a known key:
9(println object["workers"])
10
11;; accessing an object field dynamically:
12(let field "whatever")
13(println object[field]) ;; results in <nil>
14
15;; nested array and object access:
16(println object["workers"][1]["name"]) ;; results in "drone2"
I also again improved the error messages for indexing, see 4ed459e.
Reworked array creation
I reworked array declaration because I need the []
for the previously
introduced syntax feature, thus I switched from
1(let arr [1 2 3 4])
To prefixing array creation with #
:
1(let arr #[1 2 3 4])
Arrays are now treated as primitives and can be included in object definitions:
1(let person {
2 name: "xnacly"
3 skill: 0.0
4 stats: #[
5 {
6 name: "health"
7 value: 0.75
8 }
9 {
10 name: "saturation"
11 value: 0.75
12 }
13 ]
14})
15(println person["name"] "stats:")
16(for (_ stat) person["stats"]
17 (println stat["name"] stat["value"]))
This enables the parser to distinguish between array access and array declaration.
Somewhat of a foreign function interface
pims on lobster asked for a very cool feature:
[…]
Not sure if you’re open to feature requests, but exposing host functions, similar to wasm/lua/etc. would be great. One could write most of the logic in Sophia but hook into the extensive go ecosystem when needed.
~ pims, link
As I am always open to suggestions and didn’t even think about exposing functions written in Go into the sophia runtime before, I got to work.
Tip
KFI is a pun on FFI, because we know our functions, their signature and their body and they must be defined in the same binary the sophia language runtime is embedded in.For starters I rewrote the expr.Call.Eval()
method to include a fast path for
executing built ins, simply because Go manages argument assignment, stack
management, scoping, etc. thus we can omit that:
1// core/expr/Call.go
2func (c *Call) Eval() any {
3 storedFunc, ok := consts.FUNC_TABLE[c.Key]
4 if !ok {
5 serror.Add(c.Token, "Undefined function", "Function %q not defined", c.Token.Raw)
6 serror.Panic()
7 }
8
9 def, ok := storedFunc.(*Func)
10 if !ok {
11 // this branch is hit if a function is not of type *Func which only
12 // happens for built ins, thus the cast can not fail
13 function, _ := storedFunc.(func(token *token.Token, args ...types.Node) any)
14 return function(c.Token, c.Params...)
15 }
16 // [...]
17}
Then I reused my allocator for faster map keys and create the first built-in
replacing the put
keyword and the expr.Put
structure:
1// builtin provides functions that are built into the sophia language but are
2// written in pure go, they may interface with the sophia lang via AST
3// manipulation and by accepting AST nodes and returning values or nodes.
4//
5// See docs/Embedding.md for more information.
6package builtin
7
8import (
9 "os"
10 "sophia/core/alloc"
11 "sophia/core/consts"
12 "sophia/core/serror"
13 "sophia/core/shared"
14 "sophia/core/token"
15 "sophia/core/types"
16 "strings"
17)
18
19var sharedPrintBuffer = &strings.Builder{}
20
21func init() {
22 // [...]
23 consts.FUNC_TABLE[alloc.NewFunc("println")] = func(tok *token.Token, args ...types.Node) any {
24 sharedPrintBuffer.Reset()
25 shared.FormatHelper(sharedPrintBuffer, args, ' ')
26 sharedPrintBuffer.WriteRune('\n')
27 os.Stdout.WriteString(sharedPrintBuffer.String())
28 return nil
29 }
30}
Usage Examples
Now some examples taken from docs/Embedding.
Linking strings.Split
1func init() {
2 // [...]
3 consts.FUNC_TABLE[alloc.NewFunc("strings-split")] = func(tok *token.Token, args ...types.Node) any {
4 if len(args) != 2 {
5 serror.Add(tok, "Argument error", "Expected exactly 2 argument for strings-split built-in")
6 serror.Panic()
7 }
8 v := args[0].Eval()
9 str, ok := v.(string)
10 if !ok {
11 serror.Add(tok, "Error", "Can't split target of type %T, use a string", v)
12 serror.Panic()
13 }
14
15 v = args[1].Eval()
16 sep, ok := v.(string)
17 if !ok {
18 serror.Add(tok, "Error", "Can't split string with anything other than a string (%T)", v)
19 serror.Panic()
20 }
21
22 out := strings.Split(str, sep)
23
24 // sophia lang runtime only sees arrays containing
25 // elements whose types were erased as an array.
26 r := make([]any, len(out))
27 for i, e := range out {
28 r[i] = e
29 }
30
31 return r
32 }
33}
This maps the strings.Split
function from the go standard library to the
strings-split
sophia function. All functions defined with the KFI have access
to the callees token and all its arguments, for instance:
1(strings-split "Hello World" "")
2;; token: strings-split
3;; n: "Hello World", " "
The token
parameter points to strings-split
, n
contains 0 or more
arguments to the call, here its ["Hello World", " "]
.
typeof
We can do whatever go and the sophia lang type system allow. You can print an expressions type without evaluating it:
1consts.FUNC_TABLE[alloc.NewFunc("typeof")] = func(tok *token.Token, n ...types.Node) any {
2 if len(n) != 1 {
3 serror.Add(tok, "Argument error", "Expected exactly 1 argument for typeof built-in")
4 serror.Panic()
5 }
6 return fmt.Sprintf("%T", n[0])
7}
And call this function from sophia:
1$ cat test.phia; echo "------"; sophia test.phia
2(println (typeof #[1 "test" test 25.0]))
3(println (typeof true))
4(println (typeof "test"))
5(println (typeof 12))
6(println (typeof { key: "value" }))
7------
8*expr.Array
9*expr.Boolean
10*expr.String
11*expr.Float
12*expr.Object
Consequences
- calling built-in functions is less expensive than calling sophia functions, simply because the go runtime manages the callstack, scope, etc… => high performance for often used operationes, very nice
- removal of the
expr.Put
structure
Removing the JavaScript target
I removed the JavaScript target because I grew tired of maintaining a secondary
backend while not using it for anything. Thus I stripped the Node.CompileJs(b *strings.Builder)
function from the Node
interface and removed the function
from all expressions.
In the same breath I cleaned the configuration and cli flags up: I removed the
-ast
and -tokens
flags and merged them under the -dbg
flag. I also
removed the -target
flag, simply because its not used anymore. All of these
were also removed from the core.Config
structure.
Planned features
Regarding Modules
Do not worry, i havent forgotten about last weeks planned feature. Its a big feature and I’m still experimenting with the way i want to implement and design syntax.Embedding the sophia programming language
I plan to enable the embedding of the sophia programming language runtime into go application by a singular call and make it configurable to include links to functions written in go via the new KFI feature, this will look something like:
1package main
2
3import (
4 "sophia"
5 "sophia/core/types"
6 "sophia/core/token"
7)
8
9func fib(n float64) float64 {
10 bLast := 0.0
11 last := 1.0
12 for i := 0.0; i < n-1; i++ {
13 t := bLast + last
14 bLast = last
15 last = t
16 }
17 return last
18}
19
20func main() {
21 sophia.Embed(sophia.Configuration{
22 Kfi: map[string]func(*token.Token, ...types.Node) any {
23 "fib": func(tok *token.Token, args ...types.Node) any {
24 if len(args) != 1 {
25 serror.Add(tok, "Argument error", "Expected exactly 1 argument")
26 serror.Panic()
27 }
28 ev := args[0].Eval()
29 in, ok := ev.(float64)
30 if !ok {
31 serror.Add(args[0].GetToken(), "Type error", "Expected float64 got %T", ev)
32 serror.Panic()
33 }
34 return fib(in)
35 }
36 }
37 })
38}