1. 75
  1. 22

    This is interesting; I’ve been waiting for a new generation of scripting languages that are designed from the ground up with gradual typing and JIT. (Julia fits this bill, but there’s room for more.)

    Random observations:

    • Uses automatic reference counting with cycle detection, neat.
    • Looks like it generates code with libtcc, which is an interesting choice! Wonder how it’s working out for them? I’ve heard good and bad things about tcc in general. It’s simple and executes super quickly, but people have also derided it for bugs and poor maintenance.
    • Tagged unions/sum types, thank you!
    • “ Cyber will allow the host to insert gas mileage checks in user scripts. This allows the host to control how long a script can run.” aw yiss, this makes it way easier to support script-level sandboxing of untrusted code!
    • “In many dynamic languages, functions and fields are looked up in a hash map. In Cyber, they are indexed in an array by a symbol id which is much faster. This is possible because in Cyber there is a distinction between function values and statically declared functions.” Oooh, cunning. So any function call can be direct or essentially a double pointer, instead of always needing to traverse a hashtable.
    • Written in Zig, that’s pretty cool! …though I’m now confused about how libtcc comes into it?
    • It appears to duplicate Lua’s mistake of local variables being harder to declare than globals :-|
    • There’s a null-ish value :-| !!! Though I suppose it’s semi-inevitable in a dynamically typed language, and I’m not sure how exactly it interacts with the type declarations.

    On the whole, quite nice for a 0.1 release!

    1. 8

      Written in Zig, that’s pretty cool! …though I’m now confused about how libtcc comes into it?

      libtcc is used for the JIT. It’s just a C library and Zig has no issues interfacing with C libraries.

      1. 3

        The interaction with and support for C is such a nice feature of Zig.

        1. 2

          As far as I could tell from reading the docs, libtcc is specifically used to JIT compile FFI calls and that’s it. The runtime doesn’t appear to use JIT compilation aside from FFI calls at this time, it’s just a super fast VM.

          1. 1

            You’re exactly right. I misremembered what I was reading in regards to the FFI implementation:

            Calling into external functions is JIT compiled which makes them fast. To learn how it works, see FFI Docs.

          2. 1

            Yep as ifreund says it’s only used for the FFI bindings. Theres still so much that’s not implemented in the language so JIT is not a focus.

          3. 4

            Thank you for the kind words! One thing to note is that the documentation is ahead of the implementation. So user modules is non existent atm, although you can import builtins like “os” and “math”. Gradual types are also incomplete and only enough was done for the fib example.

            1. 3

              i would not have thought to call julia a scripting language! it’s closer to common lisp; a heavyweight dynamically typed language.

              1. 2

                That’s because “scripting language” isn’t really a coherent concept; it’s just a byproduct of Osterhaut’s Fallacy.

              2. 2

                It appears to duplicate Lua’s mistake of local variables being harder to declare than globals :-|

                It’s not really the case IMO. In Lua the main issue is that globals are cross-module (well, there are no globals in recent versions of Lua, but by default the top-level environment is shared). In Cyber as I understand it no local “variables” are shared across modules, only “static” variables are, so variables are only module-local by default.

                1. 7

                  Eeeeeh. If I am reading this correctly, if you do x = 1 then it declares a variable in the current scope. If you do x = 1 and there is already a variable x in the lexical scope above it, then it assigns 1 to that x. To create a new x in the current scope, if there is a variable of the same name in an enclosing scope, you have to do let x = 1. This seems like a pretty good footgun.

                  Edit: I made an issue: https://github.com/fubark/cyber/issues/11

                  1. 5

                    That was the one thing that really jumped out at me as well, I know I would mess that up all the time.

                    (And imagine the bugs if originally you didn’t have that variable in the outer scope but then you add it, making some previously pure functions impure.)

                    1. 4

                      This is a good point. I need to think about it some more.

                    2. 2

                      Yes, it does that for scopes, but not across modules.

                      I like that behavior in Lua. It is actually useful to control finely the scope a variable lives in, and there is tooling to avoid issues.

                      In this case I think I would actually prefer if declaring a variable without a keyword was an error. In Lua allowing a “declaration” without a keyword is useful for DSLs where the code runs in a specific environment, I don’t see a use case like this in Cyber.

                      1. 2

                        Yeah, making reassignment to an existing local visually indistinguishable from introducing a new local is a really common language design mistake for some reason.

                        Different things should look different.

                    3. 1

                      I’ve heard good and bad things about tcc in general. It’s simple and executes super quickly, but people have also derided it for bugs and poor maintenance.

                      I don’t know why but I suddenly feel excited to read about this. Compiler derision is the best derision. I’m hunting through the mailing lists now >:D

                      I had a good experience with tcc. Something didn’t work, asked for help, devs told me what I was doing wrong (not using extern or _dllexport()).

                      https://lists.nongnu.org/archive/html/tinycc-devel/2017-03/msg00000.html

                      At the time GCC was fine with you skipping ‘extern’ and worked everything out for me. Since then it has changed behaviour and I have to use it.

                      1. 1

                        I wrote a Lua module to wrap libtcc and while it was initially fun, over time I found myself using it less and less. Part of that is that TCC does nearly no optimization, so while it is fast at compiling, the resulting code is not. Also, the development model of TCC is chaotic with no real oversight.

                        That said, using TCC just for FFI might be the best approach.

                      2. 3

                        Neat! The benchmark that the homepage highlights is a bit of a silly workload, but it’s cool to see how lightweight fibers are. My go-to scripting language is Janet, and Janet fibers are a little heavy. I was curious how much heavier:

                        $ hyperfine --warmup 10 'zig-out/cyber/cyber test/bench/fiber/fiber.cy' 'janet fiber.janet' 'lua test/bench/fiber/fiber.lua' 'luajit test/bench/fiber/fiber.lua'
                        Benchmark 1: zig-out/cyber/cyber test/bench/fiber/fiber.cy
                          Time (mean ± σ):      31.2 ms ±   2.3 ms    [User: 19.3 ms, System: 11.5 ms]
                          Range (min … max):    27.9 ms …  44.0 ms    93 runs
                        
                        Benchmark 2: janet fiber.janet
                          Time (mean ± σ):     195.9 ms ±  12.1 ms    [User: 157.8 ms, System: 36.3 ms]
                          Range (min … max):   184.2 ms … 219.8 ms    14 runs
                        
                        Benchmark 3: lua test/bench/fiber/fiber.lua
                          Time (mean ± σ):     249.9 ms ±  16.2 ms    [User: 185.4 ms, System: 59.2 ms]
                          Range (min … max):   230.2 ms … 277.5 ms    12 runs
                        
                        Benchmark 4: luajit test/bench/fiber/fiber.lua
                          Time (mean ± σ):      86.4 ms ±   6.1 ms    [User: 58.2 ms, System: 26.6 ms]
                          Range (min … max):    78.6 ms … 105.1 ms    34 runs
                        
                        Summary
                          'zig-out/cyber/cyber test/bench/fiber/fiber.cy' ran
                            2.77 ± 0.28 times faster than 'luajit test/bench/fiber/fiber.lua'
                            6.28 ± 0.61 times faster than 'janet fiber.janet'
                            8.01 ± 0.79 times faster than 'lua test/bench/fiber/fiber.lua'
                        

                        That’s running this very direct translation of the benchmark into Janet:

                        (defn main [&]
                          (var count 0)
                        
                          (defn inc []
                            (+= count 1)
                            (yield)
                            (+= count 1))
                        
                          (def fibers @[])
                          (for _ 0 100000
                            (def f (fiber/new inc))
                            (resume f)
                            (array/push fibers f))
                        
                          (each f fibers
                            (resume f))
                        
                          (print count))
                        

                        I used to write a lot of code in a language with ARC and I think it’s a pretty nice memory model, but I admit I leaned heavily on static analysis to help me with retain cycles. It sounds like Cyber’s memory model is… ARC + a GC?

                        By default, references that outlive the first release op are tracked by the VM. The VM then checks for abandoned reference cycles automatically and frees them. The check can also be explicitly triggered in the user’s script. For embedders, the automatic check can be turned off and triggered manually by the VM host.

                        To my amateur ear this sounds like it’s describing a generational garbage collector? I don’t think I understand this. I guess one difference is that destruction still happens eagerly in the case that a value is explicitly released, instead of waiting for the next GC cycle? But that seems like a small difference. Would like to hear more about this.

                        Very interested to hear more about the gas mileage thing too; that’s the first I’ve heard of this as a first-class concept in a language (prior art I can look at?). My only real experience with embedded languages is Janet, and in order to interrupt the VM you have to spawn a separate OS thread in order to pre-empt it, which is very annoying normally and just impossible (as far as I can figure out) if you’re running it in WebAssembly.

                        1. 3

                          I haven’t implemented the GC aspect of it yet. It’s different in the sense that it won’t run on a separate thread. But it will have to build a graph and detect cycles. When does this happen? I think it would perform this action once in awhile after a release op if it detects that it needs more memory. How frequent this happens can also be manually controlled by the user script or runtime. Also providing the weak refs should help lessen the load if you know that there will be a cycle somewhere.

                          As for the mileage check. I think it will just be interrupt ops placed in functions calls and the beginning of loops. You specify an a threshold. It wouldn’t be counting the number of instructions just a hops from one interrupt to the next interrupt instruction.

                          1. 2

                            Each “interrupt” operation should be able to know at compile time how many instructions will occur before the next one, more or less. I guess that comes down to knowing the size of the basic block you’re in. When you enter a basic block you know that BB is X instructions long, and by definition contains no jumps until the end. So the start of every basic block just checks “am I out of fuel? If no, subtract X from the fuel consumed”. I thiiiiink that should let you account for fuel both pretty efficiently and pretty accurately.

                            1. 1

                              I think you might be right. To handle branching, interrupts could be placed at jump instructions and those would contain info about how many instructions until the next jump or branching instruction.

                          2. 3

                            Very interested to hear more about the gas mileage thing too; that’s the first I’ve heard of this as a first-class concept in a language (prior art I can look at?). My only real experience with embedded languages is Janet, and … to interrupt the VM … is very annoying normally and just impossible (as far as I can figure out) if you’re running it in WebAssembly.

                            I haven’t looked at Cyber or Janet, and I’m not very familiar with the WebAssembly landscape, but I recall at least one Wasm implementation, Wasmtime, having a concept built in of “fuel” that is consumed by executing Wasm operations, with execution being interrupted if it consumes too much fuel.

                          3. 1

                            what is it’s packaging and module resolution system (if anybody knows). I could not find it features page (but admittedly, I just quickly looked

                            1. 1

                              This is really nice; Is the playground built on wasm? If so, how do I use it in my webpages?

                              1. 1

                                I will most likely add an example to the ‘examples’ directory.

                              2. -1

                                Swag already available: https://cyber.equipment/