1. 43
    1. 3

      Do you think they will ever let wasm access the dom api

      1. 6

        The day WASM can access the DOM directly is the day the last line of JavaScript ever will be written. I kid, but also not totally :-)

        1. 8

          I don’t see how they’re going to solve the GC problem. If you have DOM manipulation by untrusted code sent over the network, then you really want GC.

          And once you add GC to WASM it’s basically like the JVM, and that means that it’s better for certain languages than others. It’s already is biased toward certain languages (C and Rust due to lack of GC), but I think it will be even more so with GC. Because GC requires rich types, knowledge of pointers, etc. and right now WASM has a very minimal set of types (i32, i64, f32, f64).

          1. 4

            Could the browser just kill the tab process if it exceeds some memory threshold? I don’t understand why GC is necessary

            1. 3

              Unfortunately that would limit the browser to roughly content-only pages, in which case you don’t need WASM. Think Google Maps (and pages that embed Google Maps), Protonmail, games, etc. And anything that uses a “SPA” architecture, which is for better or worse increasingly common.

              All those are long-lived apps and need GC. GC is a global algorithm, spanning languages. Web browsers use GC for C++ too, when JS (or in theory WASM) hold references to DOM objects: https://trac.webkit.org/wiki/Inspecting%20the%20GC%20heap

              1. 1

                I see, so the concern isn’t with a rogue WASM app causing bad performance in the other browser tabs, it is about being unable to write the performant WASM app at all without GC?

                1. 1

                  If you didn’t have GC, a browser tab could allocate all the memory on your computer, and many would! The GC is necessary to reclaim memory so it can be used by other tabs / programs.

                  It’s very common to allocate in a loop. That’s no problem in Python and JavaScript because the GC will pause in the middle of the loop and take care of it. In C, you usually take care to reuse the allocation, which is what makes the code “more detailed”.

                  I have some first hand experience with this because I wrote a shell which does not deallocate anything. That’s not good enough! :) It may actually be OK for small scripts, but there are long running programs in shell too. You can write a loop that reads every line of a file, etc.

                  So I’m writing a garbage collector to fix that: http://www.oilshell.org/blog/2021/03/release-0.8.8.html#the-garbage-collector-works-on-a-variety-of-examples

                  Right now the practice for WASM is to either write in C or Rust and manually deallocate – or ship a GC over the network with the program, which isn’t ideal for a number of reasons.

                  1. 3

                    But you can easily consume all the memory anyway just by making an array and continually growing it, or a linked list, or whatever. So what’s the difference?

                    Like, wouldn’t it be enough for the WASM code to have some way of telling the browser’s GC its refcount for each object it’s holding on to so it doesn’t get GCed out from under it?

                    1. 1

                      That’s done with weak refs for guest objects owned by js objecs, but there’s nothing to handle the other direction afaik.

            2. 3

              I have noticed recent versions of Safari (at least on arm64) do this. First you get a little warning underneath the tab bar saying “This tab is using significant amounts of memory — closing it may improve responsiveness” (paraphrased). It doesn’t actually go ahead and kill it for you for quite some time, but I’ve noticed that e.g. on putting the computer to sleep and waking it up again, a tab so-marked gets reloaded. It is a little annoying, but it doesn’t come up very often to begin with.

          2. 2

            Agreed, it’s a tricky thing, particularly given how including a GC or other runtime niceties in compiled code bloats the downloaded asset for each site. So I also can’t imagine that they intend to do nothing.

            1. 5

              Yeah I haven’t been following closely, but it seems like the WASM GC+types enhancements are bigger than all of WASM itself to date. (e.g. there are at least 2 WASM complete interpreters that are all of 3K lines of C code; that would no longer be possible)

              It’s possible to do, but it’s not a foregone conclusion that it will happen, or be good!

              I’d also say that manipulating the DOM is inherently dynamic, at least with the way that web apps are architected today. I say that because (1) DOM elements are often generated dynamically and (2) the values are almost all strings (attributes like IDs and classes, elements, contents of elements, CSS selectors, etc.).

              Writing that kind of code in a statically typed language is not likely to make it any better or safer. You’d probably want something other than the DOM in a more static language. I’d also go as far as to say that JS is better than most dynamic languages at these kinds of tasks, simply because it was designed for it and has libraries/DSLs for it. Python or Lua in the browser sounds good until you actually try to rewrite the code …

          3. 1

            Why can’t GC be optional? “You can turn on the GC, but then you have to comply with this additional set of requirements about using rich types, informing the GC of pointers, etc.”

            Edit: this actually seems like it must work since it is essentially the existing “ship a GC over the network” solution, except the you don’t have to actually pay the bandwidth to ship it over the network because it’s already in the browser. Unless I’m missing something, which I definitely could be!

          4. 1

            Here’s an idea:

            • you can hold references to DOM nodes. Accessing properties requires going through accessor functions that null-coalesce basically

            • if you just have a reference, it can get GC’d out from under you (accessors will guard from the dereference tho)

            • however, the reference can be reference counted. You can increment the reference count, decrement it (see the Python FFI). You can of course memory leak like this. But making it optional means you can also try and be clever.

            • no handling of cyclical issues. You wanna memory leak? Go for it. Otherwise implement GC youself

            Reference counted GC doesn’t involve stop the world stuff, and since you likely won’t be linking DOM elements together cycles would be much rarer.

            1. 1

              WASM already has a feature called “reference types”, which allows WASM programs to contain garbage collected references to DOM objects. Chrome and Firefox added support for this feature last summer. I don’t know all the details, but you can google it.

        2. 2

          I thought you could already do this. What about this package https://github.com/koute/stdweb for accessing the DOM using Rust that is compiled to WASM?

          1. 3

            That basically bridges to JS to do the actual DOM manipulation. I don’t remember the exact details.

      2. 2

        Misread this as “the doom API”

        …I’m fairly sure doom has been ported to wasm, anyhow

      3. 1

        I mean, you already can? You just have to add the bindings yourself. But that makes sense because not every wasm blob will want the same access or in the same way…

    2. 2

      Obligatory reference to The Birth and Death of JavaScript. I’m sure many people have seen this before, but hopefully someone here hasn’t and gets to enjoy it and its uncomfortably prescient predictions for the first time :-)

    3. 2

      Is WASM the new JVM?

      1. 2

        Yes, but better. Though as with everything nice and new groups are working hard on making it worse until it is the same.

        1. 1

          Better for what usecase? Can you explain your opinion a little more?

          1. 5
            • Better for multi-implementation: there already are several.
            • Better for understanding: simple set of primitives.
            • Better for extending or embedding or sandboxing: no baked-in system or IO interfaces.
            • Better for polyglot: no assumptions about GC or OOP etc
            1. 3

              But there are no threads, no object model for cross-language interop, etc.

              1. 3
                • You can have threads if your environment gives you the sycalls for them. That’s up to your implementation and not the wasm spec, which is on of the advantages
                • Cross language story is the same as basically everywhere else: C API. Richer polyglot APIs that actually work interest me, but the JVM hasa baked in model that is very limiting to what languages can run on a JVM at all… wasm just skips that and operates at a lower lelev, nothing prevents a subset of languages from agreeing on something to layer in for this
    4. 1

      Has anyone embedded a WASM engine, whether Wasm3 or otherwise? I’d be interested in hearing experiences.

      1. 4

        I worked on a production “hosted function” system first in go with life then rewrote to rust with lucet. Was very nice to work with

      2. 4

        I’ve used wasmtime from Rust a couple times, though not for anything large. I made a language that compiled to wasm and used wasmtime to run its test cases. The interface is Rust-unsafe but pretty easy: load a wasm module, look up a function by name, give it a signature, and then you can just call it like any other function. Never got complicated enough to do things like pass pointers around though, or make host functions accessible to the wasm code.

      3. 3

        Microsoft Flight Simulator (2020) somewhat embeds a Wasm engine. Addons are written in Wasm and compiled to native code using inNative, a LLVM frontend for Wasm, which solves problems with multi-platform support and restrictions on JIT compilation on consoles. The PC builds embed inNative’s JIT, which uses LLVM’s JIT interface, for faster development edit/test cycles.

        1. 3

          That’s sensational! I had been wondering when we were going to improve on LUA for modding.

          I’m also somewhat keen on the idea of sandboxing native libraries that are called from high-level languages; if the performance overhead can be brought suitably low, I would really like to be relatively safe from library segfaults (especially for image processing).

      4. 1

        What do you mean by embedded. Like, in an app or on some hw device?

        1. 4

          Just in a C++ app :) Actually I have an idea to embed WASM in https://www.oilshell.org/ . One use case is to solve some bootstrapping problems with dev tools. For example, if you have a tool written in a native language like a parser generator, then it’s somewhat of a pain for people to either build those, or for the maintainer to distribute binaries for them (especially if they change often).

          So it seems natural to write a shell script and call out to an “embedded” arch-independent binary in those cases. (Though this probably won’t happen for a long time.)

          (BTW the work on wasm3 seems very cool, I looked at the code a bit, and hope to learn more about WASI)

          1. 1

            I think wasm3 is perfect for this scenario. Especially if you realize that wasm “plugins” can be written in a variety of languages. C/C++, Rust, TinyGo, AssemblyScript, Swift…

            1. 1

              Yes the polyglot nature is very natural for shell :) How stable is WASI now?

              Is it easy to compile and run some C code like this with wasm3 ? Can I just use clang and musl libc or are there some other tools? Any examples to start from? I have run wasm in the browser but I didn’t compile any C.

              int main(char** argv, int argc) {
                 read(0, 1024);  // read from stdin
                 write(2, argv[0]);  // write to stderr
              
                 char *p = getenv("PATH");
                 write(1, p);
                 return 0;
              }
              

              So I want to call main directly; so I guess I need a wasm stub that calls it?

              I think I want to provide only a argv/ENV/stdin/stdout/stderr interface to simulate a sandboxed C program. I’m not sure binary blobs loaded into the shell to be able to read and write arbitrary files. The files should be opened in the shell, like this:

              my-wasm-program.wasm <input.txt >output.txt
              

              This also has some bearing on incremental computation like Make, e.g. knowing the inputs and outputs precisely from shell, rather than having to analyze C code or WASM code.

              1. 1

                This is exactly what you want. You can compile C to Wasi easily using wasienv. Also, it’s a matter of runtime configuration, not to allow FS access. Std in/out are open by default, but can also be blocked.

                1. 2

                  Hm so how do I embed it in an application and use the C API? I looked at the README.md, the doc/ folder, and this header:

                  https://github.com/wasm3/wasm3/blob/main/source/wasm3.h

                  I don’t see any C code examples?

                  In contrast the Python binding has an example in the README:

                  https://github.com/wasm3/pywasm3

                  1. 2

                    Good idea. I’ll create some kind of tutorial ;)

                  2. 0

                    I don’t see any C code examples?

                    Check out this: https://github.com/wasm3/wasm3/blob/main/docs/Cookbook.md