1. 2

    Nice work!

    How far away is the interpreter from becoming a usable replacement for Bash? I wanted to introduce a native bash replacement in direnv in order to remove the Bash dependency. The last time I tried, it didn’t handle all of the features that are being used in the stdlib.

    1. 2

      Looking over stdlib, I don’t see anything that shouldn’t run under the latest release of Oil. I’d be interested in a bug report if it doesn’t run.

      I just ran osh -n stdlib.sh and it spits out a big AST, so that’s a good sign.

      1. 2

        Oil is also interesting and on my radar. For the direnv use-case, gosh has an inherent advantage since it’s written in the same language as direnv.

        My hope for Oil is that we will get something that is faster and stricter than Bash. The new language seems interesting but ultimately what I want is more like a stricter and faster version of Bash. And then rebuild all of nixpkgs with it. Due to the nature of Nix and how the derivations are being composed using chunks of strings, Bash is quite ideal as a builder companion. But the feedback that Bash gives is sometimes lacking.

        1. 2

          what I want is more like a stricter and faster version of Bash

          Ah OK, I didn’t quite get the Go connection, but Oil is definitely a stricter version of bash now, and it’s on track to be faster (maybe 3 to 9 months from now).

          In the last couple releases I really nailed the strictness down. Some examples:

          http://www.oilshell.org/blog/2020/10/big-changes.html#tightened-up-string-literals

          As of early this year, e.g. http://www.oilshell.org/blog/2020/06/release-0.8.pre6.html, I consider the OSH part mostly done. That is, the semantics of the language.

          I would say OSH is the most bash compatible shell by a mile… If that’s what you want, I would think it’s pretty much your only option other than bash itself. (I haven’t tested shfmt recently, but the Go runtime places some fundamental limitations as mentioned)

          There are a few remaining issues, which are mostly unimplemented stuff that existing scripts haven’t tickled. As for divergences, there’s an IFS compatibilty bug I know about, but nobody has hit it in practice.

          https://github.com/oilshell/oil/issues?q=is%3Aissue+is%3Aopen+label%3Acompatibility

          https://github.com/oilshell/oil/issues?q=is%3Aissue+is%3Aopen+label%3Adivergence

      2. 1

        Oh neat, I never noticed direnv was written in Go!

        For most Bash code out there, it should be pretty complete modulo some minor bugs you might uncover that would be easy to fix. Some of the funkier Bash features aren’t fully implemented, but if people do use them in the wild I’m happy to take a look or review PRs.

        The only other thing to keep in mind is that this is pure Go, so there’s no real process forking for subshells. This is fine for the vast majority of vanilla Bash code, but if a bash script relies on PIDs or procfs, then the behavior will change with the Go interpreter. The upside, beyond dropping Bash as a dependency, is that you have a tighter control over the interpreter.

        Edit: just noticed what you said about the stdlib script. If you file an issue I can take a look at that this weekend.

        1. 2

          Thanks for the reply. I will give the latest version a try and see if it works with the stdlib. I am quite excited to see if it works now :)

          1. 2

            For those following along, we’re continuing this discussion here: https://github.com/mvdan/sh/issues/624

      1. 2

        Thanks for your effort, even if I don’t care for shell formatters.

        Recently I thought about using your module in order to parse the shell’s interactive command line contents and adjust relative paths to match a different origin CWD so that my directory navigator’s shell integration works more seamlessly. It seemed laborious but generally doable (even if it would work only for trivial cases, yet that’s what most command lines are).

        1. 1

          Happy to give tips if you’re interested. The only immediate problem I see is that calling Go from C++ might be tricky, because the parser produces the syntax tree structure. I’m not sure what I could do to make the syntax package more accessible from other languages. There’s shfmt -tojson too, which adds some overhead but should work from anywhere.

          1. 3

            It doesn’t need to be the same program, or even shouldn’t be–the integration’s chdir needn’t succeed, so I have to launch another instance of something anyway–who cares what that’s written in. But you’ve made me realise I’ve started showing the external command line internally–again nothing popen() can’t solve.

            I’ll remember you if I ever get back to it.

        1. 10

          Embedding timezone data seems like a recipe for your binaries being out of date very very quickly. There are important user-visible changes to the database all the time: https://github.com/eggert/tz/commits/master

          1. 8

            I believe the implementation only uses the bundled tzdata when loading a time location from the system fails. So a Go program running on an up-to-date system should continue to work fine, as the bundled tzdata is just a fallback.

            1. 6

              But it will silently have bad behavior on a system without tzdata rather than failing in a way that will allow the operator to install tzdata through system package management - and get updates. If you’re somewhere with relatively stable timezones rules you might never notice that your users are getting bad timezones until they complain.

              1. 4

                You’ll still get updates through Go: just compile with the latest Go release. Most people do this already since it’s very compatible.

                1. 2

                  But the timezone db changes daily or weekly. And your OS will pull those updates automatically while Go releases are much slower and you’d need to recompile and redeploy your binary.

                  1. 7

                    The tzdata package doesn’t release daily or weekly? 2020a is from a few days ago; 2019c is from September (and there were only 3 releases in 2019).

                    1. 6

                      Remember that this is opt-in, and only a fallback. If you prefer to not risk using out of date information, don’t use the tzdata package? Though then you have to ensure that your users/machines are fully up to date.

                      1. 3

                        If I understand correctly, yes it’s technically opt-in – but there’s no easy way to opt-out if a library dependency opts-in nor a mechanism to discourage libraries from importing it? cf https://github.com/golang/go/issues/38679#issue-607112207

                        1. 2

                          Your libraries can do plenty of bad things already, though. If anything, embedding tzdata is harmless compared to some of the nasty stuff one could do in a library’s init, like modify std error variables or declare global flags.

                          I think the answer here, besides good docs, is vetting what dependencies you add to your module (and the amount of extra dependencies they bring in).

                          1. 1

                            That’s true and a fair perspective to take. I don’t care much personally, I was just trying to understand/clarify why some people resent this direction.

              2. 5

                Go prefers the system timezone database but will use the embedded data if it’s not available.
                #38017 explains the use cases that this change resolves.
                This change will mostly affect Unixlike systems without tzdata and Windows systems that don’t have Go installed.

              1. 6

                From the 17 page doc on the linker work:

                Shift work from the linker to the compiler. The linker is a bottleneck of every build graph, even for the smallest incremental change. Unlike the linker, many instances of the compiler can run in parallel or even be distributed.

                This is exciting! Linkers are mostly terrible and haven’t had significant changes in a long time - the compiler generates a lot of the data the linker could use, then throws it away, and at this point many of the historical responsibilities of the linker have been moved to the loader. I’m happy to see anyone in any language’s toolchain rework the linker.

                1. 2

                  I agree - this work is very, very exciting! I briefly discussed it with Austin Clements last summer at GopherCon, and the document they published shortly after makes a very good summary of all the changes they intend to make.

                  They knew it would take time to fully replace the old linker, but I’m pleasantly surprised that 1.15 already includes the new linker. It still behaves like the old linker in many ways, and there’s still lots to do, but it’s great progress.

                  As much as I like projects like LLVM and how easy it is to implement languages on top of it, I also think that Go is taking full advantage of the fact that it has its own compiler and linker. They can carefully fine-tune both of them to the language, making incremental build times very fast.

                1. 5

                  It seems that they are slowly but nicely improving the language, compiler, and runtime.

                  Slide 49 mentions CPU feature detection. Does Go now have intrinsics such as SIMD intrinsics? Or is this still just to be used to select which function to run that was implemented in assembly?

                  I haven’t really followed the generics discussion. Is there already an accepted proposal and an approximate ETA?

                  1. 6

                    I can’t answer either of your questions in detail, but I’ll try to give some pointers.

                    The compiler does treat some standard library APIs as intrinsics, where possible. For example, here’s how it handles math.FMA on AMD64. You can see how it generates code to check for the feature at run-time.

                    I don’t think the compiler is quite clever enough to do that kind of thing for hand-written code. It will usually work if you use pieces of the standard library that the compiler knows about like math and math/bits, but I don’t think it will magically convert uses of arrays and slices into SIMD instructions today.

                    As for generics, it does seem like they’re still working on the prototype, but no ETA is guaranteed. I imagine there are other priorities at play, especially right now.

                    1. 2

                      Thank you for the extensive answer!

                      1. 1

                        I’ll try to give some pointers.

                        Clever.

                    1. 5

                      To build on what @andyc and @ddevault said: the main purpose of the project is to expose libraries in Go, and to build tooling with them. The prime example of this is shfmt, which formats shell code.

                      The module includes an interpreter, but like others have said in this thread, full POSIX compatibility is near impossible. It’s essentially best-effort, for the purpose of being able to interpret 99% of shell code out in the wild with pure Go. This can give the developer tighter control on how arbitrary code is executed, or avoid hoops such as cgo or exec with external dependencies.

                      And of course, if POSIX compatibility is a priority, a well established shell with that goal should be used instead. I assume this is where mrsh would be a better fit.