1. 9

    A BIG oof about this. I really loved actix for what it was doing for the rust community. But I also shaked my head about how ignorant some of the creators replies seemed to be, when people tried their hardest to point out why a specific usage of unsafe wasn’t right. Still I always kept using actix, because I think that in itself it was a nice project and improving it will eventually remove all its UB. This reminds me of systemd for some reason.

    Obviously there was also too much shitstorm about actix itself, so I can’t blame them for complaining about how people started to treat actix.

    I’ll have to transition some projects. Guess I won’t have to do the async/await switch for the actix code anymore.

    1. 3

      Even if the code was unsafe and the author impermeable to the remarks, it’s his code and therefore his right. Users of his code are not entitled to anything. If the users don’t like it, they can fork it and do as they please. Expecting anything more is preposterous.

      1. 2

        Are there actual exploits for any the unsafe code? Or is it the usual Rust cargo cult over reaction?

        1. 5

          Both. Issues got proven to be exploitable from userland.

          1. 9

            I wouldn’t go as far. It was proven that there’s a usage pattern that would trigger UB if the user used it this way. The pattern in itself is unlikely and probably not present in any application out there.

            So there’s no general path to exploiting actix-based applications in general.

            It’s basically equivalent to the openssl side of heartbleed: if you use openssl wrong, it is exploitable, you are still using it wrong. Given that the actix-web API didn’t seem to be intended for library clients use, it’s even less likely.

            Not arguing that it shouldn’t be fixed, but let’s be realistic about the impact.

            1. 4

              It was proven that there’s a usage pattern that would trigger UB if the user used it this way. The pattern in itself is unlikely and probably not present in any application out there

              From a security engineer’s point of view, doesn’t that constitute “completely broken”?

              1. 6

                No. You are literally flying planes or driving cars based on systems with potentials for such bugs. There’s usually expensive tooling around the “don’t do this, then” (linters and such).

                1. 4

                  User here refers to the developer using the library, not the user interfacing with the resulting web service. Under normal circumstances (that is, not a contrived example code snippet) it’s not broken, but it requires some amount of care that the problematic pattern wasn’t used somewhere in spite of it being unusual.

                  strcpy isn’t “completely broken” and for a given use of it it’s possible to reason that it is safe (if it is). Still, a security engineer would recommend (and at some point: demand) not to use it to reduce the amount of reasoning required (and remove the code smell). The issue at hand in actix is much less worrisome than strcpy in terms of being able to shoot yourself in the foot.

                  AIUI the author was interested in handling this, but on their own terms. Apparently that wasn’t enough for some and so they cranked up the internet rage engines.

                  1. 3

                    from a security engineer’s perspective we would talk about Likelihood or Difficulty (both of Discovery and of Exploitation) as well as Impact; you may see other metrics thrown in like Confidentiality, Integrity, and Availability.

                    If the user must use a specific library in a problematic way, that usually constitutes a Likelihood of Very Low/Low or a Difficulty of High; basically, yes, this could be bad (Impact of High), but the Likelihood of a user doing that may be very low. Security people who speak in absolutes are usually trying to sell you something or not very effective at their jobs.

          1. 2

            Some folks would’ve just requested a Forth or a Lisp. Then, they add whatever they want. Full environment. :)

            1. 4

              I wrote a Forth for CTFs once; it was meant to be able to easily compile statically, base64, and then upload to the host. Was a lot of fun.

            1. 9

              I can’t agree with this more. While doing olin/wasmcloud stuff I have resorted to this slightly insane method of debugging:

              • debug printf the pointer of a value you want to look for
              • immediately crash the program so none of the heap gets “contaminated”
              • pray the value didn’t get overwritten somehow (in my subjective and wholly personal experience Rust does this sometimes, it’s kind of annoying)
              • dump the webassembly linear memory/heap to a binary file
              • search for that pointer using xxd(1) and less(1)
              • grab my little-endian cheat sheet
              • hopefully derive meaning from it

              WebAssembly debuggers really need to happen, but until they do we are basically stuck in the printf debugging, black magick and tarot cards stone age, but at least it’s slightly easier to make a core dump. After a few hours though you learn to read little endian integers without needing to break out graph paper.

              1. 2

                I’ve used instrumentation tools like Wasabi for dynamic testing during client assessments (yes, we have clients working in WASM already). Also, we’ve extended Manticore, our symbolic executor to support WASM, and we have some symbolic debugging-like scripts for it internally.

                In general tho, I whole-heartedly agree; we’re seeing more and more of this sort of thing, and the tooling definitely needs to catch up.

                1. 2

                  To be fair, the changes needed on the DWARF/LLVM side are super recent; I believe I saw commits related to this in LLVM as recently as December. The debuggers will catch up, but it takes time for some of this stuff to percolate outwards, I haven’t done any testing myself in the last few months, but I suspect the browsers have experimental support on the way shortly, if not already available. It will take a bit longer for this stuff to make it into say, lldb, but not that much longer.

                  The current situation does suck for those of us needing to build stuff with WebAssembly right now, but I keep myself sane by knowing that its just growing pains - we’ll get the necessary tooling sooner rather than later.

                  1. 1

                    Call me a pessimist, but existing debugging infrastructure might not be good enough for this. We might wanna start with something like AVR debuggers or other things made for Harvard architectures.

                    1. 2

                      Presumably there are different needs. The first need is for front end / in browser use cases. For that, foo.wasm.debug could effectively include all the debug symbols, and source map, or whatever and you use the in browser debugger. That seems fine.

                      The server side is more problematic, I think, but that’s where your Harvard arch comes in. Provide a manhole cover that can be opened for the purposes of attaching a debugger to, in your execution environment, and poke and prod with step debuggers, perhaps? Still need the debug symbols and such… of course.

                      1. 1

                        Jtag into a wasm env, really this just needs to a handful of exported fn from the env itself. One could run wasm in wasm and implement it yourself in user space.

                  2. 1

                    What if your wasm env could visualize memory in real-time and set breakpoints in any fn call or memory write?

                    1. 1

                      Debuggers have two sides, symbol side and target side. Breaking/resuming and reading/writing memory is done in target side, and you can implement them in your WebAssembly environments. But that only lets you to do break 123 or print 456, not break function or print variable. The article is mostly about symbol side.

                      1. 1

                        That would solve a lot of problems, but it would also be a lot of work :)

                        I’m gonna see what I can do though.

                        1. 1

                          Here is a veritable zoo of binary visualizers

                          https://reverseengineering.stackexchange.com/questions/6003/visualizing-elf-binaries

                          I think the past of least friction is run wasm3 within the browser so that you can have 100% over execution, it will be slow ~25x, but provide first class access to all facets of execution of the code under debug.

                    1. 5

                      Besides what @nickpsecurity said, I think the other thing that has changed are that formal methods & fancier types are slowly creeping in from the edges.

                      • 20 years ago, linear & affine types were neat academic exercises, now we have at least one major programming language with them
                      • languages like ML, Haskell, &c were side curiosities, now you can often find major projects written in them
                      • tools like symbolic executors, abstract interpreters, “design by contract,” and so on are relatively normal now, not fancy wares of academic high towers
                      • property testing, fuzzing, and other types of random mutation testing aren’t seen as black arts, but rather mundane things that most people can use
                      1. 5

                        in security, especially red team, this is called “living off the land,” where an attacker or red team member uses only tools installed on the target machine to do whatever nefarious business. There are also the requisite handwringing over it as well.

                        I’ve gone so far as to write little shells whenever I see remote code executions, so that I can just use the RCE and not have a real footprint outside of the logs; for example this tiny shell installs just a small beacon on a host that has a file upload vulnerability, but I’ve repurposed tramp and shell to provide the same functionality when only an RCE is present.

                        I honestly prefer it; it’s way easier to clean up, as you must simply tell the client where the RCE was and how to fix it, rather than hoping that they also find all your files. There have definitely been hosts that when I went back to do another assessment, my old beacon was still running, and thus I had an instant Critical even tho the development team had fixed the original issue (as a side note, I approach that topic gingerly; if the original vuln is fixed and someone just missed one of my beacons, I try to get that removed prior to adding it to the report, assuming local infosec resources are on board with that. It means that the devs don’t feel like I’m scoring points and everyone is more secure because the issue was fixed anyway).

                        edit homophones, how do they work? (right => write)

                        1. 4

                          Main Workstation

                          • OS: 64bit Mac OS X 10.13.3c 17D47
                          • Kernel: x86_64 Darwin 17.4.0
                          • Shell: zsh 5.6.2
                          • Resolution: 3440x1440 | 3440x1440 | 1920x1080
                          • CPU: Intel Core i7-7700K @ 4.20GHz
                          • GPU: MSI VGA Graphic Cards RX 580 ARMOR 8G OC
                          • RAM: 64GB
                          • Keyboard: Redragon Kumara with Cherry Reds
                          • Mouse: Corsair Harpoon RGB
                          1. 4

                            Not sure where to look. I get itchy having more than 5 tabs open, let alone that many screens begging for my attention. Kudos to you for being able to handle all that.

                            1. 2

                              I’m with you; most of the time I’m fine hacking on my little MacBook Air, and I generally use things in full screen mode.

                              • Work: MBP 15”
                              • Personal Travel/Bumming Around: MBA (Retina 2019)
                              • Personal Home: MBP 15” (2012)

                              my home machine is the only one with two monitors, and they are:

                              • a 25” that only displays code
                              • the 15” that only displays Slack & Chrome
                            2. 2

                              I’m actually most impressed by the soundproofing setup you have, although I get the impression that it’s meant to kill echos for videoconferencing, rather than mute outside noises, right?

                              1. 1

                                Correct, the setup is meant to kill echos. Soundproofing to mute outside noise is significantly more difficult and costly but I eventually want to get there.

                            1. 23

                              FTFY: “A plea to developers everywhere: Write Junior Code”

                              Let’s get bogged down with how much simple code we write.

                              God, I wish every developer would make an effort to write simple code.

                              1. 7

                                I don’t disagree with you at all, but Haskell does have a bit of a spiral problem with these sorts of things; often folks writing even simple Haskell programs end up using very exotic types that are abstruse to more junior devs (or even more senior devs who just haven’t looked at, say, lenses before). I have this tweet about a simple dialect of Haskell saved because I think about this often when interacting with Haskell code.

                                1. 8

                                  Those exotic types describe complexity that is present in other languages as well. However, in other languages, you do not need the type checker’s permission to introduce complexity. Instead, you discover this complexity after the fact by debugging your program.

                                  It is questionable whether the Haskell approach is as wise as it is clever. At least to me, it does not seem very suitable for writing what the original post calls “junior code”. Consider some of Haskell’s main features:

                                  • Purity and precise types:

                                    • Benefit: You can use equational reasoning to understand the complexity in your code.
                                    • Drawback: You cannot ignore the complexity in your code, even when it does not matter to you.
                                  • Lazy evaluation:

                                    • Benefit: It is easy to write programs that manipulate conceptually large data structures, but in the end only need to inspect a tiny part of them.
                                    • Drawback: It is difficult to track the sequence of states resulting from running your program.
                                  • Higher-kinded types:

                                    • Benefit: It possible to abstract not only over concrete types, such as Int or String, but also over “shapes of data types”, such as List or Tree (leaving the element type unspecified).
                                    • Drawback: Oftentimes, type errors will be an unintelligible mess.

                                  It is ultimately a subjective matter whether these are good tradeoffs.

                                  1. 6

                                    often folks writing even simple Haskell programs end up using very exotic types

                                    … abstruse …

                                    🤔

                                  2. 1

                                    Isn’t a large aspect of Java and C# that they force you to write simple code? Then they get called “blub” languages or whatever. The reality is that you should write for whoever your audience is. Explaining everything such that a six-year old can understand it requires an inordinate amount of effort and without picking a target audience this is what your suggestion devolves into.

                                    1. 7

                                      Isn’t a large aspect of Java and C# that they force you to write simple code?

                                      No. C# has had type inference, covariant and contravariant generics, opt-in dynamic typing as distinct from type inference, lambdas, value variables, reference variables, checked and unchecked arithmetic, and G–d knows what else I’m forgetting since at least the late 2000s. Java’s missing some of that (although less and less recently), but adds to it things like implicit runtime code generation, autoboxing, and a bunch of other stuff. Neither language is intrinsically simple.

                                      But that said, I don’t honestly know that they’re honestly much more complicated than most languages, either. They’re more complicated than Go, maybe, but I don’t even know for sure if they’re more complicated than Python. The thing is that Java projects—at least, the “enterprise” ones for which the language has become famous—go crazy with complexity, despite—and often at odds with—the underlying language. There’s nothing preventing Python from doing absolutely crazy things, for example, and people who remember pre-1.0 versions of Django might recall when it used metaclasses and what would now be importlib to make one hell of a lot of magic happen in model classes. But the community rejects that approach. The Java community, on the other hand, is happy to go crazy with XML, factories, and custom class loaders to roam way into the Necronomicon of software development. I tend to regard this as the ecosystem, rather than the language, going to the extreme.

                                      Haskell in practice, to me, feels like what C# or Java code taken to the extreme would look like. And there’s even indeed libraries like language-ext for C# or Arrow (which is for Kotlin, but same difference), which do go there, with (IMVHO) disastrous results. (Disclaimer: I work heavily on an Arrow-based code base and am productive in it, albeit in my opinion despite that comment.) This is also an ecosystem decision, and one that I think this article is rightfully and correctly railing against.

                                      1. 5

                                        There’s nothing preventing Python from doing absolutely crazy things, for example, and people who remember pre-1.0 versions of Django might recall when it used metaclasses and what would now be importlib to make one hell of a lot of magic happen in model classes. But the community rejects that approach.

                                        I don’t think that’s true at all. The difference is that Python has good abstractions, so if you want to do something complex under the hood, you can still expose a simple interface. In fact, Python programmers would much rather use something with a simple interface and complex internals than the other way around. That’s why they’re using Python!

                                        1. 4

                                          I’m not sure we’re disagreeing, except for I think you’re implying that Java and C# lack an ability to expose something with complex internals and a simple interface. I’m logging off tech for the weekend, but Javalin is a great example of a Java framework that’s on par with Flask in terms of both simplicity and power, and done with 100% vanilla Java. It’s just not popular. And the reason I cited early versions of Django for Python is specifically because the community felt that that tradeoff of a simple interface for complex internals went too far. (If you have not used way-pre-1.0 versions of Django, it did Rails-style implicit imports and implicit metaclasses. We are not talking about current, or even 1.0, Django here.)

                                          In other words, I think you’re making my point that this is about culture and ecosystem, not language in the abstract. Which is also why this article is making a plea about how to write Haskell, and not about abandoning Haskell for e.g. OCaml.

                                          1. 4

                                            Ah right yes I see about the Django thing. I was thinking about how it uses them now. I wasn’t aware it did import magic before, that definitely sounds a bit much!

                                    2. 1

                                      I used to use juxt and comp and partial quite a bit in my Clojure code, but these days I try to avoid them. They’re clever, they’re fun, they’re succinct… but they can also make it harder for the next person who comes along if they’re not already a Clojure hotshot.

                                      1. 6

                                        That’s setting a pretty low bar, isn’t it? Partially applying functions isn’t exactly whizz-bang fancy-pants programming in a Lisp.

                                        1. 2

                                          And yet, there’s usually another way to write it that’s more clear to someone not as familiar with Lisps.

                                          (I’m not saying “never use these”. There are definitely times when it’s more awkward to use something else.)

                                          1. 3

                                            Function composition is the most fundamental functional programming concept as far as modularity is concerned, and partial application is not far behind. They are not specific to Lisps. juxt is slightly more “clever,” but nonetheless provides a ton of utility, is a part of the core library, and should not be shied away from. Talking about avoiding these functions without explicit examples or clear criteria is pointless.

                                            Do you disapprove of any macro usage in your Clojure code? Are transducers out? What about core.async? I’ve seen more “clever” and confusing code written using those features than with any of the functions you’ve listed. For that matter, the worst (all?) Clojure codebases tend to be agglomerations of layer after layer of “simple” map-processing functions which are impossible to grasp in the aggregate and incredibly frustrating to debug. This is evidence of a general lack of coherent system-level thinking, versus any specific features in Clojure being responsible for complex, unmaintainable code.

                                            The guidelines for writing clean, simple, maintainable code are never so straightforward such that they can be stated pithily, to the chagrin of Rich Hickey true-believers everywhere. It’s a combination of figuring out what works for a given team, adopting conventions and architecture well-suited to the domain, and choosing an environment and libraries to integrate with so that you introduce as little friction as possible (and probably more that I’m forgetting, unrelated to the choice of language). But picking and choosing arbitrary functions to eschew will not get you very close to the goal of writing simple code.

                                            1. 2

                                              I think you’re taking this a lot farther than what I actually said.

                                              1. 2

                                                I’m sorry, I was trying to respond systematically to a comment I disagreed with. If you wouldn’t mind: how exactly did I take it too far?

                                                1. 1

                                                  Well, I didn’t say “don’t use these”, I said that I “try to avoid them”. I don’t always succeed in that, and I’m happy to use them where they make sense.

                                                  There’s a continuum between “can’t avoid it” and “totally gratuitous” and I try to push my personal cutoff towards the left, there. When it would make the code harder to read, I don’t avoid them!

                                                  1. 1

                                                    Well, I didn’t say “don’t use these”, I said that I “try to avoid them”. I don’t always succeed in that, and I’m happy to use them where they make sense.

                                                    Why do you try to avoid using them? When does it make sense to use them?

                                    1. 9

                                      Seems that a big part of the justification in the article is to support small stacks for goroutines. Go is not the only language with green threads but seems fairly atypical in its bypassing of libc, how do other languages with green threads handle this?

                                      1. 5

                                        I think Go is unusual in that it uses a custom ABI to support the green threads internally, somewhat like Haskell.

                                        In typical cases where you’re not doing something like “Cheney on the MTA” style stack manipulation, you can just use setjmp and longjmp for management of the stack, and let the system figure out restoring. Go does quite a bit of stack manipulation in order to handle this stuff.

                                        1. 5

                                          “Cheney on the MTA” style stack manipulation,

                                          Please, please elaborate on this.

                                          1. 18

                                            sigh I accidentally reloaded the page when I had a bunch of links queued up for you hahaha 😭

                                            However, sorry for not elucidating on that reference! “Cheney on the MTA” is a garbage collector style that uses the stack to allocate objects rather than the heap. To do this, Cheney uses a set of function calls and callbacks, so as to “clear” the stack under certain conditions. The original paper by Henry Baker is super interesting, and yes this is yet another Henry Baker creation (if you’re not familiar, Henry Baker is/was a big Lisp and ML-the-language person who designed a number of interesting things for those languages). You can see this in Baker’s paper:

                                            object foo(env,cont,a1,a2,a3) environment env; object cont,a1,a2,a3;
                                            {int xyzzy; void *sp = &xyzzy; /* Where are we on the stack? */
                                             /* May put other local allocations here. */
                                             ...
                                             if (stack_check(sp)) /* Check allocation limit. */
                                                {closure5_type foo_closure; /* Locally allocate closure with 5 slots. */
                                                 /* Initialize foo_closure with env,cont,a1,a2,a3 and ptr to foo code. */
                                                 ...
                                                 return GC(&foo_closure);} /* Do GC and then execute foo_closure. */
                                             /* Rest of foo code follows. */
                                             ...
                                            }
                                            

                                            That GC call with a closure and such is used to actually clean the stack of Garbage. Chicken Scheme use(s|d) it to great effect, but it meant that calling into Chicken code was slightly more complex, because it’s not exactly the same calling convention: there are extra parameters and such that need to passed in, and many things are actually linked around callbacks. (this is the most hand wavy explanation of this, without going in to details, and if Christian is still on lobste.rs he will likely correct me…)

                                            Now, as related to Go, Go also uses it’s own calling convention and ABI internally. This has at times clashed with certain optimization techniques, because it’s not exactly register based, and like Haskell or Chicken Scheme, it’s not exactly trivial to call all code from C sources because of this change in ABI. This document here is one of the better references for Golang’s calling conventions, at least for x86/64; it’s not terrible, and there are reasons for it, but it does make things non-trivial; for example Osiris Lab at NYU just added Golang support to Ghidra, because it is slightly different from what other languages and stacks use.

                                            That was a lot, but I hope that made the reference more clear? Basically, there are systems like Cheney and others that manipulate the C stack via various means to make certain things easier, like Garbage collection. Golang does the same, but for purposes of making goroutines lighter weight.

                                            1. 4

                                              Thank you so much for the explanation!

                                              I thought it was going to be some slang/metaphor for like GC as it would be handled by a certain former US vice president trying to use public transportation or something–like, shooting in the face any allocations that weren’t high enough value or something.

                                              1. 7

                                                ah, no no; Cheney therein is a reference to reference to C.J. Cheney, best known for Cheney’s Algorithm, a stop-and-copy algorithm from the 70’s. Baker was making a song reference (“Charlie on the MTA”) and alluding to Cheney’s algorithm.

                                                1. 7

                                                  I’m a firm believer in that knowing the history and culture of a subject is essential to understanding it, and I do not know of another website where I could get this kind of insight into computing. Thank you.

                                                  1. 3

                                                    I also think that, esp. in computer science and to a lesser degree in software engineering, we have a tendency to not “survey the literature” regarding what we’re doing. So we end up very often recreating the same modalities as other pieces of software or algorithms without realizing this.

                                                    I’m very big into the history of programming languages, since I’ve read so many papers and books on the topic, and it’s always interesting to see how often we recreate things we had in the 70’s for things. A similar comment applies to operating systems, to a lesser degree, since we generally create fewer of them.

                                                    1. 2

                                                      In the large, and the small.

                                                      One of the characteristics of the best programmers I’ve worked with is a reflex to check whether the problem they’re dealing with is a solved one.

                                                      Another is to have the chops to solve the problem if it isn’t :)

                                                      Yet another is to check whether their eventual solution - either their own, or a previously invented wheel - is idiomatic.

                                                      All but one are problems solved with a search engine and a few good books, for most of the problems most teams encounter. And yet I lose count of the number of times people haven’t checked, and it’s been a mess, and the inheritors of the mess have wondered why the original implementers reinvented that particular wheel. As an octagon.

                                      1. 3

                                        I don’t understand this post at all. In real life, what matters is how long it takes to determine if one number is the correct one (the eval function), and if that process leaks any information via timing or energy consumption or similar. What’s the point of all this? If you can control the oracle and re-write it for SIMD, why do you need to guess a number?

                                        1. 11

                                          I tried to address this in the blog post since I expected a question like this, but obviously I didn’t make my point.

                                          So the goal of this wasn’t to offer a new (and expensive) way to find whether an oracle verifies a 64-bit number.

                                          The point was to show that problems that at first seem insurmountable or impractical may be well within reach, and that spending some time looking at available hardware resources and how to formulate the problem can make huge difference over the naive approach.

                                          I can’t quite remember the details, but the original impetus for this came from a real issue. We wanted to know which would be faster: to run some analysis passes or just brute force the answer. This led to a discussion about how hard it is to brute force a 64-bit compare, with the original consensus being “wait until the end of the universe”. Obviously, it doesn’t take that long even for a naive approach, so then I got curious about exactly how insurmountable the “guess a 64-bit number” problem was.

                                          1. 2

                                            I can’t quite remember the details, but the original impetus for this came from a real issue.

                                            iirc, it was random numbers used as verifiers; we had someone that we were attempting to persuade to use 128bits for something, and this came up.

                                            note Artëm and I work together!

                                            1. 1

                                              Sorry I missed the point, I was just really confused about how/why one would even think to ask the question in particular.

                                              In my defence, the question basically boils down to “how quickly can I search for a number that equals 7”

                                            2. 7

                                              What’s the point of all this?

                                              Rule of cool, man, rule of cool.

                                              1. 2

                                                @artem says below, it did actually come from a real life issue, so whilst cool, it definitely was more of our “I wonder if…” line of thinking.

                                            1. 2

                                              I don’t know much about Nim, but I’m pretty tired of the implication that scripting languages can’t be compiled. Nearly every well-known scripting language has a compiler; stop perpetuating 1990s stereotypes.

                                              1. 5

                                                It’s funny how you say “1990s stereotypes” when PHP didn’t have a compiler until 2010.

                                                1. 7

                                                  While a fair criticism, 2010 was nearly a decade ago. There are certainly many of us who still think of the nineties as merely a decade or so ago.

                                                  1. 3

                                                    One’s perception doesn’t matter, 2010 is 9 years ago (9 and a half if we consider February 2010, the release date of Hip Hop), and calling it “1990s stereotypes” to bring forward an incorrect and overblown statement is not a valid move.

                                                    1. 2

                                                      Perceptions definitely matter when it comes to language and communication. This is because you must communicate such that it is received by the listener’s expectations. A speaker who communicates to the listener’s perceptions will be far more effective. 90’s stereotypes for example may not have been meant to be taken so literally. Now you can say as a listener, “It didn’t work for me, I took it literally” and that’s a fair criticism, however you’re not the only listener, and not every listener will receive it the way you did.

                                                  2. 4

                                                    Ruby was a tree-walking interpreter until 2009 (based on this page).

                                                    Python is said to be compiled but has, and will continue to have, a “Global Interpreter Lock”.

                                                    @technomancy is getting way too overwrought over the imprecise but useful phrase, “compiled language”. Wikipedia: “The term is somewhat vague.” There is a useful separation here. Until we come up with a better name for it, let’s be kind to usage of “compiled language”.

                                                    1. 10

                                                      Python is said to be compiled but has, and will continue to have, a “Global Interpreter Lock”.

                                                      These things are not in opposition to each other. OCaml has a native code compiler, and its runtime system has a GIL (a misnomer: a better name is “global runtime lock”).

                                                      1. 1

                                                        Interesting!

                                                      2. 4

                                                        imprecise but useful phrase

                                                        What do you find this phrase to be useful for?

                                                        You can see in the comments below that the author’s intent was not to describe programs that compile to machine language, but actually to describe programs that can be distributed as single-file executables. (which is also true of Racket, Lua, Forth, and many other languages)

                                                        So it seems that even by declaring “compiled” to mean “compiled to machine language” we haven’t actually achieved clear communication.

                                                        1. 2

                                                          I’m actually rather interested in how you get single file executables with Lua? Is there a tool that makes it easy, or is it something out of the way?

                                                          EDIT: I know how to get single file executables with Love2d, and I vaguely recall that it can be done with Lua outside of Love, but it’s certainly not an automated/well-known thing.

                                                          1. 4

                                                            Many people simply embed Lua in their C programs (that was the original use case it was designed for) but if you’re not doing that you can streamline it using LuaStatic: https://github.com/ers35/luastatic/

                                                          2. 2

                                                            You’re pointing out a second way in which it is imprecise. I’m pointing out that for me – and for a large number of people who don’t know all the things you do – it was useful since it perfectly communicated what the author meant.

                                                            1. 3

                                                              Oh, interesting, so you mean to say that when you read “compiled language” you took it to mean “produces a single-file executable”?

                                                              This is honestly surprising to me because most of the time I see “compiled language” being misused, it’s put in contrast to “interpreted language”, but in fact many interpreted languages have this property of being able to create executables.

                                                              It’s just a tangled mess of confusion.

                                                              1. 5

                                                                Indeed it is. I didn’t mean that I understood “produces a single-file executable.” I meant that he’s pointing towards a divide between two classes of languages, and I understood the rough outline of what languages he was including in both classes.

                                                                Edit: I can’t define “compiled language” and now I see it has nothing to do with compilation. But I know a “compiled language” when I see it. Most of the time :)

                                                                1. 3

                                                                  Perhaps a good way to put it is “degree of runtime support that it requires”. Clearly, normal usage of both Nim and Python requires a runtime system to do things for you (e.g. garbage collection). But Nim’s runtime system does less for you than Python’s runtime system does, and gets away with it mostly because Nim can do many of those things at compile time.

                                                                  1. 4

                                                                    Even if a language retrofitted a compiler 20 years ago, it’s hard to move away from the initial programming UX of an interpreted language. Compilation time is a priority, and the experience is to keep people from being aware there’s a compiler. With a focus on UX, I think all your examples in this thread have a clear categorization: Perl, Python, Ruby, Racket, Lua, Node and Forth are all interpreted.

                                                                    1. 4

                                                                      I would frame it differently; I’d say if a language implementation has tooling that makes you manually compile your source into a binary as a separate step before you can even run it, that’s simply bad usability.

                                                                      In the 90s, you had to choose between “I can write efficient code in this language” (basically C, Pascal, or maaaaybe Java) vs “this language has good-to-decent usability.” (nearly everything else) but these days I would like to think that dichotomy is dated and misguided. Modern compilers like rust and google golang clearly are far from being “interpreted languages” but they provide you with a single command to compile and run the code in one fell swoop.

                                                                      1. 4

                                                                        I’m super sympathetic to this framing. Recent languages are really showing me how ill-posed the divide is between categories like static and dynamic, or compiled and interpreted.

                                                                        But it feels a bit off-topic to this thread. When reading what somebody else writes my focus tends to be on understanding what they are trying to say. And this thread dominating the page is a huge distraction IMO.

                                                                        I also quibble with your repeated invocation of “the 90s”. This is a recent advance, like in the 2010s. So I think even your distraction is distractingly phrased :)

                                                        2. 3

                                                          Are you not barking at the wrong tree ? I don’t find a line where the author implies anything close to it.

                                                          1. 0

                                                            I was referring to the title of the post; as if “scripting ease in a complied language” is not something provided by basically every scripting language in existence already.

                                                            1. 3

                                                              Specifically, most scripting languages make it nontrivial to package an executable and move it around the filesystem without a shebang line. On Linux, this isn’t a huge issue, but it’s convenient to not have to deal with it on Windows

                                                              1. 1

                                                                OK, but that has next to nothing to do with whether there’s a compiler or not. I think what you’re talking about is actually “emits native code” so you should like … say what you mean, instead of the other stuff.

                                                                1. 1

                                                                  Fair enough.of a point, I suppose. Many people use “compiled” vs “interpreted” to imply runtime properties, not parsing/compilation properties, it isn’t exactly the proper definition.

                                                                  I’ll try to be more precise in the future, but I would like a term for “emits native code” that is less of a mouthful.

                                                          2. 3

                                                            Scripting languages can be compiled, but, out of Python, Ruby, Tcl, Perl and Bash, most of them are by default written in such a way that they require code to follow a certain file structure, and if you write a program that is bigger than a single file, you end up having to lug the files around. I know that Tcl has star deploys, and I think that’s what Python wheels are for. Lua code can be wrapped into a single executable, but it’s something that isn’t baked into the standard Lua toolset.

                                                            1. 3

                                                              I think it would be helpful to a lot of people if you could give examples for node, Ruby, python, … Maybe you’re just referring to the wrong usage of the word compiler ?

                                                              EDIT: typo

                                                              1. 1

                                                                For python there is Nuitka as far as I know.

                                                                1. -1

                                                                  Node, Ruby, and Python are all all typically compiled. With Ruby and Node it first goes to bytecode and then usually the hotspots are further JIT-compiled to machine code as an optimization pass; I don’t know as much about Python but it definitely supports compiling to bytecode in the reference implementation.

                                                                  1. 7

                                                                    When talking about “compiled languages”, people typically mean “AOT compiled to machine code”, producing a stand-alone binary file. Python’s official implementation, CPython, interprets bytecode and PyPy has a JIT compiler. V8 (the JS engine in Node) compiles JavaScript to a bytecode and then both interprets and JIT compiles that. Ruby has a similar story. The special thing about Nim is that it has the same ease of use as a “scripting language” but has the benefits of being AOT compiled with a small runtime.

                                                                    1. 1

                                                                      The special thing about Nim is that it has the same ease of use as a “scripting language” but has the benefits of being AOT compiled with a small runtime.

                                                                      The idea that only scripting languages care about ease of use is just plain outdated.

                                                                      In the 1990s it used to be that you could get away with having bad usability if you made up for it with speed, but that is simply not true any more; the bar has been raised across the board for everyone.

                                                                2. 3

                                                                  Some languages like Dart [1] have first class support for both interpreting and compiling. I dont think its fair to say something like “some random person has a GitHub repo that does this” as being the same thing.

                                                                  1. https://github.com/dart-lang/sdk
                                                                  1. 0

                                                                    That’s the whole point; Dart has a compiler, and any language that doesn’t is very unlikely to be taken seriously.

                                                                    1. 1

                                                                      The point is:

                                                                      Nearly every well-known scripting language has a compiler

                                                                      That may be true, but nearly every well-known scripting language doesnt have an official compiler

                                                                      1. -2

                                                                        Also false; Ruby’s official implementation has had a compiler (YARV) since the 1.9 days; Node.js has used the v8 JIT compiler since the beginning, (not to mention TraceMonkey and its descendants) and python has been compiling to .pyc files for longer than I’ve been a programmer.

                                                                        According to this, Lua has had a compiler since the very beginning: https://www.lua.org/history.html I don’t know much about Perl, but this page claims that “Perl has always had a compiler”: https://metacpan.org/pod/perlcompile

                                                                        The only exception I can think of is BASIC, and that’s just because it’s too old of a language for any of its numerous compilers to qualify as official. (edit: though I think Microsoft QuickBasic had a compiler in the 1980s or early 90s)

                                                                        1. 6

                                                                          QuickBasic compiled to native code, QBasic was an interpreter, GWBasic compiled to a tokenized form that just made interpretation easier (keywords like IF were replaced with binary short codes)

                                                                1. 6

                                                                  Another thing that was done early on was dropping cross-platform support. We used LuaVela for projects which ran on x86-64 Linux only and it was difficult for our small team to try to support all the other platforms.

                                                                  It seems to me every LuaJIT fork makes this same choice, which is understandable but kind of sad. RaptorJIT is another x86-64 Linux only fork.

                                                                  1. 2

                                                                    which is understandable but kind of sad.

                                                                    ja, agreed; I do realize that it’s non-trivial to construct JITs, but at least aarch64/ARM would be p useful for many folks.

                                                                    1. 4

                                                                      As I understand, even upstream LuaJIT does not have 64-bit ARM port, only 32-bit ARM port.

                                                                      1. 2

                                                                        ja, which is a shame, since ARM (esp ARM64) is a p common arch nowadays, and will only increase to be common, even for developer machines.

                                                                        1. 1

                                                                          It absolutely does in 2.1 beta.

                                                                      2. 1

                                                                        https://github.com/siddhesh/LuaJIT is the good fork for portability, it merges 64-bit POWER support and a bunch of various fixes, including my tiny patch :)

                                                                      1. 2

                                                                        Beautiful typesetting, though I’m not 100% convinced on the line number thing.

                                                                        On topic, is AAM or such static analyses used in practice or is it just an academic thing? It seems like an interesting field.

                                                                        1. 1

                                                                          seriously, completely agreed about the visual presentation.

                                                                          wrt AAMs and the like, we definitely use them at work a bit, but it’s not in any of the production tools we currently ship; our research tools & ideas definitely pass in that direction, tho over time those will be pushed down more towards our production tooling.

                                                                        1. 2

                                                                          I’ve never seen ED. Can you post some shots? I am curious about how it looks. :-)

                                                                          1. 3

                                                                            ED doesn’t really have much of a user interface. Unlike tine, it doesn’t even have a status line.

                                                                            Also, despite the Amiga having a GUI, ED is purely a console application (as it was under TRIPOS), so any window you see is purely the hosting shell or console device. The Amiga had what was called the console device (CON:), and opening a path into the device would open a new console window on the screen. For example

                                                                            ED WINDOW CON:10/10/100/100/MyEdWindow
                                                                            

                                                                            would open a new console window at (10,10) of size 100x100, titled “MyEdWindow” and ED would use that window. You could also use * as the window specification:

                                                                            ED WINDOW *
                                                                            

                                                                            which would cause ED to use the same shell window it was launched from. Another option was AUX:, which would cause ED to use the serial port as its console.

                                                                            The versions of ED shipped with AmigaDOS 2.x and later did allow you to dynamically add menus via the SI (Set Item) and EM (Enable Menus) commands, with the menu selections being bound to ED extended commands. Those menus were drawn by the operating system, but as far as ED was concerned, you were just sending it extended commands.

                                                                            In the versions released with AmigaDOS 2.x and later, ED also had an ARexx port. This meant that you could write ARexx scripts that would manipulate ED sessions (by sending them extended commands as strings). ARexx was really one of the best features of the Amiga environment: you could very easily tie together applications from multiple vendors via complex ARexx scripts and it would just work happily. It was also nice having a de facto standard macro language for any given application (though some programs did also include their own macro language).

                                                                            1. 2

                                                                              I was curious as well, since I never really got time on the Amiga (which was a shame; I’ve looked at things like AROS before, but only in a facile way, same with AtariST). I did find the following tho, which gives a pretty good overview as to the workings:

                                                                              Screenshots seem to be mainly of ARexx terminals or MicroEMACS, neither of which are useful to see how Amiga ED worked.

                                                                              Incidentally, I’ve always wanted to work more with TRIPOS, which I know heavily influenced AmigaDOS.

                                                                              edit: AmigaLove apparently has a screenshot buried in a forum.

                                                                              1. 2

                                                                                Incidentally, I’ve always wanted to work more with TRIPOS, which I know heavily influenced AmigaDOS.

                                                                                AmigaDOS basically was TRIPOS. The original Amiga operating system wasn’t going to be ready for launch, and so MetaComCo (who wrote ED and a host of other tools) was hired to port TRIPOS to the Amiga, where it formed the DOS portion of the operating system.

                                                                                TRIPOS was written in BCPL, and when programming for the Amiga you had to be aware of the different alignment requirements of the AmigaDOS subsystem (everything in BCPL had to be double-word aligned). IIRC you also had to do some string conversion magic when dealing with AmigaDOS as it used Pascal-style length-prefixed strings but the rest of the system used C-style strings. I could be misremembering though.

                                                                                By the time of AmigaOS 2.x, AmigaDOS had been rewritten in C but still had the funky alignment requirements for compatibility purposes.

                                                                                If you’re interested in TRIPOS, you can run it today via Martin Richards’s Cintpos system. Note that it does not include ED, which was a separate product from MetaComCo.

                                                                                1. 2

                                                                                  Yes! Cintpos is so cool; I haven’t run it in anger before, but I’ve definitely seen it (we had a system that originated on a TRIPOS-alike way back when I worked in physics publishing)

                                                                                2. 2

                                                                                  edit: AmigaLove apparently has a screenshot buried in a forum.

                                                                                  From TFA: “I never use that piece of junk built in editor.”

                                                                                  :(

                                                                              1. 11

                                                                                I really don’t want to spend my free time tracking down how the latest kernel pulls in additional functionality from systemd that promptly breaks stuff that hasn’t changed in a decade, or how needing an openssl update ends up in a cascading infinitely expanding vortex of doom that desperately wants to be the first software-defined black hole and demonstrates this by requiring all the packages on my system to be upgraded to versions that haven’t been tested with my application.

                                                                                I find it impossible to continue reading after this. Nobody is forced to run Gentoo or Arch Linux on a production server, or whatever the hipster distribution of the day is. There are CentOS and Debian when some years of stability are required. More than any of the BSDs offer.

                                                                                1. 3

                                                                                  Well, the rest also mentions apt-hell with debian and package upgrading.

                                                                                  Can you elaborate on the last sentence?

                                                                                  1. 10

                                                                                    Well, the rest also mentions apt-hell with debian and package upgrading.

                                                                                    I read that section now… it seems to imply you are forced to update Debian every year to the latest version otherwise you don’t get security updates. Does the author even know Debian? apt-hell? Details are missing. I’m sure you can get into all kinds of trouble when you fiddle with (non official) repositories and/or try to mix&match packages from different releases. To attempt this in production is kinda silly. Nobody does that, I hope :-P

                                                                                    Can you elaborate on the last sentence?

                                                                                    I’m not aware of any BSD offering 10 year (security) support for a released version, I’m sure OpenBSD does not, for good reason, mind you. It is not fair to claim updates need to be installed “all the time” as the poster implies and will result in destroying your system or ending up in “apt-hell”. Also, I’m sure BSD updates can go wrong occasionally as well!

                                                                                    I’m happy the author is not maintaining my servers on whatever OS…

                                                                                    1. 18

                                                                                      I read that section now… it seems to imply you are forced to update Debian every year to the latest version otherwise you don’t get security updates.

                                                                                      We have many thousands of Debian hosts, and the cadence of reimaging older ones as they EOL is painful but IMO, necessary. We just about wrapped up getting rid of Squeeze, some Wheezy hosts still run some critical shit. Jessie’s EOL is coming soon and that one is going to hurt and require all hands on deck.

                                                                                      Maybe CVEs still get patched on Wheezy, but I think the pain of upgrading will come sooner or later (if not for security updates, then for performance, stability, features, etc.).

                                                                                      As an ops team it’s better to tackle upgrades head on, than to one day realize how fucked you are, and you’re forced to upgrade but you’ve never had practice at it, and then you’re supremely fucked.

                                                                                      And, yes, every time I discover that systemd is doing a new weird thing, like overwriting pam/limit.d with it’s own notion of limits, I get a bit of acid reflux, but it’s par for the course now, apparently.

                                                                                      1. 3

                                                                                        This is a great comment! Thanks for a real-world story about Debian ops!

                                                                                        1. 5

                                                                                          I have more stories if you’re interested.

                                                                                          1. 3

                                                                                            yes please. I think it’s extremely interesting to compare with other folks’ experiences.

                                                                                            1. 7

                                                                                              So, here’s one that I’m primarily guilty for.

                                                                                              I wasn’t used to working at a Debian shop, and the existing tooling when I joined was written as Debian packages. That means that to deploy anything (a Go binary e.g. Prometheus, a Python Flask REST server), you’d need to write a Debian package for it, with all the goodness of pbuilder, debhelper, etc.

                                                                                              Now, I didn’t like that - and, I won’t pretend that I was instrumental in getting rid of it, but I preferred to deploy things quicker, without needing to learn the ins and outs of Debian packaging. In fact, the worst manifestation of my hubris is in an open source project, where I actually prefer to create an RPM, and then use alien to convert it to a deb, than to natively package a .deb file (https://github.com/sevagh/goat/blob/master/Dockerfile.build#L27) - that’s how much I’ve maneuvered to avoid learning Debian packaging.

                                                                                              After writing lots of Ansible deployment scripts for code, binaries, Python Flask apps with virtualenvs, etc., I’ve learned the doomsday warnings of the Debian packaging diehards.

                                                                                              1. dpkg -S lets you find out what files belong to a package. Without that, there’s a lot of “hey, who does /etc/stupidshit.yml belong to?” all the time. The “fix” of putting {% managed by ansible %} on top is a start, I guess.
                                                                                              2. Debian packages clean up after themselves. You can’t undo an Ansible playbook, you need to write an inverse Playbook. Doing apt-get remove horrendous-diarrhea-thing will remove all of the diarrhea.
                                                                                              3. Doing upgrades is much easier. I’ve needed to write lots of duplicated Ansible code to do things like stat: /path/to/binary, command: /path/to/binary --version, register: binary_version, get_url: url/to/new/binary when: {{ binary_version }} < {{ desired_version}}. With a Debian package, you just fucking install it and it does the right thing.

                                                                                              The best of both worlds is to write most packages as Debian packages, and then use Ansible with the apt: module to do upgrades, etc. I think I did more harm than good by going too far down the Ansible path.

                                                                                              1. 1

                                                                                                Yeah, this is exactly my experience. Creating Debian packages, correctly, is very complicated. Making RPM packages is quite easy as there’s extensive documentation on packaging software written in various languages. From PHP to Go. On Debian there is basically no documentation, except for packaging software written in C that is not more complicated than hello_world.c. And there are 20 ways of doing something, I still don’t know what the “right” way is to build packages that works similar to e.g. mock on CentOS/Fedora. Aptly seems to work somewhat, but I didn’t manage to get it working on Buster yet… and of course it still doesn’t do “scratch” builds on a clean “mock” environment. All “solutions” for Debian I found so far are extremely complicated, no idea where to start…

                                                                                                1. 1

                                                                                                  FreeBSD’s ports system creates packages via pkg(8) which has a really simple format. I have lots many months of my life maintaining debian packages and pkg is in most ways superior to .deb. My path to being a freebsd committer was submitting new and updated packages, the acceptance rate and help in sorting out my contributions was so much more pleasurable than the torturous process that I underwent for debian packages. Obviously everbody’s experience is different, and I’m sure there are those who have been burned by *BSD ports zealots too.

                                                                                                  Anyway it’s great to see other people who also feel that 50% of sysadmin work could be alleviated by better use of packages & containers. If you’re interested in pkg, https://hackmd.io/@dch/HkwIhv6x7 is notes from a talk I gave a while back.

                                                                                  2. 1

                                                                                    Ive been using the same apps on Ubuntu for years. They occasionally do dumb things with the interface, package manager, etc. Not much to manage, though. Mostly seemless just using icons, search, and the package manager.

                                                                                  1. 2

                                                                                    It’s incredible the amount of work that is put into making JavaScript fast. Makes you wonder what other work could have been done if the scripting language of the internet had been something else. (Both in terms of applications built on browsers and the browsers themselves.)

                                                                                    1. 2

                                                                                      Makes you wonder what other work could have been done if the scripting language of the internet had been something else

                                                                                      I think about this all the time at work; we deal with languages for blockchain execution, and very often the folks who wrote those languages & implementations aren’t well versed in the current technology. Subsequently, you often end up with these ad hoc & slow compilers that generate terrible code; for example, solc for the longest time generated an exponentiation opcode for what should have been a constant, which cost folks real money to execute.

                                                                                      sighs into tea cup

                                                                                    1. 2

                                                                                      It’s interesting, because this is similar to how Solidity, the Ethereum language, compiles function dispatch as well.

                                                                                      1. 4

                                                                                        Also, when large XORs are involved, cryptominisat can shine where Z3’s SAT backend chokes (due to the exponentially-sized representation of XOR in CNF).

                                                                                        We’re working on getting CMS as an alternate SAT backend to Z3 to get the best of both worlds.

                                                                                        1. 1

                                                                                          Oh that would be great; we’ve had some interesting challenges with Z3 as well, mostly for very large proofs (someone internally posted the other day that they were using 43GB of memory for Z3 for some symbolic execution…).

                                                                                        1. 4

                                                                                          Important note on this story: since the article was written, RetDec added support for 64-bit decompilation on x86 and ARM, in case you want to use KLEE but don’t have the cash to shell out for Hex-Rays

                                                                                          1. 1

                                                                                            We also have McSema and Remill to help with lifting, and are working quite a bit on KLEE, esp. KLEE-native to help there as well. RetDec is pretty nice as well, there’s some interesting use cases for it.

                                                                                          1. 1

                                                                                            I use them quite often:

                                                                                            • threat modeling: data flows, logical/physical connections, attacker path (“kill chain”)
                                                                                            • program analysis: CFGs, data flow, symbolic execution (graph path & constraints)
                                                                                            • documentation: laying out the states &c of a program, logical connection points, &c.
                                                                                            • pentesting: similar to the threat model, I’ve definitely used them to document the attack path/kill chain (credential stuffing -> unpatched terminal server -> Mimikatz -> privesc -> DA).

                                                                                            I don’t use UML as much, but I know PyTM, a threat modeling framework, uses it (and PlantUML specifically) quite extensively. I like graphviz and DOT because they’re pretty simple to parse and generate, but I have been tempted a few times given how clean the images that PlantUML generates…

                                                                                            1. 5

                                                                                              This is something I’ve been thinking about a lot in my own ML dialect; one of the things I’ve wanted to do is have variant constructors associated with their type name:

                                                                                              type OptionFoo A {
                                                                                                  Some A
                                                                                                  None
                                                                                              }
                                                                                              

                                                                                              in code you’d match against OptionFoo.Some; it actually ends up in the struct | union tag as well:

                                                                                              enum Tags_OPTIONFOO {
                                                                                                  TAG_OptionFoo_SOME,
                                                                                                  TAG_OptionFoo_NONE,
                                                                                              };
                                                                                              typedef struct OPTIONFOO_t {
                                                                                                  int tag;
                                                                                                  union {
                                                                                                      struct {
                                                                                                          A m_1;
                                                                                                      } SOME_t;
                                                                                                      struct {
                                                                                                      } NONE_t;
                                                                                                  } members;
                                                                                              } OptionFoo;
                                                                                              

                                                                                              and so on. Super interesting stuff.