“the byte stack implementation relies on using pointers to freed storage”
Wait, so they were relying on undefined behavior in the C standard that just happened to work on their target platforms? Geez. This is exactly the sort of stuff one shouldn’t be doing in C.
It seems worse than that. It sounds like this byte stack thingy was removed because of this dangling pointer issue, but then readded for some reason to get concurrency working.
I don’t know the details so I’ll refrain from judging the matter. But code using pointers like this usually ends up with a CVE number assigned to it. Big red flag.
Using dangling pointers is Not Good, without question, but I don’t think it’s likely much of a security issue in this case simply because Emacs makes no attempt to sandbox elisp code – any exploit you could write using this pointer could almost certainly be written just as easily in straight emacs lisp, which can touch anything on the host it wants with the editor’s privileges.
Pardon my ignorance, but is code the only thing that’s at risk here? I sift through tons of data in Emacs. Could data be used in some way to create an exploit? A nastily crafted email perchance? Because if that’s the case, that seems like a concern.
No, to exploit this would require running elisp.
Ah okay, in that case no worries! :)
Keep in mind that there are a bunch of perfectly reasonable implementation techniques for interpreters that are undefined behavior when written in C. Things like “I’m going to use the bottom four bits of pointers as a tag. If it’s 0, it’s actually a 60-bit integer, 1 is a heap pointer, etc.” *(val*)((uint8_t*)pointer-1) is, I’m fairly sure, undefined, but no C compiler is going to break it because it’s the job of a C compiler to be practical, not just a strict interpretation of the C standard.
So while in this example it sounds like they’re doing something silly that should be fixed, in general strict C standard conformance is a non-goal of something like Emacs.
Alignment isn’t really undefined (though you could probably make the argument that it’s architecture dependent - I’ve only fiddled with alignment on x86). If you control how an initial chunk of malloced memory is aligned, you can guarantee alignment throughout a program. A pointer is just a value then - no undefinedness there - it’s just pointing to the wrong part of the data if the tag isn’t removed.
I’m not an expert on the C standard, but I think the issue is to do with aliasing and misaligned conversions; see e.g. http://stackoverflow.com/a/28895321/499609
Some context here: The Tale Of Concurrency In Emacs
The concurrency branch is hailed as Emacs Lisp’s future, but to me it looks like a terrible blast from the past. It’s the nineties-style Java-way of concurrency: Threads, yield, mutexes, locks. And a long tearful story of races, deadlocks, pain and suffering. It’s been a single big failure in every major programming language, Java first and foremost, how come we believe that it would be a success in Emacs?
IMO this is very unfair; the problems he describes can mostly be attributed to having pre-emptive threads in a context with a lot of global mutable data. Forcing context changes to be cooperative makes it much easier to reason about what’s going to happen with multiple threads going on. It essentially makes “concurrency” opt-in rather than opt-out.
In fact it sounds much more like Lua’s coroutine model than Java’s threading one, and that’s reassuring to me.
It’s useful to think of this kind of threading as a sort of assembly language for concurrent programs. You can write using it, or you can build higher level primitives like message passing, work stealing task queues, or other more restrictive and safer primitives. But having threads and locks is still extremely valuable.
That would make sense if we were talking about actual concurrency here, but as far as I can tell there is no attempt being made to support having more than one elisp function execute at a time.
So as a mechanism for building concurrency upon it falls flat, but I think that’s a good thing, otherwise the complaints about “this is 90s Java all over again” are suddenly valid. Scoping it to only solving the “I wish Emacs didn’t block so much” problem is less ambitious, but much more achievable.
So, every story I’ve heard about the internals of Emacs characterize it as a strange horror. I don’t use Emacs and don’t have much of an understanding about it. What is it about Emacs and about the internals that makes improving the situation in general so difficult?
[Comment removed by author]
emacs turned 40 this year for people who don’t know. I don’t think I appreciated quite how old it was until I saw that.
It’s written in C, from an era where shaving bytes off things was important.
Well, the same as with other old software: it’s there and it works in a predictable fashion. It has an ecosystem.
It’s similar to TeX.
So what sort of Emacs Lisp tricks does this allow? What would the translation of Clojure (pmap inc [1 2 3 4 5]) look like in elisp?
(pmap inc [1 2 3 4 5])
If I’m reading this correctly, these improvements allow for concurrency not parallelism, which would be required for an equivalent of pmap.
It seems to have a GIL, just like Python and Ruby, whereby only one piece of elisp code may be executed at any given time.