Reading through a Forth interpreter’s source code is a great exercise. JonesForth is very clear, literate even. All the interpreters I’ve read are very ontology-recapitulates-phylogeny, i.e. there is a chronological sequence to them, like the Book of Genesis: “In the beginning was address 0. And CH*CK implemented NEXT, and he saw that it was good…”
Lots of post-Forth links at the Concatenative Languages Wiki. It’s interesting how Forth was basically reinvented (from CS-theoretical foundations) as Joy, and then lots of other languages have come from that merger.
I’m of the opinion that a really-useable Forth needs, at a minimum, stack discipline. Factor is an example — each word specifies its effect on the stack, and the compiler rejects code that doesn’t abide by that. Without that, it’s way, way too easy to mess up the stack and crash badly in ways that are horrible to debug.
The other great innovation (from Joy, I think) is “quotes”, which are nested sequences of words. At a stroke this gives you block structure, much nicer control structures, and anonymous functions and FP. I suspect these are pretty easy to add to a “classic” Forth, though I haven’t tried.
(I wrote a little nontraditional Forth core called Tails in 2021. Nontraditional in that it’s written in C++. I was mostly interested in exploring the tail-recursive continuation-passing style of interpreter.)
Chuck Moore had changed his opinion on local variables, before that post was written. Local variables are kinda harmful in Forth because they encourage bad stack hygiene, but there are enough useful instances of needing them that even Greenarray’s own Forth-like stack processor, the F18A, has two dedicated registers for local variables (iirc), accessed with the words A and B.
Blog author here. Had a quick look at the F18A reference, and from the looks of it, A and B are general purpose registers and thus closer to globals than locals. Now that it’s a few years later and I’ve finished and shipped a game in Forth, I can pretty confidently say “local variables are bad” absolutely does NOT mean “do everything on the stack and don’t name anything” - that is, in fact, terrible advice, and you will have a real bad time writing tortuously complicated code that’s impossible to keep straight if that’s how you interpret it. Instead I believe that it means “local variables hide information from other parts of the program, and that’s bad.”
Absolutely there will come a time where stack juggling is way harder to understand than just giving a value a name, but local variables have the property that no other words can reference them. So once you introduce locals, it becomes difficult or impossible to factor out pieces of logic to make the definition smaller - and as Chuck Moore says, Forth is small definitions.
Globals, however, don’t have this problem! They have some different problems - most obviously with re-entrancy, but most functions don’t actually need to be re-entrant. So most of the time you can trivially introduce a global for this purpose. The rest of the time you’d have to take a bit more care to use globals, but then it’s explicit that these special scenarios are considered and handled.
To me, the part of Forth that is the most amazing magic trick is that the entire program is visible; no part is closed off to you or hard to access. Every piece has an unambiguous name, and if you know that name, you can invoke it, you can see inside of it. The smaller the pieces are, the more names you get, and the more access you have.
Naming things is one of the two hardest things in Computer Science (the other being cache invalidation). I’d rather keep a function a bit large rather than try to come up with a name that is effectively used only once. Also, ANS Forth (to be fair, Chuck Moore has denounced it, and any attempt to standardize Forth) does include locals and :NONAME.
Globals aren’t just bad for reentrancy, they’re also bad because it can hide “spooky action at a distance.”
Also, somewhat related—Microsoft had a hard time improving MS-DOS and Windows because of all the peeking and poking into internals that people were warned not to. Microsoft has an entire blog devoted to just how bad that can be.
Oh if you want to protect the internals of your program then Forth is emphatically not the language for you.
:NONAME is useful, I implemented it in my Forth. I also implemented a variant that I spelled :||; that worked in the middle of colon definitions. I used that a lot more. Sometimes I even nested :| inside :noname.
I’m not a Charles Moore purist or anything, but I did find working with my own Forth (not just messing around but actually solving problems and building useful software) to be an incredibly empowering experience, and I was stubborn enough to do it without implementing local variables. It was also frequently a very challenging experience compared to implementing a solution in a more batteries-included environment, and there were many changes to my workflow that I would have substantially improved my life that were completely impractical to build. (I ended up building a much more flexible and capable map editor in Honeylisp, for example.) It’s very hard to convey the specific shape of that experience, but I still try sometimes.
They are closer to globals in implementation, but it’s not how they’re supposed to be used. I think it was a TED talk where Chuck said “well it turns out locals are kind of useful, sometimes” and that’s where the A and B registers come from. You don’t need A and B as globals, you can fetch and store from memory for globals - A and B are registers for fast access.
Didn’t see anything like that in the TEDx talk I found, sadly. Would love to see/read whatever you’re thinking of if I missed it!
In my mind, “local variables” implies a named value that cannot be read or written by code outside the current scope. I don’t think that an auto-incrementing address register is really the same thing. (I’m considering adding one to the weird Forth I’m prototyping right now.) I do remember reading him talk about moving away from being dogmatic about stacks, which fits with him adding address registers and such to the F18A. IIRC ColorForth also has an IF that doesn’t consume a value from the stack, relying instead on a persistent zero flag on the CPU from whatever the last operation was. (It also doesn’t have ELSE because he decided that it’s redundant, since you can just return from inside an IF condition, but that’s not really relevant - I just think it’s a wild choice.)
Admittedly, the F18A has a small enough memory that you could probably make the case that even a global variable is kinda local, because the scope of any one program running on one of the CPUs is so constrained.
I don’t think that an auto-incrementing address register is really the same thing.
This is/was new to the F18A iirc, because it allows you to blitz through memory. Sure, it’s not scoped fancy locals like whichever brand new 1960s programming fad that’s surely going to pass has - but it’s just a place to load and store things short-term and shortly. Sure it’s not “local variables, scoped lexically”, but it’s what Chuck implemented to just store something temporarily that he found useful in a local-variable-ish way. I can’t remember where I heard him say it, apologies.
**`a Macro`**
: Moves value at register 2 to register 0. EDX to EAX.
**`a! Macro`**
: Moves value at register 0 to register 2. EAX to EDX
Chuck isn’t using locals and then descending into words that use locals. The use of that “local” is over by the end of some useful, probably quite-fundamental definition. And note how it differs from the F18A - it’s really just a short-term store for something that’s being useful, not a complex feature of the interpreter.
Oh I think maybe I found it? In his 1x Forth talk the first time he says “local” he mentions that the A register works something like a local variable. The second time he says it it’s to say that local variables are harmful and you shouldn’t implement them. So I think we’re both right and pulling from the same talk even, haha.
I do think it’s worth paying attention to exactly what he’s saying when he says locals are bad in that talk. He explicitly says, variables are essential, use as many of those as you want. I think defining a new var that’s meant to be used only by one or two words and whose value is meaningless once they’ve finished running is fine Forth style. I was scared of it at first because everyone knows globals are bad and Chuck days locals are bad so obviously I should tough it out and write a bunch of complex stack manipulations? Bounce a bunch of stuff back and forth between the data and return stacks to bubble up the value I need to reuse? I’m not the only person who assumed that! But no, complex stack manipulations suck, you absolutely should never, ever do them. No locals to me just means your Forth compiler shouldn’t do them either. Vars are cheap and simple.
I find myself wondering a lot what a more accessible Forth might look like; are there more flexible, composable, simple abstractions like the Forth “word” out there? Our current GUI paradigms can’t be irreducible in complexity; is there a radically simpler alternative that empowers individuals? What else could an individual-scale programming language look like, that is not only designed to enable simplicity, but to outright disallow complexity?
I think a “more accessible Forth” could look like a tiny 1960’s style Lisp interpreter that emphasizes extreme implementation simplicity over features that make modern Lisp nicer to use. Although such a language would be primitive compared to modern languages, it would be as malleable as Forth, and way easier to use, due to having local variables, garbage collected dynamic data structures, and memory safety. Like Forth, the implementation could fit into a few kilobytes of memory, because the original 1960 Lisp interpreter would have had to be that small.
Do you mean with Lisp syntax, i.e. S-expressions? Because there are a bunch of tiny Lisps like that, i.e. SectorLisp.
Otherwise I guess you have a language that’s “backwards Lisp” or “Lisp but it’s RPN.” You’d have to change a lot of the standard vocabulary, because on the downside functions don’t know how many arguments they’re given, and on the plus side they can have multiple return values.
Yeah I meant Lisp. The OP asked for a “radically simple, individual scale language” with the benefits of Forth but more accessible. So, highly malleable, tiny core and implementation. I think that Lisp syntax is more accessible than Forth syntax (my opinion after trying both). SectorLisp is awesome but it doesn’t match the feature set described in the article. To compete with Forth as described by the OP, we need numbers, FEXPRs, strings, I/O. It needs to be a big enough language to implement its own REPL and solve advent-of-code problems.
I’m pretty sure that can be done in a small number of kilobytes, so that it could run on the OP’s 286 or other 1980’s PC. I don’t know which existing tiny Lisp implementation meets these requirements, but I agree it should be out there.
I used Forth in the late ’80s, using a 68HC11 with a Forth kernel in PROM, downloading my code into EEPROM. I loved that there was no separation between the base code and my code; adding my own control structures was a lot of fun.
Reading through a Forth interpreter’s source code is a great exercise. JonesForth is very clear, literate even. All the interpreters I’ve read are very ontology-recapitulates-phylogeny, i.e. there is a chronological sequence to them, like the Book of Genesis: “In the beginning was address 0. And CH*CK implemented NEXT, and he saw that it was good…”
Lots of post-Forth links at the Concatenative Languages Wiki. It’s interesting how Forth was basically reinvented (from CS-theoretical foundations) as Joy, and then lots of other languages have come from that merger.
I’m of the opinion that a really-useable Forth needs, at a minimum, stack discipline. Factor is an example — each word specifies its effect on the stack, and the compiler rejects code that doesn’t abide by that. Without that, it’s way, way too easy to mess up the stack and crash badly in ways that are horrible to debug.
The other great innovation (from Joy, I think) is “quotes”, which are nested sequences of words. At a stroke this gives you block structure, much nicer control structures, and anonymous functions and FP. I suspect these are pretty easy to add to a “classic” Forth, though I haven’t tried.
(I wrote a little nontraditional Forth core called Tails in 2021. Nontraditional in that it’s written in C++. I was mostly interested in exploring the tail-recursive continuation-passing style of interpreter.)
Chuck Moore had changed his opinion on local variables, before that post was written. Local variables are kinda harmful in Forth because they encourage bad stack hygiene, but there are enough useful instances of needing them that even Greenarray’s own Forth-like stack processor, the F18A, has two dedicated registers for local variables (iirc), accessed with the words A and B.
Blog author here. Had a quick look at the F18A reference, and from the looks of it, A and B are general purpose registers and thus closer to globals than locals. Now that it’s a few years later and I’ve finished and shipped a game in Forth, I can pretty confidently say “local variables are bad” absolutely does NOT mean “do everything on the stack and don’t name anything” - that is, in fact, terrible advice, and you will have a real bad time writing tortuously complicated code that’s impossible to keep straight if that’s how you interpret it. Instead I believe that it means “local variables hide information from other parts of the program, and that’s bad.”
Absolutely there will come a time where stack juggling is way harder to understand than just giving a value a name, but local variables have the property that no other words can reference them. So once you introduce locals, it becomes difficult or impossible to factor out pieces of logic to make the definition smaller - and as Chuck Moore says, Forth is small definitions.
Globals, however, don’t have this problem! They have some different problems - most obviously with re-entrancy, but most functions don’t actually need to be re-entrant. So most of the time you can trivially introduce a global for this purpose. The rest of the time you’d have to take a bit more care to use globals, but then it’s explicit that these special scenarios are considered and handled.
To me, the part of Forth that is the most amazing magic trick is that the entire program is visible; no part is closed off to you or hard to access. Every piece has an unambiguous name, and if you know that name, you can invoke it, you can see inside of it. The smaller the pieces are, the more names you get, and the more access you have.
Naming things is one of the two hardest things in Computer Science (the other being cache invalidation). I’d rather keep a function a bit large rather than try to come up with a name that is effectively used only once. Also, ANS Forth (to be fair, Chuck Moore has denounced it, and any attempt to standardize Forth) does include locals and :NONAME.
Globals aren’t just bad for reentrancy, they’re also bad because it can hide “spooky action at a distance.”
Also, somewhat related—Microsoft had a hard time improving MS-DOS and Windows because of all the peeking and poking into internals that people were warned not to. Microsoft has an entire blog devoted to just how bad that can be.
Oh if you want to protect the internals of your program then Forth is emphatically not the language for you.
:NONAME
is useful, I implemented it in my Forth. I also implemented a variant that I spelled:|
|;
that worked in the middle of colon definitions. I used that a lot more. Sometimes I even nested:|
inside:noname
.I’m not a Charles Moore purist or anything, but I did find working with my own Forth (not just messing around but actually solving problems and building useful software) to be an incredibly empowering experience, and I was stubborn enough to do it without implementing local variables. It was also frequently a very challenging experience compared to implementing a solution in a more batteries-included environment, and there were many changes to my workflow that I would have substantially improved my life that were completely impractical to build. (I ended up building a much more flexible and capable map editor in Honeylisp, for example.) It’s very hard to convey the specific shape of that experience, but I still try sometimes.
They are closer to globals in implementation, but it’s not how they’re supposed to be used. I think it was a TED talk where Chuck said “well it turns out locals are kind of useful, sometimes” and that’s where the A and B registers come from. You don’t need A and B as globals, you can fetch and store from memory for globals - A and B are registers for fast access.
Didn’t see anything like that in the TEDx talk I found, sadly. Would love to see/read whatever you’re thinking of if I missed it!
In my mind, “local variables” implies a named value that cannot be read or written by code outside the current scope. I don’t think that an auto-incrementing address register is really the same thing. (I’m considering adding one to the weird Forth I’m prototyping right now.) I do remember reading him talk about moving away from being dogmatic about stacks, which fits with him adding address registers and such to the F18A. IIRC ColorForth also has an
IF
that doesn’t consume a value from the stack, relying instead on a persistent zero flag on the CPU from whatever the last operation was. (It also doesn’t haveELSE
because he decided that it’s redundant, since you can just return from inside an IF condition, but that’s not really relevant - I just think it’s a wild choice.)Admittedly, the F18A has a small enough memory that you could probably make the case that even a global variable is kinda local, because the scope of any one program running on one of the CPUs is so constrained.
This is/was new to the F18A iirc, because it allows you to blitz through memory. Sure, it’s not scoped fancy locals like whichever brand new 1960s programming fad that’s surely going to pass has - but it’s just a place to load and store things short-term and shortly. Sure it’s not “local variables, scoped lexically”, but it’s what Chuck implemented to just store something temporarily that he found useful in a local-variable-ish way. I can’t remember where I heard him say it, apologies.
Here’s an example from OKAD, which ran in colorForth:
Chuck isn’t using locals and then descending into words that use locals. The use of that “local” is over by the end of some useful, probably quite-fundamental definition. And note how it differs from the F18A - it’s really just a short-term store for something that’s being useful, not a complex feature of the interpreter.
I’m not a Chuck purist, though. I think he thinks differently to most people and it wouldn’t be productive to try and work like him. Even he’s come around to register-based programming when it’s useful.
Oh I think maybe I found it? In his 1x Forth talk the first time he says “local” he mentions that the A register works something like a local variable. The second time he says it it’s to say that local variables are harmful and you shouldn’t implement them. So I think we’re both right and pulling from the same talk even, haha.
I do think it’s worth paying attention to exactly what he’s saying when he says locals are bad in that talk. He explicitly says, variables are essential, use as many of those as you want. I think defining a new var that’s meant to be used only by one or two words and whose value is meaningless once they’ve finished running is fine Forth style. I was scared of it at first because everyone knows globals are bad and Chuck days locals are bad so obviously I should tough it out and write a bunch of complex stack manipulations? Bounce a bunch of stuff back and forth between the data and return stacks to bubble up the value I need to reuse? I’m not the only person who assumed that! But no, complex stack manipulations suck, you absolutely should never, ever do them. No locals to me just means your Forth compiler shouldn’t do them either. Vars are cheap and simple.
I think a “more accessible Forth” could look like a tiny 1960’s style Lisp interpreter that emphasizes extreme implementation simplicity over features that make modern Lisp nicer to use. Although such a language would be primitive compared to modern languages, it would be as malleable as Forth, and way easier to use, due to having local variables, garbage collected dynamic data structures, and memory safety. Like Forth, the implementation could fit into a few kilobytes of memory, because the original 1960 Lisp interpreter would have had to be that small.
Do you mean with Lisp syntax, i.e. S-expressions? Because there are a bunch of tiny Lisps like that, i.e. SectorLisp.
Otherwise I guess you have a language that’s “backwards Lisp” or “Lisp but it’s RPN.” You’d have to change a lot of the standard vocabulary, because on the downside functions don’t know how many arguments they’re given, and on the plus side they can have multiple return values.
Yeah I meant Lisp. The OP asked for a “radically simple, individual scale language” with the benefits of Forth but more accessible. So, highly malleable, tiny core and implementation. I think that Lisp syntax is more accessible than Forth syntax (my opinion after trying both). SectorLisp is awesome but it doesn’t match the feature set described in the article. To compete with Forth as described by the OP, we need numbers, FEXPRs, strings, I/O. It needs to be a big enough language to implement its own REPL and solve advent-of-code problems.
I’m pretty sure that can be done in a small number of kilobytes, so that it could run on the OP’s 286 or other 1980’s PC. I don’t know which existing tiny Lisp implementation meets these requirements, but I agree it should be out there.
I used Forth in the late ’80s, using a 68HC11 with a Forth kernel in PROM, downloading my code into EEPROM. I loved that there was no separation between the base code and my code; adding my own control structures was a lot of fun.
Ha, adding a static type system would be quite the trick.