The whole ‘brackets are bad we got rid of em’ thing kind of confrontational. it is not a compelling way to sell your PL idea. both languages can exist and be enjoyed separately.
edit: I don’t know why they don’t just say: “We made a really cool ocaml looking thing, with a great hygenic macro system”
They’re Racket hackers building on top of Racket. Lispers have a massive historical collective inferiority complex. Many people hate on Lisp and Lispers believe the parentheses are blinding the haters from seeing the beauty that is within, this leads to lots of attempts at “(the power of) Lisp, without parentheses”.
Racket takes that up to eleven as it has lots of support for building custom languages on their runtime. They even have a Java-lookalike. I think Guile has something similar where it supports JavaScript syntax.
Personally, I think it’s all a giant waste of time.
To be fair, the Racket folks are are a “teach first” kind of crowd, so ProfJ is intended not as a general replacement for Java, but as a demonstration of how Java style classes work, as part of a broader curriculum. I might be getting the history wrong, but Teach Scheme Reach Java was the project.
I don’t think there’s anything wrong with Java, but ProfessorJ, I believe, can use the debugger, and all of the other things in Dr Racket that students got used to.
If you target a class at non-majors, or non-developers, you’re going to hit a lot of tooling hurdles. Hell, even early CS students don’t often have a command on installation of different tools. The complexity is a barrier to understanding.
Fair enough, that makes sense. But why would you teach non-developers Java of all things? JavaScript, I can sort of understand because it is everywhere.
Well, we’re talking years ago here, but the reality was that Java was declared to be the language that prepped you for the real world by many universities.
Python or JavaScript would certainly be a better choice these days from a practical standpoint. Not sure how universities have evolved in the past 10 years…
Fair point. The content of the paper discusses how to make lisp style hygienic macros available to non-lisp surface syntax, and is much less confrontational.
The related work section is very good, although not mentioning Julia seems like an oversight. It has Lisp-like macros, but with conventional Matlab/Python syntax:
Building on “femtolisp” (https://github.com/JeffBezanson/femtolisp) is what enabled it. That is, the authors of Julia designed and implemented their own metalanguage (a Lisp).
Elixir is similar AFAICT, and mentioned many times in the paper, so it’s odd not to mention Julia.
Also, I think numerical programming is a good use of compile-time metaprogramming, although I’m not so familiar with how its usage in Julia. It seems like it should be used a lot, but I haven’t looked
Terrible. Non-Lispy Lisp notations have been tried and forgotten so often, yet they never learn. The very reason macros are so elegant, simple and flexible in Lisp is the parentheses. But I understand that HtDP must be made more popular at any price.
Have you read the paper (or language docs) or is this a knee-jerk reaction? I think they’ve found an interesting middle ground, with homoiconicity but also a readable syntax. As someone who’s always admired Lisp but finds the idea of programming by S-expressions as appealing as poking my eyes out (sorry dude), I’m fairly interested in it.
Well, if you can’t appreciate the advantages that s-expressions provide, then it is probably pointless to convince you otherwise. Compared to macro-systems for languages with traditional syntax, this may look appealing. But if you haven’t used S-expression-based macros seriously before, you will not understand that all that machinery (just note the numerous footnotes pointing out this or that special case) or the markers for operator-“fixity” and precedence are more or less an elaborate attempt to provide something that in an s-expression based syntax can be taken for granted: no special defining forms for declaring the type of macro (expression, definition, binding), no worries about grouping and precedence and so on.
S-expr based syntax makes the grouping obvious, avoiding ambiguity and all mechanisms used to address that. Being whitespace-insensitive is more or less essential for true homoiconicity, as it makes reading/printing/pretty printing forms such an effortless undertaking. But the whole discussion of SSTs vs ASTs is something a true Lisper can just shake his or her head at, turn to the keyboard and write a complex macro with vastly less effort, less knowledge of the underlying macro-expansion engine, less code and with minimal restrictions.
That is the beauty of it all: s-exprs are not syntax, they are meta-syntax (tokenization). The syntax is completely free form, and can be arbitrarily handled by the one implementing the syntactic forms. I must admit that I do like things like Honu, but these are only a veneer (a meta syntax), while still providing the total freedom of s-exprs. WIth something like shrubbery notation (similar to the less ambitious D-exprs in Dylan) you will struggle to approach the expressivity of s-expr based macros without extensive knowledge of the rules and restrictions, the implicit grouping and the special operators. This shrubbery stuff is meta-syntax and syntax, already doing the grouping for you, even if you don’t want that.
Dude, Lispers have written whole code walkers (SERIES, MOP) in terms of macros, fully general pattern matching packages (Wright, Shinn) or statically resolved generic functions (Thant Tessmann’s generic function system)! I’d love to see the equivalent in Rhombus, really. Even if possible, it surely will be something that I, for fear of my eyes, hopefully will never have a need to see (chances are good).
So if simplicity, cognitive overhead and restrictions that can only be overcome with serious effort are not a worthwhile metric for you, Rhombus may be the right thing. Go for it. It certainly will give you the power of macros with a conventional and easy to understand syntax. And Matthew Flatt knows what he is doing, so I take it this stuff is solid. It’s just so pointless…
no special defining forms for declaring the type of macro (expression, definition, binding), no worries about grouping and precedence and so on.
I don’t think “types of macros” would be needed in a more minimal system. They exist in Rhombus because there are more syntactic positions (above the Shrubbery level) and it simplifies defining macros for those positions. The closest analog in Racket (that I can think off the top of my head) are match expanders.
They exist in Rhombus because there are more syntactic positions (above the Shrubbery level) and it simplifies defining macros for those positions. The closest analog in Racket (that I can think off the top of my head) are match expanders.
Next time a relative asks me what I do for work, I’m going to reply with this.
Indeed - the “shrubbery” notation (naming never was a strong side of the PLT/Racket folks) being metasyntax and Rhombus already providing the basic syntactic skeleton, it needs more machinery to somehow give the user a method of declaring transformations in a simple enough manner. I understand the motivation, it may even be a reaonable approach to somehow cater to those that think Python is the acme of programming language syntax. Yet once you accept the turn to s-expressions this is all moot, unnecessary, overly complex and too restricting.
Design patterns are evidence of a problematic or insufficiently powerful language. Factories for example work around the tight coupling that new introduces at the ABI level, as well as its inability to return anything but the constructed type. A function that returns an abstract type or an interface solves both issues.
It’s really great to see that Java is finally realizing this and fixing the language itself.
In Ruby there is no new keyword built into the language, new is just a normal method call on a normal object that happens to be a Class. It usually returns an instance of that class, but it can be redefined to return anything the programmer wants. In Ruby every object is created via factory methods, and people don’t even seem to notice. Also since it is dynamically duck typed it is not even necessary to formally define abstract types, just the methods the returned object responds to.
Why do we have singletons and all the lifetime management complexity and defensive programming that comes along with them? Why not use static class methods? Because classes cannot implement interfaces, they cannot be passed to methods as values, because they are second class citizens of the language. So it is not enough to put all the functionality inside the class itself which is a natural singleton, programmers actually need to create an actual instance of the class in order to use it in places where objects are expected. Because classes are not objects, programmers are forced to reinvent the parts of the compiler that ensure only a single instance of each class exists in the executable.
Ruby has no such problem. Classes are objects which respond to messages like any other. They are implicit singleton values. They can be passed around as values just fine. Interfaces need not be explicitly defined, only documented: any class can be passed to any method provided it responds to the same messages and does what is expected.
I think if we look deeply enough we can find language level solutions to most if not all of the problems that created the need for such design patterns in the first place.
Huh, I thought this truism came from a quip by Alan Perlis or something like that, but in fact it was articulated in a blog post by Mark Jason Dominus. Anyway, in a follow-up, he clarified:
When I said “[Design] Patterns are signs of weakness in programming languages,” what I meant was something like “Each design pattern is a sign of a weakness in the programming language to which it applies.”
I would rather say that if a useful pattern requires lots of boilerplate to express, that is a sign of weakness. But if you identify a useful pattern and can express it rather directly, or abstract it, then you are still using a design pattern but the language doesn’t get in your way.
Classic design patterns are in the third category. They only existed as a thing that got written down in books because OO languages didn’t allow the patterns to be abstracted into a library or reduced to small idioms.
I’m going to qualify this with a “yes, but”. The Gang of Four book absolutely exists mainly to work around the shortcomings of Java, and most of the patterns have far simpler and nicer versions in more flexible languages. But there’s some odd niches here and there where they actually end up being useful.
Visitors for example. I will defend to the end the statement that visitors are just lame-ass re-creations of higher-order map, fold etc functions. But there’s actually some niche places where they do better than a higher-order function approach, and I ran into one of them recently: traversing a heterogenous tree-ish data type and operating on only a few of its nodes. In my case, a compiler traversing an AST or high-level IR and doing transformations on it. This is in Rust, which is a language with no lack of higher-order functions or type metaprogramming, but I kept running into places where a recursion scheme approach ended up being incomplete or insufficiently flexible and I had to write special-case versions of it every time I had design case I didn’t foresee. So I caved in to people’s advice and wrote a visitor trait for it, and so far it has worked very well. Is it verbose and boiler-plate-y as hell to write? Yes. Is it actually very clean and simple to use? Also yes.
Turns out the secret is the double-dispatch you get from the interaction of the visitor trait and variant-specific walk functions. I’m 99% sure I could rewrite my existing recursion schemes to do the same thing, but I don’t think it would turn out any nicer than a visitor… Having one object with all the callbacks you need to transform any node type, not just the variants of a particular enum, seems like one of those “if it didn’t exist you’d have to invent it” things.
My bad, you’re probably correct. Never actually read it. :P I’ve seen it mostly discussed in terms of Java code in the early 2000’s… probably ‘cause I was learning to program in the early 2000’s and Java was my first language.
Why oh why did they waste the opportunity to name the language Ni?
…srsly, it looks pretty cool. Shrubbery notation addresses the syntactic ugliness that’s kept me away from Lisps. A lot of the macro stuff in the paper goes over my head — I can barely figure out how to use “…” in C++ templates — but it looks very powerful. Definitely going to try out Ni Rhombus when I get a chance.
[update: turns out I made the same joke a year ago in the thread @5d22b linked to. What can I say, I watched too much Monty Python in my impressionable youth.]
Thank you for linking the paper: I saw this on the OOPSLA site yesterday but couldn’t find a link; where was this sourced (so i know where to look in the future)?
I think that link was via some sleuthing someone did on Reddit. Here is a version that is close to what will be published. The only changes should be spelling and grammar.
Hoot depends on both the tail call and GC Wasm extensions. On the GC side, Hoot will emit extra instructions to describe its types according to the Wasm GC spec and then the host VM can do the collecting.
I’m not sure where to put it, but the entire subthread is missing a look at the roadmap, which clearly shows that most runtimes do not have TC or GC available yet. Only Chrome can do it, and only with a special flag.
Yes, that’s the current status today. Firefox is in the process of actively implementing both. The proposals themselves appear to be progressing well through the standards process. Two web VMs are required to be implementing a proposal to progress through the later stages. The features should be default enabled once standardised, assuming no obstacles appear. People involved in the proposals process have estimated they should be available by the end of the year.
I believe other non-web engines are also working on these proposals, though I forget which at the moment.
Scheme makes use of tail calls and GC, and those extensions are on track to be generally available in common Wasm engines this year, so it seems like a reasonable design choice to target them now.
IIRC rolling your own GC in WASM is quite awkward/difficult, especially if you consider object references between containers. There was a post about it a month or so ago (i don’t remember the details; maybe it was the previous Spritely post?)
Also, the major WASM runtimes already contain world-class GCs, and run a GC’d language, so exposing those GCs to WASM seems a good idea for performance and interop.
(But I do get your point about piling on features that other WASM runtimes now need to add! Fortunately GC isn’t hard to implement, if you don’t care about world-class performance. I’ve done it twice this year for my smol_world project.)
Ah hmm… For these particular abilities though, it’s at the very least quite hard or potentially impossible to get the same result without some kind of engine support.
For the case of tail calls, it’s much more natural to express recursive programs (esp. ones in Scheme written to expect tail calls) in this style. Perhaps custom stack management at run-time or other hefty program transformations can be used as a workaround, but the concept of tail calls is fairly straightforward and the host engine complexity appears to be small.
For the case of GC, the proposal is far more complex, so I can understand hesitation in terms of complexity… At the same time, allowing Wasm programs to leverage the existing engine GC does make implementation drastically simpler (for languages expecting GC). It importantly also makes it possible to describe cycles between host engines data and Wasm program data which just wasn’t possible before.
I suppose a generalised version of your concern might be that you don’t want every language feature to become a Wasm extension, and I agree with that general sentiment. In the case of tail calls and GC though, they feel (to me at least) sufficiently useful to a variety of languages and allow the Wasm engine to enable a use cases that may be impossible (or very hard) otherwise.
Tail call elimination simplifies things a lot, and there’s very little reason not to have it… how to do it has been known for a long time. GCC even supports it for C in many cases, IIRC.
At any rate, I won’t go into the GC proposal stuff in depth, but here are some good motivators to see that work advance:
It means many more languages being able to become first class citizens in browser-space
It adds certain reference-integrity safety abilities to WASM languages, important and useful for ocap reasons
Usually some kind of efficient GC is already available. In the browser especially. Why not expose it?
It means being able to have a shared heap. This is really useful for garbage collection reasons across programs. Javascript and other languages can instantiate and share references without needing to duplicate or having very difficult to deal with cycle detection and elimination problems.
I’m not super up-to-date, but as I understand it, GC will be in consumer browsers by Q4 2023 (according to Andy Wingo.) And is available in development builds currently, so language implementers may want to start targeting it now.
“When someone calls a language modern, it tells you next to nothing about the language, but it tells you a fair bit about the person who said it.”
That said, Racket has a few clunky features due to its age. The class system feels very dated, and the fact that most short list operations only work on lists and not general sequence types isn’t great. The latter is somewhat addressed by the “for” family of macros but IIRC it’s something the maintainers would have done differently if they had a do over.
“When someone calls a language modern, it tells you next to nothing about the language, but it tells you a fair bit about the person who said it.”
YES! I think the term “modern” is a thought terminating cliche. What does it really mean? If you had a “modern” language, write a book about it, describe it as “modern”, and 20 years passes what does the term “modern” mean to readers?
It just shuts down conversations because no one wants argue against it.
Agreed. I’ve dug through too many used bookstores and old libraries full of books with titles like “Modern Pascal Programming For MS-DOS 4.0” to want to use it as a term.
Most of my time is now spent using Racket in places where I could use a shell script. It’s easier to write a Racket program that invokes other programs and work with their error codes and re-direct their output to the right places. Truly a joy for me, personally, as I do like writing Lisp.
For the most part, a lot of features in the Racket library do not need sub-processes to do those types of jobs.
For grep we have regexp objects which employ either racket-match or racket-match? to match across strings or filter.
seq can be mimicked by using a range function to iterate combined using expressions like for.
sort is done by using the appropriately named Racket function sort and changing the comparison function and input list.
If you want to sub-process invoke programs, then the output of a subprocess call can only be sent to a file stream like stdout or a plain file. To invoke multiple sub-processes one after another and continuously pass their outputs to one another involves a little bit of trickery which might be a bit complex to talk about in a comment, but it is do-able. The gist is to try to write tasks using the Racket standard library, then use subprocess when you need something not covered by it.
; display all files in pwd
(for-each displayln (map path->string (directory-list)))
; display all files sorted
(for-each displayln
(sort (map path->string (directory-list)) string<?))
; regexp match over a list of sorted files
(for-each displayln
(filter (λ (fname) (regexp-match? #rx".*png" fname))
(sort (map path->string (directory-list)) string<?)))
As posted in a sibling message, it’s much easier to use built-in functions than to shell out and call another program. Personally, I find Racket more convenient for writing scripts that need to work in parallel. For example, a script gets the load average from several machines in parallel over ssh.
Best way I can quickly sum it up is clever use of the function subprocess in Racket.
(define (start-and-run bin . args)
(define-values (s i o e)
(apply subprocess
`(,(current-output-port) ,(current-input-port) stdout
,(find-executable-path "seq")
,@args)))
(subprocess-wait s))
(start-and-run "seq" "1" "10")
This outputs the seq command to stdout, and allows for arbitrary commands so you can do zero-arg sub-processes or however many you need/like. The current-output-port and current-input-port calls are parameters that you can adjust by using a parameterize block to control the input/output from the exterior.
The output port must be set to a file, it cannot be set to an output string like with call-with-output-string, so output is either going to go straight to stdout, or you can use call-with-output-file to control the current-output-port parameter and store the output wherever you please.
I had trouble following all this (you’ve read the Common Lisp spec way more closely than I ever bothered to), but you might be interested in John Shutt’s Kernel language. To avoid unhygienic macros, Kernel basically outlaws quasiquote and unquote and constructs all macros out of list, cons and so on. Which has the same effect as unquoting everything. A hyperstatic system where symbols in macros always expand to their binding at definition time, never to be overridden. Implying among other things that you can never use functions before defining them.
There’s a lot I love about Kernel (it provides a uniform theory integrating functions and macros and intermediate beasts) but the obsession with hygiene is not one of them. I took a lot of inspiration from Kernel in my Lisp with first-class macros, but I went all the way in the other direction and supported only macros with quasiquote and unquote. You can define symbols in any order in Wart, and override any symbols at any time, including things like if and cons. The only things you can’t override are things that look like punctuation. Parens, quote, quasiquote, unquote, unquote-splice, and a special symbol @ for apply analogous to unquote-splice. Wart is even smart enough to support apply on macros, something Kernel couldn’t do – as long as your macros are defined out of quasiquote and unquote. I find this to be a sort of indirect sign that it gets closer to the essence of macros by decoupling them into their component pieces like Kernel did, but without complecting them with concerns of hygiene.
(Bel also doesn’t care about hygienic macros and claims to support fully first-class apply on macros. Though I don’t understand how Bel’s macroexpand works in spite of some effort in that direction.)
Depends on what you’re protecting against. Macros are fundamentally a convenience. As I understand the dialectic around hygienic macros, the goal is always just to add guardrails to the convenient path, not to make the guardrails mandatory. Most such systems deliberately provide escape hatches for things like anaphoric macros. So I don’t think I’ve ever heard someone say hygiene needs to be an ironclad guarantee.
Honestly I agree with the inclusion of escape hatches if they are unlikely to be hit accidentally; I’m just surprised that the Kernel developers also agree, since they took such a severe move as to disallow quasiquote altogether.
So I don’t think I’ve ever heard someone say hygiene needs to be an ironclad guarantee.
I don’t want to put words in peoples’ mouths, but I’m pretty sure this is the stance of most Racket devs.
Racket doesn’t forbid string->symbol either, it just provides it with some type-safe scaffolding called syntax objects. We can definitely agree that makes it more difficult to use. But the ‘loophole’ does continue to exist.
I’m not aware of any macro in Common Lisp that cannot be implemented in Racket (modulo differences in the runtimes like Lisp-1 vs Lisp-2, property lists, etc.) It just gets arbitrarily gnarly.
Thanks for the clarification. I have attempted several times to understand Racket macros but never really succeeded because it’s just so much more complicated compared to the systems I’m familiar with.
Yeah, I’m totally with you. They make it so hard that macros are used a lot less in the Scheme world. If you’re looking to understand macros, I’d recommend a Lisp that’s not a Scheme. I cut my teeth on them using Arc Lisp, which was a great experience even though Arc is a pretty thin veneer over Racket.
Nowadays when I need a Racket macro I just show up in #racket and say “boy, this sure is easy to write using defmacro, too bad hygenic macros are so confusing” and someone will be like “they’re not confusing! all you have to do is $BLACK_MAGIC” and then boom; I have the macro I need.
Kernel does not avoid unhygienic macros. Whereas Scheme R6RS syntax-case makes it more difficult to write unhygienic macros but still possible. It possible to write unhygienic code with Kernel, such defining define-macro without using or the need for quasiquote et al.
Kernel basically outlaws quasiquote and unquote
Kernel does not outlaw quasiquote and unquote semantic. There is $quote and unquote is merely (eval symbol env), whereas quasiquote is just a reader trick inside Scheme (also see [0]).
and constructs all macros out of list, cons and so on.
Yes an no.
Scheme macros, and even CL macros are meant a) a hook into the compiler to speed things up e.g. compose, or clojure’s =>, or b) change the prefix-based evaluation strategy to build, so called, Domain Specific Languages such as records eg. SRFI-9.
Kernel eliminates the need to think “this a macro or is this procedure”, instead everything is an operative, it is up the interpreter or compiler to figure what can be compiled (ahead-of-time) or not, which is slightly more general that everything is a macro, at least because an operative as access to the dynamic scope.
Based on your comment description, Wart is re-inventing Kernel or something like that (without formal description unlike John Shutt).
Page 67 of the Kernel Report says macros don’t need apply because they don’t evaluate their arguments. I think that’s wrong because macros can evaluate their arguments when unquoted. Indeed, most macro args are evaluated eventually, using unquote. In the caller’s environment. Most of the value of macros lies in selectively turning off eval for just the odd arg. And macros are most of the use of fexprs, as far as I’ve been able to glean.
Kernel eliminates the need to think “this a macro or is this procedure”
Yes, that’s the goal. But it doesn’t happen for apply. I kept running into situations where I had to think about whether the variable was a macro. Often, within the body of a higher-order function/macro, I just didn’t know. So the apply restriction spread through my codebase until I figured this out.
I spent some time trying to find a clean example where I use @ on macros in Wart. Unfortunately this capability is baked into Wart so deeply (and Wart is so slow, suffering from the combinatorial explosion of every fexpr-based Lisp) that it’s hard to explain. But Wart provides the capability to cleanly extend even fundamental operations like if and def and mac, and all these use the higher-order functions on macros deep inside their implementations.
Based on your comment description, Wart is re-inventing Kernel or something like that (without formal description unlike John Shutt).
I would like to think I reimplemented the core idea of Kernel ($vau) while decoupling it from considerations of hygiene. And fixed apply in the process. Because my solution to apply can’t work in hygienic Kernel.
I don’t making any claim of novelty here. I was very much inspired by the Kernel dissertation. But I found the rest of its language spec.. warty :D
Promoting solely unhygenic macros, is similar as far as I understand, to promote “code formal proof are useless” or something similar about ACID or any kind guarantees a software might provide.
Both Scheme, and Kernel offer the ability to bypass the default hygienic behavior, and hence promote, first, a path of least surprise (and hard to find bugs), and allow the second (aka. prolly shoot yourself in the foot at some point).
At least for me, the value of Lisp is in its late bound nature during the prototyping phase. So the useability is top priority. Compromising useability with more complicated macro syntax (resulting in far fewer people defining macros, as happens in the scheme world) for better properties for mature programs seems a poor trade-off. And yes, I don’t use formal methods while prototyping either.
The only drawback of hygienic macro that I know about is that is more difficult to implement than define-macro, but again I do know everything about macros.
We’ll have to agree to disagree about syntax-rules. Just elsewhere on this thread there’s someone describing their various attempts to unsuccessfully use macros in Scheme. I have had the same experience. It’s not just the syntax of syntax-rules. Scheme is pervasively designed (like Kernel) with hygiene in mind. It makes for a very rigid language, with things like the phase separation rules, that is the antithesis of the sort of “sketching” I like to use Lisp for.
Currently it isn’t possible. It would require implementing the base widgets (rendering and input events.) Part of an implementation could be simplified by using the existing racket/draw library which sits on top of cairo.
Eh, there are some problems with xargs, but this isn’t a good critique. First off it proposes a a “solution” that doesn’t even handle spaces in filenames (much less say newlines):
rm $(ls | grep foo)
I prefer this as a practical solution (that handles every char except newlines in filenames):
ls | grep foo | xargs -d $'\n' -- rm
You can also pipe find . -print0 to xargs -0 if you want to handle newlines (untrusted data).
(Although then you have the problem that there’s no grep -0, which is why Oil has QSN. grep still works on QSN, and QSN can represent every string, even those with NULs!)
One nice thing about xargs is that you can preview the commands by adding ‘echo’ on the front:
ls | grep foo | xargs -d $'\n' -- echo rm
That will help get the tokenization right, so you don’t feed the wrong thing into the commands!
I never use xargs -L, and I sometimes use xargs -I {} for simple invocations. But even better than that is using xargs with the $0 Dispatch pattern, which I still need properly write about.
Basically instead of the mini language of -I {}, just use shell by recursively invoking shell functions. I use this all the time, e.g. all over Oil and elsewhere.
do_one() {
# It's more flexible to use a function with $1 instead of -I {}
echo "Do something with $1"
echo mv $1 /tmp
}
do_all() {
# call the do_one function for each item. Also add -P to make it parallel
cat tasks.txt | grep foo | xargs -n 1 -d $'\n' -- $0 do_one
}
"$@" # dispatch on $0; or use 'runproc' in Oil
Now run with
myscript.sh do_all, or
my_script.sh do_one to test out the “work” function (very handy! you need to make this work first)
This separates the problem nicely – make it work on one thing, and then figure out which things to run it on. When you combine them, they WILL work, unlike the “sed into bash” solution.
Reading up on what xargs -L does, I have avoided it because it’s a custom mini-language. It says that trailing blanks cause line continuations. Those sort of rules are silly to me.
I also avoid -I {} because it’s a custom mini-language.
IMO it’s better to just use the shell, and one of these three invocations:
xargs – when you know your input is “words” like myhost otherhost
xargs -d $'\n' – when you want lines
xargs -0 – when you want to handle untrusted data (e.g. someone putting a newline in a filename)
Those 3 can be combined with -n 1 or -n 42, and they will do the desired grouping. I’ve never needed anything more than that.
So yes xargs is weird, but I don’t agree with the author’s suggestions. sed piped into bash means that you’re manipulating bash code with sed, which is almost impossible to do correctly.
Instead I suggest combining xargs and shell, because xargs works with arguments and not strings. You can make that correct and reason about what it doesn’t handle (newlines, etc.)
It can be much faster (depending on the use case). If you’re trying to rm 100,000 files, you can start one process instead of 100,000 processes! (the max number of args to a process on Linux is something like 131K as far as I remember).
It’s basically
rm one two three
vs.
rm one
rm two
rm three
Here’s a comparison showing that find -exec is slower:
Oh yes, it does! I don’t tend to use it, since I use xargs for a bunch of other stuff too, but that will also work. Looks like busybox supports it to in addition to GNU (I would guess it’s in POSIX).
the max number of args to a process on Linux is something like 131K as far as I remember
Time for the other really, really useful feature of xargs. ;)
$ echo | xargs --show-limits
Your environment variables take up 2222 bytes
POSIX upper limit on argument length (this system): 2092882
POSIX smallest allowable upper limit on argument length (all systems): 4096
Maximum length of command we could actually use: 2090660
Size of command buffer we are actually using: 131072
Maximum parallelism (--max-procs must be no greater): 2147483647
It’s not a limit on the number of arguments, it’s a limit on the total size of environment variables + command-line arguments (+ some other data, see getauxval(3) on a Linux machine for details). Apparently Linux defaults to a quarter of the available stack allocated for new processes, but it also has a hard limit of 128KiB on the size of each individual argument (MAX_ARG_STRLEN). There’s also MAX_ARG_STRINGS which limits the number of arguments, but it’s set to 2³¹-1, so you’ll hit the ~2MiB limit first.
Needless to say, a lot of these numbers are much smaller on other POSIX systems, like BSDs or macOS.
find . -exec blah will fork a process for each file, while find . | xargs blah will fork a process per X files (where X is the system wide argument length limit). The later could run quite a bit faster. I will typically do find . -name '*.h' | xargs grep SOME_OBSCURE_DEFINE and depending upon the repo, that might only expand to one grep.
As @jonahx mentions, there is an option for that in find too:
-exec utility [argument ...] {} +
Same as -exec, except that ``{}'' is replaced with as many pathnames as possible for each invocation of utility. This
behaviour is similar to that of xargs(1).
I didn’t know about the ‘+’ option to find, but I also use xargs with a custom script that scans for source files in a directory (not in sh or bash as I personally find shell scripting abhorrent).
That is the real beauty of xargs. I didn’t know about using + with find, and while that’s quite useful, remembering it means I need to remember something that only works with find. In contrast, xargs works with anything they can supply a newline-delimited list of filenames as input.
Conceptually, I think of xargs primarily as a wrapper that enables tools that don’t support stdin to support stdin. Is this a good way to think about it?
Yes I’d think of it as an “adapter” between text streams (stdin) and argv arrays. Both of those are essential parts of shell and you need ways to move back and forth. To move the other way you can simply use echo (or write -- @ARGV in Oil).
Another way I think of it is to replace xargs with the word “each” mentally, as in Ruby, Rust, and some common JS idioms.
You’re basically separating iteration from the logic of what to do on each thing. It’s a special case of a loop.
In a loop, the current iteration can depend on the previous iteration, and sometimes you need that. But in xargs, every iteration is independent, which is good because you can add xargs -P to automatically parallelize it! You can’t do that with a regular loop.
I would like Oil to grow an each builtin that is a cleaned up xargs, following the guidelines I enumerated.
I’ve been wondering if it should be named each and every?
each – like xargs -n 1, and find -exec foo \; – call a process on each argument
every – like xargs, and find -exec foo +` – call the minimal number of processes, but exhaust all arguments
So something like
proc myproc { echo $1 } # passed one arg
find . | each -- myproc # call a proc/shell function on each file, newlines are the default
proc otherproc { echo @ARGV } # passed many args
find . | every -- otherproc # call the minimal number of processes
If anyone has feedback I’m interested. Or wants to implement it :)
Probably should add this to the blog post: Why use xargs instead of a loop?
It’s easier to preview what you’re doing by sticking echo on the beginning of the command. You’re decomposing the logic of which things to iterate on, and what work to do.
When the work is independent, you can parallelize with xargs -P
You can filter the work with grep. Instead of find | xargs, do find | grep | xargs. This composes very nicely
The whole ‘brackets are bad we got rid of em’ thing kind of confrontational. it is not a compelling way to sell your PL idea. both languages can exist and be enjoyed separately.
edit: I don’t know why they don’t just say: “We made a really cool ocaml looking thing, with a great hygenic macro system”
They’re Racket hackers building on top of Racket. Lispers have a massive historical collective inferiority complex. Many people hate on Lisp and Lispers believe the parentheses are blinding the haters from seeing the beauty that is within, this leads to lots of attempts at “(the power of) Lisp, without parentheses”.
Racket takes that up to eleven as it has lots of support for building custom languages on their runtime. They even have a Java-lookalike. I think Guile has something similar where it supports JavaScript syntax.
Personally, I think it’s all a giant waste of time.
To be fair, the Racket folks are are a “teach first” kind of crowd, so ProfJ is intended not as a general replacement for Java, but as a demonstration of how Java style classes work, as part of a broader curriculum. I might be getting the history wrong, but Teach Scheme Reach Java was the project.
Yeah, it fits into the teaching thing. But I don’t see what’s wrong with using Java proper when the time comes to use that syntax.
I don’t think there’s anything wrong with Java, but ProfessorJ, I believe, can use the debugger, and all of the other things in Dr Racket that students got used to.
If you target a class at non-majors, or non-developers, you’re going to hit a lot of tooling hurdles. Hell, even early CS students don’t often have a command on installation of different tools. The complexity is a barrier to understanding.
Fair enough, that makes sense. But why would you teach non-developers Java of all things? JavaScript, I can sort of understand because it is everywhere.
Well, we’re talking years ago here, but the reality was that Java was declared to be the language that prepped you for the real world by many universities.
Python or JavaScript would certainly be a better choice these days from a practical standpoint. Not sure how universities have evolved in the past 10 years…
I don’t see where the paper says parentheses are bad and must be destroyed.
Check the title
Fair point. The content of the paper discusses how to make lisp style hygienic macros available to non-lisp surface syntax, and is much less confrontational.
The related work section is very good, although not mentioning Julia seems like an oversight. It has Lisp-like macros, but with conventional Matlab/Python syntax:
https://docs.julialang.org/en/v1/manual/metaprogramming/
Building on “femtolisp” (https://github.com/JeffBezanson/femtolisp) is what enabled it. That is, the authors of Julia designed and implemented their own metalanguage (a Lisp).
Elixir is similar AFAICT, and mentioned many times in the paper, so it’s odd not to mention Julia.
Also, I think numerical programming is a good use of compile-time metaprogramming, although I’m not so familiar with how its usage in Julia. It seems like it should be used a lot, but I haven’t looked
I think you’re right that this is an oversight. I don’t recall it being brought up during writing or by the reviewers.
Terrible. Non-Lispy Lisp notations have been tried and forgotten so often, yet they never learn. The very reason macros are so elegant, simple and flexible in Lisp is the parentheses. But I understand that HtDP must be made more popular at any price.
Have you read the paper (or language docs) or is this a knee-jerk reaction? I think they’ve found an interesting middle ground, with homoiconicity but also a readable syntax. As someone who’s always admired Lisp but finds the idea of programming by S-expressions as appealing as poking my eyes out (sorry dude), I’m fairly interested in it.
Well, if you can’t appreciate the advantages that s-expressions provide, then it is probably pointless to convince you otherwise. Compared to macro-systems for languages with traditional syntax, this may look appealing. But if you haven’t used S-expression-based macros seriously before, you will not understand that all that machinery (just note the numerous footnotes pointing out this or that special case) or the markers for operator-“fixity” and precedence are more or less an elaborate attempt to provide something that in an s-expression based syntax can be taken for granted: no special defining forms for declaring the type of macro (expression, definition, binding), no worries about grouping and precedence and so on.
S-expr based syntax makes the grouping obvious, avoiding ambiguity and all mechanisms used to address that. Being whitespace-insensitive is more or less essential for true homoiconicity, as it makes reading/printing/pretty printing forms such an effortless undertaking. But the whole discussion of SSTs vs ASTs is something a true Lisper can just shake his or her head at, turn to the keyboard and write a complex macro with vastly less effort, less knowledge of the underlying macro-expansion engine, less code and with minimal restrictions.
That is the beauty of it all: s-exprs are not syntax, they are meta-syntax (tokenization). The syntax is completely free form, and can be arbitrarily handled by the one implementing the syntactic forms. I must admit that I do like things like Honu, but these are only a veneer (a meta syntax), while still providing the total freedom of s-exprs. WIth something like shrubbery notation (similar to the less ambitious D-exprs in Dylan) you will struggle to approach the expressivity of s-expr based macros without extensive knowledge of the rules and restrictions, the implicit grouping and the special operators. This shrubbery stuff is meta-syntax and syntax, already doing the grouping for you, even if you don’t want that.
Dude, Lispers have written whole code walkers (SERIES, MOP) in terms of macros, fully general pattern matching packages (Wright, Shinn) or statically resolved generic functions (Thant Tessmann’s generic function system)! I’d love to see the equivalent in Rhombus, really. Even if possible, it surely will be something that I, for fear of my eyes, hopefully will never have a need to see (chances are good).
So if simplicity, cognitive overhead and restrictions that can only be overcome with serious effort are not a worthwhile metric for you, Rhombus may be the right thing. Go for it. It certainly will give you the power of macros with a conventional and easy to understand syntax. And Matthew Flatt knows what he is doing, so I take it this stuff is solid. It’s just so pointless…
I don’t think “types of macros” would be needed in a more minimal system. They exist in Rhombus because there are more syntactic positions (above the Shrubbery level) and it simplifies defining macros for those positions. The closest analog in Racket (that I can think off the top of my head) are match expanders.
Next time a relative asks me what I do for work, I’m going to reply with this.
Indeed - the “shrubbery” notation (naming never was a strong side of the PLT/Racket folks) being metasyntax and Rhombus already providing the basic syntactic skeleton, it needs more machinery to somehow give the user a method of declaring transformations in a simple enough manner. I understand the motivation, it may even be a reaonable approach to somehow cater to those that think Python is the acme of programming language syntax. Yet once you accept the turn to s-expressions this is all moot, unnecessary, overly complex and too restricting.
Design patterns are evidence of a problematic or insufficiently powerful language. Factories for example work around the tight coupling that
newintroduces at the ABI level, as well as its inability to return anything but the constructed type. A function that returns an abstract type or an interface solves both issues.It’s really great to see that Java is finally realizing this and fixing the language itself.
Is it? I can’t think of a single language in existence without patterns of design that people reach for.
I’ll use Ruby as an example.
In Ruby there is no
newkeyword built into the language,newis just a normal method call on a normal object that happens to be aClass. It usually returns an instance of that class, but it can be redefined to return anything the programmer wants. In Ruby every object is created via factory methods, and people don’t even seem to notice. Also since it is dynamically duck typed it is not even necessary to formally define abstract types, just the methods the returned object responds to.Why do we have singletons and all the lifetime management complexity and defensive programming that comes along with them? Why not use static class methods? Because classes cannot implement interfaces, they cannot be passed to methods as values, because they are second class citizens of the language. So it is not enough to put all the functionality inside the class itself which is a natural singleton, programmers actually need to create an actual instance of the class in order to use it in places where objects are expected. Because classes are not objects, programmers are forced to reinvent the parts of the compiler that ensure only a single instance of each class exists in the executable.
Ruby has no such problem. Classes are objects which respond to messages like any other. They are implicit singleton values. They can be passed around as values just fine. Interfaces need not be explicitly defined, only documented: any class can be passed to any method provided it responds to the same messages and does what is expected.
I think if we look deeply enough we can find language level solutions to most if not all of the problems that created the need for such design patterns in the first place.
Huh, I thought this truism came from a quip by Alan Perlis or something like that, but in fact it was articulated in a blog post by Mark Jason Dominus. Anyway, in a follow-up, he clarified:
I was first exposed to this idea via Bob Nystrom’s 2010 blog post:
I would rather say that if a useful pattern requires lots of boilerplate to express, that is a sign of weakness. But if you identify a useful pattern and can express it rather directly, or abstract it, then you are still using a design pattern but the language doesn’t get in your way.
MJD referred to a presentation by Peter Norvig, Design Patterns in Dynamic Languages which identifies three “levels of implementation of a pattern”:
Classic design patterns are in the third category. They only existed as a thing that got written down in books because OO languages didn’t allow the patterns to be abstracted into a library or reduced to small idioms.
I’m going to qualify this with a “yes, but”. The Gang of Four book absolutely exists mainly to work around the shortcomings of Java, and most of the patterns have far simpler and nicer versions in more flexible languages. But there’s some odd niches here and there where they actually end up being useful.
Visitors for example. I will defend to the end the statement that visitors are just lame-ass re-creations of higher-order
map,foldetc functions. But there’s actually some niche places where they do better than a higher-order function approach, and I ran into one of them recently: traversing a heterogenous tree-ish data type and operating on only a few of its nodes. In my case, a compiler traversing an AST or high-level IR and doing transformations on it. This is in Rust, which is a language with no lack of higher-order functions or type metaprogramming, but I kept running into places where a recursion scheme approach ended up being incomplete or insufficiently flexible and I had to write special-case versions of it every time I had design case I didn’t foresee. So I caved in to people’s advice and wrote a visitor trait for it, and so far it has worked very well. Is it verbose and boiler-plate-y as hell to write? Yes. Is it actually very clean and simple to use? Also yes.Turns out the secret is the double-dispatch you get from the interaction of the visitor trait and variant-specific
walkfunctions. I’m 99% sure I could rewrite my existing recursion schemes to do the same thing, but I don’t think it would turn out any nicer than a visitor… Having one object with all the callbacks you need to transform any node type, not just the variants of a particular enum, seems like one of those “if it didn’t exist you’d have to invent it” things.Smalltalk and C++, surely? Though I know many people who would be upset at me for claiming shortcomings in Smalltalk.
My bad, you’re probably correct. Never actually read it. :P I’ve seen it mostly discussed in terms of Java code in the early 2000’s… probably ‘cause I was learning to program in the early 2000’s and Java was my first language.
It’s ok. People may hate this statement, but Java is a Smalltalk with more types. Which is why a lot of the patterns were still applicable.
Why oh why did they waste the opportunity to name the language Ni?
…srsly, it looks pretty cool. Shrubbery notation addresses the syntactic ugliness that’s kept me away from Lisps. A lot of the macro stuff in the paper goes over my head — I can barely figure out how to use “…” in C++ templates — but it looks very powerful. Definitely going to try out
NiRhombus when I get a chance.[update: turns out I made the same joke a year ago in the thread @5d22b linked to. What can I say, I watched too much Monty Python in my impressionable youth.]
Rhombus is still the interim name. IIRC, in the process picking the “official” name is one of the last steps before the language is “done.”
I LOVE IT!
I wonder if their work is also related to WISP (whitespace lisp) https://www.draketo.de/software/wisp and https://srfi.schemers.org/srfi-110/srfi-110.html and https://readable.sourceforge.io/
IIRC all of those read to a general S-expression form. Shrubbery, the surface syntax that Rhombus uses, reads to a more constrained form of S-expressions.
Some rationale and comparison of the design choices made for Shrubbery can be read here https://docs.racket-lang.org/shrubbery/Design_Considerations.html
Thank you for linking the paper: I saw this on the OOPSLA site yesterday but couldn’t find a link; where was this sourced (so i know where to look in the future)?
I think that link was via some sleuthing someone did on Reddit. Here is a version that is close to what will be published. The only changes should be spelling and grammar.
https://users.cs.utah.edu/plt/publications/oopsla23-faadffggkkmppst.pdf
Cool stuff! I take it the WASM code uses tail-call optimization?
From earlier posts I’ve seen about Lisp on WASM, it sounds like GC will be a significant hurdle. Hoot will have to implement that itself, right?
Hoot depends on both the tail call and GC Wasm extensions. On the GC side, Hoot will emit extra instructions to describe its types according to the Wasm GC spec and then the host VM can do the collecting.
I’m not sure where to put it, but the entire subthread is missing a look at the roadmap, which clearly shows that most runtimes do not have TC or GC available yet. Only Chrome can do it, and only with a special flag.
Yes, that’s the current status today. Firefox is in the process of actively implementing both. The proposals themselves appear to be progressing well through the standards process. Two web VMs are required to be implementing a proposal to progress through the later stages. The features should be default enabled once standardised, assuming no obstacles appear. People involved in the proposals process have estimated they should be available by the end of the year.
I believe other non-web engines are also working on these proposals, though I forget which at the moment.
Hmm, that’s unfortunate
Unfortunate in what way…?
Scheme makes use of tail calls and GC, and those extensions are on track to be generally available in common Wasm engines this year, so it seems like a reasonable design choice to target them now.
Ideally those things would be handled internally, though, not rely on extra features being added to every WASM engine.
IIRC rolling your own GC in WASM is quite awkward/difficult, especially if you consider object references between containers. There was a post about it a month or so ago (i don’t remember the details; maybe it was the previous Spritely post?)
Also, the major WASM runtimes already contain world-class GCs, and run a GC’d language, so exposing those GCs to WASM seems a good idea for performance and interop.
(But I do get your point about piling on features that other WASM runtimes now need to add! Fortunately GC isn’t hard to implement, if you don’t care about world-class performance. I’ve done it twice this year for my smol_world project.)
Ah hmm… For these particular abilities though, it’s at the very least quite hard or potentially impossible to get the same result without some kind of engine support.
For the case of tail calls, it’s much more natural to express recursive programs (esp. ones in Scheme written to expect tail calls) in this style. Perhaps custom stack management at run-time or other hefty program transformations can be used as a workaround, but the concept of tail calls is fairly straightforward and the host engine complexity appears to be small.
For the case of GC, the proposal is far more complex, so I can understand hesitation in terms of complexity… At the same time, allowing Wasm programs to leverage the existing engine GC does make implementation drastically simpler (for languages expecting GC). It importantly also makes it possible to describe cycles between host engines data and Wasm program data which just wasn’t possible before.
I suppose a generalised version of your concern might be that you don’t want every language feature to become a Wasm extension, and I agree with that general sentiment. In the case of tail calls and GC though, they feel (to me at least) sufficiently useful to a variety of languages and allow the Wasm engine to enable a use cases that may be impossible (or very hard) otherwise.
Tail call elimination simplifies things a lot, and there’s very little reason not to have it… how to do it has been known for a long time. GCC even supports it for C in many cases, IIRC.
At any rate, I won’t go into the GC proposal stuff in depth, but here are some good motivators to see that work advance:
Oh, so WASM GC is available already? I had the impression it was a ways out.
Wasm GC is experimentally available via flags or being implemented in at least Chrome and Firefox, perhaps some non-browser implementations as well.
It’s believed to be on track to be generally available in stable browsers and engines sometime this year IIRC.
I’m not super up-to-date, but as I understand it, GC will be in consumer browsers by Q4 2023 (according to Andy Wingo.) And is available in development builds currently, so language implementers may want to start targeting it now.
Janet is definitely the most modern looking Lisp I’ve seen.
What about Racket?
“When someone calls a language modern, it tells you next to nothing about the language, but it tells you a fair bit about the person who said it.”
That said, Racket has a few clunky features due to its age. The class system feels very dated, and the fact that most short list operations only work on lists and not general sequence types isn’t great. The latter is somewhat addressed by the “for” family of macros but IIRC it’s something the maintainers would have done differently if they had a do over.
YES! I think the term “modern” is a thought terminating cliche. What does it really mean? If you had a “modern” language, write a book about it, describe it as “modern”, and 20 years passes what does the term “modern” mean to readers?
It just shuts down conversations because no one wants argue against it.
Agreed. I’ve dug through too many used bookstores and old libraries full of books with titles like “Modern Pascal Programming For MS-DOS 4.0” to want to use it as a term.
I’m tempted to create a terrible programming language and name it “Modern” just to try to get people to stop saying this.
You could take an amalgamation of bad features from the last 30 years of “modern” languages. It would probably be a great language!
I’m curious how you would change the Racket class system? Besides the Beta features, it’s not too different from Java or Smalltalk.
Omit it entirely. Classes were a mistake.
Most of my time is now spent using Racket in places where I could use a shell script. It’s easier to write a Racket program that invokes other programs and work with their error codes and re-direct their output to the right places. Truly a joy for me, personally, as I do like writing Lisp.
Could you provide a few idiomatic examples of replacements of typical shellscript pipelines featuring grep, sek, sort, etc?
For the most part, a lot of features in the Racket library do not need sub-processes to do those types of jobs.
grepwe haveregexpobjects which employ eitherracket-matchorracket-match?to match across strings or filter.seqcan be mimicked by using arangefunction to iterate combined using expressions likefor.sortis done by using the appropriately named Racket functionsortand changing the comparison function and input list.If you want to sub-process invoke programs, then the output of a
subprocesscall can only be sent to a file stream likestdoutor a plain file. To invoke multiple sub-processes one after another and continuously pass their outputs to one another involves a little bit of trickery which might be a bit complex to talk about in a comment, but it is do-able. The gist is to try to write tasks using the Racket standard library, then usesubprocesswhen you need something not covered by it.As posted in a sibling message, it’s much easier to use built-in functions than to shell out and call another program. Personally, I find Racket more convenient for writing scripts that need to work in parallel. For example, a script gets the load average from several machines in parallel over ssh.
https://gist.github.com/6c7ab225610bc50a3bb4be35f8e46f18
Would also love to see examples.
Best way I can quickly sum it up is clever use of the function
subprocessin Racket.This outputs the
seqcommand tostdout, and allows for arbitrary commands so you can do zero-arg sub-processes or however many you need/like. Thecurrent-output-portandcurrent-input-portcalls are parameters that you can adjust by using aparameterizeblock to control the input/output from the exterior.The output port must be set to a file, it cannot be set to an output string like with
call-with-output-string, so output is either going to go straight tostdout, or you can usecall-with-output-fileto control thecurrent-output-portparameter and store the output wherever you please.Adding or setting key=value in a file idempotently: There is a utility for that: setconf
There is also augeas.
I had trouble following all this (you’ve read the Common Lisp spec way more closely than I ever bothered to), but you might be interested in John Shutt’s Kernel language. To avoid unhygienic macros, Kernel basically outlaws quasiquote and unquote and constructs all macros out of
list,consand so on. Which has the same effect as unquoting everything. A hyperstatic system where symbols in macros always expand to their binding at definition time, never to be overridden. Implying among other things that you can never use functions before defining them.There’s a lot I love about Kernel (it provides a uniform theory integrating functions and macros and intermediate beasts) but the obsession with hygiene is not one of them. I took a lot of inspiration from Kernel in my Lisp with first-class macros, but I went all the way in the other direction and supported only macros with quasiquote and unquote. You can define symbols in any order in Wart, and override any symbols at any time, including things like
ifandcons. The only things you can’t override are things that look like punctuation. Parens, quote, quasiquote, unquote, unquote-splice, and a special symbol@forapplyanalogous to unquote-splice. Wart is even smart enough to supportapplyon macros, something Kernel couldn’t do – as long as your macros are defined out of quasiquote and unquote. I find this to be a sort of indirect sign that it gets closer to the essence of macros by decoupling them into their component pieces like Kernel did, but without complecting them with concerns of hygiene.(Bel also doesn’t care about hygienic macros and claims to support fully first-class
applyon macros. Though I don’t understand how Bel’s macroexpand works in spite of some effort in that direction.)It’s easy to write unhygenic macros without quasiquote. Does Kernel also outlaw constructing symbols?
No, looks like page 165 of the Kernel spec does provide
string->symbol.Doesn’t that seem like a big loophole that would make it easy to be unhygenic?
Depends on what you’re protecting against. Macros are fundamentally a convenience. As I understand the dialectic around hygienic macros, the goal is always just to add guardrails to the convenient path, not to make the guardrails mandatory. Most such systems deliberately provide escape hatches for things like anaphoric macros. So I don’t think I’ve ever heard someone say hygiene needs to be an ironclad guarantee.
Honestly I agree with the inclusion of escape hatches if they are unlikely to be hit accidentally; I’m just surprised that the Kernel developers also agree, since they took such a severe move as to disallow quasiquote altogether.
I don’t want to put words in peoples’ mouths, but I’m pretty sure this is the stance of most Racket devs.
Not true, because Scheme’s
syntax-rulesexplicitly provides an escape hatch for literals, which can be used to violate hygiene in a deliberate manner. Racket implementssyntax-rules.On the other hand, you’re absolutely right that they don’t make it easy. I have no idea what to make of anaphoric macros like this one from the
anaphoricpackage.Racket doesn’t forbid
string->symboleither, it just provides it with some type-safe scaffolding called syntax objects. We can definitely agree that makes it more difficult to use. But the ‘loophole’ does continue to exist.I’m not aware of any macro in Common Lisp that cannot be implemented in Racket (modulo differences in the runtimes like Lisp-1 vs Lisp-2, property lists, etc.) It just gets arbitrarily gnarly.
Thanks for the clarification. I have attempted several times to understand Racket macros but never really succeeded because it’s just so much more complicated compared to the systems I’m familiar with.
Yeah, I’m totally with you. They make it so hard that macros are used a lot less in the Scheme world. If you’re looking to understand macros, I’d recommend a Lisp that’s not a Scheme. I cut my teeth on them using Arc Lisp, which was a great experience even though Arc is a pretty thin veneer over Racket.
Have you read Fear of Macros? Also there is Macros and Languages in Racket which takes a more exercise based approach.
At least twice.
Nowadays when I need a Racket macro I just show up in #racket and say “boy, this sure is easy to write using defmacro, too bad hygenic macros are so confusing” and someone will be like “they’re not confusing! all you have to do is $BLACK_MAGIC” and then boom; I have the macro I need.
Kernel does not avoid unhygienic macros. Whereas Scheme R6RS syntax-case makes it more difficult to write unhygienic macros but still possible. It possible to write unhygienic code with Kernel, such defining
define-macrowithout using or the need for quasiquote et al.Kernel does not outlaw quasiquote and unquote semantic. There is
$quoteandunquoteis merely(eval symbol env), whereas quasiquote is just a reader trick inside Scheme (also see [0]).Yes an no.
Scheme macros, and even CL macros are meant a) a hook into the compiler to speed things up e.g.
compose, or clojure’s=>, or b) change the prefix-based evaluation strategy to build, so called, Domain Specific Languages such as records eg. SRFI-9.Kernel eliminates the need to think “this a macro or is this procedure”, instead everything is an operative, it is up the interpreter or compiler to figure what can be compiled (ahead-of-time) or not, which is slightly more general that everything is a macro, at least because an operative as access to the dynamic scope.
Based on your comment description, Wart is re-inventing Kernel or something like that (without formal description unlike John Shutt).
re apply for macros: read page 67 at https://ftp.cs.wpi.edu/pub/techreports/pdf/05-07.pdf
[0] https://github.com/cisco/ChezScheme/blob/main/s/syntax.ss#L7644
Page 67 of the Kernel Report says macros don’t need
applybecause they don’t evaluate their arguments. I think that’s wrong because macros can evaluate their arguments when unquoted. Indeed, most macro args are evaluated eventually, usingunquote. In the caller’s environment. Most of the value of macros lies in selectively turning off eval for just the odd arg. And macros are most of the use of fexprs, as far as I’ve been able to glean.Yes, that’s the goal. But it doesn’t happen for
apply. I kept running into situations where I had to think about whether the variable was a macro. Often, within the body of a higher-order function/macro, I just didn’t know. So theapplyrestriction spread through my codebase until I figured this out.I spent some time trying to find a clean example where I use
@on macros in Wart. Unfortunately this capability is baked into Wart so deeply (and Wart is so slow, suffering from the combinatorial explosion of every fexpr-based Lisp) that it’s hard to explain. But Wart provides the capability to cleanly extend even fundamental operations likeifanddefandmac, and all these use the higher-order functions on macros deep inside their implementations.For example, here’s a definition where I override the pre-existing
withmacro to add new behavior when it’s called with(with table ...): https://github.com/akkartik/wart/blob/main/054table.wart#L54The backtick syntax it uses there is defined in https://github.com/akkartik/wart/blob/main/047generic.wart, which defines these advanced forms for defining functions and macros:
That file overrides this basic definition of
mac: https://github.com/akkartik/wart/blob/main/040.wart#L30Which is defined in terms of
mac!: https://github.com/akkartik/wart/blob/main/040.wart#L1When I remove apply for macros, this definition no longer runs, for reasons I can’t easily describe.
As a simpler example that doesn’t use apply for macros, here’s where I extend the primitive two-branch
ifto support multiple branches: https://github.com/akkartik/wart/blob/main/045check.wart#L1I would like to think I reimplemented the core idea of Kernel (
$vau) while decoupling it from considerations of hygiene. And fixedapplyin the process. Because my solution toapplycan’t work in hygienic Kernel.I don’t making any claim of novelty here. I was very much inspired by the Kernel dissertation. But I found the rest of its language spec.. warty :D
Promoting solely unhygenic macros, is similar as far as I understand, to promote “code formal proof are useless” or something similar about ACID or any kind guarantees a software might provide.
Both Scheme, and Kernel offer the ability to bypass the default hygienic behavior, and hence promote, first, a path of least surprise (and hard to find bugs), and allow the second (aka. prolly shoot yourself in the foot at some point).
At least for me, the value of Lisp is in its late bound nature during the prototyping phase. So the useability is top priority. Compromising useability with more complicated macro syntax (resulting in far fewer people defining macros, as happens in the scheme world) for better properties for mature programs seems a poor trade-off. And yes, I don’t use formal methods while prototyping either.
Syntax rules are not much more complicated to use than define-macro, ref: https://www.gnu.org/software/guile/manual/html_node/Syntax-Rules.html
The only drawback of hygienic macro that I know about is that is more difficult to implement than define-macro, but again I do know everything about macros.
ref: https://gitlab.com/nieper/unsyntax/
We’ll have to agree to disagree about
syntax-rules. Just elsewhere on this thread there’s someone describing their various attempts to unsuccessfully use macros in Scheme. I have had the same experience. It’s not just the syntax ofsyntax-rules. Scheme is pervasively designed (like Kernel) with hygiene in mind. It makes for a very rigid language, with things like the phase separation rules, that is the antithesis of the sort of “sketching” I like to use Lisp for.This is probably really out of date now, but it is an implementation of javascript in Racket (https://docs.racket-lang.org/javascript/index.html) written by Dave Herman
Thanks! Added!
In a similar vein, check out JSCert, JS-2-GIL, and KJS. I believe Gillian is the only actively developed semantics….
Amazing! I was getting so few replies with research implementations. Thank you!
I’m genuinely interested if that GUI can be used with framebuffer “backend” on Linux for embedded devices, using the DRM only.
Currently it isn’t possible. It would require implementing the base widgets (rendering and input events.) Part of an implementation could be simplified by using the existing
racket/drawlibrary which sits on top of cairo.Eh, there are some problems with xargs, but this isn’t a good critique. First off it proposes a a “solution” that doesn’t even handle spaces in filenames (much less say newlines):
I prefer this as a practical solution (that handles every char except newlines in filenames):
You can also pipe
find . -print0toxargs -0if you want to handle newlines (untrusted data).(Although then you have the problem that there’s no
grep -0, which is why Oil has QSN. grep still works on QSN, and QSN can represent every string, even those with NULs!)One nice thing about xargs is that you can preview the commands by adding ‘echo’ on the front:
That will help get the tokenization right, so you don’t feed the wrong thing into the commands!
I never use xargs -L, and I sometimes use
xargs -I {}for simple invocations. But even better than that is using xargs with the$0Dispatch pattern, which I still need properly write about.Basically instead of the mini language of
-I {}, just use shell by recursively invoking shell functions. I use this all the time, e.g. all over Oil and elsewhere.Now run with
myscript.sh do_all, ormy_script.sh do_oneto test out the “work” function (very handy! you need to make this work first)This separates the problem nicely – make it work on one thing, and then figure out which things to run it on. When you combine them, they WILL work, unlike the “sed into bash” solution.
Reading up on what
xargs -Ldoes, I have avoided it because it’s a custom mini-language. It says that trailing blanks cause line continuations. Those sort of rules are silly to me.I also avoid
-I {}because it’s a custom mini-language.IMO it’s better to just use the shell, and one of these three invocations:
$'\n'– when you want linesxargs -0– when you want to handle untrusted data (e.g. someone putting a newline in a filename)Those 3 can be combined with
-n 1or-n 42, and they will do the desired grouping. I’ve never needed anything more than that.So yes xargs is weird, but I don’t agree with the author’s suggestions.
sedpiped intobashmeans that you’re manipulating bash code with sed, which is almost impossible to do correctly.Instead I suggest combining xargs and shell, because xargs works with arguments and not strings. You can make that correct and reason about what it doesn’t handle (newlines, etc.)
(OK I guess this is a start of a blog post, I also gave a 5 minute presentation 3 years ago about this: http://www.oilshell.org/share/05-24-pres.html)
I use
find . -execvery often for running a command on lots of files. Why would you choose to pipe intoxargsinstead?It can be much faster (depending on the use case). If you’re trying to
rm100,000 files, you can start one process instead of 100,000 processes! (the max number of args to a process on Linux is something like 131K as far as I remember).It’s basically
vs.
Here’s a comparison showing that
find -execis slower:https://www.reddit.com/r/ProgrammingLanguages/comments/frhplj/some_syntax_ideas_for_a_shell_please_provide/fm07izj/
Another reference: https://old.reddit.com/r/commandline/comments/45xxv1/why_find_stat_is_much_slower_than_ls/
Good question, I will add this to the hypothetical blog post! :)
@andyc Wouldn’t the find
+(rather than;) option solve this problem too?Oh yes, it does! I don’t tend to use it, since I use xargs for a bunch of other stuff too, but that will also work. Looks like busybox supports it to in addition to GNU (I would guess it’s in POSIX).
Time for the other really, really useful feature of xargs. ;)
It’s not a limit on the number of arguments, it’s a limit on the total size of environment variables + command-line arguments (+ some other data, see
getauxval(3)on a Linux machine for details). Apparently Linux defaults to a quarter of the available stack allocated for new processes, but it also has a hard limit of 128KiB on the size of each individual argument (MAX_ARG_STRLEN). There’s alsoMAX_ARG_STRINGSwhich limits the number of arguments, but it’s set to 2³¹-1, so you’ll hit the ~2MiB limit first.Needless to say, a lot of these numbers are much smaller on other POSIX systems, like BSDs or macOS.
find . -exec blahwill fork a process for each file, whilefind . | xargs blahwill fork a process per X files (where X is the system wide argument length limit). The later could run quite a bit faster. I will typically dofind . -name '*.h' | xargs grep SOME_OBSCURE_DEFINEand depending upon the repo, that might only expand to one grep.As @jonahx mentions, there is an option for that in
findtoo:I didn’t know about the ‘+’ option to
find, but I also usexargswith a custom script that scans for source files in a directory (not inshorbashas I personally find shell scripting abhorrent).That is the real beauty of xargs. I didn’t know about using + with find, and while that’s quite useful, remembering it means I need to remember something that only works with find. In contrast, xargs works with anything they can supply a newline-delimited list of filenames as input.
Yes, this. Even though the original post complains about too many features in
xargs,findis truly the worst with a million options.This comment was a great article in itself.
Conceptually, I think of xargs primarily as a wrapper that enables tools that don’t support stdin to support stdin. Is this a good way to think about it?
Yes I’d think of it as an “adapter” between text streams (stdin) and argv arrays. Both of those are essential parts of shell and you need ways to move back and forth. To move the other way you can simply use
echo(orwrite -- @ARGVin Oil).Another way I think of it is to replace
xargswith the word “each” mentally, as in Ruby, Rust, and some common JS idioms.You’re basically separating iteration from the logic of what to do on each thing. It’s a special case of a loop.
In a loop, the current iteration can depend on the previous iteration, and sometimes you need that. But in
xargs, every iteration is independent, which is good because you can addxargs -Pto automatically parallelize it! You can’t do that with a regular loop.I would like Oil to grow an
eachbuiltin that is a cleaned up xargs, following the guidelines I enumerated.I’ve been wondering if it should be named
eachandevery?each– like xargs -n 1, andfind -exec foo \;– call a process on each argumentevery– likexargs, andfind -exec foo+` – call the minimal number of processes, but exhaust all argumentsSo something like
If anyone has feedback I’m interested. Or wants to implement it :)
Probably should add this to the blog post: Why use
xargsinstead of a loop?echoon the beginning of the command. You’re decomposing the logic of which things to iterate on, and what work to do.xargs -Pgrep. Instead offind | xargs, dofind | grep | xargs. This composes very nicelyCool. A bit like the old MH mail client system.