1. 2

    Reposting here since I accidentally commented on the 2018 submission… (Good news, the code worked this time!)

    Trying to consider what language to do it in…

    I mostly use Python at work, so it might be good to do it in that to continue to improve in it. I also really enjoy JavaScript, and I used to use Ruby a lot and kind of miss it.

    Or could use this as a way to try to learn something totally new. I’ve been hearing a lot about APL recently, and it’s pretty fascinating.

    Trying to do it in a language I already know and golfing the solutions as hard as I can would also be very fun though.

    1. 4

      APL is fun to learn but if prior AoC puzzles are any indicator, many of them don’t align well with the strengths of APL. That’s not to say I won’t be trying it again this year though! : )

    1. 1

      I’ve been searching for a single-file implementation of ML in Standard ML that I once came across several years ago. I think it was written in the form of an email message or memo between developers of MLton or SML/NJ. Maybe it was Matthew Fluet or Andrew Appel? Has anyone else seen this?

      1. 2

        Appel writes a compiler for Tiger (an ML-derivative) in his Modern Compiler Implementation books, maybe that’s what you’re looking for?

        1. 1

          No, I’m familiar with Appel’s books but I’m remembering something that was an electronic ephemera, literally just a page of text, and only a fragment of the ML that Tiger supports.

      1. 4

        Hillel, I wonder if you have seen this project which approaches a similar problem using the tools of Category Theory: https://www.categoricaldata.net/

        1. 3

          I have :)

        1. 3

          Throwing a birthday party for my wife. We are going to try grilling Impossible burger sliders rather than traditional hamburgers.

          1. 4

            There’s a section in the Knuth essay around pages 18-19 where he gives some thought to “n and a half times” iterations which is what I see in your example:

            http://www.cs.sjsu.edu/~mak/CS185C/KnuthStructuredProgrammingGoTo.pdf

            I’m not really aware of a structure in your languages that makes it very simple without some repetition.

            I would probably use one of these patterns

             S; while not B do T; S; done
            
             while true do; S; break if not B; T; done
            

            where S is your data and T is the separator.

            1. 45

              RustTL;DR: It’s a very impressive language with a clear vision and priorities, but the user interface needs a lot of work. Also, the collection library is much better than Scala’s.

              • Generics with <>. It’s 2017 by now, we know it’s a bad idea. One of the reasons why the language suffers from abominations like the “turbofish” operator ::<>.

              • Strings don’t offer indexing, because it doesn’t make sense for UTF-8. Correct! But Strings offer slicing … WAT?

              • Misuse of [] for indexed access. Having both () and [] doing roughly the same thing, especially since [] can be used to do arbitrary things, doesn’t make sense. Pick one, use the other for generics.

              • Inconsistent naming. str and String, Path and PathBuf etc.

              • :: vs. . is kind of unnecessary.

              • Mandatory semicola, but with some exceptions in arbitrary places: struct Foo; vs. struct Foo {}

              • Arbitrary abbreviations all over the place. It’s 2017, your computer won’t run out of memory just because your compiler’s symbol table stores Buffer instead of Buf.

              • Can someone decide on a casing rule for types, please, instead of mixing lowercase and uppercase names? Some types being “primitive” is an incredibly poor excuse.

              • Also, having both CamelCase and methods_with_underscores?

              • Library stutter: std::option::Option, std::result::Result, std::default::Default

              • iter(), iter_mut(), into_iter() … decide prefix or postfix style and stick with it.

              • Coercions do too many things. For instance, they are the default way to convert i32 to i64, instead of just using methods.

              • Also, converting numbers is still broken. For instance, f32 to i32 might result in either an undefined value or undefined behavior. (Forgotten which one it is.)

              • Bitcasting integers to floats is unsafe, because the bits could be a signaling NaN, causing the CPU to raise an FP exception if not disabled.

              • Forward and backward annotations: #[foo] struct Foo {} vs struct Foo { #![foo] }. Also /// for normal documentation, //! for module level documentation. Documentation already uses Markdown, so maybe just let people drop a markdown file in the module dir? That would make documentation much more accessible when browsing through GitHub repositories.

              • Also, documentation can cause compiler errors … that’s especially fun if you just commented a piece of code for testing/prototyping.

              • Type alias misuse: In e.g. io crate: type Result<T> = Result<T, io::Error> … just call it IoResult.

              • Macros are not very good. They are over-used due to the fact that Rust lacks varargs and abused due to the fact that they require special syntax at call-site (some_macro!()).

              • Pattern matching in macros is also weird. x binds some match to a name in “normal” pattern matching, but matches on a literal “x” in “macro pattern matching”.

              • println! and format! are very disappointing given that they use macros.

              • Compiler errors … ugh. So many things. Pet peeve: “Compilation failed due to 2 errors” … 87 compiler errors printed before that.

              1. 8
                • Library stutter: std::option::Option, std::result::Result, std::default::Default
                • Type alias misuse: In e.g. io crate: type Result<T> = Result<T, io::Error> … just call it IoResult.

                How ya gonna square that circle?

                1. 2

                  I think std::io::IoResult would be fine – it would solve the issue of having vastly different Results flying around, while not having single-use namespaces that are only used by one type.

                  1. 2

                    The general pattern is to import Io instead. When doing this, IoResult would be jarring.

                    use std::io;
                    
                    fn my_fun() -> io::Result<T> {
                    
                    }
                    
                2. 14

                  It’s 2017,

                  I have some news for you, @soc.

                  1. 3

                    Haha, good catch. Now you see how old this list is. :-)

                    The only thing I got to delete since then was “get rid of extern crate”.

                  2. 3

                    What’s your preferred alternative to generics with <>?

                    1. 6

                      [], as it was in Rust before it was changed for “familiarity”.

                      Unlike <>, [] has a track of not being horribly broken in every language that tried to use it.

                      1. 5

                        How is <> broken?

                        1. 16

                          It complicates parsing due to shift and comparison operators.

                          1. 2

                            Ah, yeah, that makes sense.

                          2. 19

                            Pretty much no language has ever managed to parse <> without making the language worse. The flaws are inherent in its design, as a compiler author you can only pick where you place the badness; either:

                            • Add additional syntax to disambiguate (like ::<> vs. <> in Rust).
                            • Have weird syntax to disambiguate (like instance.<Foo>method(arg1, arg2) in Java).
                            • Read a potentially unlimited amount of tokens during parsing, then go back and fix the parse tree (like in C#).
                            • etc.

                            In comparison, here are the issues with using [] for generics:

                            • None.

                            For newly created languages (unlike C++, which had to shoehorn templates/generics into the existing C syntax) it’s a completely unnecessary, self-inflicted wound to use <> for generics.

                            More words here: Why is [] better than <> for generic types?

                            1. 2

                              Those are good reasons to not use <>, but as a Haskeller I personally find either style somewhat noisy. I’d rather just write something like Option Int. Parentheses can be used for grouping if needed, just like with ordinary expressions.

                              1. 2

                                Haskell feels like it is in the same category as D, they both just kicked the can a tiny bit further down the road:

                                Both need (), except for a limited special-case.

                                1. -1

                                  I don’t see how Haskell kicked the can down the road. The unit type is useful in any language. Rust has a unit type () just like Haskell. Scala has it too.

                                  I’m not sure what “special case” you are referring to.

                                  1. 2

                                    The fact that you still need () for grouping types in generics as soon as you leave the realm of toy examples – just as it is in D.

                                    (Not sure what’s the comment on the unit type is about…)

                                    1. 4

                                      Ah, I understand what you’re saying now. But that’s already true for expressions at the value level in most programming languages, so personally I find it cleaner to use the same grouping mechanism for types (which are also a form of expressions). This is especially applicable in dependently typed languages where terms and types are actually part of the same language and can be freely mixed.

                                      However, I can also appreciate your argument for languages with a clear syntactic distinction between value expressions and type expressions.

                        2. 1

                          D’s use of !() works pretty well. It emphasizes that compile-time parameters aren’t all that crazy different than ordinary runtime parameters.

                          1. 1

                            I prefer my type arguments to be cleanly separated from value arguments (languages that fuse them excepted).

                            I find D’s approach slightly ugly, especially the special-cases added to it.

                            1. 1

                              I prefer my type arguments to be cleanly separated from value arguments

                              Well, in D they aren’t type vs value arguments, since you can pass values (and symbol aliases) as compile-time arguments as well. That’s part of why I like it using such similar syntax, since it isn’t as restricted as typical type generics.

                              I find D’s approach slightly ugly, especially the special-cases added to it.

                              The one special case is you can exclude the parenthesis for a single-token CT argument list and actually I thought I’d hate it when it was first proposed and I voted against it… but now that it is there and I used it, I actually like it a lot.

                              Sure does lead to a lot first timer questions on the help forums though… it certainly isn’t like any other language I know of.

                        3. 2

                          Also, converting numbers is still broken. For instance, f32 to i32 might result in either an undefined value or undefined behavior. (Forgotten which one it is.)

                          Yeah, I kind of feel the same way. Even with try_from() dealing with number conversions is a pain in Rust.

                          1. 1

                            You saved me a lot of typing. 100% agree.

                            1. 1

                              Thanks! I’d love to know the reason why someone else voted it down as “troll” – not because I’m salty, but because I’m genuinely interested.

                            2. 1

                              2 pains I have with Rust right now:

                              • I would like to be able to connect to a database (Teradata specifically)
                              • I want to launch a subprocess with other than the default 3 stdio descriptors (e.g. exec $CMD $FD<>$PIPE in sh)
                              1. 2

                                I know it’s technically unsafe and that might preclude it from your use, but does CommandExt::pre_exec not fit your bill?

                                1. 2

                                  That could work. I’m still new to Rust so I haven’t fully explored the stdlib.

                              2. 1

                                Ahahaha, this is a great list. I’m curious about a couple things though, since you’ve obviously put a lot of thought into it…

                                Bitcasting integers to floats is unsafe, because the bits could be a signaling NaN, causing the CPU to raise an FP exception if not disabled.

                                The docs for f32::from_bits() and such talk about precisely this, but I considering the misdesign of signaling NaN’s really don’t see how it could possibly be made better. Any ideas?

                                …They are over-used due to the fact that Rust lacks varargs…

                                What little experience I have with programming language design makes me feel like varargs are a hard problem to deal with in a type-safe language, at least if you want to allow different types for the args (instead of, say, forcing them all to be what Rust would call &dyn Display or something). Do you know of any language which does it Right?

                                1. 1

                                  The docs for f32::from_bits() and such talk about precisely this, but I considering the misdesign of signaling NaN’s really don’t see how it could possibly be made better. Any ideas?

                                  Rust could have disabled the trapping of signaling NaN’s on start up, but I think Rust fell into the same design mistake of C:

                                  Scared of making the use-case of the 0.01% (people who want signaling NaN’s to trap) harder to achieve, they made life worse for the 99.99%.

                                  varargs are a hard problem to deal …

                                  Agreed, it’s safe to say that language designers hate them. :-)

                                  … at least if you want to allow different types for the args

                                  I think this is only partially the reason. You can still have only same-typed varargs at runtime, but allow recovering the individual types of the arguments in macro calls – which is exactly the case for format! and friends.

                                  Do you know of any language which does it Right?

                                  I think in the case of format strings, focusing on varargs is the wrong approach. If you imagine how you want an ideal API to look like, you probably want to interpolate things directly inside the string, never having to go through the indirection of some vararg method.

                                  Instead of having the formatting parameters in one place, and the to-be-interpolated values in a different one, like in …

                                  let carl = "Carl"
                                  let num = 1.234567;
                                  format!("{}'s number is {:.*}, rounded a bit", carl, 2, num)
                                  // -> "Carl's num is 1.23, rounded a bit"
                                  

                                  … wouldn’t it be much nicer to write (this is Scala):

                                  val carl = "Carl"
                                  val num = 1.234567
                                  f"$carl's num is $num%.2f, rounded a bit"
                                  // -> "Carl's num is 1.23, rounded a bit"
                                  
                                  1. 1

                                    Julia has nice string interpolation too. I honestly don’t understand why more programming languages don’t have it. Does everyone just forget how useful it is in bash when they come to design their language?

                              1. 21

                                I worked with a smart data contractor who loved and was good at SQL and who quit over trying to work with a large aggregation query in SQLAlchemy.

                                Maybe don’t be dogmatic and try not to fight with your tools. SQLAlchemy can execute literal SQL just fine if that’s what you feel you need to get the job done.

                                testing on sqlite when you target postgres is risky

                                This is convenient but not really the point of being vendor-agnostic. I’ve used SQLAlchemy a few times to swap vendors, not from testing to production, but from one production instance to another. No problem.

                                It makes any type-checking or linting impossible.

                                Are you telling me you have type-checking and linting for SQL as literal strings?

                                Migration is a hard problem.

                                Migrations with Alembic are pleasantly easy.

                                1. 8

                                  Are you telling me you have type-checking and linting for SQL as literal strings?

                                  If you look at the author’s tool (linked in the article) it seems like he’s going the direction of using the SQL as the source of truth, but not just as strings – the tooling is deriving migrations and object models from that, rather than the other way around. It seems like it could be a viable way of doing things.

                                  1. 4

                                    I looked at it and it may be viable at some point in the future, but it’s woefully underpowered in its current state. Checking off all the items on the list of what doesn’t work still won’t reach parity with the current function of Alembic.

                                    Edit: tying migrations to git history seems fraught, given “anything that messes with the git history (like a rebase) is deeply confusing to this tool and will result in bad migrations.”

                                    1. 3

                                      Agreed, it’s far from complete or ready for real use. But the fundamental model seems to hold some promise…particularly for shops where there is already a lot of SQL expertise and the whole “hide the database bits from the programmer” approach of many more standard ORMs doesn’t make as much sense.

                                      Edit: 100% agreed on the git part…it felt kind of icky even before that comment, and doesn’t sound like it’s going to handle the reality of how repos evolve.

                                      1. 3

                                        It would be better to keep the schema history in the source code. I’m not a database expert by any means, but I have the feeling this must have been done already…

                                        1. 3

                                          Alembic does this. Revisions to the data object model are in git history, but the migrations are represented as a DAG of scripts all available at the HEAD of the repo.

                                        2. 2

                                          SQL Server has support for automatic schema migration using DACPACs, which generate a migration script based on definitions from source code (e.g. a git repo) and comparing that to a given database. There’s a tool called skeema that does a similar thing for MySQL, although it is not nearly as full-featured.

                                          Since schema drift is a thing, I think this approach makes more sense than generating migrations from git history alone.

                                          1. 2

                                            Only problem now is that you’re stuck using SQL Server =^.^=

                                        3. 2

                                          It says it’s a beta. If your repo is dedicated to SQL and you don’t allow for history changes then it seems okay? Not very practical once a security key ends up in that repo and needs wiped, but here we are :D

                                          I am curious how they intend to support reverse migrations, though.

                                      2. 2

                                        I’ve used SQLAlchemy a few times to swap vendors, not from testing to production, but from one production instance to another. No problem. [emphasis mine]

                                        Wow. That surprises me. We sometimes can’t even change versions across the same vendor without usually reading the sqlalchemy docs closely and thinking ahead a little bit.

                                        Migrations with Alembic are pleasantly easy

                                        Unless your app uses multiple databases… Maybe it is better now, but man was it it unpleasant last time I tried to do it – sqlalchemy itself wasn’t /too/ bad, but the intersection of it and alembic and a declarative (not reflected) schema was painful.

                                      1. 45

                                        Things like this are why I don’t trust Martin’s opinions. He didn’t say a single bad thing about Clojure, he didn’t have any nuance, he doesn’t respect the other viewpoints. He does the same thing with TDD and clean code, where it’s impossible for them to be the wrong tool. He’s been programming for five decades and still thinks in silver bullets.

                                        For the record, his Clojure example is even shorter in J. It’s just *: i. 25.

                                        1. 10

                                          His example is shorter even in Clojure:

                                          (map sqr (range 25))

                                          but that misses the point. Both Clojure examples are easy to explain and understand, where in J it is not obvious what *: and . stand for, and how these should be changed should we wanted to compute something different. But even that is not the point.

                                          The point is that Uncle Bob is writing about his own experience with the language that he finds fascinating. He writes about his experience and he gets to choose how he writes it. If anyone disagrees (plenty people do, I suppose) they are very well entitled to write about their experience themselves.

                                          1. 19

                                            I don’t want to sound like an asshole, but what exactly is his experience besides teaching and writing books ? Cause we see so many people advocating for specific language/technology without any substantial real world experience.

                                            1. 2

                                              As professional advocates go, he’s well known and (at least be me) well regarded.

                                              A professional advocate advocating for something is a signal too… and a lot of the things he was advocating 25 years ago are still relevant today.

                                              http://web.archive.org/web/20000310234010/http://objectmentor.com/base.asp?id=42

                                              1. 10

                                                A professional advocate advocating for something is a signal too

                                                Yes, it’s called Appeal to Authority.

                                                I’m also not convinced he’s much of an authority. I’d say he’s a zealot. His tirades against types are tired. His odes to discipline are masturbatory. His analogies… well… This is the same guy who said C++ is a “man’s language” and that you need big balls to write it.

                                                1. 4

                                                  His analogies… well… This is the same guy who said C++ is a “man’s language” and that you need big balls to write it.

                                                  This is called an ad hominem. If you’re going to be a stickler about logical fallacies I’m surprised that you can’t even make it a few sentences without contradicting yourself. Are they important or not?

                                                  A professional advocate advocating for something is a signal too

                                                  This is called inductive reasoning. Given some evidence, such as a well-regarded professional advocating for some tool, we can try to generalize that evidence, and decide the tool has a good chance of being useful. You’ve surely heard of Bayesian probability; signals exist and they’re noisy and often incorrect but minding them is necessary if you want to make any sense of the world around you.

                                                  Yes, it’s called Appeal to Authority.

                                                  Logical fallacies only really apply when you’re working in the world of what’s called deductive reasoning. Starting from some premises which are assumed to be true, and moving forward using only techniques which are known to be sound, we can reach conclusions which are definitely true (again, assuming the premises). In this context, the one of deductive reasoning, appeal to authority is distinctly unsound and yet quit common, so it’s been given a nice name and we try to avoid it.

                                                  Tying it all together, the parent is saying something like “here’s some evidence”, and you’re interjecting with “evidence isn’t proof”. Great, everybody already knew that wasn’t proof, all that we’ve really learned from your comment is that you’re kind of rude.

                                                  1. 4

                                                    Fallacies can apply to inductive arguments too, but you are right in that there’s an important distinction between the two types and how they differ. I would say that the comment you’re replying to is referring to the idea of informal fallacies in the more non-academic context. The Stanford encyclopedia has a good in-depth page about the term.

                                                    Also, not all fallacies are equal, appeal to authority may be seen as worse than ad hominem these days.

                                                    1. 5

                                                      This thread started with, “Things like this are why I don’t trust Martin’s opinions.” Uncle Bob’s star power (or noteriety) and whether that qualifies as social proof or condemnation, is the point of the discussion, not a distraction.

                                            2. 11

                                              The point is that Uncle Bob is writing about his own experience with the language that he finds fascinating. He writes about his experience and he gets to choose how he writes it.

                                              I wouldn’t be complaining if he was just sharing a language he liked. The problem is he’s pushing clojure as the best language for (almost) everything. Every language has tradeoffs. We need to know those to make an informed decision. Not only is he not telling us the tradeoffs, he’s saying there aren’t any! He’s either naïve or disingenuous, so why should we trust his pitch?

                                              1. 1

                                                The problem is he’s pushing clojure as the best language for (almost) everything.

                                                That’s not what he said though. The closest he came to that is:

                                                Building large systems is Clojure is just simpler and easier than in any other language I’ve used.

                                                Note the qualification: ‘… than any other language I’ve used’. This implies there may well be languages which are easier for building large systems. He just hasn’t used them.

                                                Not only is he not telling us the tradeoffs, he’s saying there aren’t any!

                                                He repeated, three times for emphasis, that it doesn’t have static typing. And that it doesn’t give you C-level performance.

                                                1. 7

                                                  Note the qualification: ‘… than any other language I’ve used’. This implies there may well be languages which are easier for building large systems. He just hasn’t used them.

                                                  We need to consider the connotations and broader context here. He frames the post with

                                                  I’ve programmed systems in many different languages; from assembler to Java. I’ve written programs in binary machine language. I’ve written applications in Fortran, COBOL, PL/1, C, Pascal, C++, Java, Lua, Smalltalk, Logo, and dozens of other languages. […] Over the last 5 decades, I’ve used a LOT of different languages.

                                                  He doesn’t directly say it, but he’s really strongly implying that he’s seen enough languages to make a universal judgement. So “than anything other language I used” has to be seen in that context.

                                                  Nor does he allow special cases. Things like

                                                  But what about Javascript? ClojureScript compiles right down to Javascript and runs in the browser just fine.

                                                  Strongly connotating that “I’m writing frontend code for the web” is not a good enough reason to use Clojure, and he brushes off the lack of “C-level performance” with

                                                  But isn’t it slow? … 99.9% of the software we write nowadays has no need of nanosecond performance.

                                                  If Clojure is not the best choice for only 0.1% of software, or even 5% of software, that’s pretty darn close to “best language for (almost) everything.”

                                                  He repeated, three times for emphasis, that it doesn’t have static typing.

                                                  He repeats it as if the reader is hung up on that objection, and not listening to him in dismissing it. Note the increasing number of exclamations he uses each time. And he ends with

                                                  OK, I get it. You like static typing. Fine. You use a nice statically typed language, and I’ll use Clojure. And I’ll be in Scotland before ye.

                                                  Combined with his other posts (see “The Dark Path”), he doesn’t see static typing as a drawback. We can infer it as a drawback, but he thinks we’d be totally wrong in doing so.

                                              2. 3

                                                You have to explain both examples for them to make sense. What does map do? How do you change sqr out for a different function? If you learn the purpose of the snippet, or the semantics of each of the individual elements, you can understand either the J or Clojure example just as well as the other (if your understanding of both languages is equal).

                                                Also the meat of the article is trying to convince the reader to use Clojure (by explaining the syntax and semantics, comparing its syntax to two of the big 5 languages, and rebutting a bunch of strawman arguments - nothing particularly in-depth). I don’t see a balance of pros and cons that would be in a true account of an experience learning and using the language, including more than just a bullet point on the ecosystem, tooling, optimisation, community, etc.

                                                1. 3

                                                  I am sure that any programmer that has any experience in any language would guess that you change sqr out for a different function by typing the name of that other function. For example, you compute exp instead of sqr by, well, typing “exp” instead of “sqr”.

                                                  The same with map. Of course that someone has to know what particular function does to be able to use it effectively. The thing with Clojure (and other Lisps) is that it is enough to know that. You don’t need special case syntax rules. Any expression that has pretty much complex semantics is easy to write following a few basic rules.

                                                  1. 5

                                                    I understand the benefits of the uniformity of Lisp, but my point was just that you can’t really say that (map sqr (range 25)) is any more or less understandable than *: i. 25 if you know the purpose of the expressions and the semantics of their constituent parts. And given that knowledge, you can reasonably make substitutions like exp for sqr or ^: for *: (though I would end up consulting a manual for the exact spelling).

                                                    Further experimentation would require more knowledge of either language. For instance, why if isn’t a function in Clojure, or why lists don’t have delimiters in J. It’s all apples and oranges at this superficial level.

                                                2. 1

                                                  My version of Clojure doesn’t define sqr—is that built in?

                                                  That aside, I don’t find either version very easy to explain to someone who isn’t already experienced with functional programming. What does “map” mean? How does it make sense that it takes a function as an argument? These seem obvious once you’ve internalized them, but aren’t easy to understand from scratch at all.

                                                  If I were reviewing this code, I would suggest they write (for [x (range 25)] (* x x))

                                                  1. 4

                                                    Of course that one has to understand the semantics of what they’re doing. But, in Clojure, and Lisps it is enough to understand the semantics, while in most other languages, one has to additionally master many syntax rules for special cases.

                                                    1. 5

                                                      Closure has quite a lot of special syntax compared to many Lisps. for example, data type literals and other reader macros like literal lambdas, def forms, let forms, if forms and other syntax macros like -> are all built in. Each of these has their own special rules for syntax and semantics.

                                                      1. 2

                                                        We’re on the same page I think, except that I think knowledge of semantics should be enough to understand any language. If you see a verb and a noun in close proximity, you’d be able to make a good guess as to what’s happening regardless of the glyphs representing their relationship on screen.

                                                        1. 5

                                                          If you want a language that emphases semantics over syntax, then APL is the language for you! There are just a few things to understand about syntax, in order of importance.

                                                          1. Array literals
                                                            • Numeric arrays are just numbers separated by spaces. Negative numbers are prefixed with ¯. Some dialects have special-case syntax for complex or rational numbers: 42 3.14 1J¯4
                                                            • Character arrays are just text delimited by '' quotes. Doubling the quote inside an array escapes it: 'is' or 'isn''t'
                                                          2. Array indexing with [] braces: 'cafe'[3 2 1 4] ←→ 'face' (Many APLers have a disdain for this form because it has some inconsistency with the rest of the language.)
                                                          3. Function definitions
                                                            • Inline anonymous “direct” functions delimited by {} braces.
                                                            • Traditional named functions defined by the ∇ form.
                                                          4. Statement sequencing with (Mainly useful for jamming more code into a single line)

                                                          From there, the grammatical rules are simple and natural in the form of verb noun or noun verb noun or verb adverb noun etc. Probably the most difficult thing to learn and remember is that there is no operator precedence and evaluation reduces from right-to-left.

                                                          When I’m programming in APL, I rarely think about the syntax. When I’m programming in Clojure, syntax is often a concern. Should I use map or for? Should I nest these function calls or use ->?

                                                          1. 2

                                                            When I’m programming in Clojure, syntax is often a concern. Should I use map or for? Should I nest these function calls or use ->?

                                                            None of those are syntax. map is a function and the rest are macros. They’re all inside the existing Clojure syntax.

                                                            1. 1

                                                              Macros can be used to define syntactic constructs which would require primitives or built-in support in other languages. [my emphasis]

                                                              https://clojure.org/reference/macros

                                                              1. 2

                                                                True enough. However, at least in Clojure, macros are pretty deliberately limited so as not to allow drastically changing the look-and-feel of the language. So I’m pretty sure every macro you’ll come across (except I guess reader macros) will have the same base syntax, (a b ...).

                                                1. 3

                                                  Company: Apple
                                                  Company site: apple.com
                                                  Position(s): Data Science Engineer
                                                  Location: Austin TX

                                                  Description:

                                                  We are a software engineering team working in close partnership with data scientists. We build tools and systems to protect Apple and our customers from fraud, waste, and abuse. We are one of very few organizations that has access to and influence over nearly every aspect of Apple’s businesses, from manufacturing & supply, direct & channel sales, warranty & customer support, and services.

                                                  We are a rapidly-growing team looking to hire talented programmers at all skill levels. We use a variety of technologies, predominately written in Python, Java, and Clojure, interacting with massive-scale data on Teradata and Hadoop.

                                                  Our team is mainly based in Austin, but we are open to hiring at other Apple sites such as Cupertino, Singapore, and Shanghai.

                                                  https://jobs.apple.com/en-us/details/200080596/data-science-engineer?team=MLAI

                                                  Contact: Apply online at the link above and email me [my username] [at] apple.com

                                                  1. 4

                                                    The only good config file format is a SQLite database

                                                    1. 7

                                                      With some YAML in it!

                                                        1. 1

                                                          How do you feel about a tree of text files, each corresponding to a config key?

                                                          How many keys are in your config file? 100? 1000? 10,000?

                                                          Even 10,000 text files in a tree of subdirectories is not that unmanageable.

                                                          You can store them in a repo, and be able to immediately see what’s changed without even doing a diff.

                                                          Also one of the most accessible formats as far as tooling.

                                                          Easy to write GetConfig() and SetConfig() for, and performs well with basic caching (static hash in these functions.)

                                                          Did I miss anything?

                                                          Oh yeah, defaults. I have an imperfect solution to defaults that I use in my current project. There is a default/ directory, which contains all the default settings in the same format.

                                                          Example: default/interface/voting/enable_checkboxes

                                                          The first time GetConfig() is called on this value, if it is not present in config/interface/voting/enable_checkboxes, the value from default/ is copied over.

                                                          This also allows me to have a test which checks for orphaned default settings (if they’re present in default/ but not in config/ after test script.)

                                                            1. 2

                                                              Hey I first thought you were kidding but that’s pretty much how /proc works on Linux!

                                                              1. 2

                                                                DJB config? at least qmail does something similar

                                                            1. 2

                                                              J has a very ‘mathy’ coding style. Where by mathy I mean:

                                                              • no loops
                                                              • no if statement
                                                              • no allocations
                                                              • no ‘technical initializations’

                                                              in the main body of the algorithms.

                                                              By comparasing, this is the Python with panda implementation of the k-means (albeit slightly more complex) https://github.com/jackmaney/k-means-plus-plus-pandas/blob/master/k_means_plus_plus/cluster.py

                                                              I feel like (no substantive experience though), is that with all the explosion of linear algebra code, due to machine learning – APL family of language was best suited to express what’s written in books with LaTeX.

                                                              But likely a number of historical, commercial, and usability/accessibility reasons made them much less accessible/attractive.

                                                              Also the current trend toward compile-time expressions, monads hiding data type intricacies and their interaction with external functions, and lambda expressions – combined, are giving us the ‘mathiness’ of what’s already available in J.

                                                              1. 2

                                                                I sometimes use APL as a modeling language for what I will ultimately implement in Python with numpy or pandas. It helps me to frame the problem in terms of data, representation, and notation rather than functions and classes.

                                                                Many of the APL primitives have equivalents in numpy routines, so once I have a satisfactory APL solution, porting it to Python is pretty simple. There are a few gaps though, like the power operator used in k-means. The numpy API is so large it can be hard to even know whether there’s an equivalent to some APL primitive.

                                                                This isn’t an exorbitant cost though. The missing APL function can usually be written in just a couple lines of Python code.

                                                                1. 1

                                                                  The downside to using RANK/DENSE_RANK/ROW_NUMBER is that they don’t work with group by. As your example below demonstrate, you need to use more than one query and join the results. Another downside, which is probably more significant, is that rank etc. requires a sort. Using max(array[]) keeps a “running max” in memory dosen’t need any type of sort.

                                                                  1. 1

                                                                    I don’t understand what you mean by saying they don’t work with group by. Window functions have PARTITION BY syntax for evaluating over groups.

                                                                    You are definitely right that performance could be improved on huge datasets by using your array trick. I would still prefer the window functions unless this query is a prominent bottleneck.

                                                                    1. 1

                                                                      Window functions cannot be used with group by. What you did was to combine the window function with DISTINCT ON, so you essentially calculate the aggregates for the entire set, and then take only the first row.

                                                                      This is different than how a group by works in the sense that group by applying “reducing” rows to group while computing the aggregates. Window functions operate on the entire set (w/o distinct on you would get duplicate rows).

                                                                      1. 1

                                                                        I’m sorry, now I’m really confused. I did not write DISTINCT ON.

                                                                        1. 1

                                                                          It is a unique way of phrasing it, but if I were to guess, he’s saying: “What you [must have] did [at some point] was to combine the window function with DISTINCT ON [which while similar, has important differences]”

                                                                  1. 3

                                                                    The correct answer is undeniably 4 2

                                                                    https://tryapl.org/?a=8%F72%282+2%29&run

                                                                    1. 6

                                                                      https://i.imgur.com/KWAlJMw.png

                                                                      I’m not sure what I was expecting, but APL seems to have taken the principle of least surprise and inverted it.

                                                                      1. 2

                                                                        Ahahahaha thank you for that

                                                                        1. 1

                                                                          Very few programming languages allow multiplication by juxtaposition. I can only think of Maple and Julia which do. Maybe Mathematica too? Not familiar enough with it.

                                                                          Juxtaposition is probably the most common notation for multiplication after you leave high school.

                                                                      1. 3

                                                                        This is just a gorgeous HTML-ification of a classic paper/talk. I wish it had a little note for folks like me who just clicked into it. I’ll have to set some time aside to read this. Great work. Also it would be interesting to see where/how this connects to J, which also I just now found out about.

                                                                        1. 4

                                                                          J is probably most directly influenced by another Iverson paper, “A Dictionary of APL” [1] as described in “An Implementation of J.” [2]

                                                                          1: https://www.jsoftware.com/papers/APLDictionary.htm

                                                                          2: https://sblom.github.io/openj-core/ioj.htm

                                                                        1. 10

                                                                          I once had a thought that it would be interesting to see a treatment of category theory in APL. Then i went back and read Iverson, and he pretty much had it already. That guy was so far ahead, the rest of computing still hasn’t caught up.

                                                                          1. 29

                                                                            The frustration of the author in their Lobste.rs bio is palpable:

                                                                            If you take a look at my submissions, you’ll see most of my work is ignored and my book recommendations have been most popular, as of the time of writing this; I believe this is because, no different than Hacker News, this website is mostly populated by people with only surface-level interest in computing, if that; meanwhile, a book recommendation lets everyone have their opinion and they give it points because they easily understand it.

                                                                            While frustrated, I intend to keep this account until it completely ceases to serve me, at which time I’ll do my best to delete everything ever made under this account.

                                                                            Marketing yourself is difficult, but having distain for your audience because they will not recognize your obvious intelligence is a trap. Congrats on discovering that a clever project name will get you some self-gratifying attention, but for long term success have some humility and respect for your fellow man.

                                                                            1. 10

                                                                              People may be expecting to see some moderator action here…

                                                                              I honestly don’t know what to make of the article. I don’t find the jokes in it funny, nor do I find them appropriate for this forum (“homo” is not simply a silly word; it causes real harm). I don’t encourage personal attacks in all but the most exceptional circumstances, but I did find that your comment provided helpful context, and you were pretty restrained. Ultimately, I think I’m glad that you commented as you did. For the sake of civility, I want to encourage you not to get drawn into back-and-forth about this; I think your top-level comment stands well on its own.

                                                                              1. 1

                                                                                I don’t find the jokes in it funny, nor do I find them appropriate for this forum (“homo” is not simply a silly word; it causes real harm)

                                                                                Isn’t this referring to homoiconicity? The author states his language has this property. “Homo” has uses beyond the offensive type you’re referring to, it’s a latin prefix - homogeneous, homophone, homoiconicity.

                                                                                I’m not trying to say it’s inoffensive to everyone - I just don’t draw the conclusion that the author using it in this way.

                                                                                1. 3

                                                                                  I felt the entire joke of the first paragraph was that it’s using a bunch of funny-sounding words that are often considered inappropriate, and claiming they’re being used solely for their technical meanings.

                                                                              2. 4

                                                                                How about engaging with the content rather than bringing in the author’s off-topic profile text? Congrats on fulfilling their expectations.

                                                                                1. 5

                                                                                  Thanks, but I’m going to let my comments stand. The tone and purpose of the article I think is best explained by his bio.

                                                                                  This article is not a serious attempt at coding. If you think I’m off-topic, fine. But software is more than just code. I’ll take clarity and respect over cleverness and contempt every time.

                                                                              1. 2

                                                                                I would love to read the implementation but I think there’s some encoding problem. None of the APL symbols are rendered properly for me. For example I see ⍝ where I would expect ⍝.

                                                                                1. 1

                                                                                  The link is written in this way:

                                                                                  <a charset="UTF-8" href="masturbation.apl">implementation</a>
                                                                                  

                                                                                  This didn’t correct it, however. Your browser probably gives you the option to change the character encoding of a document manually, but I’ll change the link to behave properly if it’s a matter of changing this tag.

                                                                                  1. 4

                                                                                    Your server isn’t sending a content type or encoding header with the page itself. The charset attribute on the anchor isn’t supported by any browser. I don’t know of a way to change the encoding client-side in mobile Safari, but you are right it can be changed in most desktop browsers.

                                                                                    As @spc476 said, the best way to correct it is to configure Apache to deliver files with .apl extension with a Content-type: text/plain; charset=utf-8 header.

                                                                                    1. 3

                                                                                      Another way to fix it with Apache is to add a AddDefaultCharset directive to the configuration.

                                                                                      1. 1

                                                                                        I wonder why UTF-8 is not the standard default encoding for HTTP.

                                                                                        1. 10

                                                                                          because HTTP predates UTF-8 and wide adoption of HTTP predates wide adoption of UTF-8

                                                                                  1. 4

                                                                                    This has been my main criticism of 12 factor as well. So many instances where something has gone wrong because of ENV variables where a simple config file would’ve fixed everything. But they have their place and sometimes they are the correct solution. Still not a fan.

                                                                                    1. 3

                                                                                      Another issue with using env vars is that any part of the program can use them. It doesn’t force the developer to make the configuration schema explicit.

                                                                                      I have been in situations where different part of the program were loading different environment variables, because they were designed by different people. It becomes a mess quite quickly.

                                                                                      1. 5

                                                                                        I have been in situations where different part of the program were loading different environment variables, because they were designed by different people. It becomes a mess quite quickly.

                                                                                        Ick. Only the entrypoint of the program (func main or equivalent) has the right and responsibility to take information from the environment and provide it to components that need them. Corollary: if a component needs a bit of config, it should take it in the form of an explicit constructor or initialization parameter, never by implicitly reaching into the runtime environment or the global namespace.

                                                                                        1. 1

                                                                                          I like your point of making “configuration schema explicit” but I don’t know that any popular config file format actually does that. I have an idea that an application should always read its configuration from a database. An in-memory SQLite db would be sufficient for many purposes.

                                                                                          The “config file” is just a dump of a database from a known state. Its format is portable, editable, and standard, and any arbitrary data schema can be encoded in the relational model.

                                                                                          At startup time, the application initializes the database from this dump “config file” and also loads commandline and environment parameters into the database. From that point on, all components obtain configuration by SQL query.

                                                                                          1. 2

                                                                                            I like your point of making “configuration schema explicit” but I don’t know that any popular config file format actually does that.

                                                                                            Commandline flags as the only (or primary) way to get configuration from the environment into the program has this side effect: yourprogram -h authoritatively describes the configuration surface area.

                                                                                            Self-promotion: https://github.com/peterbourgon/ff

                                                                                      1. 2

                                                                                        I have come to believe that secrets should always be passed by reference (usually a path in the filesystem), not by value. This holds true for configuration files as well. If you are able to enforce that consistently, suddenly it becomes a non-issue to log environment variables or dump the config file for inspection. Which makes a whole set of other activities like debugging much easier.

                                                                                        1. 5

                                                                                          I have come to believe that secrets should always be passed by reference (usually a path in the filesystem), not by value.

                                                                                          I like passing them as a file descriptor, because it really truly is a capability: unforgeable yet shareable.

                                                                                          1. 1

                                                                                            That’s a good idea. Are you able to apply this in the container world or did you create your own special scheduler?

                                                                                            In Kubernetes the canonical way is to mount the secrets on disk, which makes them vulnerable to file-traversal attacks if there are any.

                                                                                            1. 1

                                                                                              I haven’t done it with containers, only with processes. It should be possible to inject into a container, but I don’t know how well the tooling supports this. Probably not well — POSIX file descriptors are criminally underknown.

                                                                                            2. 1

                                                                                              I’m guessing you mean to use something like file descriptor redirection in a shell command, e.g.:

                                                                                              python my_script_needs_secrets.py 3</path/to/secret
                                                                                              

                                                                                              Then inside the process:

                                                                                              secret=os.fdopen(3).read()
                                                                                              

                                                                                              This is a great approach for security, but how does it scale with multiple secrets? Do you use a separate descriptor for each one, or cat them all into the same descriptor? How do you organize your app to know which descriptor contains the secret data?

                                                                                              1. 1

                                                                                                When I’ve used the technique, I’ve just used a different descriptor for each, but one could send a bunch of secrets down one descriptor in some format if one wished.

                                                                                                The mapping of descriptor to schema is part of the documentation, typically a README (this is all for internal software, often just for my own use).

                                                                                          1. 3

                                                                                            If you needed a signed JSON object to retain JSON structure, why wouldn’t you add a valid JSON envelope with the token and the original payload as attributes like so:

                                                                                            {
                                                                                                "header": {
                                                                                                    "alg": "HS256",
                                                                                                    "typ": "JWT"
                                                                                                },
                                                                                                "token": "XXXXXXX",
                                                                                                "payload": {
                                                                                                    "sub": "1234567890",
                                                                                                    "name": "John Doe",
                                                                                                    "iat": 1516239022
                                                                                                }
                                                                                            }
                                                                                            
                                                                                            1. 8

                                                                                              Imagine that the sender of the JSON document is Node and the ECMAScript JSON API, and the recipient of the document is using Rust and Serde.

                                                                                              Most cryptographic algorithms, including hashing functions, operate on bytes. So to take the hash of that payload, you need to decode the entire JSON document, pull the payload object out of memory, re-encode it as a JSON document, and perform the hashing algorithm on that. When you do this in Node, you’ll wind up hashing the ASCII bytes of {"sub":"1234567890","name":"John Doe","iat":1516239022}, getting 3032e801ce56c762a1485e5dc2971da67ffff81af5cc7dac49d13f5bfbe95ba6. Also, because of the way objects are represented in Node, seemingly innocuous changes to the code can result in the keys being in a different order when you initially build an object, but Node does preserve order in JSON documents when it decodes and reencodes it. (node also does not provide any good APIs for manipulating the order of keys in objects, as far as I know, because ECMAScript actually says that the order is unspecified)

                                                                                              Serde, on the other hand, does not preserve order when you decode a JSON document. There are basically two common ways to decode a JSON object: you can decode it into a HashMap, which literally randomizes the order, or you can decode it into a struct, which if you re-encode it, will encode it in the same order that the struct is written in. So, given this code:

                                                                                              #[derive(Deserialize,Serialize)]
                                                                                              struct FullMessage {
                                                                                                  header: MessageHeader,
                                                                                                  token: String,
                                                                                                  payload: MessagePayload,
                                                                                              }
                                                                                              #[derive(Deserialize,Serialize)]
                                                                                              struct MessagePayload {
                                                                                                  iat: u64,
                                                                                                  name: String,
                                                                                                  sub: String,
                                                                                              }
                                                                                              

                                                                                              If you decode into a FullMessage, and then re-encode the MessagePayload, you will wind up hashing {"iat":1516239022,"name":"John Doe","sub":"1234567890"}, which hashes to 907b71ecd7dbc6cb902905e053fe990ed5957aa5217150b2355c36583fcf9519. It will, thus, report that the payload was tampered with, even though both versions of the payload are equivalent for your purposes.

                                                                                              Because the JSON specifications say that order is not important in an object, both behaviors are spec-compliant.

                                                                                              1. 3

                                                                                                Gotcha. I don’t use Node or Rust, but I can understand how different JSON libraries could make this a problem. What if the payload was serialized?

                                                                                                {
                                                                                                    "header": {
                                                                                                        "alg": "HS256",
                                                                                                        "typ": "JWT"
                                                                                                    },
                                                                                                    "token": "XXXXXXX",
                                                                                                    "payload": "{\"iat\": 1516239022, \"sub\": \"1234567890\", \"name\": \"John Doe\"}"
                                                                                                }
                                                                                                

                                                                                                In this form, the token is computed and verified on the given bytes of the serialized payload, so differences in parsers should not matter.

                                                                                                1. 1

                                                                                                  That would totally work. It’s basically the same as the OP’s recommendation (serialize your JSON, then concatenate it with the signature) except you’re using a much more complicated way of “concatenating” them.

                                                                                                  1. 3

                                                                                                    Right, but the result is still valid JSON, which was the problem they raised with “just concatenation.”

                                                                                                    1. 3

                                                                                                      Technically, yes. Unfortunately, whatever signature-unaware middleware you’re using won’t be able to get at the JSON keys and values within the payload part. Most people deploy such middleware specifically because they want to be able to filter or route based on the contents of the message, and you lose that.