Big-O time complexity really doesn’t paint a full picture of how data structures and algorithms perform in production software.
Details like cache lines, data locality, branch prediction and boxed terms do a great job of making for very surprising benchmarks where implementations with poorer Big-O complexity perform faster than implementations with better Big-O complexity.
The HAMT-based maps in Erlang/Elixir were definitely welcomed when they were introduced, but plenty of software was written in Erlang and run in production for years, without O(lg n) or O(1) random access into sets of KV pairs.
Which of the data structures do you think are “slow”?
List performance is as expected for singly-linked lists, tuples as for fixed-sized arrays, etc.
It’s like you shouldn’t use ‘slow’ software to write performance sensitive things like globally distributed chat software… Or something.
I should be clear, I love Elixir. I’m not disparaging it. I was surprised by the time complexity of some things is all – and I’m aware it doesn’t really translate to much IRL.
It is not unusual to have a sorting algorithm that switches to different sorting algorithms depending on information about the argument… Python does it
In practice algorithms with super linear complexity can behave very well under small inputs
This looks really cool. Learning about Partisan and Lasp led me to read about membership protocols like HyParView and broadcasting algorithms like Bimodal broadcast, PlumTree and Thicket. I’m also fascinated by fault injection techniques like LDFI, and now Filibuster.
I’m excited to read further :)
Maybe I’m a bit crazy but I feel like if you’re programing elixir with message passing in mind, then you’re doing something wrong. 99% on your code should be purely functional and you should not be thinking about message passing (which is fundamentally stateful/effectful). Sure, it’s great to keep in mind that it’s happening behind the scenes, and that grokking the mechanism can give you confidence about the robustness of your code, but I do not architect the bulk of my code considering message passing, except if I’m digging really deep and writing e.g. a low-level tcp protocol implementation (which I almost never am).
Is it different in the erlang community?
Elixir and Erlang do not have a “pure FP” philosophy. Evaluation is eager, typing is dynamic, and side effects are basically never pushed into a monad in practice. Some of the core libraries even proudly expose global/cluster-wide state.
The parts of FP that made it in (first-class functions, immutable data structures, certainly some other aspects I am missing) are there because they are useful for building systems that are resilient, debuggable, repairable, upgradable, and evolvable with ~zero downtime.
That is the primary goal of Erlang, and Elixir inherits this legacy. It’s a goal, not a philosophy, so you may find competing ideas next to each other, because they’ve been shown to work well together in this context.
Also effects in FP languages can be, and usually are modeled using process calculi which are exactly what Erlang offers!
That being said, Erlang also has side effects apart from message passing.
never claimed it is pure FP. The VM is nice in that gives you places to breach FP pureness in exactly the right spots and makes it very hard or ugly to breaching pureness where doing so is dangerous.
My mistake, I thought you were surprised at message passing from a pure-FP point of view.
Another reason to think of message passing, and more broadly genservers/processes, in particular is that they can become bottlenecks if used carelessly. People talk about genservers as a building block of concurrency, which isn’t false, but from another point of view they are Erlang’s units of single-threaded computation. They only process one message at a time, and this is a feature if you know how/when to use it, but a drawback at other times. Effective Elixir or Erlang development must keep in mind the overall pattern of message passing largely to avoid this issue (and, in highly performance-sensitive cases, to avoid the overhead of message passing itself).
I still can’t find a good word for it! https://twitter.com/gamache/status/1390326847662137355
I can only offer my own experience (5 years of Erlang programming professionally).
Message passing, like with FP, is a tool that can be (mis)used. Some of the best uses of multiple processes or message passing that I see in code are:
I’ll generally come across problems that are solved by processes/message passing when writing libraries. When writing application code that uses those libraries, it’s usually far less common.
my advice is typically:
I like the bounded concurrency one. Should probably add it to my list. Are you creating a rate limiter or resource pool? then use genserver.
There is nothing wrong with using gen_server
in library. The thing is that in most cases it is not you who should start and manage that process - leave it up to the user. The “problem” is that there are 3 “kinds” of projects in Erlang world:
systemd
integration, Telemetry Poller, etc.In each of these parts there are different needs and message passing will be used more or less, depending on the needs.
150% this. Dave Thomas got this trichotomy right in his empex talk, it’s just infuriating to me that his choice of “names” for these things is unnecessarily confusing.
Sorry I mistyped… It should be “are you writing a library? If not, you should probably not be writing a genserver.
In my experience (having done Elixir for 8 years), folks that don’t understand/think about messages and genservers in Elixir are at a severe disadvantage when debugging and grokking why their code is behaving some way.
It’s a lot like people who in the previous generation of webdevs learned Rails but not Ruby…which is fitting, since those are the folks driving adoption of Phoenix.
(There are also certainly people who reach for a genserver for everything, and that’s a whole additional annoying thing. Also the people who use Tasks when they really, really shouldn’t.)
Processes are the core feature that provides both error isolation and fault tolerance. As you build more robust systems in Elixir, the number of times you use processes increases.
Often it is correct to abstract this away behind a module or API, but you’re still passing messages.
I’ve not used Elixir, but it sounds from your description as if it has some abstractions that give you Erlang (technically Beam, I guess) processes but abstract away the message-passing details? The last time I wrote a non-trivial amount of Erlang code was about 15 years ago, but at least back then Erlang didn’t have anything like this. If you wanted concurrency, you used processes and message passing. If you didn’t want concurrency, there were several functional programming languages with much faster sequential execution than Erlang, so you’d probably chosen the wrong tool.
This combines two things I really enjoy; APLs and image dithering. They’re a perfect match for each other!
Sure, and that’s mentioned in the article. That said, matching ‘not the empty list’ also means the possibly matching things that aren’t lists. Fortunately, there’s also the is_list/1
BIF that’s allowed in guard statements. Once learned, though, the robot-butt is, in my opinion, easier to read than L when is_list(L)
The robot-butt is fun, and it showcases pattern matching, one of the things that make programming Erlang so much fun.
I think it’s a game of semantics vs implementation details. I find []
for empty list
to be more correct, but I guess in /most/ functional languages it is actually [Head | Tail]
where Tail
can be empty… Maybe the idiomatic way of []
vs [_|_]
is different per language…
[]
is definitely the most correct way to match an empty list.
[Head | Tail]
, as well as [_ | _]
, will never match an empty list in Erlang; they will only ever match (possibly improper) lists of 1 or more elements.
I guess I phrased it weirdly, my bad. Of course it was meant to be “an empty list for Tail”, which would make the complete thing a list of size 1.
This talk is nice to watch and listen to. It obviously inspires and convinces people.
Sigh
Therefore, it makes me sad that it contains bits like as a basis for why you need concurrent programming:
“Concurrency because of real world” argumentI don’t want to understand this other world. That other world is a very strange world [meaning sequential programming]. The real world is parallel. […]
At about https://youtu.be/cNICGEwmXLU?t=224
If you wanted to similate a group of humans, maybe. But for an ordinary business app? If you take that as a basis why don’t you also say: “Well, humans also run on bio-chemical processes, why don’t we use that for computing? Let’s replicate brains!”
Programming is ultimately about idealized models which are not reality. One of the most important things about programming is to model
Concurrency is incredibly hard for humans to understand.
Highly Available Data is HardFurtherdown Joe basically says that highly available data is really, really hard and you should normally not code something like a consensus protocol on your own. I totally agree. Joe dives into the topic of highly-available (HA) data quite a bit, just to drive this point home. I wish he would spent more time on things that would actually help people write better software bu hey. At least we agree.
The Architecture Erlang Should Be Benchmarked AgainstBut if you combine the facts that
I arrive at a quite different architecture, at least for typical business software:
If you say, Erlang isn’t made for these kind of apps, that’s OK. I’d like to see that clearly spelled out. If something can tell me why Erlang is better than that, or whatever is better than this for the common case, I am seriously interested.
Unreliable message passingIn my mind, sequential programming combined with a highly capable and available databse is going to be so much simpler then having a programming model with unreliable message sending. The few projects I have been in that used actors tended to just implicitly assume that the messages would indeed arrive because they mostly did. Not saying, it is impossible to write this in a better way but intelligent people everywhere will fail to do so, normally.
If you want to deal with unreliable message passing, you have a few options:
Then let’s go over “let it crash”. I like the idea in principle but he doesn’t even touch the problem of crash loops etc. If you restart things, things do not magically start working.
Somehow, the talks always stop at that point and don’t go into details.
If restarting thing helps, that also means that things are not reproducible which usually means you’ll have a bad debugging and testing experience. If you chunked your app into small pieces/actors, then at least you have very restricted state to reason about which is nice. But that is probably hardly useful without considering the state of the other actors.
There’s a lot to unpack here. I think the most important thing for me to say is that the actor model is not only about concurrency or scalability; it can (and does) help with fault tolerance and the maintainability of code.
Programming is ultimately about idealized models which are not reality. One of the most important things about programming is…
I thought programming was primarily about getting computers to perform the desired calculations. Everything above coding directly in machine code is to make programming easier for humans, but they are a means to the end, not the end itself.
Concurrency is incredibly hard for humans to understand.
Agreed, but it’s a programming luxury to get to pretend that concurrency doesn’t exist. We’ve approached a point of diminishing returns on single-core clock speed, and it’s more cost effective to work with multiple cores, multiple CPUs, multiple computers/servers, multiple datacenters, etc.
What’s nice about the actor model is that it lets you keep thinking about single-threaded execution a lot of the time; instead of thinking about mutexes, semaphores and critical sections, in Erlang, you’re dealing mainly with isolated process state and with the transmission of immutable data between processes.
- Use lots of stateless processes written in pretty much any programming language.
- Use a highly available data layer for most of the concurrency.
- Limit your own code that deals with real concurrency to the bare minimum.
This model seems to take for granted that an HTTP server, a relational DBMS, or an operating system, is written to execute code concurrently. While you correctly point out that some people might say “Erlang is not for these kinds of apps”, others might say “Erlang is for making the substrate on top of which these apps run”.
With respect to point #1, your state has to live somewhere. Of course, that state can exist in a HA database, but there’s a cost associated with such an architecture that might not be appropriate in all situations. Joe talks a lot about the merits of isolated state in the linked talk, which is a very powerful idea that is reflected in languages like Rust as well as in Erlang.
Point #2 is widely practiced among Erlang programmers. It’s perfectly possible for an Erlang application to speak to a traditional RDBMS, or there are solutions that are written in Erlang, such as mnesia, or Riak, or CouchDB.
Point #3 is also widely practiced among Erlang programmers. Many of the people writing web apps that run on the BEAM are not always directly dealing with the task of concurrently handling requests or connections; they’re writing callback functions to perform the appropriate business logic, and then these callbacks are executed concurrently by a library such as cowboy.
sequential programming combined with a highly capable and available databse is going to be so much simpler then having a programming model with unreliable message sending
Again, there’s no arguing which of these is simpler, but the simplicity you’re talking about is a luxury. What happens when your single-threaded program has more work to do than a single CPU core can handle? What happens when your single-threaded program has to communicate over an unreliable network with a program running on another computer?
Then let’s go over “let it crash”. I like the idea in principle but he doesn’t even touch the problem of crash loops etc. If you restart things, things do not magically start working.
A shockingly large amount of the time, however, restarting things does work. Transient failures are all over the place. What’s nice about “let it crash” is that it can help eliminate a huge amount of defensive programming to handle these transient failures. Unable to connect to a server? The result of some request is not what you’re expecting? Instead of handling an exception and retrying some number of times, Erlang programmers have the choice to let the process crash, and delegate the responsibility of trying again to a supervisor process. This helps keep business logic code clean of fault-handling code.
If restarting thing helps, that also means that things are not reproducible which usually means you’ll have a bad debugging and testing experience
Over long periods of execution, you’re inevitably going to encounter situations that are not reproducible. There’s a point of diminishing returns when it comes to testing, and it’s usually a long way away from simulating the kinds of conditions your app might experience over months or years of uptime.
If you chunked your app into small pieces/actors, then at least you have very restricted state to reason about which is nice. But that is probably hardly useful without considering the state of the other actors.
My experience using Erlang daily for the last 5 years is that I might have to consider the state of 2-3 actors when debugging an Erlang application, but I have almost never had to consider processes outside of the application I’m debugging, let alone the state of all the running actors. For the most part, decomposing applications into communicating actors helps greatly with writing and debugging concurrent code, rather than hindering it.
TL;DR: Yes, concurrent programming is hard, but it’s a luxury to pretend like concurrency doesn’t exist. When it comes to writing concurrent code, I find that Erlang’s actor model helps more than it hinders. Erlang’s actor model helps with more than just concurrency; it helps with fault tolerance and with code maintenance, to an almost greater degree than it helps when writing concurrent code.
I was a bit sad to see the JIT being written in C++ with lots of the other code in C, i guess its easier to justify if erlang already depended on C++, which I’m not sure about. I would love to hear about the rationale for this decision.
I hope its easy to turn off for use in C only environments.
I hope its easy to turn off for use in C only environments.
What C only environments exist that can run BEAM? Honest question, coming from someone who has ported BEAM to bizarre platforms where even there you can count on the existence of a C++ compiler and runtime.
Looks like it’s because they use asmjit, which is written in C++. The post also mention the JIT only works on x86_64 and AArch64, other platforms use the interpreter.
Sure, but is there a rationale why asmjit is better than a pure C one? I know luajit has a jit engine one can use.
edit: link to the luajit engine http://luajit.org/dynasm.html
It’s a library that they don’t have to write.
Also, isn’t luajit development stalled out/not keeping pace with standard lua? My understanding is that the original author doesn’t have much interest in pushing the project forward, which might be indicative of how much effort is involved.
(Please correct me if I’m off base. I don’t know very much about lua or luajit.)
I’m just saying there are C jit libraries out there already so wonder what the justification for C++ is. Maybe they evaluated them and rejected them:
Note I was talking about the luajit dynasm project.
Note I was talking about the luajit dynasm project.
See, that’s something that I was not aware of. Thank you for sharing!
And while I’m at it, apologies for being so curt in my previous comment. I reread it and it comes off very short. Not my intention.
I’m not an expert on software licenses, but my understanding is that while GPLv3 projects can include Apache 2.0 projects, the reverse is not true. That seems to eliminate Lightning and libjit from consideration.
LuaJIT was designed around the Lua language. It wouldn’t be suitable for Erlang without a substantial amount of work possibly even a rewrite.
I should clarify my point: it would be difficult for erlang to build and maintain something comparable to luajit.
I don’t think it makes sense to use luajit. I may not know a lot about lua, but I know it’s very different from erlang.
I was referring to the jit engine http://luajit.org/dynasm.html , not sure why that would not be suitable.
We also considered using dynasm, but found the tooling that comes with asmjit to be better.
https://github.com/erlang/otp/pull/2745#issuecomment-691482132
I want to return to Tcl. The most I did with Tcl was back in 2007 within some ns2 simulations. Exercism now has a Tcl track
I briefly flirted with learning Tcl a few months ago, but I struggled to find a concrete project to work on in order to motivate my learning.
Thanks for linking to the Exercism track! I’ll try that :)
Exercism is really nice and the mentors super supportive.
By the way, here is a small, yet concrete project in Tcl: the first version of Redis
You may want to expand on this and make your own caching thing.
I read that gist today. I really enjoyed it. I think it really demonstrates how well Tcl is suited to making prototypes like that.
Some cool lines I really enjoyed
fileevent $fd readable [list readrequest $fd]
cmd_$cmd
incantations, along with ::cmdtable
, make for nice and clear separation of the request handling code, as well as specifying properties of each of the commands.Thank you for sharing!
Nice introduction to an underrated tool for languages that run on the BEAM :)
Erlang’s dbg
module is extremely handy for debugging in the REPL during development. However, care should be taken when running dbg
in production, especially if you have SLOs for low latency. Libraries like redbug and recon are handy because they can shut themselves off after a certain number of function calls or messages. They also provide a gentler syntax for matching than Erlang’s ‘match specifications’.
Another fabulous tool for debugging Erlang, particularly the interactions between various OTP processes, is the sys
module. With sys
, you can inspect the state of a gen_* process, and you can get printouts of messages passed between these processes. You can also start a GUI for inspecting the state of a running Erlang system with observer
.
I have fallen in love with the Fraidycat browser extension. I really appreciate how it doesn’t show me a growing number of posts that have been published that I haven’t read.
I use Tiddlywiki for personal stuff, but I’ve been experimenting with org-roam for work stuff. So far I prefer Tiddlywiki.
When exploring ideas, I write out long-form sentences and paragraphs. If my mind is racing and my hands can’t keep up, I’ll resort to a ‘stream of consciousness’ type of writing; a string of words making no grammatical sense. I find both of these techniques helpful because if I get interrupted, I can reread what I’ve written and usually have an easy time continuing where I left off.
If I’m trying to write code on paper, I try to do something that resembles Wirth’s Program Development by Stepwise Refinement. I write in the language I intend to code with, usually with a bit of shorthand borrowed from mathematics (LaTeX’s \in and \exists come to mind). Writing things out with pen and paper loosely following the methodology in the above paper helps me better spot when I’m making design decisions that should be made explicit.
Ultimately, the key reason I enjoy programming with pen and paper is because I love to write with fountain pens :)
This looks pretty cool! I definitely see the resemblance to Erlang’s model of lightweight processes. That said, I couldn’t find the primitives upon which Erlang’s supervisors are built; links and monitors.
The idea of uni- or bi-directional links between processes seems to be more flexible and powerful than representing supervisor trees directly in code. Two examples come to mind:
Process A waiting for a response from process B, but process B crashed after receiving the message from A. This could be solved by having process A create a uni-directional link (‘monitor’ in Erlang) to B before sending the message.
Process A is responsible for maintaining state about a set of other processes managed by one or more supervisors. With links in Erlang, this is easy, and with trap_exit, process A can see how these processes exit, and update its state accordingly.
With supervision trees, you gain the ability to manage how some groups of processes are managed. With links and monitors, you gain the building blocks on which Erlang’s supervision trees are built, and much more.
First of all, I would like to thank you because this is one of the most constructive comments I have ever received over the course of 6 months.
I am going to reflect that mechanism over to Bastion. The cases you’ve mentioned were the cases bugging me for a while. I added your comment as an issue on project’s issue board
Corrected link for Thicket.
The best teacher I had in graduate school spent the whole semester destroying any beliefs we had about computing. He was a real iconoclast. He happened to be a genius, so we took it. At the end of the course, we were free because we didn’t believe in anything. We had to learn everything, but then he destroyed it. He wanted us to understand what had been done, but he didn’t want us to believe in it.
Anyone know who this is? I’d be really interested in reading anything this person has written.
From Alan’s next answer: Bob Barton
Architect of the B5000: a safe, hardware-software architecture with OS in high-level language in 1961.
People who’re participating, what languages are you using? I’m starting to learn Typescript, so I’m thinking of using that for practical reasons, though I’d probably rather do OCaml or something a bit more esoteric.
Last year I used 6 languages, and this year my goal is to use 12! Carp, Fennel, and Reason are some new-to-me languages I hope to touch on. I will likely do most of the problems in clojure because I enjoy manipulating data-structures in it so much, but rarely feel like it’s the practical choice for my projects.
If it is fine with you to not be able to complete a task in 24h than AoC is a great way to learn new language.
I’m learning Clojure and liking it so far, so my AoC has a perfect timing this year to be full of parenthesis
I did 2015 and 2017 in Racket, and had a pretty good time. I tried 2016 in Erlang, and really struggled.
This year, I’m going to try it with Pony.
I use Perl 5. Life + day job doesn’t give much time for actual programming so I’m really happy to just try to solve problems in a language I know well.
Plus Perl makes parsing the problem input a breeze.
As much as I am tempted to try ReasonML, I wrote a modest amount of Elixir back in 1.2 (three years ago, and stable is now 1.7), and have an upcoming need for it again. Will probably do that to brush up on any changes.
In an attempt to make it challenging is I am going to try and pick up Emacs while doing so. I have heard that the lineage inherited from Erlang provides a decent ecosystem in Emacs for other BEAM languages. Does anyone know if that perception is correct?
Cannot edit my post anymore, but it looks like not only is that the case, there are multiple Elixir-specific packages as well (elixir-mode
and alchemist
).
I think this year I’ll forego going for leaderboard, and use the opportunity to brush up on my Elixir skills. I haven’t had a good chance to touch it for quite a while, so it should be a good refresher. I really enjoy thinking in and solving problems with the toolset available in Elixir.
Jose Valim is going to stream his own solutions to each challenge one day afterward (in Elixir of course):
http://blog.plataformatec.com.br/2018/11/lets-learn-elixir-together-with-advent-of-code/
Crystal. Figured I should brush up on making actual programs instead of one-off scripts and thus style and “architecture” will be my focus.
Pretty ballsy that a project will try to reclaim that name from the horror that is Crystal Reports. Kudos!
Curious to hear from some Erlang/Elixir devs here. As I understand it, “rebooting” is at the heart of the OTP philosophy. Generally speaking, does it work well? What are some examples of when it doesn’t?
In my experience, it works very, very well. Restarting is the most efficient way to deal with transient failures, and Erlang’s use of supervision trees means that the piece being restarted tends to have the smallest granularity to be useful, while preserving the state in the rest of the system.
I feel that Erlang/Elixir’s “Let it Crash” philosophy has less to do with restarting during transient failures than many people might think. It has more to do with turning errors and failures into tools, and being forced to really think about failure scenarios when developing. Fred Hebert explains this idea in a really excellent talk/transcription: https://ferd.ca/the-zen-of-erlang.html
“Let it crash” does not really work when dealing with data/state that cannot be thrown away or reconstructed. For example, when I need to integrate with some service whose API does not allow for idempotent requests, I often find myself having to be more careful about crashes, and leaning more heavily on other error handling mechanisms. Another commenter mentioned poison messages, but I haven’t really struggled with those, because practicing “Let it crash” will make bugs around poison messages very loud and straightforward to fix.
Letting things crash+restart is not a panacea, it’s one strategy among several. Erlang also supports exceptions, and there’s a very common pattern of returning ‘Either-shaped’ terms from functions that can fail;
{ok, Result} | {error, Term}
.It’s less about rebooting the whole thing and more about having a supervisor that can start the process back in a known state. It works well in many cases…provided you don’t let yourself infinitely reprocess poison messages.