It seems everyone has their own take on what’s “better Jupyter” looks like. First with Observable: https://observablehq.com/, now this.
Let me paraphrase what the problems with Jupyter notebook these trying to solve:
The on-disk format is not diff-friendly, and not code-review friendly. Without launching Jupyter, it is hard to make sense of what these JSON blobs does;
There is no reproducibility for a give notebook, you can enforce it at code review level, or have some pre-commit hooks to clean up the output, run the notebook in order etc. But these are not built-in. You can always evaluate cell 10 before evaluate cell 8, and next time do the reverse. Over time, only you knows the right order to run your notebook;
Jupyter is not really meant to be collaborative, you definitely cannot have two people work on one notebook synchronously. Even asynchronously, you need piles of out-of-band protocols to make sure the result notebook is in sane state.
The solutions to these problems seem obvious, but:
Using text format cannot encode graphs and images into the notebook. Thus, either you need an external data source, or encode base64 blobs, or make sure people who open it has the environment and can re-evaluate instantly upon open;
Many languages cannot extract order from the source directly (observablehq went length to make an reactive language before building their notebook), specifying order beyond that seems to be another chore nobody really cares at write time.
After worked closely with some language kernels in Jupyter, I grown much more interested in alternatives. It is a big ecosystem now, but many extensions outside of what the core team provides are not well-maintained. Some more interesting features (to me), such as much better type-ahead and code-analysis (close to what VSCode can offer) still have no movement for years: https://github.com/jupyter/jupyter_client/issues/51
One last thing: many extensions in Jupyter ecosystem are “scratch my own itch” kind of thing. I have no objections to that. It is just if I am going to introduce these extensions to everyone in my team, I would rather maintain it “in-tree” in our code-repo rather than in their OS directories. Because it is “scratching own itch”, I expect we who use it to fix any bugs if there are any, it is light-years easier to do so with “in-tree” extensions. However, Jupyterlab (last time I checked) doesn’t make this easy.
I have worked on and off with helping people deploy jupyter notebooks for a while now and I completely agree with your takes, and have secretly hoped for a jupyter alternative in elixir.
Elixirs new Mix.install will really help with your last point. Jose has been very resistant to allowing global dependencies in elixir due to pain points from the python and ruby ecosystems; so you can be sure that this was designed very carefully and I imagine people will push to public or private hex repos to accomplish out of tree extensions.
I think the real killer feature of live notebook is the last mode, (not demoed in video) connect to running node. You can connect the notebook to a running elixir or erlang node and execute functions in that node. So I imagine a system where you can create an analytics notebook, hook it up to prod, run a quick etl, and prepare a report for management.
Or, let’s say you need to patch some data in a database prod. You can wrap your notebook in an sql transaction, play with the changes and verify that it looks good, roll back the tx if it doesn’t, and when you are happy with your script, commit the script and save the notebook for record keeping purposes.
Or, because concurrency is fantastic, run a machine learning training on a node, and midway through training, interact with the model at that point in training…
Yeah, you’re probably thinking of this one: https://github.com/jonklein/niex . But looks like it hasn’t been updated in a couple months. And it seems the one posted here is unrelated.
This one is actually quite interesting to me given the support of Dashbit, Jose Valim and co, and the concrete use case with Nx. For example, it has s great way to install dependencies, based on something in the Elixir 1.12 release candidate, which is an advantage of coming from the language creator.
When the other one was posted, I thought it was neat, but I wondered why not just have an Elixir kernel for jupyter? But the use case here is compelling, and the idea of simultaneous, collaborative coding is quite interesting.
It seems everyone has their own take on what’s “better Jupyter” looks like. First with Observable: https://observablehq.com/, now this.
Let me paraphrase what the problems with Jupyter notebook these trying to solve:
The solutions to these problems seem obvious, but:
After worked closely with some language kernels in Jupyter, I grown much more interested in alternatives. It is a big ecosystem now, but many extensions outside of what the core team provides are not well-maintained. Some more interesting features (to me), such as much better type-ahead and code-analysis (close to what VSCode can offer) still have no movement for years: https://github.com/jupyter/jupyter_client/issues/51
One last thing: many extensions in Jupyter ecosystem are “scratch my own itch” kind of thing. I have no objections to that. It is just if I am going to introduce these extensions to everyone in my team, I would rather maintain it “in-tree” in our code-repo rather than in their OS directories. Because it is “scratching own itch”, I expect we who use it to fix any bugs if there are any, it is light-years easier to do so with “in-tree” extensions. However, Jupyterlab (last time I checked) doesn’t make this easy.
I have worked on and off with helping people deploy jupyter notebooks for a while now and I completely agree with your takes, and have secretly hoped for a jupyter alternative in elixir.
Elixirs new Mix.install will really help with your last point. Jose has been very resistant to allowing global dependencies in elixir due to pain points from the python and ruby ecosystems; so you can be sure that this was designed very carefully and I imagine people will push to public or private hex repos to accomplish out of tree extensions.
I think the real killer feature of live notebook is the last mode, (not demoed in video) connect to running node. You can connect the notebook to a running elixir or erlang node and execute functions in that node. So I imagine a system where you can create an analytics notebook, hook it up to prod, run a quick etl, and prepare a report for management.
Or, let’s say you need to patch some data in a database prod. You can wrap your notebook in an sql transaction, play with the changes and verify that it looks good, roll back the tx if it doesn’t, and when you are happy with your script, commit the script and save the notebook for record keeping purposes.
Or, because concurrency is fantastic, run a machine learning training on a node, and midway through training, interact with the model at that point in training…
Wasn’t there someone else doing a live notebook in elixir? Did they join forces?
Yeah, you’re probably thinking of this one: https://github.com/jonklein/niex . But looks like it hasn’t been updated in a couple months. And it seems the one posted here is unrelated.
This one is actually quite interesting to me given the support of Dashbit, Jose Valim and co, and the concrete use case with Nx. For example, it has s great way to install dependencies, based on something in the Elixir 1.12 release candidate, which is an advantage of coming from the language creator.
When the other one was posted, I thought it was neat, but I wondered why not just have an Elixir kernel for jupyter? But the use case here is compelling, and the idea of simultaneous, collaborative coding is quite interesting.
edit: nevermind, in the release announcement (https://dashbit.co/blog/announcing-livebook) it says “contributions from jon klein” so it does seem like they joined forces.
I thought it would be neat if they reverse engineered the jupyter protocol and let other languages plug in.