After having discovered these primitives in Elixir, it’s weird to go back to other languages where they are not a thing.
These days, I work in a complex python codebase that spins a dozen of thread for various things. Each of these threads have their own “when to stop” logic, making it hard to gracefully handle ctrl-c. Communication is done by method access which edit a shared memory, meaning that sometimes error happen when locking is insufficient (when a list get edited while it’s being looped on by another thread). And most importantly, the lack of supervision makes it hard to understand what is running and even harder to do something when one thread dies because of an error.
I find myself often googling for some form of these Erlang primitives in python, without success so far.
I wrote a lot of web scrapers in Python. Tedious and error prone stuff dealing with threads and failure.
Then I wrote one scraper in Elixir and it amazed me. Controlling a pool of clients, handling failure in connections, resizing the pool, balancing work between workers, running Async tasks.
I’ve been evaluating bazel lately, because I’d like to untangle the mess that is the current build system in my team (which is a bunch of Jenkins jobs triggering other jobs and expecting them to write to a predefined path).
My intuition is that I’d like something similar to nix flakes, defining the builds for artifacts for each repo, and then having a way to depend on other repos artifacts, with some lockfile-like version management.
My intuition also tells me I don’t want to rewrite the entire build process to be inside of nix (which would also make my application require nix at runtime).
I get the feeling from this that bazel is not a great fit. Because I’ll have the same problem with rewriting everything, and because it seems very oriented toward monorepo, and using it in a multi repo setup seems to go against the grain.
I’ll have to try it out ti figure exactly why (if) it doesn’t work. The current process creates a python virtualenv and installs dependencies in it, and might also depend on some installed Ubuntu packages. The only way to make it produce an artifact without replacing half of this with nix packages is to run nix in impure mode (but that might be a solution).
I was assuming you completely nix-ify the whole build, so your output is a nix derivation. Then taking the closure of that should work, assuming it refers to all its dependencies directly and never e.g. via $PATH.
I find it a bit underwhelming that this article is from 10 months ago, but that flakes are still not available in a main release of nix. This leads to a situation where one part of the community is invested in it, and uses it, while another considers it unusable for now, and continues using other tools for pinning and importing.
I had the same opinion of yours, but then I took the time to look at the material that’s available around like the nice nixos wiki page maintained by Mic92 and the yet unstable manual of the Nix that will be. That together with the fact that it’s possible to let collaborators start using Nix by directly installing the flake-enabled version convinced me to start using it. It think that as a community we should contribute at least by using it and giving feedback. What would be of projects like Linux or Debian if all their users installed the “stable” releases?
It looks like the flakes branch was merged into master back in July. Do I still need to use a special flakes-enabled version of Nix, or is the feature included by default in recent releases?
Wonderful! I was looking for something like this a few weeks ago, because I was sad that the nix-shell setup (or the stack script setup for that matter) doesn’t do any caching. I’ll definitely be using this.
As a side note, when I looked into this, I started wondering how stack script manages to find the dependencies from the imports. Well, it has a big mapping list generated for each snapshot. I’m not sure how feasible that would be for nix, so I kind of dropped the topic, but I still think that it might be the ultimate step in ease of use of scripts.
Remember using upx could be a horrible idea for programs that have more than once instance running at once.
Normally the kernel can share memory pages between different executions of the same program, upx defeats that.
I didn’t know about that specifically, so I’ll add a note about it. I kind of discounted dynamic linking as soon as I realized modern cloud deployment practices would mean only one executable per instance/VM, and I didn’t consider much by way of sharing instances after that
On that topic, I wonder if you can reproduce the gains of UPX with some standard (non-autoextracting) compression. If that the case, then you might also gain the same benefits from on-the-wire compression when transferring the executable.
Theoretically, the OS’ executable format can include compressing the pages, so then the kernel can expand the pages in a way that’d be shared across all packages.
Does anybody have an insight on how this workflow interacts with GitHub? I feel like the next step I’d like is sending each of my patches to a different pull request (independent, if possible).
I’m actually writing some tooling to make that possible, but the workflow proposed by stgit feels to close to what I want that I’m wondering if I’m not reimplementing something that exists.
I think what you would do with stgit is work on a master/devel branch + a stack of your patches. When you are ready to do a PR you go “stg publish my-pr-branch; git push github my-pr-branch” to make a branch you can use the github UI to do a pull request with. To update your PR branch after feedback I think you need to republish.
Unfortunately for that workflow, stg publish doesn’t support force pushing directly to a remote branch, which I think would be quite cool. instead i think you need to rerun stg publish + git push.
With this patch you can do something like stgit publish --overwrite some-pr-branch && git push -f some-pr-branch and it will get your current patches into a github pr. Without my patch it creates a new commit each time you update things.
You will need to manipulate the stack of patches before each publish command to hide patches you don’t with to publish, I am pretty sure something like stg pop dont-want && stg publish will work.
This looks extremely interesting, especially considering I’m currently working on an implementation of a variant of nix for working with batch computation. It would be interesting to see what it looks like to define the derivations in Expresso.
The presenter mentioned the possibility of boilerplate being massively reduced due to removing the need for many type definitions, almost everything can be inferred by usage with the compiler screaming at inconsistencies.
Time for some nitpicking! You actually just need a Semigroup, you have no use for munit, and it’s pointless to append the list with munits, since mplus a munit = a by the monoid laws.
Your comment reminded me of Data.These: since we don’t pad with mempty values, then there a notion of “the zip of two lists will return partial values at some point”.
And that led me to Data.Align, which has the exact function we are looking for:
salign :: (Align f, Semigroup a) => f a -> f a -> f a
It’s funny, because I think I poked the universe in a way that resulted in salign going into Data.Align; A year or so ago, someone mentioned malign in a reddit r/haskell thread, and I pointed out that malign only needed Semigroup and one of the participants in the thread opened an issue requesting a malign alternative with the Semigroup constraint.
After having discovered these primitives in Elixir, it’s weird to go back to other languages where they are not a thing.
These days, I work in a complex python codebase that spins a dozen of thread for various things. Each of these threads have their own “when to stop” logic, making it hard to gracefully handle ctrl-c. Communication is done by method access which edit a shared memory, meaning that sometimes error happen when locking is insufficient (when a list get edited while it’s being looped on by another thread). And most importantly, the lack of supervision makes it hard to understand what is running and even harder to do something when one thread dies because of an error.
I find myself often googling for some form of these Erlang primitives in python, without success so far.
I wrote a lot of web scrapers in Python. Tedious and error prone stuff dealing with threads and failure.
Then I wrote one scraper in Elixir and it amazed me. Controlling a pool of clients, handling failure in connections, resizing the pool, balancing work between workers, running Async tasks.
Have you tried writing Python code structured-concurrency-style using e.g. trio with httpx? It brings back a lot of sanity into async Python.
I’ve been evaluating bazel lately, because I’d like to untangle the mess that is the current build system in my team (which is a bunch of Jenkins jobs triggering other jobs and expecting them to write to a predefined path).
My intuition is that I’d like something similar to nix flakes, defining the builds for artifacts for each repo, and then having a way to depend on other repos artifacts, with some lockfile-like version management.
My intuition also tells me I don’t want to rewrite the entire build process to be inside of nix (which would also make my application require nix at runtime).
I get the feeling from this that bazel is not a great fit. Because I’ll have the same problem with rewriting everything, and because it seems very oriented toward monorepo, and using it in a multi repo setup seems to go against the grain.
Would it? IME you can tarball a nix closure and unpack it into a chroot and the programs run just fine without any of the nix command line tools.
Mmmm, true, what I said doesn’t hold in general.
I’ll have to try it out ti figure exactly why (if) it doesn’t work. The current process creates a python virtualenv and installs dependencies in it, and might also depend on some installed Ubuntu packages. The only way to make it produce an artifact without replacing half of this with nix packages is to run nix in impure mode (but that might be a solution).
I was assuming you completely nix-ify the whole build, so your output is a nix derivation. Then taking the closure of that should work, assuming it refers to all its dependencies directly and never e.g. via $PATH.
I find it a bit underwhelming that this article is from 10 months ago, but that flakes are still not available in a main release of nix. This leads to a situation where one part of the community is invested in it, and uses it, while another considers it unusable for now, and continues using other tools for pinning and importing.
Very good observation. Just because I have to enable flakes, I’m actively avoiding them. Sticking to niv until flakes are official, maybe.
I had the same opinion of yours, but then I took the time to look at the material that’s available around like the nice nixos wiki page maintained by Mic92 and the yet unstable manual of the Nix that will be. That together with the fact that it’s possible to let collaborators start using Nix by directly installing the flake-enabled version convinced me to start using it. It think that as a community we should contribute at least by using it and giving feedback. What would be of projects like Linux or Debian if all their users installed the “stable” releases?
It looks like the
flakes
branch was merged intomaster
back in July. Do I still need to use a special flakes-enabled version of Nix, or is the feature included by default in recent releases?It is included by default in recent unstable releases. I.e. the
nixUnstable
attribute in nixpkgs.On my NixOS unstable updated a week ago the stock Nix release is still 2.3.10, not 2.4
Wonderful! I was looking for something like this a few weeks ago, because I was sad that the nix-shell setup (or the stack script setup for that matter) doesn’t do any caching. I’ll definitely be using this.
As a side note, when I looked into this, I started wondering how stack script manages to find the dependencies from the imports. Well, it has a big mapping list generated for each snapshot. I’m not sure how feasible that would be for nix, so I kind of dropped the topic, but I still think that it might be the ultimate step in ease of use of scripts.
Remember using upx could be a horrible idea for programs that have more than once instance running at once. Normally the kernel can share memory pages between different executions of the same program, upx defeats that.
upx also doesn’t fix that the binary is (likely) not very cache-efficient
I didn’t know about that specifically, so I’ll add a note about it. I kind of discounted dynamic linking as soon as I realized modern cloud deployment practices would mean only one executable per instance/VM, and I didn’t consider much by way of sharing instances after that
Thanks for the feedback!
On that topic, I wonder if you can reproduce the gains of UPX with some standard (non-autoextracting) compression. If that the case, then you might also gain the same benefits from on-the-wire compression when transferring the executable.
Theoretically, the OS’ executable format can include compressing the pages, so then the kernel can expand the pages in a way that’d be shared across all packages.
Does anybody have an insight on how this workflow interacts with GitHub? I feel like the next step I’d like is sending each of my patches to a different pull request (independent, if possible).
I’m actually writing some tooling to make that possible, but the workflow proposed by stgit feels to close to what I want that I’m wondering if I’m not reimplementing something that exists.
I think what you would do with stgit is work on a master/devel branch + a stack of your patches. When you are ready to do a PR you go “stg publish my-pr-branch; git push github my-pr-branch” to make a branch you can use the github UI to do a pull request with. To update your PR branch after feedback I think you need to republish.
Unfortunately for that workflow, stg publish doesn’t support force pushing directly to a remote branch, which I think would be quite cool. instead i think you need to rerun stg publish + git push.
Ok, I just sent a pull request to stgit that makes the github workflow work better.
https://github.com/ctmarinas/stgit/pull/29
With this patch you can do something like
stgit publish --overwrite some-pr-branch && git push -f some-pr-branch
and it will get your current patches into a github pr. Without my patch it creates a new commit each time you update things.You will need to manipulate the stack of patches before each publish command to hide patches you don’t with to publish, I am pretty sure something like
stg pop dont-want && stg publish
will work.This looks extremely interesting, especially considering I’m currently working on an implementation of a variant of nix for working with batch computation. It would be interesting to see what it looks like to define the derivations in Expresso.
The presenter mentioned the possibility of boilerplate being massively reduced due to removing the need for many type definitions, almost everything can be inferred by usage with the compiler screaming at inconsistencies.
Time for some nitpicking! You actually just need a
Semigroup
, you have no use formunit
, and it’s pointless to append the list withmunit
s, sincemplus a munit
=a
by the monoid laws.Your comment reminded me of
Data.These
: since we don’t pad withmempty
values, then there a notion of “the zip of two lists will return partial values at some point”.And that led me to
Data.Align
, which has the exact function we are looking for:http://hackage.haskell.org/package/these-0.7.4/docs/Data-Align.html#v:salign
(that weird notion was
align :: f a -> f b -> f (These a b)
)Yeah this is exactly it. Good eye!
It’s funny, because I think I poked the universe in a way that resulted in
salign
going intoData.Align
; A year or so ago, someone mentionedmalign
in a reddit r/haskell thread, and I pointed out thatmalign
only neededSemigroup
and one of the participants in the thread opened an issue requesting amalign
alternative with theSemigroup
constraint.Now I feel like a
Semigroup
evangelist :)