Similar, but different solution: thebe is a pretty old project that used remote jupyter kernels for running computation in cells on static sites; it was updated with jupyterlite/pyodide and can now run with a browser-backed jupyter kernel
WGSL feels like a cross-over of Rust and GLSL to me.
As a language designer and occasional graphics programmer WGSL is incredibly frustrating because they could’ve created a truly modern and ergonomic language but instead just decided to take GLSL, make it more verbose, and sprinkle some Rust keywords on top. WGSL feels nothing like Rust, nor any modern language really. It annoys me that they had to NIH their own solution instead of just using SPIR-V…
The frustrations start with silly syntax decisions such as using <> for generics, which makes parsing much harder than it needs to be. There’s also the fact that they decided to take the variable declaration syntax of Rust, which has the nice property of having declarations align vertically, and decided to slap attributes before them. So in reality most declarations (apart from lets in functions) don’t align and you get none of the benefits. Here’s an actual excerpt of some WGSL code I’ve written for my app:
Note how the entire @group(n) @binding(n) ceremony offsets all variable names very far to the right, which makes them harder to spot, and then var<uniform> is the cherry on top that makes three of the six variable names land at a different column. Nice.
There are also structs, which will take you back to the days of C++ before C++20, because there are no designated initializers. In struct constructors, you initialize fields in the order they’re declared. No hints as to which field does what.
(To add salt to the would, may I remind you that designated initializers are a thing from C99 that only landed in C++20 and also got nerfed for some reason, as the order of the fields’ initializers has to be the same as the order they’re declared in the class. But it’s still better than WGSL!)
Bad cosmetic decisions are hardly the only thing that frustrates me. I’m honestly a lot more baffled at how the language ignores all the design advancements of functional languages designed in the past 10 years and instead just decides to be a weird mixture of GLSL and Go, being so statement-oriented as to not even have an if expression or ternary ?: operator. So despite the language encouraging immutable variables by having the let keyword be immutable by default, and having you opt into mutability by var, you’ll still need to use var for cases where you eg. want to multiply a variable by some coefficient conditionally.
Also for some reason it’s impossible to index a const array by a non-const index? Or maybe I’m doing something wrong (or maybe that part of Naga is incomplete?)
Also, Naga’s error message are incredibly hard to read and not user-friendly at all. Which makes the development experience a whole lot more frustrating than it needs to be. I understand it’s in 0.x state, but c’mon… Writing human-readable error messages really isn’t that hard, it just takes a bit of time.
Also, this language has pointers and references for some reason. A shader programming language. Because tuples and structural typing are too modern, so let’s write our code in out parameter style instead, which is famously very ergonomic and not verbose at all if you only need one of the return values.
None of these things make WGSL totally unusable of course, just worse than it could be.
And apologies for being so snarky… I’m not usually like that. I’m just disappointed at how much of a wasted opportunity WGSL seems to be. Given that it’s still in draft state I’m hoping they will make the language more modern over time, but it’ll probably take many years and possibly some language extensions to truly make it happen.
TL;DR WGSL, at least in its current state, is a pain. Wish they had just used SPIR-V.
Unfortunately I didn’t have time to finish my research and find a more proper conclusion I could include in my comment, hence I omitted that topic altogether and threw a loose little “wish they had used SPIR-V” at the end without elaborating.
I do wonder what the Apple vs Khronos tension is all about though, if anyone could shed some info I’d be happy to read :)
The Apple Khronos dispute involves IP. Apple stopped updating OpenGL at version 4.1 in MacOS, so I’d look to see what changed in the Khronos Intellectual Property Framework after 2010. Or what patents Apple filed in 2011, when OpenGL 4.2 was announced. The current IP framework would appear to restrict member’s ability to assert IP rights and demand royalties. That could be it.
Highly recommend checking cargo-zigbuild, I’ve been using it indirectly thru maturin for making Python wheels from a Mac Silicon to x86_64 Linux and it just works. It is also way faster than QEMU or the experimental Rosetta2/vz support in colima.
(Is it silly to build a Python wheel containing a Rust binary with C++ deps cross-compiled with Zig? Yes. Is it amazing that it all works? Also yes. =P)
Thank you so much! I should try to integrate this in my workflow ;) I am particularly interested in the speed gain. Compiling the project takes several minutes!
Nice! Automating and over-engineering your thesis process is pretty much a rite of passage =P
(Mine is at https://github.com/luizirber/phd, it was shortly after I started with Nix, so it is based around conda instead. I ended up adding github actions to automate releases for generating the final PDF, and also depositing new versions to Zenodo for archival)
Guess PhD students are older and less bound by this 6 month deadline :P I surely didn’t do it for my (non-PhD) thesis. It was 2009/10 and it was a pretty standard latex build that I had running on either Ubuntu or Windows.The in progress state not living long enough to even think about a distro upgrade helps.
My top Rust request. When I make a small typo and cause a syntax error don’t spit out 200 error messages about functions not existing. Just because I forgot a comma doesn’t mean that the function doesn’t exist. Just tell me about the sybtax error and let me fix it. If there is still semantic errors we can get to them later.
I would give $200 to have someone implement a “stop on first error” flag.
You might like to try bacon, which I’ve used from time to time. It’s a little more than you asked for with the filesystem-watching, but the relevant point is that it continuously shows you the top of the error list as you work through the fixes.
I tried that but it had a lot of problems. It had a hard time picking the “first” error that was the trigger and caused tests to fail for reasons that I didn’t manage to fully figure out. Something with how stdin and stdout are managed?
Interesting overview of the Nix approaches, so thanks for sharing. My curiosity has been piqued by all the NixOS posts I’ve been seeing even though I don’t run it myself. Any reports from folks running a bunch of devices like the one from this post on their home network? What are you using them to do?
My home router/firewall/dhcp/ipsec server (atom x86-64) and file/media/proxy/cache/print server (celeron x86-64) are OpenBSD, so when I bought a Beaglebone Black (ARMv7) to toy with, it was fun to go down a rabbit hole pretending I was @tedu (his 2014 post) to learn about diskless(8) and pxeboot(8) and how to netboot via uboot. This ended being pure experimentation since the actual parallelized work I do at home is on a single beefy Linux workstation (hard requirement on Nvidia GPU for now) and I’m not a professional sysadmin. The BBB sits disconnected in a drawer, but the setup lives on as the mere handful of config line changes required to set up tftpd(8) on the file server and point dhcpd(8) to it from the router, so I gained a more complete understanding of those as a neat side effect of experimenting. At some point in the next couple years I’m going to want to play with a RISC-V SoC, but that’s going to mean looking at Linux again unless I magically become competent to write my own drivers.
I set up the workstation and chromebox as remote builders for all systems, just as @steinuil did in the post. I’m using the rpi for running Jellyfin (music) and Nextcloud (for sharing calendars and files with my spouse), and setting up the chromebox to be an IPFS node for sharing research data. The laptop and workstation are using home-manager for syncing my dev environment configurations, but I do most of the dev/data analysis in the workstation (which has gigabit connections to the internet), and while the laptop is often more than enough for dev, my home connection is way too slow for anything network-intensive (so, it serves as a glorified SSH client =P)
They are all wired together using zerotier, and services running in the machines are bound to the zerotier interface, which ends up creating a pretty nice distributed LAN.
I don’t have my configs in public (booo!), because I’ve not been too good on maintaining secrets out of the configs. But @cadey posts are a treasure trove of good ideas, and I also enjoyed this post and accompanying repo as sources of inspiration.
That’s a fair point. Part of using nixops was about exploring how to use it later for other kinds of deployment (clouds), and it is a bit excessive for my use case (especially since I use nixops to deploy locally in the laptop =P).
A lot of my nix experience so far is seeing multiple implementations of similar concepts, but I also feel like I can refactor and try other approaches without borking my systems (too much).
It sounds like you would benefit from a workflow runtime. I really like snakemake because it is Make-like but allows dropping Python code to fit what’s missing.
This paper is an extended version of an earlier conference paper (Mokhovet al., 2018).The key changes compared to the earlier version are: (i) we added further clarifications and examples to §3, in particular, §3.8 is entirely new; (ii) §4 and §5 are based on the material from the conference paper but have been substantially expanded to include further details and examples, as well as completely new material such as §5.2.2; (iii) §7 is completely new; (iv) §8.1 and §§8.6-8.9 are almost entirely new, and §8.3 has been revised. The new material focuses on our experience and various important practical considerations, hence justifying the “and Practice” part of the paper title
I wonder if it really makes sense to use home-manager for single user systems? Or rather, I haven’t tried this, I think it actually doesn’t make sense, and I wonder if folks can confirm/reject my suspicions.
My understanding is that home-manager achieves two somewhat orthogonal goals:
First, it manages user-packages (as opposed to system-wide packages)
Second, it manages dotfiles
Using home manager, you can install packages without sudo, and they will be available only for your user and won’t be available for root, for example. I think this makes a ton of sense in multi-user systems, where you don’t want to give sudo access to. But for a personal laptop, it seems like there’s little material difference between adding a package to global configuration.nix vs adding it to home-manager’s per-user analogue?
For dotfiles, the need to switch configs to update them seems like it adds a lot of friction. I don’t see why this is better than just storing the dotfiles themselves in a git repo and symlinking them (or just making ~ a git repository with git clone my-dotfiles-repo --separate-git-dir ~/config-repo trick).
Am I overlooking some of the benefits of home-manager?
Post author here! I find it conceptually nicer to be able to shove most of my config in my ‘per-user’ configuration, as opposed to a bunch of separate dotfiles that I then have to manage the symlinking for myself.
The friction is definitely a downside, but most of my config I update infrequently enough that it doesn’t matter, and a lot of programs will have a “use this as the config file” option; my mechanism for tweaking kitty is to grab the config file path by looking at ps, copy it into /tmp/kitty.conf, run kitty -c /tmp/kitty.conf until I’m happy, then copy my changes back into my config.
I do agree that doing per-user installation isn’t super useful on a single-user system. This is why I have /etc/nixos owned by my non-root user, so I don’t have to sudo every time I want to edit it (though I do still have to sudo to rebuild the changes).
What is neat with managing dotfiles with Nix is to be able to reference derivations. You need to run autorandr as part of script? Just say ${pkgs.autorandr}/bin/autorandr and it won’t clutter your path.
My understanding is that home-manager achieves two somewhat orthogonal goals:
As @vbernat said, they are not completely orthogonal, since you can refer to files in (package) output paths in configuration files, which is really nice.
I think this makes a ton of sense in multi-user systems, where you don’t want to give sudo access to. But for a personal laptop, it seems like there’s little material difference between adding a package to global configuration.nix vs adding it to home-manager’s per-user analogue?
For me, the large benefit is that I can use home-manager on both NixOS and non-NixOS systems. E.g., in my previous job I used several Ubuntu compute nodes. By using home-manager, I could have exactly the same user environment as my NixOS machines.
home-manager has a complexity cost so that needs to be weighed in for sure.
Typically the vim config can get complicated because it has to be adjusted to work between various environments. Since home-manager provides both the package and config, this is much simplified. I remember having to tune vim constantly before.
Want to call attention to this as not just a well-written article, but a well-written example of how we should be comparing languages. Extensive knowledge of both, comparisons, complex examples, benchmarks, discussions about community, this is a template I want to use going forward.
I highly recommend reading other Jonathan posts, as well as signing up for his newsletter: https://www.dursi.ca/newsletter.html
So many good links and discussion!
First time submitting a story, the software I work on recently moved from C++ to Rust (in a Python extension) and this post goes thru a small PR moving some Python code into the Rust layer, and what changes need to be done (including the FFI layer).
Thank you for writing and submitting this. It is well written: easy to follow and fairly succinct.
I wonder if you could get a reasonable compromise between cpu and memory consumption by growing the list till it hits (say) 64 elements then flushing it? That way you’re amortizing the per-call ffi overhead by buffering, but your buffer doesn’t grow very large.
Runtime is slightly larger (77s vs 73s), but as I noted in the PR it’s the memory that will likely be the bottleneck in the future (as we grow to millions of signatures, instead of ~100k we have now).
Yes, conda looks good. The fact that it is advertised for Python and R suggests some genericity. It is hard to find documentation about anything not Python though.
I’m open to alternatives. Maybe datascience or datasci, but I feel that it might limit it too much to AI and statistics, which could be grouped under ai, and that would leave out the computational modeling. Also, science is a tag used often for scientific computing as of now, but it’s more generally understood as scientific results and discovery, and there are many things of interest to the domain of scientific computing that are of little interest to practicing scientists, such as what goes on under the hood.
I guess data engineering would be left out if we don’t go with datascience, but perhaps a data or bigdata tag could better suit them.
As of right now, I am sticking with scicomp because human language is not commutative, and that we have the opportunity to create a niche for discussion in a growing field that could also benefit lobste.rs as a whole.
I think data is good. bigdata and scicomp can both be considered sub-fields of a wider data field, and there is crossover between them. Individually those tags are quite specific, maybe even niche?
And… 90% of bioinformatics is converting from one file format to another, and because most formats are text-based people are tempted to write a parser for the 1000th time (or cook up a shell one-liner).
Similar, but different solution:
thebe
is a pretty old project that used remote jupyter kernels for running computation in cells on static sites; it was updated withjupyterlite
/pyodide
and can now run with a browser-backed jupyter kernelDemo: https://executablebooks.github.io/thebe/lite.html
As a language designer and occasional graphics programmer WGSL is incredibly frustrating because they could’ve created a truly modern and ergonomic language but instead just decided to take GLSL, make it more verbose, and sprinkle some Rust keywords on top. WGSL feels nothing like Rust, nor any modern language really. It annoys me that they had to NIH their own solution instead of just using SPIR-V…
The frustrations start with silly syntax decisions such as using
<>
for generics, which makes parsing much harder than it needs to be. There’s also the fact that they decided to take the variable declaration syntax of Rust, which has the nice property of having declarations align vertically, and decided to slap attributes before them. So in reality most declarations (apart fromlet
s in functions) don’t align and you get none of the benefits. Here’s an actual excerpt of some WGSL code I’ve written for my app:Note how the entire
@group(n) @binding(n)
ceremony offsets all variable names very far to the right, which makes them harder to spot, and thenvar<uniform>
is the cherry on top that makes three of the six variable names land at a different column. Nice.There are also structs, which will take you back to the days of C++ before C++20, because there are no designated initializers. In struct constructors, you initialize fields in the order they’re declared. No hints as to which field does what.
Well either that, or…
(To add salt to the would, may I remind you that designated initializers are a thing from C99 that only landed in C++20 and also got nerfed for some reason, as the order of the fields’ initializers has to be the same as the order they’re declared in the class. But it’s still better than WGSL!)
Bad cosmetic decisions are hardly the only thing that frustrates me. I’m honestly a lot more baffled at how the language ignores all the design advancements of functional languages designed in the past 10 years and instead just decides to be a weird mixture of GLSL and Go, being so statement-oriented as to not even have an
if
expression or ternary?:
operator. So despite the language encouraging immutable variables by having thelet
keyword be immutable by default, and having you opt into mutability byvar
, you’ll still need to usevar
for cases where you eg. want to multiply a variable by some coefficient conditionally.Also for some reason it’s impossible to index a
const
array by a non-const
index? Or maybe I’m doing something wrong (or maybe that part of Naga is incomplete?)Also, Naga’s error message are incredibly hard to read and not user-friendly at all. Which makes the development experience a whole lot more frustrating than it needs to be. I understand it’s in 0.x state, but c’mon… Writing human-readable error messages really isn’t that hard, it just takes a bit of time.
Also, this language has pointers and references for some reason. A shader programming language. Because tuples and structural typing are too modern, so let’s write our code in
out
parameter style instead, which is famously very ergonomic and not verbose at all if you only need one of the return values.None of these things make WGSL totally unusable of course, just worse than it could be.
And apologies for being so snarky… I’m not usually like that. I’m just disappointed at how much of a wasted opportunity WGSL seems to be. Given that it’s still in draft state I’m hoping they will make the language more modern over time, but it’ll probably take many years and possibly some language extensions to truly make it happen.
TL;DR WGSL, at least in its current state, is a pain. Wish they had just used SPIR-V.
From my understanding Apple said no to SPIR-V hence why we have WGSL.
I did some research before writing my comment and apart from the Apple vs Khronos thing there were also some more tech-focused discussions I found:
Unfortunately I didn’t have time to finish my research and find a more proper conclusion I could include in my comment, hence I omitted that topic altogether and threw a loose little “wish they had used SPIR-V” at the end without elaborating.
I do wonder what the Apple vs Khronos tension is all about though, if anyone could shed some info I’d be happy to read :)
This post was useful for me to understand better how WebGPU (and GPU APIs in general) came to be: https://cohost.org/mcc/post/1406157-i-want-to-talk-about-webgpu
Previous submission: https://lobste.rs/s/q4ment/i_want_talk_about_webgpu
The Apple Khronos dispute involves IP. Apple stopped updating OpenGL at version 4.1 in MacOS, so I’d look to see what changed in the Khronos Intellectual Property Framework after 2010. Or what patents Apple filed in 2011, when OpenGL 4.2 was announced. The current IP framework would appear to restrict member’s ability to assert IP rights and demand royalties. That could be it.
Highly recommend checking cargo-zigbuild, I’ve been using it indirectly thru maturin for making Python wheels from a Mac Silicon to x86_64 Linux and it just works. It is also way faster than QEMU or the experimental Rosetta2/vz support in colima.
(Is it silly to build a Python wheel containing a Rust binary with C++ deps cross-compiled with Zig? Yes. Is it amazing that it all works? Also yes. =P)
Thank you so much! I should try to integrate this in my workflow ;) I am particularly interested in the speed gain. Compiling the project takes several minutes!
Artists are very good at creating edge cases: https://dustri.org/b/horrible-edge-cases-to-consider-when-dealing-with-music.html
Nice! Automating and over-engineering your thesis process is pretty much a rite of passage =P
(Mine is at https://github.com/luizirber/phd, it was shortly after I started with Nix, so it is based around conda instead. I ended up adding github actions to automate releases for generating the final PDF, and also depositing new versions to Zenodo for archival)
Guess PhD students are older and less bound by this 6 month deadline :P I surely didn’t do it for my (non-PhD) thesis. It was 2009/10 and it was a pretty standard latex build that I had running on either Ubuntu or Windows.The in progress state not living long enough to even think about a distro upgrade helps.
It turned out that overengineering my thesis document was the only thing in my PhD that made me in employable.
My top Rust request. When I make a small typo and cause a syntax error don’t spit out 200 error messages about functions not existing. Just because I forgot a comma doesn’t mean that the function doesn’t exist. Just tell me about the sybtax error and let me fix it. If there is still semantic errors we can get to them later.
I would give $200 to have someone implement a “stop on first error” flag.
This is #27189 and I accept the challenge. I have a good understanding of Rust’s diagnostics subsystem, and it seems doable.
You might like to try bacon, which I’ve used from time to time. It’s a little more than you asked for with the filesystem-watching, but the relevant point is that it continuously shows you the top of the error list as you work through the fixes.
cargo-limit might be worth checking too
I tried that but it had a lot of problems. It had a hard time picking the “first” error that was the trigger and caused tests to fail for reasons that I didn’t manage to fully figure out. Something with how stdin and stdout are managed?
Interesting overview of the Nix approaches, so thanks for sharing. My curiosity has been piqued by all the NixOS posts I’ve been seeing even though I don’t run it myself. Any reports from folks running a bunch of devices like the one from this post on their home network? What are you using them to do?
My home router/firewall/dhcp/ipsec server (atom x86-64) and file/media/proxy/cache/print server (celeron x86-64) are OpenBSD, so when I bought a Beaglebone Black (ARMv7) to toy with, it was fun to go down a rabbit hole pretending I was @tedu (his 2014 post) to learn about diskless(8) and pxeboot(8) and how to netboot via uboot. This ended being pure experimentation since the actual parallelized work I do at home is on a single beefy Linux workstation (hard requirement on Nvidia GPU for now) and I’m not a professional sysadmin. The BBB sits disconnected in a drawer, but the setup lives on as the mere handful of config line changes required to set up tftpd(8) on the file server and point dhcpd(8) to it from the router, so I gained a more complete understanding of those as a neat side effect of experimenting. At some point in the next couple years I’m going to want to play with a RISC-V SoC, but that’s going to mean looking at Linux again unless I magically become competent to write my own drivers.
I just converted my last non-NixOS machine yesterday, so I’ll share my experience =]
I currently have 5 machines running NixOS and deployed using NixOps (to a network called
ekumen
):I set up the workstation and chromebox as remote builders for all systems, just as @steinuil did in the post. I’m using the rpi for running Jellyfin (music) and Nextcloud (for sharing calendars and files with my spouse), and setting up the chromebox to be an IPFS node for sharing research data. The laptop and workstation are using home-manager for syncing my dev environment configurations, but I do most of the dev/data analysis in the workstation (which has gigabit connections to the internet), and while the laptop is often more than enough for dev, my home connection is way too slow for anything network-intensive (so, it serves as a glorified SSH client =P)
They are all wired together using zerotier, and services running in the machines are bound to the zerotier interface, which ends up creating a pretty nice distributed LAN.
I don’t have my configs in public (booo!), because I’ve not been too good on maintaining secrets out of the configs. But @cadey posts are a treasure trove of good ideas, and I also enjoyed this post and accompanying repo as sources of inspiration.
I don’t really see the value nixops provides over nixos-rebuild which can work over ssh.
That’s a fair point. Part of using nixops was about exploring how to use it later for other kinds of deployment (clouds), and it is a bit excessive for my use case (especially since I use nixops to deploy locally in the laptop =P).
A lot of my nix experience so far is seeing multiple implementations of similar concepts, but I also feel like I can refactor and try other approaches without borking my systems (too much).
On the Pi from the post I run:
It sounds like you would benefit from a workflow runtime. I really like snakemake because it is Make-like but allows dropping Python code to fit what’s missing.
This link 404’s for me. Here’s a mirror I found: https://www.microsoft.com/en-us/research/uploads/prod/2020/04/build-systems-jfp.pdf
Hm this paper is now 55 pages? The one from 2018 was 29 pages.
https://www.microsoft.com/en-us/research/uploads/prod/2018/03/build-systems.pdf
It’s kind of weird that they slightly modified the title and added more content, and the 3 authors are the same.
I’d be curious for a summary of the “diff” …
There is a summary by the end of section 1:
I wonder if it really makes sense to use home-manager for single user systems? Or rather, I haven’t tried this, I think it actually doesn’t make sense, and I wonder if folks can confirm/reject my suspicions.
My understanding is that home-manager achieves two somewhat orthogonal goals:
Using home manager, you can install packages without sudo, and they will be available only for your user and won’t be available for
root
, for example. I think this makes a ton of sense in multi-user systems, where you don’t want to givesudo
access to. But for a personal laptop, it seems like there’s little material difference between adding a package to globalconfiguration.nix
vs adding it to home-manager’s per-user analogue?For dotfiles, the need to switch configs to update them seems like it adds a lot of friction. I don’t see why this is better than just storing the dotfiles themselves in a git repo and symlinking them (or just making
~
a git repository withgit clone my-dotfiles-repo --separate-git-dir ~/config-repo
trick).Am I overlooking some of the benefits of home-manager?
Post author here! I find it conceptually nicer to be able to shove most of my config in my ‘per-user’ configuration, as opposed to a bunch of separate dotfiles that I then have to manage the symlinking for myself.
The friction is definitely a downside, but most of my config I update infrequently enough that it doesn’t matter, and a lot of programs will have a “use this as the config file” option; my mechanism for tweaking kitty is to grab the config file path by looking at
ps
, copy it into/tmp/kitty.conf
, runkitty -c /tmp/kitty.conf
until I’m happy, then copy my changes back into my config.I do agree that doing per-user installation isn’t super useful on a single-user system. This is why I have /etc/nixos owned by my non-root user, so I don’t have to sudo every time I want to edit it (though I do still have to sudo to rebuild the changes).
I would like to thank you for the post, I’m totally copying the autorandr and awesome setup (and a bunch of other things too =])
What is neat with managing dotfiles with Nix is to be able to reference derivations. You need to run autorandr as part of script? Just say
${pkgs.autorandr}/bin/autorandr
and it won’t clutter your path.As @vbernat said, they are not completely orthogonal, since you can refer to files in (package) output paths in configuration files, which is really nice.
For me, the large benefit is that I can use home-manager on both NixOS and non-NixOS systems. E.g., in my previous job I used several Ubuntu compute nodes. By using home-manager, I could have exactly the same user environment as my NixOS machines.
home-manager has a complexity cost so that needs to be weighed in for sure.
Typically the vim config can get complicated because it has to be adjusted to work between various environments. Since home-manager provides both the package and config, this is much simplified. I remember having to tune vim constantly before.
Want to call attention to this as not just a well-written article, but a well-written example of how we should be comparing languages. Extensive knowledge of both, comparisons, complex examples, benchmarks, discussions about community, this is a template I want to use going forward.
I highly recommend reading other Jonathan posts, as well as signing up for his newsletter: https://www.dursi.ca/newsletter.html So many good links and discussion!
First time submitting a story, the software I work on recently moved from C++ to Rust (in a Python extension) and this post goes thru a small PR moving some Python code into the Rust layer, and what changes need to be done (including the FFI layer).
Thank you for writing and submitting this. It is well written: easy to follow and fairly succinct.
I wonder if you could get a reasonable compromise between cpu and memory consumption by growing the list till it hits (say) 64 elements then flushing it? That way you’re amortizing the per-call ffi overhead by buffering, but your buffer doesn’t grow very large.
I hadn’t thought of that, that’s a great idea!
I’ll try it and report back, thanks!
I implemented you suggestion, and the memory consumption is negligible now: https://github.com/dib-lab/sourmash/pull/840
Runtime is slightly larger (77s vs 73s), but as I noted in the PR it’s the memory that will likely be the bottleneck in the future (as we grow to millions of signatures, instead of ~100k we have now).
Thanks again!
Sweet! Glad to hear it. <3
This is really cool! I opened a PR in conda-forge to make it available for conda users: https://github.com/conda-forge/staged-recipes/pull/10473
Thanks!
Maybe you want conda and the packages on conda-forge?
Yes, conda looks good. The fact that it is advertised for Python and R suggests some genericity. It is hard to find documentation about anything not Python though.
an example from bioconda (derived from conda-forge), using Rust: https://github.com/bioconda/bioconda-recipes/blob/33c024c852e1292d4548407a3b3c2884f24acc93/recipes/rust-bio-tools/meta.yaml
I like this idea, there seems to be a lack of a more data-science related tag.
I submitted https://lobste.rs/s/u4m0lr/managing_messes_computational a few days ago, and it would also benefit from the
scicomp
tag. I ended up puttingpython
andpractices
, but none is a good descriptor of the content.But it is somewhat confusing to have
scicomp
andcompsci
as tags, since they are quite similar…I’m open to alternatives. Maybe
datascience
ordatasci
, but I feel that it might limit it too much to AI and statistics, which could be grouped underai
, and that would leave out the computational modeling. Also,science
is a tag used often for scientific computing as of now, but it’s more generally understood as scientific results and discovery, and there are many things of interest to the domain of scientific computing that are of little interest to practicing scientists, such as what goes on under the hood.I guess data engineering would be left out if we don’t go with
datascience
, but perhaps adata
orbigdata
tag could better suit them.As of right now, I am sticking with
scicomp
because human language is not commutative, and that we have the opportunity to create a niche for discussion in a growing field that could also benefit lobste.rs as a whole.I’ve been thinking of submitting a datascience tag proposal for a while now. I think the two are distinct enough that we’re better off with both.
I think
data
is good.bigdata
andscicomp
can both be considered sub-fields of a widerdata
field, and there is crossover between them. Individually those tags are quite specific, maybe even niche?The book is very good, with lots of great examples walking through how the algorithms work.
This is a great write-up on the situation!
And… 90% of bioinformatics is converting from one file format to another, and because most formats are text-based people are tempted to write a parser for the 1000th time (or cook up a shell one-liner).
I misread that as ”(or cock up a shell one-liner)” which I suspect is distressingly often not wrong.
Do not confuse with https://github.com/casey/just/…
This is a really nice example of using cmake to integrate Rust code into C and C++ codebases!
Some papers already leaked: https://lobste.rs/s/ydabmd/kingme_attack
Seems like tom7 has 4 chess-related papers this year: http://tom7.org/chess/ (and he always have great submissions to SIGBOVIK =])
And it will be livestreamed on twitch! (source)