How would replacing bash with nushell play with bootstrapping of nix and nixpkgs? When comparing guix and nix, guix did quite a good job on that topic and there is really a minimal set of packages to build everything from scratch. I’m wondering if bringing Rust in, just to build nushell, just to build stdenv based on it, would make bootstrapping nearly impossible.
100% agree, this article completely eludes this central question. Bash is semi-trivial to bootstrap!
Nixpkgs does not bootstrap rustc, we’re currently using a binary distribution: https://github.com/NixOS/nixpkgs/blob/master/pkgs/development/compilers/rust/rustc.nix#L28
Adopting this as a default stdenv would require to push this massive bindist to the Nixpkgs bootstrap seed. That seed is already massive compared to what Guix has, I don’t think we want to degrade this further.
Rust is definitely source-bootstrapable, as a matter of fact, Guix manages to do it, there’s no reason we can’t do the same. The bootstrap chain is pretty long though. On top of what we already bootstrap (gcc, openssl, etc.), we’d need to bootstrap llvm, mrustc, then rust 1.54 -> 55 -> 56 -> 57 -> 58 -> 60 -> 61 -> 62 -> 63 -> 64 -> 65.
So yeah, pushing this to stdenv would considerably degrade the bootstrap story in any case.
From my perspective, Bash is a local optimum, I personally wouldn’t change it, it’s certainly a good balance between a language that is easy to bootstrap and a good-enough expressiveness to express builds.
If we really want to move to something more modern, Oil could be a more serious contender, they seem to take bootstrapping seriously. There’s a drafted RFC wrt. Oil adoption.
[Edit]: The question is eluded, but I don’t think the author expects this to replace stdenv, at least it’s not mentionned in the article. Don’t take this comment as an overwhelming negative “this project is worthless”. Cool hack!
This made me realize that Rust is fundamentally non-bootstrapable. It’s definitely going to produce a release every six weeks for quite a number of years, and Rust’s release N needs release N-1 to build, so the bootstrap chain, by design, grows quickly and linearly with time. So it seems that, in the limit, it is really a choice between:
There is mrustc which is written in C++ and allows you to bootstrap Rust. There is also GCC Rust implementation in the works, that will allow bootstrapping.
Is there a reference interpreter, perhaps? I imagine that that can’t be a complete solution since LLVM is a dependency, but it would allow pure-Rust toolchains to periodically adjust their bootstraps, so to speak.
Yes the new Oil C++ tarball is 375 kilobytes / 90K lines of compressed, readable C++ source :)
https://www.oilshell.org/release/0.14.2/
The resulting binary is about 1.3 MB now. I seem to recall that the nushell binary is something like 10 MB or 50 MB, which is typical for Rust binaries. rustc is probably much larger.
There was some debate about whether Oil’s code gen needs to be bootstrapped. That is possible, but lots of people didn’t seem to realize that the bash build in Nix is not.
It includes yacc’s generated output, which is less readable than most of Oil’s generated code.
For sure it would make bootstrapping much harder on things like OpenBSD. Good luck if you are on an arch that doesn’t have rust or LLVM. That said, I don’t think this would replace stdenv.. for sure not any time soon!
Also the article does mention exactly what you are pointing out:
it’s nowhere near the range of systems that can run Bash.
My question is orthogonal to this, and maybe I should have specify what I mean by bootstrapping. It’s “how many things I have to build first, before I can have working nixpkgs and build things users ask for”. So if we assume that nushell runs wherever bash runs, how much more effort is to build nushell (and rust and llvm) than bash? I would guess order of magnitude more, thus really complicating the initial setup of nixpkgs (or at least getting them to install without any caches).
This sounds a bit inflammatory. For the benefits of us beginners, why do you consider that those tools are the right thing and golangci-lint should be avoided?
One example: golangci-lint
includes a linter called ireturn
which warns when you return interface
types from functions. It references Rob Pike’s opinion on the proverb “Accept Interfaces Return Struct/Concrete Types”.
To which he responded
I’m not a great fan of that one. It’s subtle and tricky to explain compactly. And although popular, it’s not often easy to apply well. Plus: Why structs? There are many other concrete types in Go […] It’s very hard to be clear about where it applies, and it often doesn’t.
It’s a bit comical that this manufactured lint rule appeals to authority by linking to a Rob Pike conversation … which ultimately disagrees with the proverb that influenced the lint rule.
Linter rules are a bunch of opinions that aren’t agreed upon, while native go tools are for better or worse the “law of the land”. Nobody will ask you to love the law, but you’ll abide by it as long as it remains unchanged.
That has nothing to do with avoiding golangci-lint though. That they have a lint available doesn’t mean you have to use that lint, as can be seen by TFA disabling a bunch of them.
And that something is a compiler error doesn’t mean everyone agrees with it, just that whoever wanted it got it through before the language was published.
Furthermore the existence of go vet
completely breaks your quip by both being “a buch of opinions that aren’t agreed upon” (because it’s a linter) and “the law of the land” (because it’s a native go tool).
I’ll gladly concede that go vet
is a linter and won’t bother arguing my ill defined concept of “land law” with respect to its checks :)
However every compiler error is indisputably the law of the land. Because you can disagree with compiler errors all day long, but without changing the compiler, any argument against them is moot. There is a high barrier to changing compiler errors; in that sense they are very much law.
As I said:
you’ll abide by it as long as it remains unchanged.
there are many issues with it, however the primary problem is that it bundles an old fork of staticcheck and disables many of their default lints. it also includes a bunch of linters that don’t really do anything.
here is one of several instances I know of from the staticcheck maintainer on twitter and I don’t even pay that much attention, I’m sure if you searched golangci-lint on the staticcheck issue tracker you’d find more
I’m not the maintainer, however, I feel sympathy for him. that being said, golangci-lint isn’t doing anything wrong or illegal, it’s open source software, you’re free to use it even in ways the original author doesn’t intend or approve of. that being said, you aren’t necessarily entitled to support
Sounds like a weak way to promote own tool. Just saying.
Not a very charitable take.
I don’t know anything about the specifics of that debate, and I haven’t (yet?) noticed problems using staticcheck with golangci-lint. But when I started using golangci-lint in December of last year, I immediately noticed that the results for revive were different if I used it under golangci-lint than if I used it alone. (Briefly, revive under golangci-lint was silent where there should have been warnings.) The problem persisted, and I ended up with this note in a git commit.
At the moment, the revive and golangci configurations are separate, and
they have some overlap. They don't seem to play nicely together, and
I don't want to spend too much time debugging that now. But I should
come back to this later.
As you can imagine, I never came back to it. The only way that I can get things to work correctly is (still) that I run revive alone and then I run golangci-lint for the rest of the linters I use. That’s not the end of the world, but it does suggest to me that the staticcheck maintainer is not making things up. golangci-lint seems to alter how (some?) tools operate under it. That may make sense—or even be necessary for some reason—but it can also be a pain. I like golangci-lint a lot, but it causes some problems too.
Trying to get construct library in Python to parse a particularly nasty exchange protocol and play nicely with sockets.
I actually made the opposite move a few days ago. I switched over to Nix on my work laptop so I could access fresh packages with a faster install process, but after the recent performance improvements in the Homebrew 4.0.0 release, I ended up switching back. Don’t get me wrong, I find the Nix language truly delightful, I think flakes are brilliant, and I don’t mind the new Nix CLI, but I rarely find myself reaching for any of Nix’s special features. Even when I inevitably end up needing an ancient version of some random package for a work project, I usually end up using an environment manager like Conda/Mamba or a container manager like Docker/Podman so that I can more easily share my configuration with my Nix-averse colleagues. Moreover, I’m not a big fan of how often Nix installs unique versions of the same dependency for every new package I install, and I’ve found that the Homebrew equivalents of most packages tend to be kept better up-to-date and, more importantly, ship with better defaults (e.g., compilation flags, environment variables, and so forth). Similarly, I tried getting into Home Manager, but I failed to grok the advantage over Git-versioned dotfiles.
For context, at home and on personal servers, I satisfy my desire for fresh packages and speedy installations with pacman, either via Arch Linux or JuNest, and I scarcely require ancient software given how I manage dependencies in my personal projects (i.e., I’m happy to rewrite code to accommodate breaking changes).
I love Nix, and I want to need it, but I just don’t see how it improves my workflow.
Similarly, I tried getting into Home Manager, but I failed to grok the advantage over Git-versioned dotfiles.
+1 (and I do use NixOS). I can see a fundamental benefit of Home Manager for multi-user setup, as it allows declarative management of per-use environments. But, in typical desktop systems with single user, just stuff packages into system environment?
A non-fundamental benefit of HM is that a bunch of people use it, so it probably has convenient configuration models for all kinds of software, which is a benefit, even if you don’t care about user packages per se.
The reason I use home-manager is to have the same setup among all the machines I work on. It’s really great that my settings and favorite tools follow me.
I do the same, but without home manager:
~
, I store then in the same repo as flake.nixI don’t think I’d use NixOS as a daily driver (not the least because my work uses a proprietary VPN client that doesn’t work on Linux), but I do run it on my home servers and it’s really great at that. I can install and uninstall services I want to try and feel confident that I’m not leaving config gunk lying around. I think I’d use a Nix profile on Ubuntu, Mint, or Arch if I wanted to get more into it.
I switched to Emacs from Vim a couple of years ago. To ease the transition I chose spacemacs because it offered evil-mode and had a lot of pretty good settings. However, it’s dog slow, even on a relatively beefy Macbook Pro. I even switched to using the latest version with native-comp, but no dice. Sometimes it even completely hangs when I edit python or go files. Have other people encountered the same issues? How do you fix them?
I’ve heard similar complaints from Spacemacs users; Spacemacs brings in so many things that it makes it hard to tell where a specific problem is coming from. Luckily it’s easy to use evil-mode on its own.
Have you tried Doom Emacs? I had similar issues you describe with Spacemacs, which caused me to switch to vanilla vim (yikes). A friend introduced me to Doom and it’s been pleasant. It feels batteries-included like Spacemacs, it has good evil support, and its rough edges haven’t been as rough as Spacemacs’s. My config files are pleasantly short and I believe that the legitimate configs (e.g. keybinds) outweigh the hacks (e.g. lsp parse failure workarounds).
My biggest issue so far is with lsp-mode choking on some inputs, but I kind of think it’s my fault.
My personal laptop is a 6-year-old Thinkpad running Linux, and Spacemacs is not at all slow. But six months ago I started a new job and they gave me a Macbook Pro M1 for work. I’ve been mostly impressed with the MacBook…superb screen, amazing battery life, and generally quite fast… except emacs seemed almost painfully slow, especially with Spacemacs. Installing emacs 28 and compiling all elisp to native code does improve things quite a bit, though.
I just have to ask the “is it plugged in” question: Have you checked you’re running an M1 build and not x86 via rosetta? (Some things get really slow when run via rosetta.)
How good are you with Emacs Lisp? Emacs has a built-in profiler – actually, there were two at one point! – which might at least help you figure out if there’s a specific culprit there, or if this is… just how fast the whole thing goes.
Folks in the Spacemacs community might also be able to help. I’m not familiar with it and I don’t use large configuration frameworks, but I get the feeling they have a very active community behind them.
Not to the same extent but I agree that Emacs is quite slow… for a text editor. Since it does so much more and is a Lisp interpreter underneath I guess that’s understandable, though still annoying.
One of my banks does this - with just numbers!
Numbers only
Minimum 7 numbers
Maximum 20 numbers
Can’t have same number three times in a row (e.g. 111)
Can’t have four ascending or descending numbers (e.g. 1234 or 4321)
Can’t have the same number appear more than five times.
Can’t have pairs next to each other if the second pair is one number higher (e.g. 1122)
Can’t be the same as your previous eight access codes
Wow! How long do you think it will be before they start adding requirements like
The resulting number must contain at least three distinct odd prime factors
or
Must not contain your phone number partly or fully
Or contain your birthday, or any dates at all. In fact, it shouldn’t be divisible by 2 because most numbers are, and we want to make your password secure.
After the number of services hits 20ish, cute names stop being cute, at least for me. The reason is that I can’t immediately recall what each of the names stands for, and making it hard to follow any discussion.
However, there is one benefit of using cute names – if you work in secretive businesses, cute names allow you to discuss the services with colleagues after work, without worrying too much about spilling company secrets. Or at least I’m so told.
One dividing line might be if the names at your company are “pets” or “cattle.” When you have less than a dozen computers/projects, you can give them all cutesy names. When you have more than that, you need descriptive names to keep sane.
For some reason, I got to read this submission and https://lobste.rs/s/vyb9rm/stack_graphs_name_resolution_at_scale in succession. In stack graph submission, GitHub produces stack graphs for every push to the repository. To speed things up, they rely on the fact that if git push doesn’t contain some file, that file’s content haven’t changed.
That got me to thinking: could cargo-semver-check
use the same trick of leveraging git to update indices only for files that change? That should bring even more speedup as less work needs to be done.
That submission was great, I originally saw the Strange Loop talk and loved it! And yes, the same trick could work here as well.
In practice, though, we have to regenerate the rustdoc JSON data because cargo-semver-checks
has 40+ checks implemented (with more added constantly), most of which need more data than stack graphs could resolve (e.g. the implemented traits for a type, which requires type-checking due to auto-traits). This involves a trip through rustc, which is then free to change all the item identifiers in the new rustdoc JSON file, which in turn invalidates the index. I’ve had some conversations with the rustdoc maintainers about identifier stability, and together we decided that’s a “not now but maybe in the future” work item.
But recall there are two JSON files: one from the older “baseline” version and one from the newer “current / about-to-publish” version. The “current” JSON file has to be regenerated, but the “baseline” is almost always referring to a release that’s already on crates.io! That means we get to cache the baseline JSON file instead of rebuilding it, which saves a huge amount of time: avoided a build + better build caching for the other build since nothing got overwritten. This will ship in the next cargo-semver-checks
version in the next few days, as soon as we’ve finished testing the alpha with our early-adopter users.
Could we then also cache the indexes for the baseline? Probably! Right now, it just isn’t worth it:
So even if we switched to a fancier persisted index (or even turned cargo-semver-checks
into a daemon that keeps the index always warm in RAM), we’d win maybe 1s total, but pay for it dearly in extra complexity.
The bigger win we haven’t grabbed yet is parallelizing with rayon
, as I hinted in the beginning of the post. We have 40+ queries on the same read-only dataset + completely independent from one another. That’s just a “parallel for loop,” and rayon
will easily get us another O(# of cores)
speedup.
At that point, we’re talking about a checking-only time of <1s for even the biggest crates, and we start optimizing things like “rustdoc JSON itself” and “JSON deserialization” :)
But I do need to finish the new Trustfall API first!
This is kind of a silly argument, because it assumes that the dynamism of the language precludes type-safety. That is simply not the case, it means that type-checks happen at runtime rather than during compilation.
Moving type-checks to compilation makes them more robust, but at the cost of the power you gain from dynamic language features. It’s a tradeoff, not a straightforward benefit. There are things you simply cannot do in a statically-typed language.
The first issue I faced was understanding the code. It took a sweet amount of time to figure out the kind of objects, things certain functions were receiving and what they were doing with them. The code did have some unit tests, but the coverage was poor. So I had to guess, make changes and test the code at many places.
The real problem here is that you allowed poor coding practices to ship. If you care enough about correctness to switch languages, then you care enough to add systems that enforce this. It’s a false dichotomy.
Years of coding in Go gave me this comfortable feeling: if it compiles, it works.
No the fuck it doesn’t are you kidding me. You are fully able to write incorrect Go code. You still need to write tests in Go.
I would even argue that “if it compiles, it works” is more realistic in Python + mypy than it is in Go:
This is of course assuming that everyone is very disciplined about using types and not abusing Any
, which may be a strong assumption.
Passing the compiler’s type check is not sufficient to prove your program is doing what it is supposed to do.
There is one important distinction that I see here, and that burned me several times. If mypy fails, I can ignore it and still run by that python code, only to possibly fail later. If it fails in go, there won’t be anything for me to run.
Nevertheless, I agree with you that there are some classes of errors that could be caught and prevented with mypy.
Kinda funny, it’s the opposite for me: when trying things out the flexibility of bypassing type checks helps, and CI ensures type checking is never ignored in prod.
In find the Go compiler especially annoying when experimenting because it won’t even let me run code with unused variables and imports, they really don’t matter when experimenting.
The core argument for compile time type safety is that sooner or later you always allow poor coding practices to ship. The larger codebase and the longer lived it is the more likely that there are dangers lurking in there for the next refactor that you won’t discover until you hit production. Compile time checks can’t eliminate them totally but they can reduce their frequency considerably. For some people and projects that’s a really valuable attribute.
The argument being made here is that Go’s types are preferable to Python because developers can’t be trusted to test their code. They are literally suggesting that types are a replacement for test coverage.
This is delusional. Types reduce the amount of testing needed, but they will never replace it. If your developers are this unprofessional, I guarantee they will find a way to fuck up in Go as well.
The argument being made here is that Go’s types are preferable to Python because developers can’t be trusted to test their code. They are literally suggesting that types are a replacement for test coverage.
No, I am not. Could you tell me how did you get that impression? I’d be happy to add a correction
Unit testing is the way to go. However, this is not always possible, for instance, when enforcing rules within a team is difficult or when inheriting a legacy project. But does enforcing strict rules work in practice? You could set up CI/CD pipeline enforcing 100% code coverage, but that will affect team productivity and surely piss off people.
So your argument is that Python is a bad choice because
This is a people problem, not a technical one. Using a static type system will not solve this lack of professionalism.
Meanwhile, elsewhere in this thread, people are literally arguing that 100% path coverage is necessary and that static types somehow offer the ability to do that.
I am attempting to give them the benefit of the doubt; if they intended to say that type-safety is equivalent to 100% code coverage, then that would be even more nonsensical.
You’re pretty obviously not giving him the benefit of the doubt. The “benefit of the doubt” reading is “getting and enforcing 100% code coverage is harder than using static types in conjunction with less-than-100% code coverage.”
Nobody anywhere is suggesting that 100% code coverage is reasonable. This is a dumb strawman about testing. I’m pretty sure that you know this.
When somebody writes a blog post that tells me the things I have spent the better part of a decade-plus career doing is impossible, I am gonna call bullshit.
I have literally gotten in arguments with people who definitely think that 100% code coverage is both reasonable and necessary. Perhaps it’s not actually a strawman but something they have encountered in their own career?
that the dynamism of the language precludes type-safety. That is simply not the case, it means that type-checks happen at runtime rather than during compilation.
Type checks that happen at runtime are not useful. In fact, they shouldn’t even be called types at that point. Types are for determining if static code terms are valid or not. And to even say this, it sounds like you didn’t read the article:
The first issue I faced was understanding the code. It took a sweet amount of time to figure out the kind of objects, things certain functions were receiving and what they were doing with them.
Dynamic types don’t help with that. And that’s not a “silly” argument. Do you realize that’s insulting?
The fact that you have a personal definition of types that requires static checks does not mean that all of computer science agrees with you.
If you are working with a dynamic language, dynamic type checks are not only useful they are absolutely necessary.
That is simply not the case, it means that type-checks happen at runtime rather than during compilation.
The question is when a check gives you the most value. You can say an exception is a unit test that runs in production, but I prefer to run my tests as early as possible, with reasonable effort. For the class of invariances that can conveniently be expressed and checked using a type system, that’s what I prefer. For those that can be caught by unit tests, that’s the best. For those that require integration tests… etc. All the way to load tests using realistic fake traffic, if that’s needed.
if it compiles, it works.
All depends on what you mean by “works”. Of course, in any moderately complex system, there are an infinite number of intended behaviors that can’t be expressed through a type system (or unit tests). The closest thing I’m aware of is Elm, where the combination of a good (and rather simple) type system and very limited runtime capabilities make it so that if it compiles, it’s extremely rare that there are runtime exceptions. It brings me a lot of joy to be able to skip out on the part of the development cycle where I finish editing some code and wait to see if the program crashes or not. It’s great to just skip ahead to seeing if the higher-order behaviors are as I expected or not. It really does save me time and, most importantly, helps me keep mental focus and enthusiasm.
For the class of invariances that can conveniently be expressed and checked using a type system, that’s what I prefer. For those that can be caught by unit tests, that’s the best.
This is perfectly reasonable, but I will point out that I follow this approach in dynamic languages as well. Dynamic languages are capable of exactly the same constraints (in fact, they can be even more sophisticated), but you have to go about it differently.
Type-checking at compile time comes at a cost, the robustness of the checks means you are giving up flexibility to do certain things. This may be a worthwhile tradeoff for your situation — but it should be made intentionally.
I want to push back against this notion that you need static types to achieve this level of correctness. You do not. Don’t blame your tools for unprofessionalism.
the robustness of the checks means you are giving up flexibility to do certain things
Could you show me some affect of a program which is possible to produce with dynamically type checked tooling but not statically typed tooling?
Don’t blame your tools for unprofessionalism.
I’d argue that the industrial benefits of static typing are so obvious and accessible at this point in time that it’s unprofessional to operate under the assumption that “comprehensive” testing can replace the many roles of static tooling.
Here’s one example: I practice repl-driven development on a daily basis. IDE type hints can’t achieve this level of real-time feedback loop.
https://www.idris-lang.org/ (in action, ditto)
Many other examples of statically typed languages with REPL’s exist but this one is my favorite as it’s almost designed around progressive interactions with the REPL.
Moving type-checks to compilation makes them more robust, but at the cost of the power you gain from dynamic language features …
But the Python type hints are extremely flexible and gives you lots of power to take advantage of all the great dynamic language features of Python.
They are also not an all-in-or-nothing kind of thing. If you have code that needs to be super dynamic and creative and type hints get in the way then you can just as easily decide to not use them there.
(Or - in many cases you can usually create an “outer” layer of typed API while the inside is all the Python magic you want)
I believe both our arguments are perfectly compatible here. I don’t really consider Python’s type-hints to be in the same category as Go’s type checking, I am attempting to speak more broadly about dynamic vs static type systems and not get bogged down in a specific implementation.
We can see similar issues in Europe, too. For example, Serbian and Bulgarian Cyrillic have some subtle differences and readability depends on the font used. Most of the fonts optimize or only have Russian/Bulgarian variants, but not Serbian/Macedonian. For example, b, p, and t are written differently, while other letters are pretty similar.
Never heard of the Parquet data format before but it looks pretty interesting. Anyone ever used it in practice?
I use Parquet at work all the time, though largely through Delta tables, so most of the complexity is abstracted away. Main selling points talked online are row-columnar layout (row group boundaries can be tuned to e.g. store images), page compression and dictionary encodings. But what I found really useful after using it for a while is an excellent support for nested schemas and arrays, implemented by so-called “definition” and “repetition” levels. Dremel, on which Parquet is based, was built specifically as “interactive ad-hoc query system for analysis of read-only nested data”. At first everyone on the team wanted to normalize the tables to avoid those complex types, but we quickly found that ingesting heavily nested JSON without flattening improved both ingestion experience, and reader experience, without bringing down performance of either. There are other similar formats, like ORC, but Parquet is actively developed (they’re adding first-class Bloom filters right now!), and Apache Spark is well intergrated with it.
I use it daily. But to me, the format seems a bit chaotic, since different readers may support different feature sets. For example, when I tried writing a parquet file with categorical values and delta encodings, with addition of zstd compression, that file was only readable by my go program, not with pandas nor Dremio nor Clickhouse :(
Edit: this is a really great article outlining some of the features in parquet: https://wesmckinney.com/blog/python-parquet-multithreading/
I have, used it with Spark and AWS Athena, it works quite well.
Its main selling points (compared to something like compressed json) :
It’s column based. That has a lot of advantages, but to me the main one is that readers can read only a subset of columns and it will cost about the same as if the file contained only these. It also enables some efficient compression of numeric columns.
Files are split in blocks, and the start of blocks in the file is sorted in the file footer. This means that if a file is split in 10 blocks, 10 workers can read in concurrently without needing any synchronization.
Predicate pushdown. Each block contains some metadata about the value of columns in the block (for ex min/max). This enables readers to skip blocks that will not matter to them. For example if you have an age column and are looking for rows with age > 10, if max is 9 in the metadata you skip the block.
Self describing : the schema is stored in the file, so the file is self-sufficient.
It’s my go-to for data warehousing.
I wouldn’t want to be implementing a reader though!
Anyone ever used it in practice?
I know you’re probably asking for real details, but it’s also worth saying it’s the de facto standard “big data” file format. Basically all modern “big data” systems support importing parquet first class, or directly use parquet as their preferred storage format (usually in an external blob store like S3).
This is a weird way to measure a language. Look at GitHub and how productive a single developer can be with Go. Some programs are editors, computer games or command line utilities. Not every program needs the type of defensive programming the article promotes.
Inherent complexity does not go away if you close your eyes.
This is true. But not all projects has this particular type of inherent complexity.
Successful projects, quick compile times, an ecosystem that lets you add a dependency that works, not having broken code just because 3 years has passed; these are also nice things.
One should not close the eyes to the quick development cycles and low cognititve overhead experienced Go developers can achieve, either.
This is a weird way to measure a language. Look at GitHub and how productive a single developer can be with Go.
You could apply this argument to literally any language.
No, you couldn’t, and there is no comparison. Look at active repos in Common Lisp, Scheme, Forth, Smalltalk, Ada, you know, the languages everybody talks about nicely. Then look at Go. If there’s 2 types of X, the ones everybody complains about, and the ones nobody uses, Go is certainly in the first group.
Go is very popular, and getting more popular, and there is no denying it: https://madnight.github.io/githut/#/pull_requests/2022/3. Tons of people are learning go in their free time and using it to get stuff done, because it is easy to learn and easy to get stuff done with.
Now you’re arguing by popularity, which is a fallacious argument and a very different thing than “you can get stuff done with it”. Clearly you can gst stuff done with the other languages too.
Github and the popularity of the language in the platform proves that many single developers can and are getting stuff done with it. Which is a proof that you don’t get with other, unpopular languages. You might claim that they are just as good, but you are not getting proof of it from github.
Popularity is not a good measure of anything. Tobacco smoking is popular. Eating Tide-Pods used to be popular.
Tobacco smoking and eating tide pods do not produce anything (well, except damage to the person doing it). I think corasara is implying that projects on github demonstrate that people get stuff done with Go. To put it differently, TikTok and Insta measure popularity. GitHub showcases code that actually does something (well, most of the time). And the fact that there are many repositories on GitHub implies that many people do get stuff done with Go. Now, there might be an issue that users of other languages do not advertise their work on GitHub…
My point is that the people that get stuff done with Go could be just as well getting stuff done with other languages, but most of them haven’t really tried many other languages besides the ones that are obviously worse for productivity, such as C.
I think this is wrong. I think developers that have tried many different programming languages are the ones that most appreciate the things that Go gets right.
but most of them haven’t really tried many other languages besides the ones that are obviously worse for productivity, such as C.
This belief reveals a great deal about you and I’m not sure reflects anything true about Go programmers.
Wow, look at the replies you’re getting. These people are coming out of the woodwork to defend their questionable career choices. One could base a whole career on Go criticism.
This is ridiculous. You are comparing a decade of individual’s spending their rarest resource, their free time, to a claim that a 4chan meme was real and chemical addiction.
Get over it, some people use go because it’s the best choice for them.
This is not your schools debate club.
I agree with some of this post, and some parts not so much. But this one always irritates me,:
Do not make production changes on Fridays
If you can’t deploy on a Friday, you need to fix your deployment strategy. By removing Friday from when you can deploy, you’re wasting 1/5 of your available days.
Note: deploy != Release. Use flags, canaries etc.
I strongly disagree with you. Incidents are fairly strongly correlated with changes. Relying on on-call engineers who have less context than whoever deployed, and extending the time taken to bring people in, extends the time it takes to resolve an incident.
You can absolutely develop on Fridays, if you insist on working on Fridays. But maybe it’s a better time to chill out or have meetings, as everyone’s decision making will be worse for being tired from the week.
Sure, but incidents can be reduced with smaller changes deployed (and released separately, where possible.)
Our general guidance is “be thoughtful/don’t be rude” e.g. don’t click “merge” and then shut your laptop lid.
Sounds a lot like “don’t make mistakes”. Which is harder on a Friday, not least due to tiredness, but also because you might have plans, or you might have managers who are desperate to report in meetings that things got deployed.
But really, what is this extra day of deploying (or releasing) buying you?
Less stress on Monday! If lots of developers are waiting for Monday to deploy, that makes it all the more risky. It also ends up encouraging people to batch their changes (“I have 3 things done, but as it’s Friday I’ll put them all out together on Monday”), and again that increases risk.
If deployment on Friday is a risk, what makes you think Thursday is so much safer?
I wish I could source this quote, because it really sums up deployment for me:
Deployments should be boring, they are the heartbeat of your organisation
It also ends up encouraging people to batch their changes (“I have 3 things done, but as it’s Friday I’ll put them all out together on Monday”), and again that increases risk.
Really? I’d consider this less risky because the changes live together in test.
If deployment on Friday is a risk, what makes you think Thursday is so much safer?
As to saving Monday stress - I’d rather increase the risk of incidents on Monday and Tuesday to reduce the chances of incidents at the weekend.
Really? I’d consider this less risky because the changes live together in test.
With a few caveats, bundling changes together increases risk due to how those changes could intersect; if they are completely isolated from each other, then sure, the risk is about the same as deploying separately, but if they are more complex changes, or touch similar areas (or related, e.g. a library update and a UI change which somewhere depends on said library, transiently), then the risk compounds - its multiplicative rather than additive.
The other issue with deploying multiple changes together is that if something goes wrong, you now have a bigger change set to investigate, which can slow down your time to repair, or if you have to roll back you are now undoing unrelated changes too.
I do definitely want to avoid weekends working; having been in this particular project were we have ~50 developers in one repo deploying 10-30 times per day, the number of times we’ve had an outage which has needed someone to work a weekend has been maybe 3 times? I think one was just switching off the flag and coming back to it on Monday too. The site in question is no Facebook for traffic levels but is significant.
That said, I don’t buy your freshness argument in the slightest. If you people are not fresh enough to deploy on Friday (responsibly), then they are not fresh enough to debug something they deployed on Thursday either, thus you shouldn’t deploy on Thursday too.
It really seems to me like your mindset is along the lines of “all change is risky, we should minimize that risk by minimizing the number of times we make changes”. I think this can be a reasonable strategy.
I would argue, similarly to @pondidum, that batching changes up like that doesn’t minimize risk, it at best concentrates when the new problems may arise.
Speaking just from my own experience, having a big batch deployment process where you bring in a bunch of changes (frequently from different people) and deploy them all at once, compared to anybody deploying whenever they’d like ends up causing more problems. The biggest reason for this is just that the person doing the deploy frequently doesn’t have the context on the changes they’re herding out (speaking for myself, it may just be one weekend later, but remembering the nitty gritty details of what I did on Friday and how it may interact with production when it gets there is really hard on Monday, now add in changes made by other people that you may not have dealt with at all!)
To me, if the goal is minimizing off-hours work, then you want small, bite-sized change sets, deployed by the person who actually wrote that code, so they can debug it immediately if any problems show up. In that case, the minimizing risk rule shouldn’t be “don’t deploy on Friday” and instead should be “don’t deploy in the hour before you sign off for the day”.
And I know this is a bit of an appeal to authority, but I am saying this as someone who’s been working as a devops/infrastructure person on a highly complex platform for six years where we do let developers hit the deploy button whenever, and the only after hours issues we’ve had to deal with have been amazon breaking and not our apps breaking.
Seems like your assumption is that there is no environment where you can stage all your changes and integration test them. My assumption is that you should have that to the extent possible. That way all your changes can play together and you can see how they interact. Certainly if you’re releasing multiple times a day, you lose the ability to immediately know what broke production.
I personally would change it from “Do not make production changes on Fridays” to “Do not make risky production changes late in the work day”.
I’d agree to that! Fits well with my general guidance on deploying: “be thoughtful/don’t be rude” e.g. don’t click “merge” and then shut your laptop lid.
While I broadly agree that you should be able to deploy at any time, I think it comes down to how well managed your risks are. For one, at my work it’s fairly common to keep half an eye on a service for a bit (eg: five minutes to half an hour or) after you’ve deployed, so leaving work right after a deploy would be considered bad form. That said, the likelihood of provoking a failure will depend on throughput, amongst other things.
On top of that, deploying a new version your application itself is (hopefully) a less invasive change (infrastructurally) than deploying new supporting infrastructure. So you might say, you don’t want to do anything with terraform on Friday afternoons, as if if that goes wrong, you want to give yourself a decent window for recovery.
All other comments are mentioning that Friday is bad is because engineers are tired and there is a higher chance of errors. Another way to look at why deploys on Fridays are bad is because of people who are using that code. If deployed instances bring any change (and they do), that will disrupt someones workflow, on the very same Friday or during an important weekend, risking to waste their time.
It’s great to see it’s catching up with https://devenv.sh :)
Poetry would take an inordinately long time to resolve the required dependencies to install a package. Perhaps it was a one-off type of thing? Unfortunately not. It got to the point where I actively avoided using poetry to install dependencies and resorted to adding the dependency in pyproject.toml by hand, installing it locally using pip, and exporting the requirements as I did before
Computing a solution to the dependency graph (i.e. generating the lock file) is an NP-hard problem. It gets much slower with every package that you add. In a large project, it will take a long time, especially since poetry is written in Python. The slowness is usually not a big deal, because you only have to compute a new lock file when you want to add/remove/update dependencies.
It sounds like the author is computing a new lock file every time their build script is run. If so, that’s a misuse of poetry. You should generate it once, check it in to the repo, and then you can run poetry install
to fetch the packages after that.
tl;dr
I think the author has unrealistic expectations for a Python package manager and also is probably misusing poetry
. IMO poetry
is the best Python package manager and we should be encouraging its use, not spreading FUD about it.
If I recall correctly, the slowness of python dependency resolution comes from the fact that setup.py
has to be executed to get the list of dependencies. Repeating that invocation for all candidate versions really quickly adds up.
Computing a solution to the dependency graph (i.e. generating the lock file) is an NP-hard problem. It gets much slower with every package that you add.
While NP-hard problems are in principle extremely slow, in practice we’re really good at quickly solving even very large problems, as long as they’re not pathological. “Very large” in this case is millions of clauses — something well beyond the size of the typical dependency graph
In a large project, it will take a long time, especially since poetry is written in Python.
On the other hand, if Poetry wasn’t offloading the problem to a dedicated SAT solver, that’s a pretty big design failure.
The slowness is due in part to the fact that calculating the full dependency set requires downloading and inspecting the packages. Even ones that use static metadata declaration, so that the setup.py
script doesn’t exist/doesn’t have to be executed, because even the static package metadata is still inside the package itself. There’s work ongoing to expose more package metadata directly through the API of the Python Package Index, which would allow fetching the dependency list for a package without needing to obtain a copy of the full package, but it’s not 100% there yet.
The same issue also affects pip
and its dependency solver, and any other tools which try to calculate dependency trees for Python packages.
On the other hand, if Poetry wasn’t offloading the problem to a dedicated SAT solver, that’s a pretty big design failure.
I wouldn’t say so: based on experience with Cargo, I think reasonable design choices are:
Most of complexity in Cargo’s implementation seems to come from domain modeling (a lot of various kinds of dependencies) and desire to provide good error messages. Performance is also important, but I wouldn’t say it’s an overriding concern.
Maybe for Python with more historically baroque dep graphs and low performance reaching for off-the shelf solver makes more sense. But then, I’d assume that for Python packaging tools being “pure Python” is a hard requirement (you don’t want installing the installing tool to be harder than actually installing stuff!), and I’d assume good SAT solvers are not pure Python?
In my defense, I was most definitely not computing poetry.lock
every time. I generated it once, committed it, ran the relevant commands (e.g. ‘add’), and observed obscene amounts of slowness that I didn’t feel was justified. Maybe it was to be expected given the inherent difficulty in dependency resolution and I should have adjusted my expectations accordingly, but when I was able to switch to the pip-tools
workflow I no longer had to contend with such annoyances given the already low amount of value Poetry was providing.
If Poetry works for you, then I am genuinely glad for you as I am a strong believer in having tools around that remove unnecessary friction from our lives. Poetry just didn’t fit the bill for me and made me be more wary of introducing such tools (and any dependencies, to be honest) in the future.
Computing a solution to the dependency graph (i.e. generating the lock file) is an NP-hard problem. It gets much slower with every package that you add.
It’s not necessary NP-hard. If you only allowed one-sided constraints (eg, bounding semver ranges from below, and allowing duplicating major versions), greedy algorithm works. This probably doesn’t work for Python, which has to work with existing conventions, instead of designing something easily solvable.
Types are ok, taking assumptions about similarly named types from different crates is not. This whole article is a huge endorsement for reading documentation about components we use, before coding something.
I can’t unfortunately, it doesn’t seem to have an “edit” button. I believe it needs moderator access or something.
Part one is here: https://johnnysswlab.com/decreasing-the-number-of-memory-accesses-1-2/ but I found part two to be more interesting.