1. 18

    This resembles my experience in adding types to existing projects: you almost always find a few a couple of real bugs. The other thing is that typechecking speeds up development: mypy is usually quicker to run than the testsuite so you waste less time before finding out you’ve made a silly mistake.

    1. 4

      I wholeheartedly agree, however, the type errors can be dizzying for programmers who aren’t software engineers. I work with data scientists & product managers who contribute Python code, and adding mypy types had some negative effects to their ability to contribute. Overall, I think we came out ahead; I’m thankful for mypy. I’d love to see better error messages.

      1. 5

        Yeah, this is somewhere where I think most type checkers/compilers leave a ton of value on the table – tracking down a bug caught by a type error is usually much easier than than one caught by a test suite (or in prod…), because it points you to the source of the error rather than the eventual consequences of not catching it. But then many type checkers do a poor job of explaining the error, which undermines this. Elm deserves mention for doing a particularly good job here.

        1. 3

          I would rather teach data scientists who use Python about how to use type annotations than forego using them in Python programs just in case a data scientist needs to touch that code.

          1. 2

            I work on pytype, and we do try to improve the error messages where we can (e.g. here’s a recent commit improving “primitive types ‘str’ and ‘int’ aren’t comparable” to “primitive types ‘x: str’ and ‘10: int’ aren’t comparable”), however when you’re down in the weeds of developing a type checker it can often be hard to notice an error message is not readily comprehensible or helpful. I would encourage you to file a bug with mypy whenever you find an error message hard to read.

        1. 2

          This is cool, and Redirector is a very cool project! Are there any lists of common redirections for, eg, twitter and other sites that are very slow? I use quite an old machine to browse the web

          1. 2

            Thanks! Redirector isn’t mine FYI, but I agree it’s nice. @ploum pointed me toward Privacy Redirect which has similar sites listed. That might work well for you 🙂

          1. 10

            All this because Mozilla leadership still haven’t set up Firefox to take community funding directly, and instead want to use people’s donations on their irrelevant projects.

            1. 3

              As I understand Mozilla’s legal structure, you cannot at present give money to Firefox at all.

              Donations given to the foundation cannot be passed to the corporation. The irrelevant projects you mention (and there are a lot of them) come out of the Firefox profits so are eating the seed corn directly. I seem to recall off-hand that a lot of the donation money goes on grants to external organisations.

              1. 2

                And how many people would actually give Firefox money directly?

                1. 6

                  I’d give them $1/mo for sure. Maybe more, depending on what they did with it.

                  1. 5

                    maybe if you could specifically give money to fund the useful parts like FTP and RSS support, and ALSA

                    1. 3

                      I’ve donated as much as $75/mo to neovim. I don’t donate as much nowadays but if I could donate to a specific dev working on furthering my interests in firefox, I would.

                      I wonder if something like Igalia’s open prioritization would work for Firefox itself.

                      1. 2

                        We won’t know until they try. But for some points of reference: bcachefs which is still an out-of-tree alpha level project gets 2k/mth, WhatsApp in 2013-14 charging a dollar/yr (easily avoidable) was decently profitable, Wikipedia gets lots of donations annually even though they don’t really need it, neovim gets probably $50k/yr between various funding methods and neovim is relatively obscure. You can still ask for money on the internet and get a decent sum. With enough users like FF, they could definitely give it a go.

                    1. 2

                      You have neglected to disable FLoC with Permissions-Policy: interest-cohort=(). Like you I’m disappointed with Google ignoring parts of the robots.txt standard that they clearly understand the meaning of and am unpersuaded by their reasoning but for me the FLoC system is much more obnoxious.

                      I can also tell you from personal experience that disabling FLoC harms your search appearance considerably. My own website was downranked hard after I added that header.

                      1. 1

                        Writing a new blog post for my website. We’ll see if I get this one out the door this weekend (unlikely based on past performance. It’s on topic here and so I hope to submit it when I’ve finished

                        1. 15

                          Packaging is usually people’s #1 problem with the Python ecosystem. I do empathise with people struggling but often when I read the specifics of problems people have they often seem to be doing what I would consider “ambitious” things, like mixing package managers (conda and pip in one example in TFA), trying to do data science on a musl libc ddocker container or trying to take their app to prod without packaging it up.

                          In Pythons case I think the packaging system is solid overall and does work, though there are beartraps that unfortunately they aren’t labelled well. That said I think poetry does provide the right UX “affordances” and I think it should be the default choice for many.

                          1. 14

                            My latest problem with Python packaging was with setting up JupyterHub at $JOB. Here is what happened:

                            • building a notebook docker image FROM jupyter/scipy-notebook
                            • RUN conda update
                            • environment completely broken
                            • change conda update to only update the specific package we need a bug fix for
                            • environment completely broken
                            • ADD fix_a_specific_bug_in_that_specific_package_myself.diff /tmp/patch.diff
                            • WORKDIR /opt/conda
                            • RUN patch -p3 < /tmp/patch.diff

                            I did nothing ambitious, unless updating packages is “ambitious” in Python. The entire packaging ecosystem seems like a huge dumpster fire. Why are there so many different package managers? I can only imagine because the authors of each package manager thought all the other ones suck too much to bother using, which seems completely true to me.

                            Ruby just has gem. Rust just has cargo. They both just work. Python has really dropped the ball here.

                            1. 7

                              Have you tried just using pip? pip install jupyterlab just works for me.

                              To be honest I see a lot of people complain about conda which makes me suspect it doesn’t work. Like I have gotten a lot of mileage out of pip and python -m venv (and even that one, if you’re in docker you don’t need to mess with virtual envs) (Though maybe you’re on windows?)

                              1. 2

                                The official Docker image we based on uses Conda, and mixing Conda and pip is asking for even more pain. And I was under the impression Conda has packages that pip doesn’t, but maybe that’s not true anymore.

                              2. 4

                                @ $WORK, For the past decade+ our policy was: whatever our OS packages for dependencies only. If we need more than what our OS(typically Ubuntu LTS) packages for python dependencies, then the only option is to import the package into the tree, as a direct dependency and we now OWN that code. This is the only sane way to do it, as the python packaging stuff after many decades is still mostly broken and not worth trying to fight with.

                                We’ve since started moving to playing with Nix and Docker containers, which mostly solve the problem a different way, but it’s still 90% saner than whatever python packaging people keep spouting. Note, the technical issues are basically 100% solved, it’s all community buy in and miserable work, nothing anyone wants to do, which is why we are stuck with Python packaging continually being a near complete disaster.

                                Maybe the PSF will decide it’s a real problem eventually, hire a few community organizers full time for a decade and a developer(maybe 2, but come on it’s not really a technical problem anymore) and solve it for real. I’m not holding my breath, nor am I volunteering for the job.

                                1. 2

                                  My solution has been the opposite: do not rely on brew or apt for anything. Their release strategies (like “oh we’ll just upgrade python from 3.8 to 3.9 for you”) just don’t work, and cause pain. This has solved so much for me (and means I’m not breaking the system Python when doing “weird” stuff). Python.org has installers, after all!

                                  Granted, I’m not on Windows but everything just works for me and I think it’s partly cuz package maintainers use the system in a similar way.

                                  I kinda think Python needs a “don’t use anything from dist-packages and fully-qualify the Python version everywhere” mode, to prevent OS updates from breaking everything else.

                                  1. 1

                                    There is def. work involved to “port” our software to the next LTS branch, but it’s usually not miserable work. Plus we only have to do it every few years. The good thing is, by the time the python package makes it into a .deb in -stable branch the python code is also pretty stable, so upgrades and bugfixes are usually pretty painless.

                                2. 1

                                  Yes I’m afraid I would consider blithely updating lots of software to the latest version very ambitious. Only a loon would update a number of different pieces of software at once with the serious expectation that everything would continue working, surely you can’t mean that?

                                  I can’t speak to other package managers in general except to contradict you on rubygems. Last time I was running ruby in prod we had to increase the vm size to run rubygems in order to run puppet. Our Rubygem runs were unreliable (transient failures ala npm) and this was considered common. Perhaps this is an out of date view, it was some years ago.

                                  1. 15

                                    That’s insane. apt-get upgrade works fine. gem update works fine. cargo update works fine. Each updates the packages while maintaining dependency version invariants declared by those packages. Conda seemed to be trying to do that, but couldn’t do it correctly, and then trashed the system. This is not a failure mode I consider acceptable.

                                    Sure, sometimes upgrading to newer software causes issues with backwards compatibility, hence stable and LTS release channels. But that’s not what happened here. After conda update completed, it was impossible to do anything else with conda at all. It was utterly broken, incapable of proceeding in any direction, no conda commands could successfully change anything. And the packages were all in completely inconsistent states. Stuff didn’t stop working because the newer packages had bugs or incompatibilities, they stopped working because package A required B version 3+, but version 2 was still installed because conda self-destructed before it finished installing B version 3. Has that ever happened to you with gem?

                                    If updating my packages in the Python ecosystem makes me a loon, what is normal? Does anyone care about security patches? Bug fixes?

                                    I can’t comment on gem‘s memory usage, I’ve never run it on such a small machine that I’ve had that problem. But I haven’t had transient failures for any reason other than networking issues. I have used gem to update all my --user installed gems many times. Conda literally self-destructed the first time I tried. Literally no issue with gem comes remotely close to the absolute absurdity of Python packaging.

                                    1. 4

                                      apt-get upgrade does work pretty reliably (but not always, as you imply) because it is a closed universe as described in TFA in which the totality of packages are tested together by a single party - more or less anyway. Cargo I have not used seriously.

                                      Upgrading packages does not make you a loon, but blindly updating everything and expecting it to work for you arguably does - at least not without the expectation of debugging stuff, though I wouldn’t expect to debug the package manager. That sounds like you also ran into conda bug on your first (and apparently sole?) experience with python packaging. I can’t help you there except to say that is not the indicative experience you are extrapolating it to be.

                                      I don’t want to get deep into rubygems tit for tat except to repeat that it has been a huge problem for me professionally and yes including broken environments. Rubygems/bundler performance and reliably were one of the reasons that team abandoned Puppet. Ansible used only system python on the target machine, a design I’m sure was motivated by problems having to bootstrap ruby and bundler for puppet and chef. I’m sure there were other way to surmount that but this was a JVM team and eyes were rolling each time DevOps was blocked on stuff arising from Ruby packaging.

                                      1. 2

                                        your first (and apparently sole?) experience with python packaging

                                        Latest experience. Previous experiences have not been so egregiously bad, but I’ve always found the tooling lacking.

                                        having to bootstrap ruby and bundler for puppet

                                        That’s the issue. As a JVM team, would you run mvn install on production machines, or copy the jars? PuppetLabs provides deb and rpm packages for a reason.

                                        Regardless, even though gem has problems, I can still run gem update on an environment and resolve any problems. What about Python? Apparently that’s so inadvisable with Conda that I was a loony for trying. It’s certainly not better with pip, which doesn’t even try to provide an update subcommand. There’s pip install -U, which doesn’t seem to provide any way to ensure you actually end up with a coherent version set after upgrading. Conda, though it blew up spectacularly, at least tried.

                                        Seriously, how are you supposed to keep your app dependencies updated in Python?

                                        1. 2

                                          how are you supposed to keep your app dependencies updated in Python?

                                          In the old days, there were scripts that told you what’s outdated and even generated new requirements.txt. Eventually pip gained the ability to list outdated deps by itself: https://superuser.com/questions/259474/find-outdated-updatable-pip-packages#588422

                                          State of the art — pyproject.toml projects — your manager (e.g. pdm or poetry or whatever else because There’s One Way To Do It™) just provides all the things you want.

                                    2. 8

                                      Yes I’m afraid I would consider blithely updating lots of software to the latest version very ambitious. Only a loon would update a number of different pieces of software at once with the serious expectation that everything would continue working, surely you can’t mean that?

                                      You have been Stockholm syndromed. Yes, in most languages, you just blithely do upgrades and only run into problems when there are major version bumps in the frameworks you are using and you want to move up.

                                  2. 5

                                    I’ve been in the Python ecosystem now for about six months after not touching it for 15+ years. Poetry is a great tool but has its warts. I worked around a bug today where two sources with similar URLs can cause one to get dropped when Poetry exports its data to a requirements.txt, something we need to do to install dependencies inside of the Docker container in which we’re running our app.

                                    As Poetry matures, it’ll be great.

                                    1. 2

                                      Poetry is already three years old.

                                      1. 2

                                        A young pup!

                                    2. 2

                                      I agree with your post, but thisnis a stretch:

                                      In Pythons case I think the packaging system is solid overall and does work

                                      You define you dependencies but they will update their dependencies down the tree, to versions incompatible to each other or with your code. This problem is left unsolved by the official packaging systems.

                                      But in all frankness… This of pulling dependencies for something that takes otherwise 5 minutes to implement, and being ok with having dozens of moving targets as dependencies, is something that just can’t be solved. I rather prefer to avoid it. Use fewer and well defined dependencies.

                                      1. 7

                                        If you ship an app, this can be solved by “pip freeze” which saves the deep dependencies as well. If you ship a library, limit the dependency versions.

                                        This is not really a python problem - you have to do the same thing in all languages. In the similar group of languages Ruby and JS will suffer the same issue.

                                        1. 2

                                          Pip freeze helps, but you’re still screwed if you rely on some C wheel (you probably do) or Python breaks something between versions (3.7 making async a keyword in particular was brutal for the breakage it caused).

                                          1. 1

                                            I’m not sure what you mean - wheels are precompiled, so get frozen like everything else. What do you think breaks in that situation?

                                            1. 2

                                              If I knew, it wouldn’t be broken, would it? All I can tell you is that wheels routinely work on one machine but not another or stop working when something on a machine gets upgraded, whether deliberately or accidentally. Why? I have no idea.

                                    1. 8

                                      Looks cool, my quick feedback is that your 02-12-06 time format is confusing.

                                      1. 6

                                        Agreed, please use the ISO 8601 (YYYY-MM-DD) date format. 🤓

                                        1. 4

                                          Agreed, what is this horror

                                          Got 212,432 snapshots, from 96-10-23 18:55:02 to 21-05-30 14:37:26.

                                          1. 1

                                            Oh, that didn’t cross my mind. I’ll probably add back those two digits, then!

                                          1. 10

                                            I can’t believe a company like SalesForce doesn’t eye licenses like a hawk. That’s really weird, and poor form.

                                            1. 19

                                              It doesn’t surprise me at all. The only places I have worked that cared about licences of dependencies were banks. Everywhere else, using a library has been entirely at the programmers discretion and the programmer usually does not care.

                                              This is how OpenWRT was born.

                                              1. 3

                                                Maybe it’s a “software industry” thing? All three telecommnications businesses I’ve worked for have been very stringent about licensing and choosing the right licenses for our code and imported libraries etc.

                                                1. 7

                                                  I think that mindset of:

                                                  It is available on the internet, so it must be free to use

                                                  Is quite popular outside of the software industry as well. Unfortunately people are quite hard to educate about intellectual property.

                                                  1. 2

                                                    More of a company size thing. At HP unusual license approvals had to come from the legal dept. And that’s for a project which has MIT, Apache2 and a few others pre-approved. I’m sure there were other projects which needed confirmation of everything.

                                                  2. 1

                                                    Google cares very much about licenses.

                                                  3. 10

                                                    I once told a room of OSS policy wonks that my Big Tech Co had no one in charge of licensing or knowing what we use or checking for compliance. They were flabbergasted as though this were not the norm. I have worked at many sizes of company, was always the norm. You want a dependency you add it to Gemfile or whatever and push, the end.

                                                    1. 3

                                                      In my experience unless an engineer makes the company lawyers aware of the risk here they won’t even know to think about it. I make a point of raising it everywhere I work and institute some amount of review on my teams for licensing. But it’s not even on the radar of most company lawyers.

                                                      1. 1

                                                        I worked at a company that had a policy, but there was no formal enforcement mechanism. Devs were supposed to check, but this didn’t happen consistently. As a practical matter, though, it really wasn’t a problem. Just before I left the lawyers started asking questions and I actually built the initial version of an enforcement system. As it turned out, basically all of our dependencies were Apache, BSD, or MIT licensed (IIRC).

                                                      2. 2

                                                        Keep in mind licensing isn’t the only part though.

                                                        However, adding monkey patching to Go is not a reasonable goal while respecting anything close to “best practices.”

                                                        I think if you start out with something non-reasonable, such as working against the very programming language you use, why would you get into thinking about its license?

                                                        If a company like Linksys didn’t care about licensing why would a company like SalesForce?

                                                      1. 3

                                                        There are a hard core of emacs users who have a huge number of customisations and whose life revolves around emacs to a considerable extent. I don’t mind them but I do think it’s unhealthy for the wider emacs community to think of as the main users of emacs.

                                                        I am personally in some other niche. I can’t get into org-mode as I prefer putting notes on paper and don’t really see any need to keep notes beyond a week or so. I do use the calculator a bit but not for anything advanced, just because the RPN calculator is built in and good for simple sums. I certainly don’t use email or RSS from emacs and probably would never as networking has always been a weak point of emacs and I think of any non-local IO as a risk. I do have some customisations but over recent years I feel like I remove more than I add.

                                                        The parts of emacs I do use a lot:

                                                        • occur/clickable greps
                                                        • compile mode
                                                          • clickable tracebacks
                                                          • clickable warnings
                                                        • some Python jump-to-def thing
                                                        • ibuffer
                                                        • dired
                                                        • keyboard macros (and saving them)
                                                        • bookmarks
                                                        • hippie-expand, which is so dumb but feels very smart
                                                        • a save and session level backup thing
                                                        • magit
                                                        • any mode for a languages/filetypes

                                                        I have used other editors over the years when required. I’ve used Intellij to write Java/Scala and used Visual Studio for about a year at work when a client required it. Visual Studio is a lot like emacs - I think a ripoff of it actually - except that instead of M-x you hit C-q, IIRC.

                                                        Consequently I do encourage people to start using emacs and I do help them get started. I don’t think it is so hard but there are a lot of idiosyncratic keybindings which aren’t common outside of emacs (g to cancel is a classic emacsism). I tend to discourage new users from doing lots of configuration.

                                                        1. 1

                                                          Yeah, I’m a pretty serious emacs user but I don’t get into the emacsOS stuff. If I want an IRC client or a browser or a news reader, I’ll use a dedicated program.

                                                          But for writing text? Haven’t found anything better. I love that evil-mode is a thing that works as well as it does, I love being able to customize it and script it and M-:.

                                                        1. 2

                                                          It’ll be interesting to see just how big the VMS on X86 market actually is.

                                                          I rather wonder if it’ll mostly just be installations running atop aging DEC Alpha installations that’ll move over.

                                                          1. 2

                                                            I wasn’t even aware OpenVMS is available for x86_64. Though some legacy critical software designed for VMS “just works” after several decades, and there is less incentive to port it to newer architectures (ref. DEC Alpha). If I recall correctly, Indian Railways had their reservation system software running on VAX/VMS, and I won’t be surprised if they still continue to do so.

                                                            1. 2

                                                              It has been in the works for a long time now, this is them finally getting ready to deploy it. It’s still not production quality yet apparently and thy have been at it.. a decade at least.. it seems that long anyway! :)

                                                              1. 3

                                                                I don’t understand how one pays for such a long term engineering project. I mean if this was ones hobby, sure, put a decade into it. But for a commercial product? How does that make any sense? How is there a market for that after such a long time?

                                                                1. 3

                                                                  They’ve continued to offer support on the older hardware during the transition. Remember VMS has been alive since the 1970’s and offers some pretty amazing features, like active-active failover and shared load. They were doing stuff like Erlang’s BEAM is capable of, but for entire Operating Systems, not just applications and have been delivering it in production for decades.

                                                                  So there is a very conservative user base that’s been using VMS for decades and decades. It’s very reliable tech at this point. This would be the 3rd or 4th architecture port for VMS.

                                                                  They have had demos of it running on X86 hardware for at least a year now.

                                                                  Our financial/back office system was on VMS when the Alpha hardware got killed and they announced the X86 port. We decided to just re-write the application using PyQT and Postgres. It’s been mostly painless for us, our uptime isn’t way less than VMS, but it’s adequate for our needs. We are at 99.7% uptime(24/7, including maintenance) even though we only have an 8-5 uptime promise(which we mostly succeed with) . With VMS 99.9…% is totally achievable with much less effort than we’ve put into our systems.

                                                                  It’s a different type of Computing, one the x86/Linux/Mac/Windows world is still mostly unable to match in capability. The IBM mainframe systems have been running identical software compatibility since the late 1960’s and are still running just fine today.

                                                                  1. 1

                                                                    They’ve continued to offer support on the older hardware during the transition. Remember VMS has been alive since the 1970’s and offers some pretty amazing features, like active-active failover and shared load. They were doing stuff like Erlang’s BEAM is capable of, but for entire Operating Systems, not just applications and have been delivering it in production for decades.

                                                                    I don’t doubt the quality of the system and its design, but there are so few users of it, one has to wonder how this all works out. No company will run something out of nostalgia. Having a single vendor for some niche is probably seen more as a risk than a benefit.

                                                                    So there is a very conservative user base that’s been using VMS for decades and decades. It’s very reliable tech at this point.

                                                                    People retire, move on etc. The person that introduced VMS at some company in 1985 or even 1995 is very likely no longer there. There must be some extremely critical stuff making substantial amounts of money, otherwise people would have moved on. That has nothing to do with the quality of the system, but more with the long term thinking. Again, I am not arguing that any newer system is better, just what I would think would make sense to me.

                                                                    Our financial/back office system was on VMS when the Alpha hardware got killed and they announced the X86 port. We decided to just re-write the application using PyQT and Postgres. It’s been mostly painless for us, our uptime isn’t way less than VMS, but it’s adequate for our needs. We are at 99.7% uptime(24/7, including maintenance) even though we only have an 8-5 uptime promise(which we mostly succeed with) . With VMS 99.9…% is totally achievable with much less effort than we’ve put into our systems.

                                                                    This I find a little funny in the positive sense. On the one hand we hear how super-crazy amazing VMS is, yet using PyQT and Postgres is just as good in delivering on the given SLA. It is probably also inifintely easier to find programmers for PyQT & Postgres, than VMS.

                                                                    1. 2

                                                                      I don’t doubt the quality of the system and its design, but there are so few users of it, one has to wonder how this all works out. No company will run something out of nostalgia. Having a single vendor for some niche is probably seen more as a risk than a benefit.

                                                                      Don’t mistake “I don’t hear much about VMS” for “There are few users of VMS”. It has a surprisingly robust community. Is it as big as Windows or Mac or Linux? Not by a long shot but there’s both interest AND money there for the folks who choose to occupy this niche.

                                                                      1. 1

                                                                        There must be some extremely critical stuff making substantial amounts of money, otherwise people would have moved on.

                                                                        Well, yeah?

                                                                        job^’s ERP system was written in COBOL against a System/360. There have been ~3 attempts to “modernize”, each of which failed at various steps – some after lots of investment. This is a tale as old as time.

                                                                        It’s remarkably difficult to move half a century of business rules and processes to a new environment. It’s even harder when that environment works well enough now and powers 9+ figures of revenue.

                                                                        job^’s ERP system now runs on a leased System Z and is fully managed – turns out there is a very healthy ecosystem of MSPs that will run your mainframe, modernize and build new logic into your legacy processes, etc.

                                                                        1. 1

                                                                          sure, but that sounds like IBM Mainframe, not VMS. How does the success of the one speak for the other?

                                                                          1. 2

                                                                            You’re right to be confused; I managed to quite inarticulately lead with the example and not focus on the key point I was trying to make. Long day :)

                                                                            In my experience, these legacy platforms live not because they offer something uniquely compelling. Instead, they survive because they’ve accreted often implicit knowledge/rules/business processes over a very long time. And that makes moving away from them more dangerous than sticking with them, as long as you have people you can pay to maintain them and a source of replacement parts – or enough spares in a closet.

                                                                        2. 1

                                                                          This I find a little funny in the positive sense. On the one hand we hear how super-crazy amazing VMS is, yet using PyQT and Postgres is just as good in delivering on the given SLA. It is probably also inifintely easier to find programmers for PyQT & Postgres, than VMS.

                                                                          Agreed, a little funny! But we didn’t need anything VMS was really offering, I mean we took advantage of it, since we had it, but our SLA is 8-5 business hours basically. I mean Windows 95 can even manage that most of the time :)

                                                                          As for finding programmers, agreed, ones with Python experience are easier to find, but in my estimation/experience, if you have a programmer that can’t learn a language and get useful quickly, they probably aren’t worth hiring, so language experience is mostly a pointless metric. That said, you generally want at least 1 person with a good level of experience in whatever language you use, so you can avoid the big language pitfalls newbies might run into not knowing any better and get pointed to the saner libraries for solving problem Y with less research required.

                                                                          We mostly ported because users wanted “GUI”. Not that for most accounting/back office, a GUI necessarily helps, users still mostly want row/column output and reports yesterday that they dream up tomorrow. All easily done with a CLI/TUI.

                                                                          1. 1

                                                                            As for finding programmers, agreed, ones with Python experience are easier to find, but in my estimation/experience, if you have a programmer that can’t learn a language and get useful quickly, they probably aren’t worth hiring, so language experience is mostly a pointless metric. That said, you generally want at least 1 person with a good level of experience in whatever language you use, so you can avoid the big language pitfalls newbies might run into not knowing any better and get pointed to the saner libraries for solving problem Y with less research required.

                                                                            Sure you can learn any language, but finding people who want to learn a completely different OS that is not much used anywhere else is probably not that easy. If you told me I had to leave all my Unix knowledge behind and work on some obscure system, I would not be interested. You will probably find people that want to do that, but the pool of people will def. be a lot smaller. I can learn Rust for a new job or Typescript or whatever, but if you make all my knowledge about the systems irrelevant, I will feel very uncomfortable and look elsewhere.

                                                                            1. 3

                                                                              We used Python on VMS also :)

                                                                              Mostly a side note, VMS isn’t THAT different from *nix on a day to day level. I came from UNIX, did VMS for a while and never had any big trouble, and then went right back to FreeBSD & Linux hardly skipping a beat.

                                                                              It helps that VMS documentation is generally fine, but it is a for pay product that takes some pretty big $$‘s to buy, so it’s not really surprising that their documentation existed and was decent. Unfortunately today’s world regularly punts on the documentation problem, and in some cases even punts on the customer support problem.

                                                                      2. 2

                                                                        I think in part it isn’t purely commercial but also borne out of belief in VMS and wanting to preserve it

                                                                  2. 2

                                                                    I assume the Alpha sites have mostly moved to Itanium.

                                                                    1. 1

                                                                      It’s been a few years ago since I last heard from someone with a large VMS installation, but less than a decade ago the folks I talked to were still running a mix of hardware including VAX. Reliability is generally more important than performance to the folks still using VMS and some of the old VAX hardware is incredibly reliable (and was incredibly expensive when new), they only replace it with Alpha / Itanium stuff when it actually fails. Compaq was still selling VAX hardware until around the early 2000s. I think the last VAX systems were in roughly the same performance ballpark as a Pentium or Pentium II, so slower than a software emulation of one on a cheap mobile phone today (though some had quite powerful vector units for the era).

                                                                      1. 1

                                                                        I’ve heard the opposite; there’s a lot of customers that stuck with Alpha because the Itanium boxes were too much for them.

                                                                        From what I’ve heard, the people on real steel tend to be 1. midwestern auto parts shops that adopted VMS back when it was accessible to smaller businesses, and got stuck on it since 2. Fortune 500s. The old technical userbase mostly migrated.

                                                                      2. 1

                                                                        From what I’ve heard, most people who still use VMS applications run them under emulation:

                                                                        https://www.stromasys.com/solutions/charon-axp/

                                                                      1. 5

                                                                        This is a great overview and rationale for systemd’s logging and binary format.

                                                                        Inspired by git, in the journal all entries are cryptographically hashed along with the hash of the previous entry in the file. This results in a chain of entries, where each entry authenticates all previous ones. If the top-most hash is regularly saved to a secure write-once location, the full chain is authenticated by it. Manipulations by the attacker can hence easily be detected.

                                                                        I had not idea that this was the case.

                                                                        1. 3

                                                                          I am not too clear but I think Poettering calls this “log sealing” in some of the early written material on systemd. I haven’t used this feature in systemd but I have used it in other (proprietary) logging systems elsewhere and I can fully imagine that is was something the Red Hat’s clients commonly asked for.

                                                                          For me the part I like about the systemd journal is the indexed search/filtering, the automatic rolling and max sizing (logs can’t fill your disk) and interpolation of the logs from different services. It’s like a mini-ELK (and probably with sufficient config could replace a lot of ELK).

                                                                          Dare I say it given how unpopular systemd is everywhere, but I love systemd. Clearly someone thoughtful looked through all of the disparate, badly interacting parts of unix and just sorted heaps and heaps of it. Unit files, as a sysadmin, are way, way better than the old shell based init scripts, for example.

                                                                          1. 3

                                                                            This looks like the same model used by auditdistd on FreeBSD. It’s really useful for audit logs to know that an attacker who compromises the machine can’t tamper with audit logs undetectably (they can delete them, they can’t delete their activity in the middle, and with the distribution if they erase some entries at the end then the chain on the remote end will mismatch and point you directly to the entries that they deleted).

                                                                          1. 3

                                                                            I am working on a side-project to help “clinicians” (ie doctors and nurses in local surgeries) diagnose patients according to the various healthcare guidelines. It’s very much a proof-of-concept at the moment but our first PoC got a positive response so we’re going to try to incorporate the feedback and improve it bit. It’s a refreshing change to work on such a tiny project after working on lots of large sprawling codebases recently.

                                                                            Apart from that I’m hoping to write a first draft of my next blog post, which is to do with outages and hopefully can be a bit funnier than my last blog post ended up being.

                                                                            1. 3

                                                                              I am perhaps the very last person to adopt these newfangled rust-based versions of basic shell utilities…however I recently was forced to switch from gnu rgrep to rg by a large and unruly codebase I now work on. Wish I’d switched ages ago. rg is a lot quicker (largely I think because it carefully avoids digging through irrelevant stuff like node modules).

                                                                              What other ones should I look at?

                                                                              However…I only really call this stuff from emacs, so fzf is probably not much use to be as the emacs file finder is fine as it is. For context I hate change and it would require a minor miracle for me to use a piece of software not packaged in debian

                                                                              1. 4

                                                                                What other ones should I look at?

                                                                                bat is quite great, though its fanciness can get in a way (I mean specifically the line wrapping being done both by bat and by the terminal if you’d resize it).

                                                                                fd is great for simple file search (think: find(1)) though I’m not perfectly happy with its featureset.

                                                                                exa is a fancy ls(1) replacement. No comment here, it does exactly what’s on the lid.

                                                                                However…I only really call this stuff from emacs, so fzf is probably not much use to be as the emacs file finder is fine as it is.

                                                                                fzf is what you’d call a completing-read or “completion UI” in the Emacs world. fzf only filters what it’s given, defaulting to calling find(1) internally I believe. Inside Emacs you have the likes of Selectrum for that already.

                                                                                1. 1

                                                                                  exa looks wonderful. Reminds me of tweaks I did on Emacs dired to achieve similar styling https://xenodium.com/showhide-emacs-dired-details-in-style

                                                                                2. 2

                                                                                  fzf is packaged in Debian as of Buster (10), and has a backport for Stretch (9).

                                                                                  1. 1

                                                                                    rga: ripgrep, but also search in PDFs, E-Books, Office documents, zip, tar.gz, etc. https://github.com/phiresky/ripgrep-all

                                                                                  1. 44

                                                                                    There is very little in this I can agree with, except the last part.


                                                                                    Re: “Leadership under attack”:

                                                                                    Neither RMS nor ESR have never had any significant involvement in Linux, so why bring these up in the context of the Linux kernel? Seems odd.

                                                                                    Anyway:

                                                                                    • RMS has been highly controversial for a long time, as he mentions himself. He probably turned off more people from Free Software than attracted. He was always a highly problematic figure.

                                                                                    • Eric S. Raymond hasn’t been meaningfully involved in the OSI for a very long time, and when he came back it was for little more than for a weirdly misplaced rant about “Vulgar Marxists”. Add to this the context that ESR has significantly crazified in the last 10/15 years and is now advocating for literal terrorism … yeah (site seems offline, archive, and as with most of his posts the craziest stuff from ESR is in the comments he posts). The reason few people have noticed is probably because ESR has long ago already been relegated to the crazy internet cooks corner by most, and few people have been paying attention to him.

                                                                                      Besides, the OSI seems to do little more than write blogspam and discus some licensing issues on a mailing list. It’s certainly not significantly involved in the development of Linux as far as I can tell.

                                                                                    • I’m not aware of any serious efforts to “oust” Linus. The only sources I can find are from techrights.org, and that’s basically the InfoWars of OSS. By and large, people are happy with him as he’s been doing a pretty good job for the last 30 years.

                                                                                    Lunduke has been harping on about these things for ages, and always just pretends that the context of all of this doesn’t matter and that just “outed == bad == LINUX WILL DIE!!!!” I find it a complete non sequitur.


                                                                                    Re: “Linux companies”

                                                                                    • Who cares that IBM “killed off CentOS” (there’s also a bit more nuance to that IMO, but let’s leave that to the side)? There were immediately a bunch of replacements available. Doesn’t sound like “dying” to me.

                                                                                    • SUSE’s realignment to more “cloud stuff” seems to fit a general trend. Microsoft is doing the same for example: where Windows was once the core product, now e.g. Azure is increasingly seen as its “core product”. For better or worse, the OS in itself has become less important a more “abstracted”.

                                                                                    • Linux Journal is back, which he conveniently leaves out. LWN is still alive and strong. A single publication running in to a spat trouble strikes me as very little evidence of anything. I don’t see the “onslaught” he’s talking about.


                                                                                    Re: “Linux complexity”

                                                                                    Well, the world of computing is more complex than it was in 1992 🤷 It doesn’t seem to me that Linux has it worse than any other mainstream general purpose OS.

                                                                                    I think his point about maintenance and security are overly simplistic as well; Linux isn’t a monolithic entity where you run every line of code that gets committed to the kernel; it’s probably more useful to see Linux as a sort of “monorepo”.


                                                                                    Re: “Linux events”

                                                                                    He is complaining that in-person events were cancelled throughout 2020 and 2021 and that this means “our community is dying” and “in hospice with every known disease on the planet”.

                                                                                    I kept waiting for him to mention the pandemic.

                                                                                    He doesn’t. He simple asserts that there are fewer events and that they’re not coming back.

                                                                                    lol?


                                                                                    Re: “Fuchsia”

                                                                                    Yeah, this might replace Linux. We’ll see.

                                                                                    This is pretty much the only argument that makes any sense: something better will come along and it will displace Linux. It may not even be Fuchsia but something else. I have some ideas in which way things will probably move, but they’re probably wrong. We’ll see what happens.

                                                                                    Will Linux still be around in 25 years? I don’t think it’s as clear-cut as “Operating systems in the past have come and gone”. Overall, the world of computing is a lot less in its infancy than it was in the 70s and 80s, so it makes sense that systems last longer. There is also the issue there is a lot more software, so compatibility and inertia is more important than ever. You see this with programming languages as well, which seem to have a much longer average longevity than they had in the past.

                                                                                    It’s may not even be a bad thing if something were to come along and incorporate all the lessons from the last 25 years. Apple didn’t do too bad with its OS X right? But MacOS Classic was kind of horrible, and Linux seems “good enough” for a lot of things. It’s not uncommon that “good enough” blocks “better”, but again, we’ll have to see what happens.

                                                                                    1. 12

                                                                                      Regarding the first point, I definitely remember people trying to oust Linus because his behaviour was too brash.

                                                                                      He went on a small break and came back largely because of those attacks.

                                                                                      But he’s back now so it largely doesn’t matter, and I think people have stopped attacking him though I’m not certain of that.

                                                                                      1. 16

                                                                                        “Oust” is a strong word. I don’t believe I saw anyone credibly suggest that Linus step down from leading kernel development altogether. At most I saw claims that Linus’ interactions in mailing lists etc. was unprofessional, reflected poorly on the Linux project in particular, and could potentially exclude people from wishing to contribute.

                                                                                        As you said, Linus took these viewpoints to heart and the complaints have died down.

                                                                                        1. 3

                                                                                          At most I saw claims that Linus’ interactions in mailing lists etc. was unprofessional, reflected poorly on the Linux project in particular, and could potentially exclude people from wishing to contribute.

                                                                                          Those tend to be the opening moves of the ousting playbook.

                                                                                          1. 2

                                                                                            I try to give people the benefit of the doubt. If they state they’re trying to change Linus’ behavior out of concern for him personally, his legacy, and the health of the kernel development process, I’d accept that, absent any proof of nefarious intent.

                                                                                        2. 7

                                                                                          Nobody was trying to oust Linus, just trying to get him to understand that “management by perkele” wasn’t working anymore. Take a read over what he himself posted on LKML on the issue: https://lkml.org/lkml/2018/9/16/167

                                                                                          1. 6

                                                                                            FWIW, I think that break did wonders for Linus and was a moment of significant personal growth. The stuff he writes nowadays still has an edge to it, but in a much more mature way. I was recently reading some old Linus rants on the LKML, and I actually cringed each time he wrote that someone “should be retroactively aborted”. I think it’s helpful to think about what happened back then as more of an intervention than an ousting.

                                                                                            1. 6

                                                                                              It should also be added that Linus has said things to the effect of “Yeah, I don’t really like this angry temperamental side of my personality either; I wish it was different and I tried to change it and failed, guess it’s just how I am” years earlier already. This wasn’t some sort of magic epiphany moment but just one (large) step in a long process, and neither was it forced upon him by SJW beta cuck feminazi dangerhair Marxists trying to “cancel” him, or some such.

                                                                                              And while I don’t want to excuse any of his more, ehm, angry behaviour, I also feel that he’s been portrayed a bit unfairly. Some people seem have the impression that he is (or was) some sort of angry madman ranting and raving to everyone because every ridiculously outburst got media attention with a picture of him giving nvidia the finger, because 🍿 Again, not excusing this, but it is a very one-sided and incomplete picture.

                                                                                              1. 1

                                                                                                Change is often slow. If over a period of 10 years you manage to reduce non-constructive inflammatory phrasings from being present in 10% of your communications to only 0.1% of communications (while probably also reducing edge cases and improving the general quality), then the outside world will still only get transgressions pointed out to them and observe no change. Even those closer to the fire may draw the same conclusion through confirmation bias.

                                                                                          2. 9

                                                                                            You’ve spent a lot of time and effort responding to something that I think (despite the speakers claims otherwise) is basically a clickbait troll. I think that’s laudable but I definitely wouldn’t be spending my own time in this way! It’s a deliberately inflammatory headline and the presentation as a whole is mostly performative incredulousness.

                                                                                            One thing I would disagree with you about:

                                                                                            [RMS] probably turned off more people from Free Software than attracted

                                                                                            I don’t dispute that RMS has variously been unhelpful and has certainly alienated people but I think that still his net attraction to free software has been huge. For me, I first became interested in becoming a computer programmer at all because of his political essays. I don’t think that people who were convinced by his political ideas would change their mind about those ideas even if they later came to dislike him.

                                                                                            1. 2

                                                                                              You’ve spent a lot of time and effort responding to something that I think (despite the speakers claims otherwise) is basically a clickbait troll.

                                                                                              Perhaps; but I’ve seen Lunduke’s stuff around often enough to warrant writing something down, and it wasn’t that time-consuming :-)

                                                                                              Re: RMS. I don’t want to handwave away your comments, but I’m also a little bit weary of talking about him, so I’ll defer to my post from a few months ago for that, adding to it that “convinced by his political ideas” it’s not an “on/off switch”, and that he mostly turned off people who were broadly sympathetic, but not on all details or with some more nuance, and (strongly) dislike his hard-line no-compromise stance. This was certainly the case for me, and actually quite a few people I know.

                                                                                              1. 1

                                                                                                Thanks for linking me to that. Very comprehensive and I think you have me convinced. Having X11 under the GPL would have been an enormous thing for FOSS and I suppose it hadn’t occurred to me that FOSS might have been bigger had Stallman been more charismatic.

                                                                                              2. 1

                                                                                                I try to separate RMS the person (whom I have never met) with the ideals he (and the FSF) espouses.

                                                                                                Many people who are attracted to those ideals can be frustrated that RMS’ personality and communication choices can hinder the wider dissemination of them.

                                                                                                1. 3

                                                                                                  I don’t even find it particularly hard (and this is not to say that have taken to a dislike of RMS). There is no shortage of people who I’ve gotten ideas from who I don’t much like.

                                                                                              3. 5

                                                                                                There’s also a few “inaccuracies” in the complexity part. Completely omitting the fact that most of the new code comes with drivers. Not sure if the complaint there is “complex hardware is a problem for Linux survival” or … ?

                                                                                                There’s also the “million lines just in systemd bootloader”. I did not run cloc, but if there’s a million lines just in https://github.com/systemd/systemd/tree/main/src/boot I’ll eat my hat. (Edit: 8.5k lines including comments/whitespace)

                                                                                                1. 2

                                                                                                  There’s also a few “inaccuracies” in the complexity part. Completely omitting the fact that most of the new code comes with drivers.

                                                                                                  Yeah, that’s my feeling as well, but I didn’t feel like doing an examination of the Linux source and what exactly is in those “2 million lines code”; my comment was already long enough 😅 Would be interesting though! Maybe I’ll do it later.

                                                                                                  As for systemd bootloader, I think he may have been referring to the entire EFI process, but I’d have to go back to see what he said exactly.

                                                                                                  1. 1

                                                                                                    I did a cloc on Linux. It’s big, with 20,670,238 LoC in total[1].

                                                                                                    14,157,243 of those 20MLoC are drivers. Another 993,777 are filesystems. 1,729,484 architecture support code.

                                                                                                    Ext2 + Ext4 + common filesystem code is 109,636 LoC. The ARM64 support code is just 9,040 LoC. The size of a reasonable Linux system (ARM64, Ext4) would be 3,908,410 LoC[2].

                                                                                                    Around 4MLoC seems extremely reasonable for the core of a kernel like Linux, in my opinion. And of course there are gonna be drivers and filesystems and additional platforms supported, and each of those things will add a whole bunch of code, but all of that code will be neatly sectioned off in its own area and just isn’t the kind of code which creates a lot of maintainability issues long term.

                                                                                                    I didn’t bothered to get numbers for exactly where code has gone in recent years, but given that the entire core of a reasonable ARM64 kernel is just 4MLoC, those 2 million LoC per year are definitely going into either drivers, additional CPU architectures, or additional filesystems.

                                                                                                    [1]: I’ve counted lines labelled as C/C++/headers/Assembly. There are some more code in perl scripts, makefiles, shell scripts, RST files for documentation, etc. But I think it’s fair to say that C, headers and assembly are the meat of the code that actually gets compiled into a Linux kernel. If anything, I over-counted by including a few hundred kLoC of tools and sample code and such.
                                                                                                    [2]: I arrived at the 4MLoC number by doing (Linux LoC - drivers LoC - architecture LoC - filesystems LoC + ARM64 architecture LoC + Ext2 LoC + Ext4 LoC + filesystem common code LoC).

                                                                                                  2. 2

                                                                                                    Completely omitting the fact that most of the new code comes with drivers.

                                                                                                    Yeah, obviously Intel is going to own the SGX code, and it will have little impact on linux development generally.

                                                                                                  3. 2

                                                                                                    He simple asserts that there are fewer events and that they’re not coming back.

                                                                                                    This is the only point I tentatively agree with—factually at least, not necessarily agree with the fact that it means the community is dying. My feeling is that the number of in-person events has been steadily decreasing in the last decade. I think one of the biggest blows was the 2008 recession though. It may be just my feeling and/or a regional thing, because I don’t have good data.

                                                                                                    Are there any good catalogs of FOSS events?

                                                                                                    1. 2

                                                                                                      Probably; but this seems to be across the board: couchsurfing and meetup.com are also not what they used to be for example. And then there were the forum meetups that I attended, a concept which seems pretty much dead too (just as forums are).

                                                                                                  1. 12

                                                                                                    I suppose I have two home truths about analytics. Firstly that the vast majority of analytics data is not looked at. In my professional experience many just attach it and never look at it, though the data continues to flow and (importantly for the numerous SaaS analytics companies) they continue to pay for it.

                                                                                                    The other comment is that of those who do look at it, the majority don’t use it very well. There is a weird sense that people have that analytics helps you to micro-optimise tiny stuff on your site (the “128 shades of the colour blue” story from Google). Perhaps this is true if you are in the top 5 companies worldwide by market capitalisation.

                                                                                                    For most normal companies though, your analytics usually tell you that you are doing something basic wrong. For example you have an 8 step signup form with 90% drop-off from people who originally did want to do business with you. Instead of discussing and dealing with problems such as that you get a lot of talk about multi-armed bandits and weekly updates about noise metrics. This is perhaps similar to how many companies with fewer than 20 devs use k8s - people prefer to act as though they are part of an enormous company.

                                                                                                    Javascript analytics obviously gives up some privacy. I can’t say that I am really ok with it. What I think privacy campaigners either don’t know or underplay is how little is gotten out of that surrendered privacy.

                                                                                                    1. 1

                                                                                                      You might be right and the people for whom I work will never look at the analytics data.

                                                                                                      However, until this day, they do. Not only for crashes and that kind of stuff so JIRA can have stuff for us, the developers, and our salary is justified, but also for see if users are using the application properly.

                                                                                                      I cannot say really much about my work, but that data is used. Maybe not now, maybe it needs a bit of time to be effective, but it is used. It’s very early to confirm that.

                                                                                                      Using analytics can be done in two ways: JavaScript (Firebase, Matomo) or also in the backend, or both. In Android you embed that data with a manager or something and put the analytics.trackScreen(...) in whenever you need it. If the developers don’t allow you to opt-out, you’re being tracked each time you use the app. In the browser you can block that using uBlock Origin or something like that and you’re almost not tracked…

                                                                                                    1. 1

                                                                                                      This is bemusing me somewhat. I’m planning an in-place OS cloud VPS upgrade, don’t need zero-downtime, so I’m thinking I’ll stop it, take a snapshot, run the upgrade, and if it doesn’t work out how I plan, I can restore the snapshot. Seems a lot simpler, am I missing something obvious?

                                                                                                      1. 3

                                                                                                        Sounds like a good plan to me

                                                                                                        1. 1

                                                                                                          Sounds a lot like my plan! I just made a whole bunch of mistakes.

                                                                                                          1. 3

                                                                                                            Oh, you should hear some of the mistakes I’m too embarrassed to tell you about …

                                                                                                            1. 1

                                                                                                              Oh, but no, you wiped your VPS. You had no place to restore it on a single click like with a snapshot.

                                                                                                              A suggestion - you have all these fine things running as docker containers, why not trying to just raise them on your laptop? You don’t even need the docker daemon, just something like podman. Do that once a month or something to see if your backups are still good?

                                                                                                          1. 2

                                                                                                            I’m sorry you had a hard time with this. Did you not keep the old machine around while you were setting up the new one? Whenever I upgrade the OS on my VPS I do that. I also tend to tarball the whole disk rather than try to select files and then I keep the tarball around for a few months after I case I had forgotten something.

                                                                                                            1. 3

                                                                                                              I was thinking the same, but it seems that OP is trying to change the OS in place. I imagine it is a vps with longer term commitment, not a hetzner cloud instance.

                                                                                                              1. 2

                                                                                                                I think it’s a Hetzner dedicated server, with monthly rates and a setup fee.

                                                                                                                The following reads sorta like nitpicks, I know. I’m thinking aloud.

                                                                                                                I would have considered rsynced tarballs elsewhere; tarsnap can be slow in cases with latency to an East Coast AWS region, like, say, Germany. GPG tarballs to s3 or to Hetzner storage sound ok.

                                                                                                                It’s nice that it’s all in Docker. My last Alpine experience was unpleasant; any space savings were countered by lack of operational context and image caching anyway.

                                                                                                                If I was worried about downtime I’d maybe doing a prototype / trial migration to a vps with hourly or daily rates.

                                                                                                                A benefit of sticking to tarsnap is that it would show any gaps in backup coverage, such as the .env file here. Unfortunately, by doing this in place, it only surfaces any issues too late.

                                                                                                                The self-hosted Bitwarden blip feels … very close to a very serious outage, too.

                                                                                                                Thanks for the honest recap!

                                                                                                                1. 1

                                                                                                                  I ran them all with docker-compose from a Debian VPS.

                                                                                                                  Doesn’t sound like it was a dedicated server.

                                                                                                            1. 2

                                                                                                              It’s worth looking at using a forward proxy (eg trafficserver, which is widely packaged in distros) to limit external connections and do other things. As this is something they optimise specifically they are usually faster and lower overhead at doing it and they can even do SSL MITM with some configuration.

                                                                                                              It’s also worth looking at uvloop - I would use that in preference to the built in loop under all circumstances.

                                                                                                              1. 4

                                                                                                                Good choice of name and domain name for the website. Their are used like the plague and negative connotation is most welcome, these days when people do things they way they are supposed to rather than having a specific reason for it.

                                                                                                                Why do people use floats? Honest question. I don’t know any reason for using floats in any situation.

                                                                                                                1. 18

                                                                                                                  Numeric processing with high dynamic range is simpler with floating-point numbers than fixed-point numbers. In particular, they have the ability to temporarily exceed range limitations with a fair amount of headroom and only a modest loss of precision.

                                                                                                                  1. 2

                                                                                                                    I agree this is the kind of thing they are appropriate for. A rather specific use case.

                                                                                                                    1. 16

                                                                                                                      I’m not sure that “any science, physics or simulation anywhere, ever” is a very specific use case. Just not one that overlaps much with current hip new computing tech much.

                                                                                                                      1. 14

                                                                                                                        High dynamic range = most graphics, so it’s not actually very specific

                                                                                                                    2. 10

                                                                                                                      They require less memory and are adequate for some kinds of programming where higher precision isn’t necessary. For instance, https://gioui.org uses them for pixel offsets because fractional pixels don’t matter beyond a point. 0.00000000001 pixels isn’t usually worth worrying about in an application’s layout.

                                                                                                                      I also think that there are some processors on which float32 operations are faster than float64, but I don’t think that’s true of conventional x86_64 processors.

                                                                                                                      1. 3

                                                                                                                        I also think that there are some processors on which float32 operations are faster than float64, but I don’t think that’s true of conventional x86_64 processors.

                                                                                                                        It’s true that there are lots of cases where you won’t see a difference at all because you’re limited by something else (e.g. the cost and latency of arithmetic can be hidden by memory latency sometimes), but I would not state this with confidence.

                                                                                                                        When you’re cache or memory bandwidth limited, you can fit twice as many float32 numbers into each cache line.

                                                                                                                        Vector operations on float32s typically have twice the throughput. All the vector operations in SSE and SSE2 for example come in versions that work on float32 or float64 numbers packed into 128 bit registers. The 32 bit versions operate on twice as many numbers with the same or better latency and clocks-per-instruction (according to Intel’s documentation, at least).

                                                                                                                        A few operations (such as division) have slightly worse latency noted in Intel’s docs for float64 versions.

                                                                                                                        1. 2

                                                                                                                          In order to have an insignificant error like the example you guive, you are using up more memory, not less.

                                                                                                                          Having deltas order of magnitude smaller than the precision you need is an argument against floats. Not for floats. There is nothing positive into brute forcing the the maximal error by throwing useless bytes at it.

                                                                                                                          The do have have high precision around the range people use them. What they don’t have, and I suppose this is what people mean by precision is exactness. Given they are created by constructors accepting decimal notation in most programming languages. Most common decimal round numbers are not representable with such data types. And that is why I don’t understand why they are so ubiquitous.

                                                                                                                          1. 8

                                                                                                                            I don’t think most floats are created to represent decimal numbers. Some are, like when representing currency or in a calculator, but most floats are representing light or sound levels, other sensor readings, internal weights in neural networks, etc.

                                                                                                                            I’m guessing you may work in a domain like finance where decimal numbers seem ubiquitous, but you’re not considering the wider use cases.

                                                                                                                            1. 3

                                                                                                                              Yes, I do work in domains where decimal numbers are ubiquitous, floats are the plague. I see them even for representing natural numbers “in case we want to use a smaller unit”, and other such nonsense.

                                                                                                                              Even when used for store sensor readings (like image or sound) the only valid reason to use them is ifndividing your scale exponentially serves you better than linearly. Which I would argue It’s perhaps half the times or less.

                                                                                                                        2. 9

                                                                                                                          In machine learning, it’s common to optimize your parameters for space, since in those cases you typically don’t care about the precision loss compared to doubles and it lets you halve your parameter size, but you don’t want to use fixed point because your parameter range can be large. There are some approaches that involve 8-bit or 16-bit fixed point, but it’s not a universal thing at all.

                                                                                                                          In general, though, a lot of times they’re just Good Enough, and they save you from having to think about scaling constants or writing your own multiplication algorithms due to hardware support.

                                                                                                                          1. 7

                                                                                                                            Are you talking about the C float type, i.e. 32-bit IEEE floating-point, or all floating point types? If the latter, what commonly available data type should people use instead? Last I checked, few languages offer fixed-point types.

                                                                                                                            32-bit float is often used internally in audio code (for example Apple’s CoreAudio) because it has as much precision as a 24-bit integer but (a) gives you a lot more more dynamic range at low volume, and (b) doesn’t turn into garbage if a calculation overflows. (I don’t know if you’ve ever heard garbage played as PCM audio, but it’s the kind of harsh noise that can literally damage speakers or people’s hearing, or at least really startle the shit out of someone.)

                                                                                                                            A general reason for using floats is because a general purpose system — like the JavaScript language, or the SQLite database —doesn’t know the details of every possible use case, so providing FP math means it’s good enough for most use cases, and people with specialized needs can layer their own custom types, like BCD or fixed-point, on top of strings or integers.

                                                                                                                            1. 5

                                                                                                                              JavaScript is a typical case where floating point is a bad default. Typical use cases for numerics are user-facing values such as prices, not 3D graphics.

                                                                                                                              1. 2

                                                                                                                                I haven’t heard anyone say what should be used instead. Are you saying JavaScript should have provided a BCD DecimalNumber type instead of floating point? How would people doing any sort of numerics in JS have felt about this? Doing trigonometry or logarithms in BCD must be fun.

                                                                                                                            2. 5

                                                                                                                              I’ve gone through a personal rollercoaster in my relationship with IEEE floating-point, and my current sense is that:

                                                                                                                              a) I’d love to have computers support a better representation like Unums or Posits or something else.

                                                                                                                              b) What we have available in mainstream hardware is fairly decent and certainly worth using while it’s the only option. Overflow and underflow in floating-point isn’t that different from overflow in integers, and a whole lot less likely to be encountered by most of us given the far larger domain of floating-point numbers.

                                                                                                                              c) The real problem lies in high-level languages that hide processor flags from programmers. C and C++ have some bolted-on support with intrinsics that nobody remembers to use. Rust, for all its advances in other areas, sadly hasn’t improved things here. Debug mode is a ghastly atavism, and having release builds silently wrap is a step back from gcc’s (also sub-optimal) -ftrapv and -fwrapv flags.

                                                                                                                              1. 8

                                                                                                                                Haha as the implementor of unums and posits, I’d say unums are too much of a pain in the ass. Posits might have better performance, though if you need error analysis, it might be strictly worse. Posits had a fighting chance with the ML stuff going on but I think that ship has sailed.

                                                                                                                                As for ignored processor flags. I think zig is making an effort to make those sorts of intrinsics easily accessible as special case functions in the language, and hopefully they take on a strategy of making polyfilling easy for platforms that have partial support.

                                                                                                                                1. 3

                                                                                                                                  I use floats for GPU based computer graphics. I’ve read “Beating Floating Point at its Own Game: Posit Arithmetic”, and posits sound amazing: better numerical properties and more throughput for a given amount of silicon. But I’ve not used them, and I will never use them unless they are adapted by GPU manufacturers. Which I guess won’t happen unless some new company disrupts the existing GPU ecosystem with superior new GPU tech based on posits. Something like Apple with the M1, but more analogous to Space-X with the Falcon and Starship. I don’t see any reason for the large entrenched incumbents to gamble on new float technology that is incompatible with existing graphics standards.

                                                                                                                                  1. 4

                                                                                                                                    Yeap. Sorry it didn’t work out. We tried though (I even have some verilog models for posit circuits).

                                                                                                                                2. 3

                                                                                                                                  Swift’s default integer arithmetic operators panic on overflow. (There are alternate ones that ignore overflow, for performance hot spots.)

                                                                                                                                  1. 1

                                                                                                                                    Or when you actually need that behaviour, such as in hashing functions. But you don’t want your customer ids to actually wrap around silently.

                                                                                                                                3. 3

                                                                                                                                  Why do people use floats? Honest question. I don’t know any reason for using floats in any situation.

                                                                                                                                  They’re used to represent real numbers. It’s easy and convenient to have types like float that natively represent real numbers. It’s also nice to have statically allocated, roughly word-sized representation (as opposed to arbitrary precision).

                                                                                                                                  1. 2

                                                                                                                                    Why? What makes them more suited than integers for representing real numbers?

                                                                                                                                    1. 1

                                                                                                                                      fractions, sqrt, etc fixed point arithmetic drops a huge range of precision at either the high or the low end, and is also slower for many operations.

                                                                                                                                      1. 1

                                                                                                                                        I don’t understand what you mean. Integers have uniform precision throughout b the scale. Choose the base unit as you see fit for the precision you want and that is what you get.

                                                                                                                                        It always “drops the same range of precision”. if you need the precision of a float around zero, then set your base unit to that and there you have, it.s your maximum error. Unlike with floats.

                                                                                                                                        When are integers slower and why? You always have to at least perform the same operation in the mantissa of your floats..?

                                                                                                                                        1. 5

                                                                                                                                          the problem with fixed point is that you have to choose one range of precision, otherwise you’re just inventing what is likely to be a suboptimal software version of floating point. While there are (were?) cases where fixed point is acceptable, in general floating point can do better, and is faster.

                                                                                                                                          The reasons fixed point is slower boils down to the lack of hardware support for fixed point, but there are a few other reasons - efficientlyand accurately computing a number of real functions often requires converting fixed point to some variant of floating point anyway.

                                                                                                                                          In general integer operations are faster for basic arithmetic (and I really mean the basics: +,-,*), complex functions are typically made “fast” in fixed point arithmetic by having lookup tables that approximate the results, because fixed point arithmetic is typically used in places where accuracy is less important.

                                                                                                                                          Multiplication, addition, subtraction of floating point is only marginally slower than integer arithmetic, and once you add in the shifts required for fixed point arithmetic floating point actually outperforms it.

                                                                                                                                          1. 1

                                                                                                                                            I have no idea what you mean by “lack of hardware support”. Manipulating integers is leterally everything a processor does at a low level.

                                                                                                                                            What are you referring to?

                                                                                                                                            1. 1

                                                                                                                                              It’s not a matter of just doing inter operations, because as you say everything is fundamentally integers in a cpu. The question is how many integer operations you have to do.

                                                                                                                                              If you’re doing fixed point arithmetic you have to do almost everything floating point logic requires only without hardware support. Fixed point arithmetic isn’t simply integer arithmetic, it’s integer arithmetic plus large integer work, plus shifts. Because there isn’t hardware support, which there isn’t because if you’re adding hardware you may as well do floating point which is more generally useful.

                                                                                                                                              1. 1

                                                                                                                                                No to be stubborn but I am still not getting your point.

                                                                                                                                                The question is how many integer operations you have to do.

                                                                                                                                                Less than half as if you use floats, obviously. Whatever operations your cpu does for integers, it needs do for the mantissa of your floats, plus handle the exponents plus moving stuff out of the way and back in.place.

                                                                                                                                                Fixed point arithmetic isn’t simply integer arithmetic

                                                                                                                                                I am not sure what you think I am suggesting but to be clear it is: reduce all you variables to integer and do only integer arithmetic. It is, in the end, everything a processor is capable of doing. Integer arithmetic. Everything builds on it.

                                                                                                                                                I think the confusion here is the notion of “point”. A computer is capable of representing a finite number of states. A point is useful for us humans to make things more readable. But for a computer, a number is always an element in a finite set. You suggest I need to meaa around with fixed point arithmetic because I reject floats. But what I mean is: unless you hit scale limitations, there is no reason for using anything else than integers.

                                                                                                                                                If the confusion is how the result is presented to the user… That is a non problem. Just format your number to whatever is most human readable.

                                                                                                                                                1. 2

                                                                                                                                                  No to be stubborn but I am still not getting your point.

                                                                                                                                                  no worries

                                                                                                                                                  Ok, the first problem here is that you can’t reduce everything to integer arithmetic, if I am doing anything that requires fractional values I need to adopt either fixed point or floating point arithmetic. Fixed point is inherently too inflexible to be worth creating a hardware back end for in a general purpose CPU, so has to be done in software, that gives you multiple instructions for each operation. If you are comparing fixed point to floating point in software fixed point generally wins, but the reality is the floating point is in hardware, so the number of instructions you are dispatching (which for basic arithmetic is the bottleneck) is lower, and floating point wins.

                                                                                                                                                  In this case point has nothing to do with what the human visible representation is. The point means how many fractional bits are available. It doesn’t matter what your representation is, floating vs fixed, the way you perform arithmetic is dependent on that decision. Fixed point arithmetic simplifies some of this logic which is why in software implementations it can beat floating point, but it does that by sacrificing range and precision.

                                                                                                                                                  To help clarify things lets use concrete examples, how are you proposing 1.5 gets represented, and how do you perform 1.5 * 0.5 and represent the result. I need to understand what you are proposing :D

                                                                                                                                                  1. 1

                                                                                                                                                    I think the claim that precision and range are sacrificed doesn’t really hold. There is no silver bullet. The range of floats is larger because if has less precision as you get closer to the limits. Arguably, it has more precision where it is most useful, but this can be very deceiving. Include a large number in your computation and the end result might has less precision than what most people would think. They look at the decimal representation with a zillion decimal places and assume a great deal of precision. But you might have poluted your result with a huge error and it won’t show. This doesn’t happen with ints. You reach range limitations faster of course… But this isn’t very common with 64 bit ints.

                                                                                                                                                    But your final question perfectly illustrates the problem. As a programmer, you need to decide what should happen ahead of time. If you mean those values as exact values then you pretty much need a CAS to handle fractions, roots and and so on. Which obviously has no use for floats. If you mean approximate values, you need to be explicit and be in charge of the precision you intend. 1.5*0.5 is 0.7 or 0.8. it doesn’t make sense to include more decimal places if you are no doing exact calculus.

                                                                                                                                                    We learn this in school and my pocket TI calculator does this. If you set precision to automatic and insert 1/3, the result is zero. But if you inser 1/3.0, the result is 0.3. why would you want more decimal places if the number cannot possibly be stored with its exact value and is derived for numbers with less precision?

                                                                                                                                                    If you write 1.000 kg, it doesn’t mean the same as 1kg. If you mean the first it means a precision to the gram, and the easiest when writing a computer program is to just reduce to grams and proceed with integer arithmetic.

                                                                                                                                                    1. 3

                                                                                                                                                      the claim that precision and range are sacrificed doesn’t really hold

                                                                                                                                                      This is well studied. For example, I’ve seen the results of computational fluid dynamics simulation,, taking f128 to be “ground truth”, f64 gets far closer to the correct answer than any fixed64 representation.

                                                                                                                                          2. 3

                                                                                                                                            Consider something like 1 / x^2, where x >> 1. You have to calculate x², which will be a very small large, and then take the reciprocal, which will be a very small number. You can’t pick a single fixed-point to cover both, and there’s no opportunity in that one calculation to switch between two formats

                                                                                                                                            Situations like that are common in many scientific applications, where intermediate stages of computation are much bigger and small than both your inputs and final output.

                                                                                                                                            1. 1

                                                                                                                                              That is when one would use floats yes. But let.s be clear. They are comon in some scientific applications, specifically chemistry. The maxint or a 32 bit integer is plentiful for must usages.

                                                                                                                                              64 bit processors have been the standard for over a decade. Even those situations you mention hardly need a range larger than a 64 bit integer.

                                                                                                                                              1. 2

                                                                                                                                                That is when one would use floats yes. But let.s be clear. They are comon in some scientific applications, specifically chemistry. The maxint or a 32 bit integer is plentiful for must usages.

                                                                                                                                                I can’t think of a scientific field which wouldn’t prefer floats to 32 bit integers. What happens when you need to find a definite integral, or RK4 a PDE, or take the determinant of a matrix?

                                                                                                                                                64 bit processors have been the standard for over a decade. Even those situations you mention hardly need a range larger than a 64 bit integer.

                                                                                                                                                If we’ve got 64 bits, then why not use a double?

                                                                                                                                                1. 1

                                                                                                                                                  Regarding your first paragraph. I don’t think you are getting that I am suggesting to adjust the base unit to whatever precision delta you intend. Otherwise I don’t understand your question. Could you be clear about what exactly happens if you use floats that wouldn’t happen otherwise? They are both a data type made of a descrete set representing point on the real number axis. What limitations exactly are you suggesting integers have other than their range?

                                                                                                                                                  As for your second paragraph, isn’t it the other way around? Isn’t the point of floats to overcome integer range and precision limita and strike a ballance between both? Why would you need to that if you don’t have such limitations anymore. Floats were used all the time on 8 bit processors even for things you would integers because of range limitations. We don’t need to do that on our 32 and 64 bit processors.

                                                                                                                                                  I think there is this wrong idea that ints are meant to be used for natural numbers and such only. Which is of course a misconception.

                                                                                                                                                  1. 1

                                                                                                                                                    Regarding your first paragraph. I don’t think you are getting that I am suggesting to adjust the base unit to whatever precision delta you intend. Otherwise I don’t understand your question. Could you be clear about what exactly happens if you use floats that wouldn’t happen otherwise? They are both a data type made of a descrete set representing point on the real number axis. What limitations exactly are you suggesting integers have other than their range?

                                                                                                                                                    My point is that all three of those things involve working with both very large and very small numbers simultaneously. You can’t “just set the precision delta”. Or if you can, you’d have provide a working demonstration, because I believe it’s much harder than you’re claiming it is.

                                                                                                                                                    Also, lots of science involves multiplying very small by very large numbers directly, such as with gravitational force.

                                                                                                                                                    As for your second paragraph, isn’t it the other way around? Isn’t the point of floats to overcome integer range and precision limita and strike a ballance between both? Why would you need to that if you don’t have such limitations anymore. Floats were used all the time on 8 bit processors even for things you would integers because of range limitations. We don’t need to do that on our 32 and 64 bit processors.

                                                                                                                                                    I think we use them for lots of reasons, and one is that you don’t need to pick a basis in advance of computation, like you do with fixed width.

                                                                                                                                      2. 1

                                                                                                                                        Floating-point numbers can only represent (binary) fractions, but many real numbers need to be represented by computations which emit digits.

                                                                                                                                      3. 3

                                                                                                                                        One of the most important reasons is that floats are invariably literals whereas “proper” decimals are usually not

                                                                                                                                        1. 1

                                                                                                                                          How so?

                                                                                                                                          1. 1

                                                                                                                                            eg in Python

                                                                                                                                            # literal reals in python are IEEE floats
                                                                                                                                            >>> 0.2 + 0.1
                                                                                                                                            0.30000000000000004
                                                                                                                                            

                                                                                                                                            vs

                                                                                                                                            # Decimal is a wrapper around the GMP library - ie proper numbers
                                                                                                                                            >>> from decimal import Decimal
                                                                                                                                            >>> Decimal("0.2") + Decimal("0.1")
                                                                                                                                            Decimal('0.3')
                                                                                                                                            

                                                                                                                                            Extra syntax and extra library (even though it’s in the stdlib!) is a huge barrier. I have seen a number of real world systems be written to use floats - and suffer constant minor bugs - simply because it was easier.

                                                                                                                                            Once or twice I have ripped out floats for decimals. It’s not too hard but you do need a typechecker to keep things straight.

                                                                                                                                        2. 2

                                                                                                                                          Precision degrades much more gracefully with floating point operations (which round to approximate values or saturate to 0 or inf) than with integer or fixed width operations (which truncate or overflow).

                                                                                                                                          If you have to do work with real numbers then floats are usually best of those three options.