Threads for jhl

  1. 3

    The first person who successfully compiles must get an award!

    1. 7

      I think Foone is trying to make it happen: https://twitter.com/Foone/status/1511808848729804803

      1. 1

        Of course he is

        1. 3

          They are the reason it was open sourced.

          I wonder where they get the time to spend on all of these reverse engineering efforts?

          1. 3

            They have a Patreon, I just chipped in today: https://www.patreon.com/foone/

            1. 2

              I adore and support them… but it doesn’t look like enough money to support a life? Especially a life with a serious retro tech buying habit!

              1. 1

                Maybe they have another income streams? Hopefully, anyways

    1. 3

      I used this hack in a production embedded system which I’ve been maintaining for the past 7 years…

      It was simpler and more performant than switching to an RTOS, but there’s definitely a bit of a footgun when writing new code - if you define new variables that have to survive a yield, you’d better remember to declare them static!

      1. 1

        As I recall, Contiki used this as well, with precisely the same footgun. They called their model protothreads.

      1. 2

        As someone who does a lot of embedded, electronics, construction, and cooking… I love these things!

        My own go-to is the fairly hidden gem that is Qalculate’s CLI: https://qalculate.github.io/manual/qalc.html It doesn’t have a whole programming language; a matter of taste I guess.

        1. 6

          I’m not sure how I feel about the “Five Whys” example. The classic problem with root cause analysis is that people stop looking too early. Demonstrating this, the author stops at step 3 with “there is a bug in the parser”… when surely step 4 is “we are using a regex to parse a query string”, a rootier cause.

          1. 1

            That’s fair. I would note that five whys is not always intended to be used five times - but to be used until you reach a root cause.The root cause in this case, IMO, is not that a regex is used, but that the regex is incorrect.

          1. 1

            This seems pretty unnecessary (you could use the mute button in the conferencing app, or use the same touch controller thingy as a HID device plugged into the computer that would mute/unmute the mic’s capture device, which I think even changes the color of the light on it), but it’s nicely done nonetheless.

            1. 2

              I’d argue that a hardware mute button beats both of those options for usability, especially given how much some of us use it.

              For the first - what if you’re working in another app?

              For the second, you’d need a helper app running on the PC to manage the mute status, which adds complexity and a point of failure. It’s been a while since I last looked at the USB audio class spec, but I’m pretty confident mute status doesn’t go out on the wire, so you’d need another indicator on your button. (Not a bad thing, IMO.)

              Personally I went out of my way to add a hardware mute button to my desk mic, with a nice big red tally light to help me keep track of the mute status. It’s nice to have something that just works and always will.

              1. 1

                For stuff like this, dedicated buttons beat kludged-together software every time, for me. Yep.

                1. 1

                  For the second, you’d need a helper app running on the PC to manage the mute status, which adds complexity and a point of failure.

                  Sure, but not too much of one. It should be possible to make it stupid enough that it never breaks, and set it to always run if the controller is plugged in.

                  It’s been a while since I last looked at the USB audio class spec, but I’m pretty confident mute status doesn’t go out on the wire, so you’d need another indicator on your button.

                  It is (or it can be anyhow), USB Audio calls out mute and volume as supported kind of “controls”. Of course you don’t need to have any controls for a USB mic, you can let everything be done in software, but the Yeti does expose mixer controls, and I believe (I don’t have access to mine right now, so I’m going on memory) that the mute button on the device itself just sets the gain control to -∞dB and then restores it, and that doing the same thing from the host side will make the light on the mute button turn red.

                  I’m not arguing too seriously here, I’m just saying… you could probably get by without opening the thing up, and given my hardware skills, it’s the way I would have gone.

                  1. 2

                    It should be possible to make it stupid enough that it never breaks

                    Call me a stupid hardware engineer, but in my experience getting stuff like this to be completely bulletproof (eg over suspend/resume etc) can be non-trivial!

                    USB Audio calls out mute and volume

                    Thanks for teaching me something new today (:

                2. 1

                  Yep.

                  I have the same mic. I have a dedicated button on my keyboard bound to mute all mics via pulseaudio. The Blue notices this… somehow and changes the light on the hardware button. I was surprised the first time I noticed.

                  It also changes an indicator on my status bar so I know whether it’s hot or muted without looking away from my screen. It mutes every mic attached to the computer, so just in case something has bound to the webcam mic by accident I’m still safe.

                1. 9

                  Seems like we need a deposite regime to ensure all designs and code are available. And probably at least one third party maintenance partner that can repair the devices.

                  1. 13

                    I’d prefer a system where the national health service (everyone has those, right?) requests tenders, selects vendors, pays for upkeep/training for a number of years, and gets access to all source code / manufacturing details etc.

                    It might not be VC money, but done right it can keep a company profitable for years.

                    (I don’t know if health services already do this, for ear implants f.ex.)

                    This article is pretty depressing, both because of the social failure (not market failure) and because the tech is despite all this still so limited. My wife and stepchildren have retinitis pigmentosa, and I had naively imagined tech to at least alleviate their condition.

                    1. 6

                      Imagine technology chosen for being the lowest cost, or for having greased the right palms. Sounds like I’d prefer the open source approach.

                      1. 1

                        Agreed on the palm-greasing part, but lowest cost is not the issue - it is when quality is sacrificed in pursuit of lower costs.

                        There have to be mechanisms in place to ensure that quality (across many aspects) is held above a standard, while still allowing vendors and manufacturers to drive costs down. Low cost is a long-term good thing for people with lower wealth, because even assuming large subsidies, there is only so much we can allocate to particular problems.

                        This is important in every industry, but the quality aspect has to be much more heavily weighted in the medical device world because of the drastic consequences of allowing companies to ignore their long-term negative externalities, as seen in the article.

                        It’s a tough issue because it has to be balanced with incentivising smart people to solve these problems in the first place. I like the discussion here around escrow of code, but seems like we need more than that. Escrow the hardware design files as well, and require proof that the company has done due diligence in protecting their supply chain from single sources of failure. Multiple sources for components. Escrow the documents detailing repair procedures, including repair part sources and test equipment. Etc.

                      2. 4

                        Interesting that you (correctly, in my opinion) identify this as not a market failure.

                        I’m of a different opinion to you as how best to handle this sort of situation. I’m not a socialist; rather, I think that voluntary solutions to problems like these are better in a number of ways. This particular problem could be solved by either the use of appropriately licensed free software and hardware, or an escrow system in the case that the devices are withdrawn from the market for any reason (thereby mitigating planned obsolescence, too).

                        However I am sick of the term “market failure” being used to describe a functioning market providing prices that people don’t like, so I was pleased to see someone of (I presume) quite different political stripes calling it out.

                        1. 7

                          A state/nation is probably the only entity that can correctly price what this kind of technology is worth. In principle, a blind person could be expected to spend a lot of money to get even a small amount of sight back, but in practice, most blind people are economically disadvantaged, because their handicap preclude them from amassing enough capital for actually pay for the solution.

                          1. 4

                            A state/nation is probably the only entity that can correctly price what this kind of technology is worth.

                            But then you’re right back to considering this a market failure - only instead of proposing price regulation, you’re proposing the establishment of a monopsony to use taxpayer money to drive the prices up to where you personally, not the market, think they belong. Presumably to create incentives for the research?

                            but in practice, most blind people are economically disadvantaged, because their handicap preclude them from amassing enough capital for actually pay for the solution.

                            Yes. Probably a VC-funded for-profit isn’t the answer here. Perhaps a charitably-funded non-profit? (Edited to add: this would also address the obsolescence issues, if done properly.)

                            1. 2

                              you’re proposing the establishment of a monopsony to use taxpayer money to drive the prices up to where you personally, not the market, think they belong

                              This is not what I propose at all.

                              This is drifting off-topic, I won’t continue this thread. ∎

                        2. 3

                          Mostly a good idea but

                          pays for upkeep/training for a number of years

                          This is still a cliff someone could fall off. Individualist solutions create lots of cliffs for everyone who doesn’t have infinite money, but collectivist solutions also pose similar problems. Full access to the materials required to repair your own implants alleviates this at least a little - you still need money and/or skills but at least you’re not relying on an organization (governmental or private) to provide support for ever.

                          1. 3

                            A country knows through statistics how many citizens have RP. It can also calculate the raw savings of allowing people with RP more years of productive work if they get this kind of therapy. This can be translated into a commitment to pay a company a certain amount of money for a technical solution, letting that company set up tooling and training. As the technology advances, and more patients are helped, the technology can be accepted in more countries, opening up new markets and helping more people.

                            Granting a patent will ensure the technology is available after the state-granted monopoly has expired.

                            1. 3

                              Measuring the benefit is really difficult though - it can take a long time and a lot of samples, and it’s difficult to predict how an early stage technology’s benefits would develop - the target can be moving pretty quickly. Even getting VC funding for something that’s working fantastically can be tricky, especially if you run into non-medical issues (regulatory, product engineering etc).

                              I’d dearly love to see state-sponsored systems to bootstrap these things but it seems like working out what will succeed is a nigh intractable problem. I’ve been building implantable neuromodulators for over a decade and I’ve seen so many devices that just seem to be splashing around in the dark. On reflection, all the impossible promises about high-res retinal prostheses are a good example!

                              Adding more electrodes doesn’t work for very long. The electrodes are relatively far from the nerves they stimulate so the current spreads a lot. And you’re limited in how many electrodes you can activate at once due to the current density - so that you don’t cook the patient.

                              A state agency trying to review this without the wisdom of having built these things is difficult to set up - even the existing safety regulators have trouble keeping up (no doubt a lot of that is due to under-resourcing, though).

                              1. 3

                                Thanks for expanding! I don’t know much about this tech space.

                                Measuring the benefit is really difficult though

                                Absolutely. But I think it’s more achievable if you have a large sample size / patient pool. guaranteed funding, and requirements for followup etc. I also think that if you’re designing something that will be used for the rest of someone’s life, you should have different priorities than just looking at the next quarter. Trusts seem to shoulder some of this responsibility in many countries.

                                Adding more electrodes doesn’t work for very long. The electrodes are relatively far from the nerves they stimulate so the current spreads a lot. And you’re limited in how many electrodes you can activate at once due to the current density - so that you don’t cook the patient.

                                Ah, the old software/hardware interface. I curse it every time I have to use a printer ;)

                                Seriously though, it is a bit depressing that the stuff that’s breezily foretold in SF isn’t achievable in the real world due to heat dissipation.

                                To sum up, we’re in early days of the kind of medical devices that require peripherals like camera etc. The patients in the linked article presumably knew they were part of a pilot group (although information asymmetry comes into play really quickly). I think the company made a good faith effort to get a product to succeed, but the financial support simply wasn’t there.

                                The French effort seems to target age-related macular degeneration which is a better bet - a much bigger market (all those boomers who will demand tech solutions to their aging problems) and hopefully RP patients can “ride the coattails”.

                      1. 3

                        This sshd got started inside the “doubly niced” environment

                        As for why “the processes didn’t notice and then undo the nice/ionice values”, think about it. Everyone assumes they’re going to get started at the usual baseline/default values. Nobody ever expects that they might get started down in the gutter and have to ratchet themselves back out of it. Why would they even think about that?

                        These days, this should stand out as a red flag – all these little scripts should be idempotent.

                        You shouldn’t write scripts where if you Ctrl-C them, and then re-run it, you’ll get these “doubling” effects.

                        Otherwise if the machine goes down in the middle, or you Ctrl-C, you are left with something that’s very expensive to clean up correctly. Writing Idempotent scripts avoids that – and that’s something that’s possible with shell but not necessarily easy.

                        As far as I can tell, idempotence captures all fhte benefits of being “declarative”. The script should specify the final state, not just a bunch of steps that start from some presumed state – which may or may not be the one you’re in!


                        I guess there is not a lot of good documentation about this, but here is one resource I found: https://arslan.io/2019/07/03/how-to-write-idempotent-bash-scripts/

                        Here’s another one: https://github.com/metaist/idempotent-bash

                        1. 9

                          I believe the “doubly niced” refers to “both ionice and nice”. There wasn’t any single thing being done twice by accident. The issue is with processes inheriting the settings due to Unix semantics.

                          1. 4

                            The problem is the API - it increments the nice value rather than setting it. From the man page:

                            The nice() function shall add the value of incr to the nice value of the calling process.

                            So the nice value did end up bigger than desired.

                            1. 3

                              That is an interesting quirk of nice()/renice, but in this case I believe they explicitly stated they originally set the nice value to 19, which is the maximum.

                              1. 2

                                Thanks, you’re right! Took me a second reading…

                            2. 1

                              Ah yeah you could be right …

                              But still the second quote talks about something related to idempotence. It talks about assuming you’re in a certain state, and then running a script, but you weren’t actually in that state. Idempotence addresses that problem. It basically means you will be in the same finishing state no matter what the starting state is. The state will be “fixed” rather than “mutated”.

                              1. 3

                                Hmm, I still don’t think this is the case. The state being dealt with is entirely implicit, and the script in question doesn’t do anything with nice values at all, and yet still should be concerned about them.

                          1. 1

                            DO NOT LEAVE A DECPACK IN THE RK05 It will be damaged and likely damage the drive as well. Someone will have to open the case and check on the heads. Don’t do it!

                            Pretty sure you can just put a new battery in an RL.

                            1. 1

                              What kind of batteries do they use, anyway?

                              1. 3

                                According to the illustrated parts breakdown (pg 18 item 38) the battery pack is a 4-cell battery, p/n 12-10641-00. And according to this 27-year-old post, those are 4 NiCads in series. So replaceable, but not quite off the shelf?

                                The most amusing thing I learned from all this is actually the little elapsed-time indicator on the back of the device, used to tell the user when the 1500-hour maintenance interval is up. It’s a column of mercury with a little index bubble in it, and the index slowly worms its way along the mercury column. At the 1500 hour mark, you pull out the indicator and flip it around, and the index starts heading in the other direction.

                                They work by electrolysis - the index is actually a droplet of electrolyte; when you pass a current through it, the mercury on one side of the bubble starts dissolving into the electrolyte, shortening that column; and more mercury plates out as metal at the other side, making the opposing column longer. It’s ingeniously simple and apparently you could buy these things as recently as 2002.

                                Compare and contrast with the safer solution that succeeded it, the electromechanical counter, which is much bulkier, and much more complicated.

                                1. 1

                                  It’s a column of mercury with a little index bubble in it…

                                  …that is really cool. I wonder if we could do this with something other than mercury? Worst case, gallium with a small heater in it…

                                2. 1

                                  IIRC, it’s a standard kind of cell - AA or AAA?

                              1. 3

                                I had the weirdest experience with one of these.

                                My sister was staying with us from out of town, and was working on her Macbook in another room (connected to our wifi of course). I pressed the play on our “TV” remote… and it auto-started playing music ON HER LAPTOP IN THE OTHER ROOM.

                                Obviously we had never configured her computer to talk to the remote! It was some kind of uPnP+DLNA thing, gone horribly wrong.

                                We threw it out shortly after.

                                1. 1

                                  For a while there Macbooks had an IR remote receiver - you could buy a dinky little Apple remote for media control (or maybe they even came with some laptops?). And from memory they used one of the standard remote codes, so…

                                  1. 1

                                    It was only 2 years ago, with a modern-ish Macbook… somehow it was auto-discovering the mumbles DLNA port…

                                1. 2
                                  • et-see - initially used to call it ee-tee-see till i heard someone pronounce it et-see
                                  • lib - like lib[erty]
                                  • char - like char[acter], called it char[coal] till someone made pointed out it was wrong (it wasn’t wrong but the person was not comfortable with the idea that there can be different pronunciations)
                                  • f-es-see-k
                                  • skema

                                  i taught myself to code and had different pronunciations in my head which changed as i was exposed to other people who also wrote code and used linux.

                                  1. 2

                                    It is said that people who pronounce things “wrong” should be respected because they learned it from a book (or now the Internet) rather than being taught in person :)

                                  1. 3

                                    I’m Australian, and we pronounce everything wrong, so my apologies; but:

                                    1. ettuk (’ɛttək)
                                    2. lib (liberated)
                                    3. char (like burning, not like character)
                                    4. fsk (unvoiced f; ski)
                                    5. ski ma / skeema; the plural is “heaps schemas mate”

                                    As a confusing bonus, we pronounce route as “root” (to a place, or on a network). However, both the woodworking and networking routers are called “rout er”, in the American fashion. This might be because the verb “to root” means something quite different here, and most people don’t do that in public.

                                    1. 10

                                      This page is really painful to read: it’s quite aggressive towards the author of xz. The tone is really needlessly nasty. There are only elements against xz/lzma2, nothing in favor; it’s just criticism which conclusion is “use lzip [my software]”.

                                      Numbers are shown the way they look bigger: “0.015% (i.e. nothing) to 3%” efficiency difference is then turned as “max compression ratio can only be 6875:1 rather than 7089:1” but that’s over 1TB of zeroes and only 3% relative to the compressed data, which amounts to a 4*10^-6 difference on the uncompressed data! (and if you’re compressing that kind of things, you might want to look at lrzip)

                                      The author fails to understand that xz’s success has several causes besides compression ratio and the file format. It’s a huge improvement over gzip and bzip2 for packages. The documentation is really good and helps you get better results both with compression ratio and speed (see “man xz”). It is ported pretty much everywhere (that includes OS/2 and VMS iirc). It is stable. And so on.

                                      As a side-note, this is the only place where I’ve seen compression formats being used for archiving and expecting handling of potential corruption. Compression goes against archiving. If you’re doing archiving, you’ll be using something that provides redundancy. But redundancy is what you eliminate when you compress. What is used for archiving of audio and video? Simple formats with low compression at best. The thing with lzip is that while its file format might be better suited for archiving, lzip itself as a whole still isn’t suited for archiving. And that’s ok.

                                      Now, I just wish the author gets less angry. That’s one of the ways to a better life. Going from project to project and telling them they really should abandon xz in favor of lzip for their source code releases is only a proof of frustration and a painful life.

                                      1. 6

                                        The author fails to understand that xz’s success has several causes besides compression ratio and the file format.

                                        But the author doesn’t even talk about that? All he has to say about adoption is that it happened without any analysis of the format.

                                        Compression goes against archiving. If you’re doing archiving, you’ll be using something that provides redundancy. But redundancy is what you eliminate when you compress.

                                        This sounds like “you can’t be team archiving if you are team compression, they have opposite redundancy stat”. It’s not an argument, or at least not a sensical one. Compression makes individual copies more fragile; at the same time, compression helps you store more individual copies of the same data in the same space. So is compression better or worse for archiving? Sorry, I’m asking a silly question. The kind of question I should be asking is along the lines of “what is the total redundancy in the archiving system?” and “which piece of data in the archiving system is the weakest link in terms of redundancy?”

                                        Which, coincidentally, is exactly the sort of question under which this article is examining the xz format…

                                        What is used for archiving of audio and video? Simple formats with low compression at best.

                                        That’s a red herring. A/V archiving achieves only low compression because it eschews lossy compression and the data typically doesn’t lend itself well to lossless compression. Nevertheless it absolutely does use lossless compression (e.g. FLAC is typically ~50% smaller than WAV because of that). This is just more “team compression vs team archiving”-type reasoning.

                                        The thing with lzip is that while its file format might be better suited for archiving, lzip itself as a whole still isn’t suited for archiving.

                                        Can you actually explain why, rather than just asserting so? If lzip has deficiencies in areas xz does well in, could you step up and criticise what would have to improve to make it a contender? As it is, you seem to just be dismissing this criticism of the xz format – which as a universal stance would result in neither xz nor lzip improving on any of their flaws (in whatever areas those flaws may be in).

                                        As a side-note, this is the only place where I’ve seen compression formats being used for archiving and expecting handling of potential corruption.

                                        Juxtaposing this with your “author fails to understand” statement is interesting. Should I then say that you fail to understand what the author is even talking about?

                                        This page is really painful to read: it’s quite aggressive towards the author of xz.

                                        I saw only a single mention of a specific author. All the substantive statements are about the format, and all of the judgements given are justified by statements of fact. The very end of the conclusion speaks about inexperience in both authors and adopters, and it’s certainly correct about me as an adopter of xz.

                                        There are only elements against xz/lzma2, nothing in favor; it’s just criticism which conclusion is “use lzip [my software]”.

                                        Yes. The authors of xz are barely mentioned. They are certainly not decried nor vilified, if anything they are excused. It’s just criticism. That’s all it is. Why should that be objectionable? I’ve been using xz; I’m potentially affected by the flaws in its design, which I was not aware of, and wouldn’t have thought to investigate – I’m one of the unthinking adopters the author of the page mentions. So I’m glad he took the time to write up his criticism.

                                        Is valid criticism only permissible if one goes out of one’s way to find something proportionately positive to pad the criticism with, in order to make it “fair and balanced”?

                                        Frankly, as the recipient of such cushioned criticism I would feel patronised. Insulting me is one thing and telling me I screwed up is another. I can tell them apart just fine, so if you just leave the insults at home, there’s no need to compliment me for unrelated things in order to tell me what I screwed up – and I sure as heck want to know.

                                        1. 2

                                          The author fails to understand that xz’s success has several causes besides compression ratio and the file format.

                                          But the author doesn’t even talk about that? All he has to say about adoption is that it happened without any analysis of the format.

                                          Indeed, this is more a comment about what appears to be biterness from the author. This isn’t part of the linked page (although the tone of the article is probably a consequence).

                                          Compression goes against archiving. If you’re doing archiving, you’ll be using something that provides redundancy. But redundancy is what you eliminate when you compress.

                                          This sounds like “you can’t be team archiving if you are team compression, they have opposite redundancy stat”. It’s not an argument, or at least not a sensical one. Compression makes individual copies more fragile; at the same time, compression helps you store more individual copies of the same data in the same space. So is compression better or worse for archiving? Sorry, I’m asking a silly question. The kind of question I should be asking is along the lines of “what is the total redundancy in the archiving system?” and “which piece of data in the archiving system is the weakest link in terms of redundancy?”

                                          Agreed. I’m mostly copying the argument from the lzip author. That being said, one issue with compression is that corruption on compressed data is amplified with no chance to be able to reconstruct the data, even by hand. Intuitively I would expect the best approach for archiving would be compression followed by adding “better” (i.e. more even) redundancy and error recovery (within the storage budget). Now, if your data has some specific properties, the best approach might be different, especially if you’re more interested in some parts (for instance if you have a progressive image, you might value more the less specific parts because losing the more specific ones implies only losing on the image resolution).

                                          Which, coincidentally, is exactly the sort of question under which this article is examining the xz format…

                                          What is used for archiving of audio and video? Simple formats with low compression at best.

                                          That’s a red herring. A/V archiving achieves only low compression because it eschews lossy compression and the data typically doesn’t lend itself well to lossless compression. Nevertheless it absolutely does use lossless compression (e.g. FLAC is typically ~50% smaller than WAV because of that). This is just more “team compression vs team archiving”-type reasoning.

                                          If you look for some stuff from archivists, FLAC isn’t one of the preferred format. It is acceptable but the preferred one still seems to be WAV/PCM.

                                          Sources:

                                          The thing with lzip is that while its file format might be better suited for archiving, lzip itself as a whole still isn’t suited for archiving.

                                          Can you actually explain why, rather than just asserting so? If lzip has deficiencies in areas xz does well in, could you step up and criticise what would have to improve to make it a contender? As it is, you seem to just be dismissing this criticism of the xz format – which as a universal stance would result in neither xz nor lzip improving on any of their flaws (in whatever areas those flaws may be in).

                                          I had intended the leading sentences to explain that. The reasonning is simply that compression is mostly at odds with long-term preservation by itself. As discussed above, proper redundancy and error recovery can probably turn that into a good match but then the qualities of the compression format itself don’t matter that much since the “protection” is done at another layer that is dedicated to that and also provides recovery.

                                          As a side-note, this is the only place where I’ve seen compression formats being used for archiving and expecting handling of potential corruption.

                                          Juxtaposing this with your “author fails to understand” statement is interesting. Should I then say that you fail to understand what the author is even talking about?

                                          You’re obviously free to do so if you wish to. :)

                                          This page is really painful to read: it’s quite aggressive towards the author of xz.

                                          I saw only a single mention of a specific author. All the substantive statements are about the format, and all of the judgements given are justified by statements of fact. The very end of the conclusion speaks about inexperience in both authors and adopters, and it’s certainly correct about me as an adopter of xz.

                                          Being full of facts doesn’t make the article objective. It’s easy to not mention some things and while the main author of xz/liblzma could technically answer, he doesn’t really wish to do so (especially since it would cause a very high mental load). That being said, I’ll take liberalities and quote from IRC where I basically only lurk nowadays (nicks replaced by “Alice” and “Bob”). This is a recent discussion, there were more detailled ones earlier but I’m not only taking the most recent one.

                                          Bob : Alice the lzip html pages says that lzip compresses a bit better than xz. Can you tell me the technical differences that would explain that difference in size ?

                                          Bob : Alice do you have ideas on how improving the size with xz ?

                                          Alice : Bob: I think it used to be the opposite at least with some files since .lz doesn’t support changing certain settings. E.g. plain text (like source code tarballs) are slightly better with xz –lzma2=pb=0 than with plain xz. It’s not a big difference though.

                                          Alice : Bob: Technically .lz has LZMA and .xz has LZMA2. LZMA2 is just LZMA with chunking which adds a slight amount of overhead in a typical situation while being a bit better with incompressible data.

                                          Alice : Bob: With tiny files .xz headers are a little bloatier than .lz.

                                          Alice : Bob: In practice, unless one cares about differences of a few bytes in either direction, the compression ratios are the same as long as the encoders are comparable (I don’t know if they are nowadays).

                                          Alice : Bob: With xz there are extra filters for some files types, mostly executables. E.g. x86 executables become about 5 % smaller with the x86 BCJ filter. One can apply it to binary tarballs too but for certain known reasons it sometimes can make things worse in such cases. It could be fixed with a more intelligent filtering method.

                                          Alice : Bob: There are ideas about other filters but me getting those done in the next 2-3 years seem really low.

                                          Alice : So one has to compare what exist now, of course.

                                          Bob : Alice btw, fyi, i have tried one of the exemples where the lzip guy says that xz throws an error while it shouldn’t

                                          Bob : but it is working fine, actually

                                          Alice : Heh

                                          Two main points here: the chunking, the point of view that the differences are very small; and the fact that one of the complaint seems wrong.

                                          If I look for “chunk” in the article, the only thing that comes up is the following:

                                          But LZMA2 is a container format that divides LZMA data into chunks in an unsafe way. In practice, for compressible data, LZMA2 is just LZMA with 0.015%-3% more overhead. The maximum compression ratio of LZMA is about 7089:1, but LZMA2 is limited to 6875:1 approximately (measured with 1 TB of data).

                                          Indeed, the sentence “In practice, for compressible data, LZMA2 is just LZMA with 0.015%-3% more overhead.” is probably absolutely true. But there is no mention of what happens for uncompressible data. I can’t tell whether that omission was voluntary or not but it makes this paragraph quite misleading.

                                          Note that xz/liblzma’s author acknowledges some of the points of lzip’s author, but not the majority of them.

                                          There are only elements against xz/lzma2, nothing in favor; it’s just criticism which conclusion is “use lzip [my software]”.

                                          Yes. The authors of xz are barely mentioned. They are certainly not decried nor vilified, if anything they are excused. It’s just criticism. That’s all it is. Why should that be objectionable? I’ve been using xz; I’m potentially affected by the flaws in its design, which I was not aware of, and wouldn’t have thought to investigate – I’m one of the unthinking adopters the author of the page mentions. So I’m glad he took the time to write up his criticism.

                                          Is valid criticism only permissible if one goes out of one’s way to find something proportionately positive to pad the criticism with, in order to make it “fair and balanced”?

                                          I concur that writing criticism is a good thing but the article is not really objective and probably doesn’t try to be. In an ideal world there would be a page with rebuttals from other people. In a real world, that would probably start a flamewar and the xz/libzma author does not wish to get involved into that.

                                          I’ve just looked up the author name + lzip and first result is: https://gcc.gnu.org/ml/gcc/2017-06/msg00044.html “Re: Steering committee, please, consider using lzip instead of xz”.

                                          Another scary element is that nor “man lzip” nor “info lzip” mention “xz”. They mention gzip and bzip2 but not xz (“Lzip is better than gzip and bzip2 from a data recovery perspective.”). Considering the length of this article, not seeing a single mention of xz makes me think the lzip author does not have a peaceful relation to xz.

                                          You might think that the preference of lzip in https://www.gnu.org/software/ddrescue/manual/ddrescue_manual.html would be a good indication but the author of that manual is also lzip’s author!

                                          And now scrolling down my search results, I see https://lists.debian.org/debian-devel/2015/07/msg00634.html “Re: Adding support for LZIP to dpkg, using that instead of xz, archive wide” and the messages there again make me think he doesn’t have a peacfeful relation to xz.

                                          I don’t like criticizing authors but with this one-way article with surprising omissions and incorrect elements (no idea if that’s because things changed at some point), I think more context (and an author’s personnality and history are context) helps decide how much you trust the whole article.

                                          Frankly, as the recipient of such cushioned criticism I would feel patronised. Insulting me is one thing and telling me I screwed up is another. I can tell them apart just fine, so if you just leave the insults at home, there’s no need to compliment me for unrelated things in order to tell me what I screwed up – and I sure as heck want to know.

                                          Yes, it’s cushioned because I don’t like criticizing authors as I said above so I’m uncomfortable doing, I try to avoid doing it but sometimes that’s not something we can separate from a topic or article so I still ended up doing it at least a bit (you can now see that I did it as little as possible in my previous message). With that being said, I don’t think the author needs to be told all of this, or at least I don’t want to start such a discussion with the author who seems to be able to go on for years (and tbh, I’m not sure that’s healthy for him).

                                          edit: fixed formatting of the IRC quote

                                        2. 3

                                          As a side-note, this is the only place where I’ve seen compression formats being used for archiving and expecting handling of potential corruption. Compression goes against archiving. If you’re doing archiving, you’ll be using something that provides redundancy.

                                          This is not true at all. [Edit: Most of the widely used professional backup and recovery software that was specifically designed for long-term archiving also included compression as an integral part of the package, and advertised it’s ability to work in a robust manner.]

                                          BRU for UNIX, for example, does compression, and is designed for archiving and backup. This tool is from 1985 and is still maintained today.

                                          Afio is specifically designed for archiving and backup. It also supports redundant fault-tolerant compression. This tool is also from 1985 and is still maintained today.

                                          [Edit: LONE-TAR is another backup product I remember using from the mid 1980s, was originally produced by Cactus Software. It’s still supported and maintained today. It provided a fault-tolerant compression mode, so it would be able to restore (most) data even if there was damage to the archive.]

                                          As to all your other complaints, it seems you are attacking the documents “aggressive tone” and you mention that you find it painful (or offensive) to read, but you haven’t actually refuted any of the technical claims that author of the article makes.

                                          1. 1

                                            Sorry, I had compression software in mind when I wrote that sentence. I meant that I had never seen a compression software that made the resistance to corruption such an important feature.

                                            Thanks for the links! I’m not that surprised that there are some pieces of software that already exist and fit in that niche (I would have had to build a startup otherwise!). I’m quite curious at their tradeoff choices (in space vs. recovery capabilities) but since two of them are proprietary, I’m not sure there is one unfortunately.

                                            As to all your other complaints, it seems you are attacking the documents “aggressive tone” and you mention that you find it painful (or offensive) to read, but you haven’t actually refuted any of the technical claims that author of the article makes.

                                            Indeed. Part of that is because comments are probably not really a good place for that because the article itself is very long. The other part is because xz’ author does not wish to get into that debate and I don’t want to pull him in by publishing his answers on IRC on that topic. It’s not a great situation and I don’t really know what to do so I end up hesitating. Not perfect either. I think I mostly only hope to get people to question a bit the numbers and facts on that page and to not forget everything else that goes into making a file format useful in practice and it’s not because there’s no rebuttal that the article is true, spot-on, unbiaised and so on.

                                          2. 2

                                            I agree about the tone of the article, but I’m not sure that archiving and compression run counter to each other.

                                            I’ve spent a lot of time digging around for old software, in particular to get old hardware running, but also to access old data. Already we are having to dig up software from 20+ years ago for these things.

                                            In another 20 years, when people need to do the same job, it will be more complicated: if you need to run one package, you may find yourself needing tens or worse of transitive dependencies. If you’re looking in some ancient Linux distribution mirror on some forgotten server, what are the chances that all the bits are still 100% perfect? And certainly nobody’s going to mirror all these in some uncompressed format ;-)

                                            This is one case where being able to recover corrupted files is important. It’s also helpful to be able to do best-effort recovery on these; in any given distro archive you can live with corruption in some proportion of the resulting payload bytes - think of all the documentation files you can live without - but if a bit error causes the entire rest of the stream to be abandoned then you’re stuffed.

                                            I’d argue that archival is something we already practice in everyday release of software. The way people tend to name release files as packagename-version.suffix is a good example: it makes the filename unique and hence very easy to search for in the future. And here, picking one format over another where it has better robustness for future retrievers seems pretty low-cost. It’s not like adding parity data or something that increases sizes.

                                            1. 2

                                              Agreed. :)

                                              Makes me think of archive.org and softwareheritage.org (which already has pretty good stuff if I’ve understood correctly).

                                          1. 5

                                            There’s a bit of a winding road from new architecture to something you hold in your hands, as the author notes at the end in talking about affordable dev boards. What would you do with such a board? Probably not much you couldn’t do quicker, easier and cheaper with an ARM or other mainstream board, for now. So for most people it won’t be worth much until and unless there are useful and affordable commodity systems.

                                            But if you are a hardware builder, there is a good reason to care: most architectures aren’t free. Right now I’m looking at designing a new ASIC, and it’ll need a CPU. I’m prototyping in an FPGA. If I wanted to use an ARM core, I’d have to go and pay ARM a bunch of money, even if I were to write my own CPU core - because it uses their ISA. But if I use one of the free ISAs, like SPARC or RISC-V, then I can get going right now: there are free cores which I can use all the way from simulation to silicon. So for now I’m working with RISC-V.

                                            ARM have realised this hazard is real; they’ve made a couple of cores free for use with Xilinx FPGAs and I’m guessing that’s in part to lower the bar for the prototyping crowd like me. Licensing an ARM core isn’t that expensive (in the scheme of chip development), so we’ll see what everyone’s holding in 5-10 years’ time.

                                            1. 6

                                              I spent last week laying out PCBs for an experimental neuromodulation device, which was frustrating, because the PCB layout tool I use is an Australian monstrosity written in Delphi which is stonkingly expensive, and it’s chock full of inexplicable design choices and weird bugs… (I’m an Australian monstrosity too, and chock full of weird bugs thanks to my toddler, but at least I’m written in C.)

                                              The neuromod device is particularly interesting because it’s driven by an FPGA which I’m writing in Python via Migen. Having worked with VHDL/Verilog over the past decade it’s a really refreshing way to design logic. Migen has its faults - chief amongst which is lack of documentation - but it’s a very good tool. Being able to structure systems in a high-level language, and automate crappy jobs like memory map assignment, is fantastic.

                                              Now I’m getting my hobby project moving again, which is a plug-in cartridge for a Sega Saturn that hijacks its CD-controller brain and streams “CD content” from an SD card instead of a disc. The core hackery has been working great for years but now I’m trying to turn it into something people can pick up and use. I’m slogging through manufacturing and production test systems, which feel like 10× more work. I’m glad I don’t do manufacturing or test for a living! Some interesting problems but they’re just not up my alley.

                                              1. 7

                                                I use this technique in an embedded product. It’s really helpful to be able to structure my code in that way; in this case it’s I/O related, and in a microcontroller with no OS to provide threading or anything.

                                                But it’s a subsystem I only touch once a year or so, and the cognitive load of maintaining it is high, eg. remembering to make sure all your variables are marked static. If there were a way to check those things with the compiler it’d be safer, but… then it wouldn’t be C would it ;-)

                                                1. 1

                                                  I spent many hours on the Discworld MUD, though not in the last decade. I’m not overly familiar with the books from which it draws its inspiration, but I enjoyed the writing, the pretty substantial scope of the world and the friendly people I encountered there.

                                                  1. 2

                                                    I love making things. Two major outlets.

                                                    Cooking. I’ve always enjoyed this even if I’m not a great cook. My background is super anglo but my partner’s family is Malaysian Chinese, which has widened my cooking range considerably.

                                                    Tech. This one is a bit vague because it’s an ever-shifting pile of sand. I make little things around the house to solve local problems; I grew up with a soldering iron in hand and added a 3d printer a couple of years ago. I’ve gone as far as turning things into products every now and then, eg. the Drag’n’Derp cartridge for Game Boy and the Saturn Satiator for the Sega Saturn. Lately I’ve been making keyboards (for typing) and as soon as I’m happy with one I’ll stop and do something else.

                                                    Reverse engineering. Hardware in particular, retro tech in particular. I count this as making, because it’s almost always in service of making something (eg. the Saturn Satiator is entirely dependent on some extensive reversing).

                                                    When I think about it my hobbies look a lot like my day job… except for the cooking, sadly!

                                                    1. 2

                                                      I too am a Keepass/syncthing user. I use kpcli on desktops, KeePassDroid on Android, and I forget what my partner uses on Mac for our shared passwords. Perhaps not the lowest-friction setup but it’s completely under my control and I have high confidence in it.