Threads for trn

    1. 2

      (sorry for an HN cross-post, but this might also be helpful, to see the different options of the various Nvi forks [no Nvi2 here … yet])

      To give a quick overview of ‘:set’-able options (:set all output) of the various editors that derive from ‘nvi’…

      n-t-roff vi (01/14/2017) aka Heirloom Traditional ex/vi:

      noautoindent            nomodelines                     noshowmode
      autoprint               nonumber                        noslowopen
      noautowrite             open                            nosourceany
      nobeautify              nooptimize                      tabstop=8
      directory=/var/tmp      paragraphs=IPLPPPQPP LIpplpipbp taglength=0
      noedcompatible          prompt                          tags=tags /usr/lib/tags
      noerrorbells            noreadonly                      term=screen-256color
      noexrc                  redraw                          noterse
      flash                   remap                           timeout
      hardtabs=8              report=5                        ttytype=screen-256color
      noignorecase            scroll=17                       warn
      nolisp                  sections=NHSHH HUnhsh           window=35
      nolist                  shell=/usr/bin/zsh              wrapscan
      magic                   shiftwidth=8                    wrapmargin=0
      mesg                    noshowmatch                     nowriteany
      

      OpenVi 7.0.13-dev (02/20/2022):

      noaltwerase     noedcompatible  noimctrl        nooctal         nosecure        nottywerase
      noautoindent    escapetime=2    keytime=6       open            shiftwidth=8    noverbose
      autoprint       noerrorbells    noleftright     path=""         noshowmatch     novisibletab
      noautowrite     noexpandtab     lines=36        print=""        noshowmode      warn
      backup=""       noexrc          nolist          prompt          sidescroll=16   window=35
      nobeautify      noextended      lock            noreadonly      tabstop=8       nowindowname
      nobserase       filec=" "       magic           remap           taglength=0     wraplen=0
      cdpath=":"      noflash         matchtime=7     report=5        tags="tags"     wrapmargin=0
      cedit=""        hardtabs=0      mesg            noruler         noterse         wrapscan
      columns=108     noiclower       noprint=""      scroll=17       notildeop       nowriteany
      nocomment       noignorecase    nonumber        nosearchincr    timeout
      imkey="/?aioAIO"
      paragraphs="IPLPPPQPP LIpplpipbpBlBdPpLpIt"
      recdir="/var/tmp/vi.recover"
      sections="NHSHH HUnhshShSs"
      shell="/bin/zsh"
      shellmeta="~{[*?$`'"\"
      term="screen-256color"
      

      OpenBSD vi (7.0-current):

      noaltwerase     escapetime=1    noleftright     path=""         noshowmatch     warn
      noautoindent    noerrorbells    lines=36        print=""        noshowmode      window=35
      autoprint       noexpandtab     nolist          prompt          sidescroll=16   nowindowname
      noautowrite     noexrc          lock            noreadonly      tabstop=8       wraplen=0
      backup=""       noextended      magic           remap           taglength=0     wrapmargin=0
      nobeautify      filec=" "       matchtime=7     report=5        tags="tags"     wrapscan
      cdpath=":"      noflash         mesg            noruler         noterse         nowriteany
      cedit=""        hardtabs=0      noprint=""      scroll=17       notildeop
      columns=108     noiclower       nonumber        nosearchincr    timeout
      nocomment       noignorecase    nooctal         nosecure        nottywerase
      noedcompatible  keytime=6       open            shiftwidth=8    noverbose
      paragraphs="IPLPPPQPP LIpplpipbpBlBdPpLpIt"
      recdir="/tmp/vi.recover"
      sections="NHSHH HUnhshShSs"
      shell="/usr/local/bin/zsh"
      shellmeta="~{[*?$`'"\"
      term="screen-256color"
      

      nvi-1.81.6-45-g864873d3 (2022-02-21) aka Nvi1:

      noaltwerase     escapetime=1    nolisp          optimize        shiftwidth=8    nottywerase
      noautoindent    noerrorbells    nolist          path=""         noshowmatch     noverbose
      autoprint       noexrc          lock            print=""        noshowmode      warn
      noautowrite     noextended      magic           prompt          sidescroll=16   window=35
      backup=""       filec=""        matchtime=7     noreadonly      noslowopen      nowindowname
      nobeautify      flash           mesg            noredraw        nosourceany     wraplen=0
      cdpath=":"      hardtabs=0      nomodeline      remap           tabstop=8       wrapmargin=0
      cedit=""        noiclower       msgcat="./"     report=5        taglength=0     wrapscan
      columns=108     noignorecase    noprint=""      noruler         tags="tags"     nowriteany
      nocombined      keytime=6       nonumber        scroll=17       noterse
      nocomment       noleftright     nooctal         nosearchincr    notildeop
      noedcompatible  lines=36        open            nosecure        timeout
      directory="/tmp"
      fileencoding="UTF-8"
      inputencoding="UTF-8"
      paragraphs="IPLPPPQPP LIpplpipbp"
      recdir="/var/tmp/vi.recover"
      sections="NHSHH HUnhsh"
      shell="/bin/zsh"
      shellmeta="~{[*?$`'"\"
      term="screen-256color"
      
      1. 2

        Of particular note, the lack of expandtab in Nvi and Traditional vi may be more of a showstopper than lack of multibyte glyph rendering for many.

        1. 1

          the lack of expandtab in Nvi and Traditional vi may be more of a showstopper than lack of multibyte glyph rendering for many

          You’re probably right about the numbers: more programmers would care about tabs and spaces. But I’m a weird case. I program as a hobby, but my job is teaching Latin and ancient Greek. I need my multibyte characters. Thanks for these details.

          1. 2

            I can’t promise a time-frame, because doing it means doing it right, so I can get the changes up-streamed.

            I don’t want OpenVi to diverge greatly from OpenBSD’s vi, because the maintenance burden would be too great, with too high a risk of introducing subtle bugs.

            Edit: I’ve added some individual features, and likely will continue to do so conservatively, but these are all essentially stand-alone and don’t require invasive changes to other parts of the source tree, which multibyte will require.

            Edit 2: One of these new stand-alone features is the visibletab (or vt) option, which is extremely handy for editing Makefiles, so the usage of tabs is visible and the alignment respects your tabstop value. Feel free to try it out, and compare it to using the standard :set list mode.

            OpenVi’s :set vt is the equivalent of using :set list and :set listchars=tab:~~ in Vim or NeoVim.

            1. 2

              I can’t promise a time-frame, because doing it means doing it right, so I can get the changes up-streamed.

              I completely understand, and (obviously) I’m one random person who needs UTF-8 constantly.

              In any case, OpenVi looks very interesting, and I will definitely take it for a spin. Thanks for your work on it.

      1. 10

        This looks interesting. On a related note, if you want something closer to vi, but the lack of UTF-8 and bidirectional text in OpenVi is a problem, take a look at neatvi.

        1. 9

          I really enjoy everything the author of neatvi has written (neatroff especially). I reached out to him once via email basically to tell him I was a fan. He was very gracious.

          1. 3

            He was very gracious.

            Agreed. When neatvi was relatively new, I submitted a small number of patches and had several questions. He was always helpful.

          2. 5

            For UTF-8, another option is nvi. OpenBSD’s vi is actually an old version of nvi that lacks UTF-8 support.

            1. 3

              Don’t confuse OpenVi/OpenBSD-vi, nvi1, and nvi2. These are all different programs that share the same heritage.

              OpenVi is derived from OpenBSD vi, which derives from nvi version 1.79, released in 1996. There has been 25+ years of independent development as part of the OpenBSD base system and it has diverged greatly in that time, with the development going in a different direction.

              Nvi1, currently on version 1.8x, is maintained at https://repo.or.cz/nvi.git - I believe the latest version of this editor does have multibyte support, but this is not the OpenVi/OpenBSD version of the editor.

              Nvi2 shares the same heritage as well, but is also quite far removed from 1996 code. It is actively maintained at https://github.com/lichray/nvi2 and also includes multibyte support.

              (If I remember correctly) the multibyte support in both Nvi1 and Nvi2 derives from nvi-m17n, developed as part of the KAME project by the late itojun - http://www.itojun.org/itojun.html … the last update to nvi-m17n was about 3 years ago, and is available at https://cgit.freebsd.org/ports/tree/editors/nvi-m17n/files

              Currently, optimizing for size using link-time garbage collection with GCC 11.2 on an x86_64 glibc Linux system gives a good idea of the changes over time and the different direction these editors have taken. OpenVi is also simplified in structure and does not have the three levels of abstraction of Nvi 1.8x - there is no library interface layer.

              For OpenVi, the compiled binary is 278K, and for Nvi1 (nvi-1.81.6-45-g864873d3) the compiled binary is 528K (36K for vi, 528K for libvi).

              OpenVi has a single configuration standard with no dependencies beyond curses.

              Nvi1 has many options beyond trace/debug (“widechar” “gtk” “motif” “threads” “perl” “tcl” “db3/4” “internal-re”) - so at least 255 different build variations are possible.

              (I’ve not yet built Nvi2 myself on Linux so I can provide an actually fair comparison yet, but I will, and I’ll summarize the data in an FAQ section of the README)

              1. 2

                (Note that I was using the defaults here, I’m sure that it’s possible to trim down Nvi 1.8x further, but I’m comparing the default compilations, optimized for size (GCC, -Os, -fdata-sections, -ffunction-sections, link-time GC enabled), but Nvi 1.8x is a much more complicated program, and has a different feature set, and different supported options.

                1. 1

                  Well, I allowed myself to omit the fact that OpenBSD’s vi has seen some independent development past nvi 1.79, which is true. A “(based on)” should be inserted before “an old version” in my original comment. But I appreciate the thorough summary of nvi versions!

                2. 2

                  Nope, the vi in OpenBSD is nvi - you’re confusing it with nvi2. Both are in active development: nvi and nvi2.

                  1. 2

                    It should be noted that DragonFly BSD has imported nvi2, but with some modifications as well.

                    It’s unfortunate there is so much confusion surrounding the various nvi-based editors, mostly due to them all being so similarly named.

                    Part of why I chose to call this project OpenVi was because the name was - suprisingly - available, and does not directly imply that OpenVi is exactly Nvi1/2 or OpenBSD’s vi.

                    (In particular, all bugs in OpenVi should be considered my fault.)

                3. 2

                  I will confirm that Neatvi is an excellent project, but I’m a bit more interested in Nextvi - https://github.com/kyx0r/nextvi - the RTL/bidi in Neatvi is a huge strength and is done very cleanly when compared to other vi-likes

                1. 2

                  And finally for those interested in UTF-8, cleanly doing UTF-8 support (the “right” way) is planned but is certainly non-trivial … for example, see:

                  https://www.openbsd.org/papers/eurobsdcon2016-utf8.pdf and http://www.usenix.org/events/usenix99/full_papers/hagino/hagino.ps

                  Nvi2 (and Nvi1) are excellent editors, but I wouldn’t want to copy the Nvi2 implementation directly. This has been talked about elsewhere**

                  ** https://misc.openbsd.narkive.com/9NHoQv8L/nvi-and-unicode#post4

                  1. 4

                    Coincidentally, just yesterday I was looking through the code of a few vi implementations, because I’m writing a toy implementation based on the libvim idea of having a core that is a function of (editor_state, input) -> editor_state and decoupling it from the rendering. I hadn’t seen this implementation nor neatvi, so thank you!

                    1. 2

                      Nvi1 (https://repo.or.cz/nvi.git) provides you with just the kind of shared library interface I think you’re looking for, which is used for the different interfaces:

                      36K    vi
                      24K    vi-ipc
                      108K   vi-motif
                      492K   libvi.so.0.0.0
                      660K   total
                      

                      OpenVi does not - it’s a single monolithic binary:

                      278K   bin/vi
                      
                    1. 1

                      So far I’ve found Vivaldi to be a great alternative to Chrome.

                      I hope that uBlockOrigin will support it (and other forks) even if Google rejects them from the “Chrome Store”.

                      1. 5

                        Beautiful work. This kind of hobby project is so pure. I wonder if the dev is going for full POSIX compliance.

                        Semi-related quote:

                        Computer science would have progressed much further and faster if all of the time and effort that has been spent maintaining and nurturing Unix had been spent on a sounder operating system. We hope that one day Unix will be relinquished to the history books and museums of computer science as an interesting, albeit costly, footnote.

                        I love UNIX, maybe because I’m used to it, but I keep wondering what an OS building on what made *nix great without *nix grievances would be.

                        1. 7

                          I love UNIX, maybe because I’m used to it, but I keep wondering what an OS building on what made *nix great without *nix grievances would be.

                          This is exactly the intent of Plan 9; your mileage may vary on if it succeeds at its goals.

                          There are also non-Unix ways of thinking, but these were (IMHO, falsely) discredited by the sheer market and cultural powers of Unix.

                          1. 4

                            This is exactly the intent of Plan 9; your mileage may vary on if it succeeds at its goals.

                            I’ve tried it and it’s not for me. Plus the community is weird and unwelcoming.

                            There are also non-Unix ways of thinking, but these were (IMHO, falsely) discredited by the sheer market and cultural powers of Unix.

                            What are you referring to exactly? I’m interested in your thoughts about this.

                            1. 6

                              I have a few non-UNIX ways of thinking in this list that you might find interesting.

                              1. 3

                                There is Jehanne, which started as a Plan 9 fork — you might find it interesting. The website for the project has some good write-ups as well.

                                1. 3

                                  What a sad state of affairs. I read the pieces on the Harvey OS side of the story, that’s enough to demotivate you completely.

                                  1. 3

                                    This is an interesting read too and hopefully I’m not breaking any rules by linking it.

                                2. 3

                                  This may give you some idea.

                                  https://web.mit.edu/~simsong/www/ugh.pdf

                                  1. 2

                                    I’ve read it :) the quote above is from this book.

                                3. 4

                                  In my preferred alternate reality, VMS discredited UNIX.

                                  I’d still rather use VMS than UNIX today for almost any non-trivial production task.

                                  “One of the questions that comes up all the time is: How enthusiastic is our support for UNIX?

                                  “Unix was written on our machines and for our machines many years ago. Today, much of UNIX being done is done on our machines. Ten percent of our VAXs are going for UNIX use. UNIX is a simple language, easy to understand, easy to get started with. It’s great for students, great for somewhat casual users, and it’s great for interchanging programs between different machines. And so, because of its popularity in these markets, we support it. We have good UNIX on VAX and good UNIX on PDP-11s.

                                  “It is our belief, however, that serious professional users will run out of things they can do with UNIX. They’ll want a real system and will end up doing VMS when they get to be serious about programming.

                                  “With UNIX, if you’re looking for something, you can easily and quickly check that small manual and find out that it’s not there.

                                  With VMS, no matter what you look for – it’s literally a five-foot shelf of documentation – if you look long enough it’s there. That’s the difference – the beauty of UNIX is it’s simple; and the beauty of VMS is that it’s all there.”

                                  • Ken Olsen
                                  1. 1

                                    Unix was free/cheap, VMS was expensive.

                                    Also, what’s wrong with Windows (NT)? ;)

                                    1. 2

                                      Worse really is better, when it’s free.

                                      1. 1

                                        “Also, what’s wrong with Windows (NT)? ;)”

                                        It was the successor to VMS designed by the same people tweaking and improving on the same internals. So, Ken Olsen’s arguments naturally apply to it, too. ;)

                                    2. 2

                                      O3ONE was one attempt by one amateur enthusiast to build a hobbyist VMS-like rather than UNIX-like system.

                                      The project pages are up but it hasn’t been updated since 2004.

                                      It might be a fun thing to fork and work on.

                                      Edit: VMS became OpenVMS when they added support for POSIX system calls to their kernel and enhanced portability with “open systems”. There was the FreeVMS project, now defunct, which was working on a clone the opposite way, by adding VMS system calls and features to a POSIX/UNIX-like (Linux) kernel, and building on those to clone the standard VMS libraries and system services.

                                      They did have a decent DCL and SMG$ and, if I recall correctly, a working BLISS compiler.

                                    3. 4

                                      HelenOS is a very interesting portable multicore multiserver microkernel research operating system that isn’t a UNIX clone and is not built to be compatible with existing systems, though they do provide enough compatibility to support porting most C11/C++14 applications and libraries, but it’s certainly not UNIX-like.

                                      It’s development is driven by various academic research projects rather than a vision or roadmap, so it will likely never be a production system, but it’s interesting nonetheless to look at a modern non-UNIX design.

                                    1. 3

                                      Nice article!

                                      I still use my G4 Mac Mini today, on a daily basis. I mostly use it as an sshd, web, and git hosting server. I do software development on it as well.

                                      Like the article, the DVD drive went out or doesn’t boot home-burned CD-R discs on my machine. I used to run OpenBSD/macppc but the system compiler (gcc 4.2.1?) did not support C++11. NetBSD had a newer compiler but did not support booting from a USB flash drive and NFS booting seems pretty complicated. I finally gave up and just installed Debian Jessie (which does offer USB bootable media for their powerpc port) and got a C++11 compiler. Debian Jessie is the last stable version to support powerpc and has approximately 1 year of life. Since end of life only goes until June 2020 I have been looking for alternatives.

                                      A couple of weeks ago I made the jump to Debian Sid using the notes here: http://powerpcliberation.blogspot.com/2018/07/debian-ppc-status-update.html

                                      1. 1

                                        The OpenBSD system compiler is now clang where it’s supported, and where it isn’t, GCc will never advance beyond 4.2.1 due to licensing. It’s standard procedure to install GCC via ports, which is made available as egcc.

                                        On platforms without Clang, installing new ports on OpenBSD with GCC is like a three-stage rocket. The base system builds a compiler based on the last non-GPLv3 gcc compiler, which builds egcc (which is 4.9.x, but might have recently moved to 8.x), which is then used to compile modern software.

                                        In short, OpenBSD (and BSD in general) is working to remove GCC completely where possible, as they consider it unusable, and will never include any GCC compiler code newer than 4.2.1 due to the license change.

                                        1. 1

                                          I have a diskless fw800 g4 powerbook which I boot from a usb flashdrive, NetBSD is happy to boot from it. Only thing is that it doesn’t automatically enumerate the root disk being on usb so you get asked what the root disk is (sd0 if you haven’t got any other usb disks connected).

                                        1. 23

                                          I think I’m just going soft–this writeup is something that years ago I would’ve been jealous to have written!

                                          That said…I kind of hope the author of V doesn’t get too discouraged and they address this stuff and keep working on their language.

                                          1. 14

                                            I think we all want the language to succeed. The difference between “I can cure cancer” vs “I am working on a cure for cancer” is huge. Now the site has changed I am more ok with it.

                                            1. 11

                                              I actually want V to succeed. Having a cross platform GUI development tool is part of what we need badly to end the web app insanity. Just not like this exact implementation.

                                              1. 13

                                                There’s plenty of cross platform GUI things though? Qt/QML is probably the most complete one, with fast OpenGL rendering, accessibility, internationalization, and countless other things a UI toolkit has to have. Google is reinventing that with Flutter thanks to NIH syndrome (or licensing issues, or whatever).

                                                Developing a full, solid UI toolkit is not an easy problem and it’s been done a hundred times already. So I don’t think there’s a technical solution to the “I want people to use non-web cross-platform GUI tools” problem. What needs to be done is more promotion, more resources, documentation, tutorials, support for the existing tools.

                                                (and I don’t like the Web Hate Bandwagon either. the web platform is awesome :P)

                                                1. 11

                                                  Absolutely. There are dozens (if not hundreds) of cross-platform GUI libraries, and five or six are at the intersection of well-maintained, frequently used, and featureful.

                                                  All of them, as far as I can tell, are ugly & awkward to use, but none of them are as ugly or awkward to use as web tech.

                                                  That said, almost all these libraries have their own universe & their own idioms that typically don’t match any of the idioms of languages they bind to (and in some cases, such as with GTK and QT, they have totally distinct type systems and build systems). Writing code in Tkinter is too much like writing code in Wish and not enough like writing code in Python. There’s a certain amount of intellectual work to be done to learn a new system like this (and these systems are typically grown organically, so even if you’re using a properly-designed host language, you’re locked into memorizing and dealing with ugly corners of fossilized design by your GUI toolkit), and I think this is what keeps so many people locked into webtech: they have, at great cost to their sanity and the length of their lives, memorized the most obvious ugly corners of CSS, HTML, JavaScript, browser behavior, and perhaps 3 creaking leviathans of web frameworks, and learning to navigate the treacherous twists of QT and the names and habits of the strange beasts in its depths seems rightly daunting.

                                                  1. 5

                                                    Problem is that V promised cross platform GUI with tiny amounts of space requirements, which sounds great to me.

                                                    pyside2 is smth like 80MiB, pyqt is smth like 60MiB, which are absolutely massive compared to ~400KiB of V.

                                                    But sadly it turned out to be vaporwave. Oh well. It would be nice to have that (a graphics library that’s easy, cross platform, well documented and not massive) one day.

                                                    1. 3

                                                      Anything that promises to be tiny is either cheating (big stuff is still coming somewhere else) or is a toy :)

                                                      Honestly the obsession with tininess is just a big waste of energy.

                                                      1. 3

                                                        I always thought the heavy software taking up more CPU and RAM was a waste of energy. ;)

                                                      2. 3

                                                        There’s always TK, if you don’t care about alpha support or 24-bit colors. It seems to ship with python (as TKInter).

                                                        1. 1

                                                          tk support needs to be compiled in and isn’t available by default on all distros.

                                                          1. 2

                                                            True. It’s as close to built-in as you can get with python, though. (Like, curses also ships with python, but if you’re building from scratch & you don’t have curses installed, the python bindings will not provide it.) Some binary distros will let you not install X, & will split off TK & tkinter to support Xless systems.

                                                            That said, last I looked, Tk ships with the binary windows and Mac OS classic versions of python.

                                                            (IIRC, TK also runs on less-common systems like riscos, though I’ve never seen it do so. I’ve also heard tell of a curses frontend to TK, for folks who don’t have a bitmapped display at all, but I don’t know how much it supports or how well it’s been maintained. Basically, it’s way more cross-platform than most of the stuff that gets called cross-platform.)

                                                        2. -3

                                                          V UI is in fact not vaporware, it’s right here: https://github.com/vlang/ui

                                                          You also confused statements about what is being planned with what has already been implemented.

                                                      3. 10

                                                        It seems to me much of this already exists with Rebol.

                                                        1. 7

                                                          Rebol was the first thing that popped into my head when I was reading about V. It is also quite shit for doing anything (I’ve tried for something like a year) and it taught me that one-person projects done by brilliant coders are not good enough, even if they claim superior features. Community counts.

                                                        2. 3

                                                          I am working on curing cancer. Hit my Patreon!

                                                          1. 3

                                                            An honest crackpot is ok . If you do terrible cancer research, but you actually are doing something, all good. Open source helps people see what is going on regardless, then they can make a properly informed decision to throw money away. Before it was more like “I definitely cured cancer, coming soon, donate now.”

                                                            So do some cancer laboratory tours with investors and go ahead :) I hope you succeed lol.

                                                            1. 4

                                                              Regarding my own work, I was kidding about actually doing cancer research (though I have a minor in molecular and cell biology in my Masters from UC Berkeley). I am, however, quite serious about solving the high performance cross-platform GUI problem.

                                                              1. 4

                                                                Good luck :), definitely something people struggle with.

                                                      1. 4

                                                        The style of writing and the types of claims reminds of the GWAN web server.

                                                        1. 2

                                                          I remember that site from a decade or so ago! It actually seems to have been cleaned up to be comparatively respectable now. Still some pretty strong claims, but the original site had a big rant alleging that Microsoft was behind an anti-GWAN campaign that involved collusion with anti-virus makers and Wikipedia to blacklist GWAN, due to MS’s “jihad against efficiency”.

                                                        1. 3

                                                          The fpart tool, which this uses — and it’s included fpsync utility — deserves mention.

                                                          Using fpsync can massively speeds up rsync transfers consisting of many tiny files (such as maildirs), even on slower systems.

                                                          1. 6

                                                            This reminds me of the classic “What Colour are your bits?” which was written about copyright but has a principle applicable here: the law does not work in the way programmers often assume it does, mixing bits and pieces of programmer-understanding with bits and pieces of lawyer-understanding is risky, and as a general rule, clever tricks programmers come up with to try to get around what (they think) the law does are unlikely to succeed.

                                                            1. 1

                                                              Thanks for posting this - it was a good read, which I was either never exposed to, or had completely forgotten.

                                                              Also, the third comment is from Terry A. Davis! R.I.P.

                                                            1. 1

                                                              In my personal opinion I believe this matter is simple:

                                                              The digital world is not unique or novel.

                                                              Without a warrant nobody has a right to look at the contents of your phone or computer or any digital archive same as any of your papers kept out of plain sight.

                                                              With a warrant the authorities are allowed to access your material in the confines of the warrant. You can not hide the evidence in a container and refuse to give up access to the container and remain in compliance with law. The digital domain is not some special place just as a crook hiding the evidence in a combination safe should not be some kind of immunity from the police gathering evidence for the prosecution.

                                                              However, I ran into this interesting article: http://blogs.denverpost.com/crime/2012/01/05/why-criminals-should-always-use-combination-safes/3343/

                                                              This article traces a line of argument that keeps cropping up that basically says a person can not be compelled to give up “the contents of his mind” in order to obey a court order.

                                                              As you can imagine the effect of this is kind of ridiculous as the headline of the article indicates. Smart crooks should be locking up the records of their crimes in combination safes rather than something with a key, which can be found and used.

                                                              I one hand I support this as the kind of conservative thinking we should ask for from the supreme court, since if the authorities are allowed to force us to give up a password how can we ensure the slippery slope is not ridden and they are allowed to force us to give up any other contents of our mind? For example, this is no different than a serial killer refusing to reveal the locations of the bodies.

                                                              I have no love for the serial killer, but we have decided on due process as the way to increase the longevity of our civilization.

                                                              This leads me to the awkward position of supporting backdoors into devices accessible by the government or funding for the government to be able to crack these digital safes without harming their contents.

                                                              Which is the lesser evil? Let hordes of criminals off the hook because they know how to set the password on their phone or give this power to the government?

                                                              I actually don’t know.

                                                              1. 3

                                                                This leads me to the awkward position of supporting backdoors into devices accessible by the government or funding for the government to be able to crack these digital safes without harming their contents.

                                                                I’d argue this is the wrong answer.

                                                                The societal cost of our freedoms may mean that crime will always exist and that criminals will have the opportunity to escape justice - only in a society completely devoid of both freedom and privacy can crime ever be completely eliminated and all transgressions provably punished.

                                                                Not being able to eliminate crime (or offensive speech, or harassment, etc.) is a very small price to pay when the alternative is eliminating privacy and freedom.

                                                                Edit: What if you build your own device, or implement your own cryptography, without the backdoors? Should the basics of cryptography or DIY computer system building be born secret information? Should just talking about computers or crypto be enough to land you in prison?

                                                                1. 1

                                                                  Eliminate is such an absolute word. When people use absolute words like this I begin to suspect they are not interested in understanding the complexities of a position.

                                                                  There are rarely absolutes in any system. All systems that I know of are based on a balance between rights and responsibilities, powers of the individual and powers of the state.

                                                                  I strongly suspect the correct answer here is to give the state greater powers in breaking cryptography (or research into the same) including greater penalties for refusing to divulge passwords.

                                                                  1. 3

                                                                    Certainly, the word is absolute, but it was chosen deliberately, to take the idea of reducing crime to it’s logical end - elimination.

                                                                    To quote an article from the National Institute of Justice, emphasis mine:

                                                                    Soon after his inauguration, [President] Johnson acknowledged the need for a Federal response to crime and public safety. In a March 1965 address to Congress — the first by a president on the issue of crime — Johnson called for legislation to create an Office of Law Enforcement Assistance. He also established the President’s Commission on Law Enforcement and Administration of Justice, charging the members to draw up “the blueprints that we need … to banish crime.”

                                                                    The task — breathtaking in scope — reflected not only the “can do” attitude of Johnson’s Great Society, but also a growing confidence in the ability of science and technology to solve problems. The Nation was already improving public health, harnessing atomic energy, and putting a man on the moon. Why not unleash that same creative power to eliminate crime?

                                                                    The very explicit goal was not just to reduce, but to prevent crime. Obviously, a primary goal of law enforcement is the elimination of crime - but, the reality, if we wish to maintain our freedoms, makes that impossible.

                                                                    The product of Johnson’s commission was something indeed balanced. “The Challenge Of Crime In A Free Society: A Report By The President’s Commission On Law Enforcement And Administration Of Justice” explicitly opens with a statement speaking of ways to reduce (not eliminate) crime, and goes on to make many observations that are still relevant today (emphasis again mine):

                                                                    America’s form of government, its laws and its Constitution, all express the desire to maintain the maximum degree of individual liberty consistent with maintenance of social order. The process of striking this balance is complex and delicate. … Presumably, deterrence would best be served by placing a policeman on every corner. Street crimes would be reduced because of the potential criminal’s fear of immediate apprehension. Even indoor crimes, such as burglary, might be lessened by the increased likelihood of detection through a massive police presence. But few Americans would tolerate living under police scrutiny that intense. … In a democratic society privacy of communication is essential if citizens are to think and act creatively and constructively. Fear or suspicion that one’s speech is being monitored by a stranger, even without the reality of such activity, can have a seriously inhibiting effect upon the willingness to voice critical and constructive ideas. When dissent from the popular view is discouraged, intellectual controversy is smothered, the process for testing new concepts and ideas is hindered and desirable change is slowed. External restraints, of which electronic surveillance is but one possibility, are thus repugnant to citizens of such a society.

                                                                    The report very clearly recognizes that creating an oppressive police state, in which all activity is scrutinized, would not be accepted by society and would not be conducive to maintaining social order - indeed, that it would lead to disorder and lawlessness.

                                                                    I’m not sure the first time I read this historical 1960’s document, but it’s a great read, and it even has sections regarding wiretapping and electronic surveillance. In fact, when you read it, one of the overall takeaways is that you cannot have effective reductions in crime without having the respect and cooperation of the communities being policed.

                                                                    I think this is an important conclusion that is often forgotten today. In my opinion, even with all all the discontent you see in the news media, actual dissent seems to be at a very low level compared to the 1960’s. I feel that most people would rather blog and express online outrage rather than putting themselves at risk to enact meaningful change.

                                                                    Today, we are often conditioned by the media away from dissenting views and steered to accept the current popular view. We aren’t just told what is happening, but how we should feel about it. Unpopular views and contrary speech are often viciously attacked and silenced by majority mob rule, both online and off.

                                                                    Many citizens and politicians alike are seeking absolute solutions, and they are actively working to condition us to accept less freedom and privacy to make it possible to implement what would be otherwise unacceptable solutions.

                                                                    While I do recognize that I’m somewhat of a “freedom extremist”, on the opposite side of the argument on more than a few occations, and I accept that a balance is essential, I (and the government) also know the reality is that if there is widespread rebellion against certain measures, if “We The People” as a society simply will not tolerate some curtailments on our freedoms, then these measures simply cannot be implemented. I feel, very passionately, that it’s a responsibility of programmers and technologists to foster that discontent, to fuel the fires of rebellion, in those outside of our field.

                                                                    While I argue that, in almost all cases, crime, even increased crime, is an acceptable trade-off to further curtailment of personal liberties, I am legitimately interested in hearing thoughts from the other side of the side - even if I’m not going to be convinced easily!

                                                                2. 2

                                                                  This article traces a line of argument that keeps cropping up that basically says a person can not be compelled to give up “the contents of his mind” in order to obey a court order.

                                                                  Interestingly, and a bit of a tangent, but I never knew this until today… the opposite also appears to be true - the government can stop you from revealing the contents of your mind, if they believe those contents constitute a threat to national security.

                                                                  https://fas.org/blogs/secrecy/2010/10/invention_secrecy_2010/ and https://bloom.bg/1Odhrtz (the latter seems randomly selectively paywalled).

                                                                  I can’t say I agree with this any more than I agree with mandated backdoors or compulsory key disclosure.

                                                                1. 10

                                                                  I was a practicing lawyer in Australia, so I come to this with some knowledge, but I am not a US constitutional lawyer.

                                                                  The privilege against self-incrimination is phrased in the fifth amendment as follows:

                                                                  nor shall [a person] be compelled in any criminal case to be a witness against himself

                                                                  The privilege is against giving testimony. To this extent, the content of your password is immaterial. If your password contains an admission of a crime, that admission cannot be used against you in court (and under the fruit of the poisonous tree doctrine, nor can any evidence acquired solely because of that admission). If you are immune from prosecution relying on a statement, you can be compelled to answer questions (see Kastigar v US). (You cannot be compelled to say a particular thing, as a matter of first amendment law, but you can be compelled to answer questions.)

                                                                  As such, I think that the content of the passphrase – whether confessional or not – would not affect whether the 5th amendment privilege can protect you from being required to give over a password.

                                                                  The open legal question here is whether the act of giving your password is itself testamentary in nature, perhaps analogous to admitting ownership of contraband. If it is, the fifth amendment applies. If it is not, and it is instead analogous to handing over a physical key, then the fifth amendment does not apply, and you can be compelled to provide your password.

                                                                  This obviously isn’t legal advice; I might be wrong about all of this.

                                                                  1. 0

                                                                    The privilege is against giving testimony. To this extent, the content of your password is immaterial. If your password contains an admission of a crime, that admission cannot be used against you in court (and under the fruit of the poisonous tree doctrine, nor can any evidence acquired solely because of that admission).

                                                                    This still raises some interesting possibilities.

                                                                    What if, for example, the content of the confessional passphrase is a confession to the crime for which are you are being investigated and interrogated?

                                                                    I could see this creating a scenario where disclosing your passphrase would make you immune from prosecution, for example, if you were able to negotiate that you would not be prosecuted for crimes revealed by the confession in your confessional passphrase, and if it turns out that the confessional passphrase is, in fact, a confession to the crime for which you are being prosecuted.

                                                                    Also, following that reasoning, could you actually be actively prevented from disclosing your passphrase, if the prosecution and investigators had knowledge (or reasonable suspicion) that your passphrase was such a confession?

                                                                    Creating a situation where confessing to the crime, such as might be required for the purposes of a plea agreement, might grant immunity from prosecution is an interesting thought experiment (or a good future episode of Law & Order).

                                                                    Edit: I find it a stretch to imagine a statement such as “I am guilty of the crime of XXXXX committed on June 7 2019” could be argued to be non-testamentary in nature, forcing it instead to be considered analogous only to a physical key, but then again, I’m not a judge, prosecutor, or constitutional attorney.

                                                                    1. 8

                                                                      I could see this creating a scenario where disclosing your passphrase would make you immune from prosecution, for example, if you were able to negotiate that you would not be prosecuted for crimes revealed by the confession in your confessional passphrase, and if it turns out that the confessional passphrase is, in fact, a confession to the crime for which you are being prosecuted.

                                                                      Yeah that’s not how this works. Fruit of the poisonous tree only applies to the specific fruit of the passphrase disclosure. If you’re already under investigation, the confession in your passphrase can safely be ignored. The fruit of the poisonous tree Wikipedia article has some good discussion of exceptions, notably including parallel construction, inevitable discovery and independent sources.

                                                                      Parallel construction is sometimes done in a cleanroom environment to avoid fruit of the poisonous tree. It is very easy for this to be the case for password disclosure too. A technician, who is not otherwise involved in the investigation, could be the only law enforcement officer who ever sees your password. They can type it in where required, and therefore the content of the password is never known to the investigators. This type of process is entirely acceptable to courts.

                                                                      I find it a stretch to imagine a statement such as “I am guilty of the crime of XXXXX committed on June 7 2019” could be argued to be non-testamentary in nature, forcing it instead to be considered analogous only to a physical key, but then again, I’m not a judge, prosecutor, or constitutional attorney.

                                                                      It may be helpful to think about it in programming terms. The same sequence of bytes may be executable in one context and data in another. It is their context, not their content, which determines this. Some bytes are never executable (some text is never testamentary), but executable bytes in a non-executable context are still not executable (similarly, testamentary text in a non-testamentary context is still not testamentary).

                                                                      The debate is about whether the entering of a password is a testamentary context, whereas this post is about whether the text is testamentary.

                                                                      1. 1

                                                                        A technician, who is not otherwise involved in the investigation, could be the only law enforcement officer who ever sees your password. They can type it in where required, and therefore the content of the password is never known to the investigators. This type of process is entirely acceptable to courts.

                                                                        I wonder, if that practice was not followed, could a defendant make a legal argument that the investigators, having to essentially read and work with a phrase, over and over, that amounts to no less than a confession (at least in the eyes of a layperson), be tainted or biased by the exposure to the phrase? If a confessional passphrase was leaked to the media and widely reported, could that be a basis to claim that a potential juror (or jury) has been tainted, and move to dismiss the juror (or declare a mistrial?).

                                                                        [Edit: Also, in general, does intentionally complicating a potential investigation tend to work in favor of or against a defendant?]

                                                                        [Edit: What I’m getting at would be: Is there anything to be gained from such a passphrase scheme? Even better, are there any disadvantages that I might not be aware of?]

                                                                        It may be helpful to think about it in programming terms. The same sequence of bytes may be executable in one context and data in another. It is their context, not their content, which determines this. Some bytes are never executable (some text is never testamentary), but executable bytes in a non-executable context are still not executable (similarly, testamentary text in a non-testamentary context is still not testamentary).

                                                                        While I’m tempted snarkily to argue “but code is data and data is code!” and point at Lisp, this makes perfect sense to me on a logical level, and I really appreciate your effort to present these concepts as clearly as you have.

                                                                        Thank you!

                                                                        1. 3

                                                                          I wonder, if that practice was not followed, could a defendant make a legal argument that the investigators, having to essentially read and work with a phrase, over and over, that amounts to no less than a confession (at least in the eyes of a layperson), be tainted or biased by the exposure to the phrase?

                                                                          I think I’m not well-placed to answer this, as someone who has not done any criminal law work in the US.

                                                                          My suspicion is that you would need something a lot stronger than this to throw out the prosecution. I think the case of US v Ceccolini would be instructive here, though it doesn’t directly answer the question. A helpful quote from p 273-274:

                                                                          The constitutional question under the Fourth Amendment was phrased in Wong Sun v. United States, 371 U. S. 471 (1963), as whether “the connection *274 between the lawless conduct of the police and the discovery of the challenged evidence has ‘become so attenuated as to dissipate the taint.’”

                                                                          That’s kind of the key here: an assessment of the degree of the “taint” of illegality (compelled confession) against how significant that evidence was to the investigation.

                                                                          It’s worth noting too that even if you’re compelled to disclose your password, there’s nothing compelling you to assert the truth of the statement. My password can be “two plus two equals five”; that doesn’t mean I believe that to be true.

                                                                          in general, does intentionally complicating a potential investigation tend to work in favor of or against a defendant?

                                                                          In general, against. At the high end, it is obstruction of justice. At the low end, a judge will not look kindly on it. Judges are humans, and don’t blindly follow an algorithm. Any attempt to be “too clever” will likely fail, and put the judge offside. It’s very difficult for laypeople (and most lawyers!) to identify the line between a good technical argument and an argument which is “too clever” in the sense I’ve used it above. Being able to identify it is one of the skills that makes great litigators great.

                                                                          While I’m tempted snarkily to argue “but code is data and data is code!” and point at Lisp …

                                                                          Ah and that’s kind of my point! Data is only code when you evaluate it; and code is always data when it’s not evaluated. Context is key.

                                                                          I really appreciate your effort to present these concepts as clearly as you have.

                                                                          It’s my pleasure! Part of the reason I became a lawyer was my interest piqued in these sorts of questions by slashdot, and a desire to be able to think deeply and correctly about them. So I spent some enjoyable years studying them, and learning to think in the ways of the law.

                                                                  1. 2

                                                                    I’d love to hear a lawyer’s take on this.

                                                                    My idea would be something simpler. If I were to visit a country that is known for demanding to know all my personal passwords I’d maybe:

                                                                    • assume everything is in a password manager
                                                                    • don’t take any devices with me
                                                                    • don’t put myself in a position to be able to login anywhere, because a) ssh keys or b) password I don’t know because password manager
                                                                    • get person A (not travelling with me) to encrypt my password manager file with a password I don’t know
                                                                    • visit the country

                                                                    Now I simply am truthfully not able to give out any passwords (don’t know them) and also can’t give out my password manager file (don’t have it online or with me) and then when I land try to contact person A to help me unlock my password file, probably by sending it to me to a newly created email address.

                                                                    Yes, this is kind of paranoid and would probably get me thrown into a holding cell until I relent, so I just choose to not visit those countries for now :P

                                                                    Actually the only difference to my current MO is:

                                                                    • I do know some of my passwords because they are not 64 random chars
                                                                    • I do know the password to my password manager
                                                                    • I have the file on my phone/laptop
                                                                    1. 2

                                                                      The idea really is not for those visiting another county, but for those of us who live in South Africa, the United States, France, India, the UK and Ireland, for example (or other jurisdictions with password/key disclosure laws, as well as those without such laws but without existing legal precedents). In places like Belgium, perhaps most disturbingly, they can’t compel password/key disclosure or decryption from suspects, but they can against witnesses and uninvolved parties!

                                                                      When I last travelled overseas, I zero’d out the hard disk on the netbook I took with me, restored my (encrypted) back-up over the ‘net, and uploaded a new back-up and zero’d the disk again before returning. (Thanks to the DBAN people for making the process easy.) That also protects you from data loss in case of hardware seizure.

                                                                      This did, on my return trip, cost me hours of time and a missed flight, as I sat detained by security, trying to get me to explain why I was traveling with a “highly suspect” inoperable laptop. Not wanting to miss a connecting flight, I told them “Fine, keep it.” — wrong move. The laptop was now even more suspect. Since that incident, every time I fly, even domestically, I receive an SSSS ticket without fail, which apparently means I’m now on the TSA/DHS Secondary Security Screening Selection watchlist.

                                                                    1. 10

                                                                      This page is really painful to read: it’s quite aggressive towards the author of xz. The tone is really needlessly nasty. There are only elements against xz/lzma2, nothing in favor; it’s just criticism which conclusion is “use lzip [my software]”.

                                                                      Numbers are shown the way they look bigger: “0.015% (i.e. nothing) to 3%” efficiency difference is then turned as “max compression ratio can only be 6875:1 rather than 7089:1” but that’s over 1TB of zeroes and only 3% relative to the compressed data, which amounts to a 4*10^-6 difference on the uncompressed data! (and if you’re compressing that kind of things, you might want to look at lrzip)

                                                                      The author fails to understand that xz’s success has several causes besides compression ratio and the file format. It’s a huge improvement over gzip and bzip2 for packages. The documentation is really good and helps you get better results both with compression ratio and speed (see “man xz”). It is ported pretty much everywhere (that includes OS/2 and VMS iirc). It is stable. And so on.

                                                                      As a side-note, this is the only place where I’ve seen compression formats being used for archiving and expecting handling of potential corruption. Compression goes against archiving. If you’re doing archiving, you’ll be using something that provides redundancy. But redundancy is what you eliminate when you compress. What is used for archiving of audio and video? Simple formats with low compression at best. The thing with lzip is that while its file format might be better suited for archiving, lzip itself as a whole still isn’t suited for archiving. And that’s ok.

                                                                      Now, I just wish the author gets less angry. That’s one of the ways to a better life. Going from project to project and telling them they really should abandon xz in favor of lzip for their source code releases is only a proof of frustration and a painful life.

                                                                      1. 6

                                                                        The author fails to understand that xz’s success has several causes besides compression ratio and the file format.

                                                                        But the author doesn’t even talk about that? All he has to say about adoption is that it happened without any analysis of the format.

                                                                        Compression goes against archiving. If you’re doing archiving, you’ll be using something that provides redundancy. But redundancy is what you eliminate when you compress.

                                                                        This sounds like “you can’t be team archiving if you are team compression, they have opposite redundancy stat”. It’s not an argument, or at least not a sensical one. Compression makes individual copies more fragile; at the same time, compression helps you store more individual copies of the same data in the same space. So is compression better or worse for archiving? Sorry, I’m asking a silly question. The kind of question I should be asking is along the lines of “what is the total redundancy in the archiving system?” and “which piece of data in the archiving system is the weakest link in terms of redundancy?”

                                                                        Which, coincidentally, is exactly the sort of question under which this article is examining the xz format…

                                                                        What is used for archiving of audio and video? Simple formats with low compression at best.

                                                                        That’s a red herring. A/V archiving achieves only low compression because it eschews lossy compression and the data typically doesn’t lend itself well to lossless compression. Nevertheless it absolutely does use lossless compression (e.g. FLAC is typically ~50% smaller than WAV because of that). This is just more “team compression vs team archiving”-type reasoning.

                                                                        The thing with lzip is that while its file format might be better suited for archiving, lzip itself as a whole still isn’t suited for archiving.

                                                                        Can you actually explain why, rather than just asserting so? If lzip has deficiencies in areas xz does well in, could you step up and criticise what would have to improve to make it a contender? As it is, you seem to just be dismissing this criticism of the xz format – which as a universal stance would result in neither xz nor lzip improving on any of their flaws (in whatever areas those flaws may be in).

                                                                        As a side-note, this is the only place where I’ve seen compression formats being used for archiving and expecting handling of potential corruption.

                                                                        Juxtaposing this with your “author fails to understand” statement is interesting. Should I then say that you fail to understand what the author is even talking about?

                                                                        This page is really painful to read: it’s quite aggressive towards the author of xz.

                                                                        I saw only a single mention of a specific author. All the substantive statements are about the format, and all of the judgements given are justified by statements of fact. The very end of the conclusion speaks about inexperience in both authors and adopters, and it’s certainly correct about me as an adopter of xz.

                                                                        There are only elements against xz/lzma2, nothing in favor; it’s just criticism which conclusion is “use lzip [my software]”.

                                                                        Yes. The authors of xz are barely mentioned. They are certainly not decried nor vilified, if anything they are excused. It’s just criticism. That’s all it is. Why should that be objectionable? I’ve been using xz; I’m potentially affected by the flaws in its design, which I was not aware of, and wouldn’t have thought to investigate – I’m one of the unthinking adopters the author of the page mentions. So I’m glad he took the time to write up his criticism.

                                                                        Is valid criticism only permissible if one goes out of one’s way to find something proportionately positive to pad the criticism with, in order to make it “fair and balanced”?

                                                                        Frankly, as the recipient of such cushioned criticism I would feel patronised. Insulting me is one thing and telling me I screwed up is another. I can tell them apart just fine, so if you just leave the insults at home, there’s no need to compliment me for unrelated things in order to tell me what I screwed up – and I sure as heck want to know.

                                                                        1. 2

                                                                          The author fails to understand that xz’s success has several causes besides compression ratio and the file format.

                                                                          But the author doesn’t even talk about that? All he has to say about adoption is that it happened without any analysis of the format.

                                                                          Indeed, this is more a comment about what appears to be biterness from the author. This isn’t part of the linked page (although the tone of the article is probably a consequence).

                                                                          Compression goes against archiving. If you’re doing archiving, you’ll be using something that provides redundancy. But redundancy is what you eliminate when you compress.

                                                                          This sounds like “you can’t be team archiving if you are team compression, they have opposite redundancy stat”. It’s not an argument, or at least not a sensical one. Compression makes individual copies more fragile; at the same time, compression helps you store more individual copies of the same data in the same space. So is compression better or worse for archiving? Sorry, I’m asking a silly question. The kind of question I should be asking is along the lines of “what is the total redundancy in the archiving system?” and “which piece of data in the archiving system is the weakest link in terms of redundancy?”

                                                                          Agreed. I’m mostly copying the argument from the lzip author. That being said, one issue with compression is that corruption on compressed data is amplified with no chance to be able to reconstruct the data, even by hand. Intuitively I would expect the best approach for archiving would be compression followed by adding “better” (i.e. more even) redundancy and error recovery (within the storage budget). Now, if your data has some specific properties, the best approach might be different, especially if you’re more interested in some parts (for instance if you have a progressive image, you might value more the less specific parts because losing the more specific ones implies only losing on the image resolution).

                                                                          Which, coincidentally, is exactly the sort of question under which this article is examining the xz format…

                                                                          What is used for archiving of audio and video? Simple formats with low compression at best.

                                                                          That’s a red herring. A/V archiving achieves only low compression because it eschews lossy compression and the data typically doesn’t lend itself well to lossless compression. Nevertheless it absolutely does use lossless compression (e.g. FLAC is typically ~50% smaller than WAV because of that). This is just more “team compression vs team archiving”-type reasoning.

                                                                          If you look for some stuff from archivists, FLAC isn’t one of the preferred format. It is acceptable but the preferred one still seems to be WAV/PCM.

                                                                          Sources:

                                                                          The thing with lzip is that while its file format might be better suited for archiving, lzip itself as a whole still isn’t suited for archiving.

                                                                          Can you actually explain why, rather than just asserting so? If lzip has deficiencies in areas xz does well in, could you step up and criticise what would have to improve to make it a contender? As it is, you seem to just be dismissing this criticism of the xz format – which as a universal stance would result in neither xz nor lzip improving on any of their flaws (in whatever areas those flaws may be in).

                                                                          I had intended the leading sentences to explain that. The reasonning is simply that compression is mostly at odds with long-term preservation by itself. As discussed above, proper redundancy and error recovery can probably turn that into a good match but then the qualities of the compression format itself don’t matter that much since the “protection” is done at another layer that is dedicated to that and also provides recovery.

                                                                          As a side-note, this is the only place where I’ve seen compression formats being used for archiving and expecting handling of potential corruption.

                                                                          Juxtaposing this with your “author fails to understand” statement is interesting. Should I then say that you fail to understand what the author is even talking about?

                                                                          You’re obviously free to do so if you wish to. :)

                                                                          This page is really painful to read: it’s quite aggressive towards the author of xz.

                                                                          I saw only a single mention of a specific author. All the substantive statements are about the format, and all of the judgements given are justified by statements of fact. The very end of the conclusion speaks about inexperience in both authors and adopters, and it’s certainly correct about me as an adopter of xz.

                                                                          Being full of facts doesn’t make the article objective. It’s easy to not mention some things and while the main author of xz/liblzma could technically answer, he doesn’t really wish to do so (especially since it would cause a very high mental load). That being said, I’ll take liberalities and quote from IRC where I basically only lurk nowadays (nicks replaced by “Alice” and “Bob”). This is a recent discussion, there were more detailled ones earlier but I’m not only taking the most recent one.

                                                                          Bob : Alice the lzip html pages says that lzip compresses a bit better than xz. Can you tell me the technical differences that would explain that difference in size ?

                                                                          Bob : Alice do you have ideas on how improving the size with xz ?

                                                                          Alice : Bob: I think it used to be the opposite at least with some files since .lz doesn’t support changing certain settings. E.g. plain text (like source code tarballs) are slightly better with xz –lzma2=pb=0 than with plain xz. It’s not a big difference though.

                                                                          Alice : Bob: Technically .lz has LZMA and .xz has LZMA2. LZMA2 is just LZMA with chunking which adds a slight amount of overhead in a typical situation while being a bit better with incompressible data.

                                                                          Alice : Bob: With tiny files .xz headers are a little bloatier than .lz.

                                                                          Alice : Bob: In practice, unless one cares about differences of a few bytes in either direction, the compression ratios are the same as long as the encoders are comparable (I don’t know if they are nowadays).

                                                                          Alice : Bob: With xz there are extra filters for some files types, mostly executables. E.g. x86 executables become about 5 % smaller with the x86 BCJ filter. One can apply it to binary tarballs too but for certain known reasons it sometimes can make things worse in such cases. It could be fixed with a more intelligent filtering method.

                                                                          Alice : Bob: There are ideas about other filters but me getting those done in the next 2-3 years seem really low.

                                                                          Alice : So one has to compare what exist now, of course.

                                                                          Bob : Alice btw, fyi, i have tried one of the exemples where the lzip guy says that xz throws an error while it shouldn’t

                                                                          Bob : but it is working fine, actually

                                                                          Alice : Heh

                                                                          Two main points here: the chunking, the point of view that the differences are very small; and the fact that one of the complaint seems wrong.

                                                                          If I look for “chunk” in the article, the only thing that comes up is the following:

                                                                          But LZMA2 is a container format that divides LZMA data into chunks in an unsafe way. In practice, for compressible data, LZMA2 is just LZMA with 0.015%-3% more overhead. The maximum compression ratio of LZMA is about 7089:1, but LZMA2 is limited to 6875:1 approximately (measured with 1 TB of data).

                                                                          Indeed, the sentence “In practice, for compressible data, LZMA2 is just LZMA with 0.015%-3% more overhead.” is probably absolutely true. But there is no mention of what happens for uncompressible data. I can’t tell whether that omission was voluntary or not but it makes this paragraph quite misleading.

                                                                          Note that xz/liblzma’s author acknowledges some of the points of lzip’s author, but not the majority of them.

                                                                          There are only elements against xz/lzma2, nothing in favor; it’s just criticism which conclusion is “use lzip [my software]”.

                                                                          Yes. The authors of xz are barely mentioned. They are certainly not decried nor vilified, if anything they are excused. It’s just criticism. That’s all it is. Why should that be objectionable? I’ve been using xz; I’m potentially affected by the flaws in its design, which I was not aware of, and wouldn’t have thought to investigate – I’m one of the unthinking adopters the author of the page mentions. So I’m glad he took the time to write up his criticism.

                                                                          Is valid criticism only permissible if one goes out of one’s way to find something proportionately positive to pad the criticism with, in order to make it “fair and balanced”?

                                                                          I concur that writing criticism is a good thing but the article is not really objective and probably doesn’t try to be. In an ideal world there would be a page with rebuttals from other people. In a real world, that would probably start a flamewar and the xz/libzma author does not wish to get involved into that.

                                                                          I’ve just looked up the author name + lzip and first result is: https://gcc.gnu.org/ml/gcc/2017-06/msg00044.html “Re: Steering committee, please, consider using lzip instead of xz”.

                                                                          Another scary element is that nor “man lzip” nor “info lzip” mention “xz”. They mention gzip and bzip2 but not xz (“Lzip is better than gzip and bzip2 from a data recovery perspective.”). Considering the length of this article, not seeing a single mention of xz makes me think the lzip author does not have a peaceful relation to xz.

                                                                          You might think that the preference of lzip in https://www.gnu.org/software/ddrescue/manual/ddrescue_manual.html would be a good indication but the author of that manual is also lzip’s author!

                                                                          And now scrolling down my search results, I see https://lists.debian.org/debian-devel/2015/07/msg00634.html “Re: Adding support for LZIP to dpkg, using that instead of xz, archive wide” and the messages there again make me think he doesn’t have a peacfeful relation to xz.

                                                                          I don’t like criticizing authors but with this one-way article with surprising omissions and incorrect elements (no idea if that’s because things changed at some point), I think more context (and an author’s personnality and history are context) helps decide how much you trust the whole article.

                                                                          Frankly, as the recipient of such cushioned criticism I would feel patronised. Insulting me is one thing and telling me I screwed up is another. I can tell them apart just fine, so if you just leave the insults at home, there’s no need to compliment me for unrelated things in order to tell me what I screwed up – and I sure as heck want to know.

                                                                          Yes, it’s cushioned because I don’t like criticizing authors as I said above so I’m uncomfortable doing, I try to avoid doing it but sometimes that’s not something we can separate from a topic or article so I still ended up doing it at least a bit (you can now see that I did it as little as possible in my previous message). With that being said, I don’t think the author needs to be told all of this, or at least I don’t want to start such a discussion with the author who seems to be able to go on for years (and tbh, I’m not sure that’s healthy for him).

                                                                          edit: fixed formatting of the IRC quote

                                                                        2. 3

                                                                          As a side-note, this is the only place where I’ve seen compression formats being used for archiving and expecting handling of potential corruption. Compression goes against archiving. If you’re doing archiving, you’ll be using something that provides redundancy.

                                                                          This is not true at all. [Edit: Most of the widely used professional backup and recovery software that was specifically designed for long-term archiving also included compression as an integral part of the package, and advertised it’s ability to work in a robust manner.]

                                                                          BRU for UNIX, for example, does compression, and is designed for archiving and backup. This tool is from 1985 and is still maintained today.

                                                                          Afio is specifically designed for archiving and backup. It also supports redundant fault-tolerant compression. This tool is also from 1985 and is still maintained today.

                                                                          [Edit: LONE-TAR is another backup product I remember using from the mid 1980s, was originally produced by Cactus Software. It’s still supported and maintained today. It provided a fault-tolerant compression mode, so it would be able to restore (most) data even if there was damage to the archive.]

                                                                          As to all your other complaints, it seems you are attacking the documents “aggressive tone” and you mention that you find it painful (or offensive) to read, but you haven’t actually refuted any of the technical claims that author of the article makes.

                                                                          1. 1

                                                                            Sorry, I had compression software in mind when I wrote that sentence. I meant that I had never seen a compression software that made the resistance to corruption such an important feature.

                                                                            Thanks for the links! I’m not that surprised that there are some pieces of software that already exist and fit in that niche (I would have had to build a startup otherwise!). I’m quite curious at their tradeoff choices (in space vs. recovery capabilities) but since two of them are proprietary, I’m not sure there is one unfortunately.

                                                                            As to all your other complaints, it seems you are attacking the documents “aggressive tone” and you mention that you find it painful (or offensive) to read, but you haven’t actually refuted any of the technical claims that author of the article makes.

                                                                            Indeed. Part of that is because comments are probably not really a good place for that because the article itself is very long. The other part is because xz’ author does not wish to get into that debate and I don’t want to pull him in by publishing his answers on IRC on that topic. It’s not a great situation and I don’t really know what to do so I end up hesitating. Not perfect either. I think I mostly only hope to get people to question a bit the numbers and facts on that page and to not forget everything else that goes into making a file format useful in practice and it’s not because there’s no rebuttal that the article is true, spot-on, unbiaised and so on.

                                                                          2. 2

                                                                            I agree about the tone of the article, but I’m not sure that archiving and compression run counter to each other.

                                                                            I’ve spent a lot of time digging around for old software, in particular to get old hardware running, but also to access old data. Already we are having to dig up software from 20+ years ago for these things.

                                                                            In another 20 years, when people need to do the same job, it will be more complicated: if you need to run one package, you may find yourself needing tens or worse of transitive dependencies. If you’re looking in some ancient Linux distribution mirror on some forgotten server, what are the chances that all the bits are still 100% perfect? And certainly nobody’s going to mirror all these in some uncompressed format ;-)

                                                                            This is one case where being able to recover corrupted files is important. It’s also helpful to be able to do best-effort recovery on these; in any given distro archive you can live with corruption in some proportion of the resulting payload bytes - think of all the documentation files you can live without - but if a bit error causes the entire rest of the stream to be abandoned then you’re stuffed.

                                                                            I’d argue that archival is something we already practice in everyday release of software. The way people tend to name release files as packagename-version.suffix is a good example: it makes the filename unique and hence very easy to search for in the future. And here, picking one format over another where it has better robustness for future retrievers seems pretty low-cost. It’s not like adding parity data or something that increases sizes.

                                                                            1. 2

                                                                              Agreed. :)

                                                                              Makes me think of archive.org and softwareheritage.org (which already has pretty good stuff if I’ve understood correctly).

                                                                          1. 2

                                                                            I always enjoys Mickens’ - This World of Ours is a favorite. While I agree with him that heterogeneous distributed systems cannot be made inherently reliable, I’d argue they certainly can be made more reliable than previous systems for particular failure modes, so not all research into the subject is a complete waste. He’s certainly not wrong about the way many of the papers read, however.

                                                                            Also, “Byzantine Fault” would make an excellent name for a metal band.

                                                                            1. 3

                                                                              Despite the thorough explanation why XZ should not be used for long-term archiving (and why the format seems to be badly specified) the article lacks to mention any alternative. What should we use instead?

                                                                              Update

                                                                              From the front page:

                                                                              This article describes the reasons why you should switch to lzip if you are using xz for anything other than compressing short-lived executables.

                                                                              1. 4

                                                                                Assuming by the website - http://lzip.nongnu.org

                                                                                I cannot find it right now, but there was dissection of this article with list of problems this article has.

                                                                                1. 3

                                                                                  I’d appreciate reading that if you can locate it. Also, the article was updated very recently (2019-05-17), perhaps in response to previous feedback?

                                                                                  I am aware of some previous discussion and concerns that the critique might be “politically motivated”, but I’ve not seen any convincing counter arguments made that actually debunk the technical claims.

                                                                                  Interesting, some projects (such as wget) are now using .lz for distribution, while other (such as emacs) are using .xz.

                                                                                  1. 1

                                                                                    It would be hard as it pops on HN and Reddit every quarter so it is even hard to google.

                                                                                2. 2

                                                                                  I’d personally recommend lzip’d POSIX.1-2001 archives (pax, or GNU tar `–format=pax’), or afio, which has fault-tolerant compression, and makes an excellent archive format.

                                                                                1. 1

                                                                                  I remember reading about zsh’s startup and prompt latencies being considerably worse than bash but can’t recall where (either Github Issues comments or a Dan Luu-like post.) In any case:

                                                                                  echo "Changing shell to brew's bash. . ."
                                                                                  brew install bash
                                                                                  echo "$(brew --prefix)/bin/bash" | sudo tee -a /etc/shells > /dev/null
                                                                                  chsh -s $(brew --prefix)/bin/bash
                                                                                  
                                                                                  1. 3

                                                                                    Funny, I thought the performance issues were the other way around.

                                                                                    I certainly notice that bash’s completion is considerably slower than zsh.

                                                                                    1. 2

                                                                                      I’ve been a zsh user since the early 1990s, maybe 1992 or so, and never really used bash as I was a ksh user previously, so perhaps my observances can be taken with a grain of salt - but I think the problem here is that a lot of people turn on oh-my-zsh (or similar) and enable every plugin and option and then wonder why the shell takes forever to start up.

                                                                                      I do use oh-my-zsh, and for zsh 5.3.1 on a Raspberry Pi 3B, average (over 10 startups) startup time is 0.714s, this is compared to 0.204s for (uncustomized beyond Debian defaults) bash, 0.176s for tcsh, and 0.008s for mksh. Uncustomized zsh actually launches slightly faster than uncustomized bash

                                                                                      Also, once the shell is running, I find zsh performance (in areas like globbing and completion) to be quite speedy, but I’ve never benchmarked against bash.

                                                                                      Edit: After upgrading to zsh 5.7.1 (from 5.3.1), startup time is, on average, 215ms faster - startup in ~500ms on average.

                                                                                      Profiling shows the speed improvement is faster completion initialization.

                                                                                      1. 1

                                                                                        My 85-line .zshrc, with some fancy-ish prompt stuff, some functions and alises, but no framework like OMZ has an average start time of 0.058s. On the same system, bash in its default configuration (as provided by the FreeBSD package) is about 0.008s.

                                                                                        I’ve never understood the need for frameworks for my shell. I can implement the same functionality with much less overhead.

                                                                                        1. 1

                                                                                          The framework isn’t very heavy unless you enable all sorts of thing you likely won’t use.

                                                                                          Using the zsh profiler (add zmodload zsh/zprof as the first line of your .zshrc, and zprof as the last), I can see that ~50% of my startup overhead is related to completion, and another 30% is related to zsh-syntax-highlighting (which can be slow in usage, but not annoyingly so).

                                                                                          I don’t see a need to try to optimize the remaining 20% by profiling parts of oh-my-zsh or trying to avoid it. (I do use some of it’s plugins and then unset or reset a few of the aliases, however, so there is some low hanging fruit in that 20%.)

                                                                                          As this is a comparatively slow CPU, where your numbers are probably for a modern desktop, you’d be even less likely to notice the overhead from the framework. It’s not all that heavy.

                                                                                      2. 1

                                                                                        To be honest, I don’t know much about bash completion because I don’t use it. Standard tab for directory autocomplete with the following Readline settings is all I need.

                                                                                        set show-all-if-ambiguous on
                                                                                        set completion-ignore-case on
                                                                                        set mark-symlinked-directories on
                                                                                        set colored-stats on
                                                                                        set completion-prefix-display-length 3
                                                                                        
                                                                                      3. 1

                                                                                        Ha! I just had to do this last week when something broke my completions and I realized it was easier to install and use bash from Homebrew than it was to find a workaround for my actual problem. My overengineered solution is here.

                                                                                        1. 1

                                                                                          I do something similar in my dotfiles setup script.

                                                                                      1. 0

                                                                                        Sell me zsh. A few friends use it but they’ve not been able to clearly convey its advantages over ye olde bash that is just… everywhere.

                                                                                        1. 4

                                                                                          auto completion works even if you don’t type the string from the beginning. i.e. you have a series of folders

                                                                                          /backup_images

                                                                                          /backup_files

                                                                                          /backup_videos

                                                                                          you can do an ls and start typing videos and it will tab-autocomplete to the right folder. It was the one feature that signaled to me, you made a good decision to switch.

                                                                                          1. 2

                                                                                            If bash is good enough for you then you probably have no reason to switch.

                                                                                            In the past I used zsh for some of its fancier features: I could do more expressive expansions to make my prompt pretty or do clever directory chomping; I found its completion much faster than bash, but that’s only useful if you use flag or subcommand completion; menu completion can be useful. I think also the history settings in zsh were or are more featureful than bash, but I’m not entirely sure what bash’s current features are like.

                                                                                            I’ve been using bash just fine for the last ~10 years on personal systems but I do have zsh on some servers so that I can do more clever things with prompting.

                                                                                            1. 2

                                                                                              The main reason I used zsh was to handle correctly files with special chars in them ([ \t\n] for example). It also has “real” lists and associative arrays. Mainly it was often better to write scripts. After I also found that print in zsh is often better than echo. For example print -P "%Bsomething bold%b". Also things like ${fic:t} instead of basename $fic, ${fic:s/x/_/} instead of echo $fic | sed 's/x/_/' and a lot of small niceties.

                                                                                              I no longer use zsh as my main shell, I switched to fish. Still I always preferred zsh over bash. But it was a long time ago, perhaps bash is better now.

                                                                                              I use fish for basic usage (completion is great), but when I script I generally use zsh.

                                                                                              1. 1

                                                                                                I’d start with the zsh-lovers document.