Threads for phoebos

  1. 2

    In a different take on the title, I recently contacted a company to ask what hash algorithm they used to “store” some user data. After a lot of back-and-forth, I was told that

    We’re so sorry to say that this information is confidential. We can not reveal it due to the rules of our secret [sic] business.

    1. 6

      Well, of course! If you contacted an NSA subcontractor which is developing a new highly-secure quantum hashing algorithm to hash government top-secret data, what did you expect?

      Oh, it was just some random company? Oh well…

      Now seriously, I totally understand you, heh… I’ve had and heard of so many of these types of exchanges!

      That reminds me of something that really happened to me just today (believe it or not!):

      I was told by my lawyer (don’t ask) that since you couldn’t digitally sign every page of a PDF document separately (I don’t even know what that means), that she was going to separate all the pages of the document into separate PDF documents and requested me to digitally sign all the PDFs (one page each). Here’s how the telephone conversation went:

      • Me: but you know, if you separate all the pages into separate documents and I sign them separately, then you could just replace some pages by other pages signed by me from another document and fool the judge into thinking I signed them as a whole.

      • Lawyer: No, but I wouldn’t do that, I will only send these pages!

      • Me: Yes but I mean, even if you do that, when the judge receives the pages he will not be able to tell whether you sent all the pages of the document or whether some pages are conveniently missing. Are they at least numbered as in “1 of 10”, “2 of 10”, etc?

      • Lawyer: Well, no, we don’t number them like that in our legal documents, we only put the first number! But I will send them all! The judge asked us to do it this way!

      • Me: Uhm, yes, but you should really tell the judge this is a bad idea! You should always digitally sign the entire document as a whole, as that’s what the digital signatures are designed to do, otherwise the digital signatures can be misinterpreted to mean something they don’t.

      • Lawyer: OK, I will speak with the other lawyer and then I’ll send you an email.

      • Lawyer’s email: Here’s the document. I sent you the document as separate pages and also as a whole. Please sign them all. When you digitally sign the PDF that contains the document as a whole, please digitally sign only the last page!

      …………… Oh for the love of God!!!

      1. 8

        Back in 2007 (I think), I had the contract for my first book. I was emailed the PDF and was asked to digitally sign it. This was very exciting, I’d never been asked to do that before! So I dutifully created a key signed by my CACert identity, hashed the document, and sent them back a copy of the hash signed by my certificate and a copy of the certificate. They had no idea what to do with this. It turns out, they wanted me to scribble a signature on the PDF and send it back. Now, the fun thing about PDF is that the format is designed for non-destructive editing: the signature is added as a separate object and is rendered on the top. Given this document, it’s trivial to extract that object and replace any of the layers underneath. Total security value of this dance: zero.

        More recently, I’ve been asked to sign something for LLVM’s relicensing using DocuSign. This required me to type my name into a text field. At no point in the process did it do anything that verified my identity. If I knew someone’s email address (public, in the git logs) then I could easily sign on their behalf (I didn’t need to be able to receive emails there, I just needed to enter it).

        Legally, in English common-law countries, the requirement for a contract to be legally binding is that a ‘meeting of minds’ has occurred: i.e. both parties have a reasonable chance of understanding what they’ve agreed to (they don’t have to exercise this chance - if I give you a contract and you sign it without reading it then that’s your fault, if I give you a contract and don’t let you read it until you’ve signed then that’s not binding) and have indicated agreement. Last time I checked, there was absolutely nothing in statute law in the UK or USA about what constitutes a valid signature, but there was a lot of precedent that a hand-written signature (especially a witnessed one) is a strong indication of a meeting of minds. I would love to see what happens when someone challenges DocuSign in court: we may find that a lot of ‘signed’ documents are not, in fact, legally binding.

        1. 2

          The US and UK both have statutes specifically to the effect that typing your name counts as a “digital signature”.

          Of course this is much more easily forged and repudiated. In practice this makes them suitable only for what American lawyers call contracts of adhesion. In general those limit the liability of one party (usually a company), impose an obligation to pay fees on the other party, and often imposes an arbitration clause. If identity of the consumer matters to the company, it can be verified separately from the signature.

          Accordingly, the main remedy of the company is to cease supplying services, and they only really care about identity when there is a significant debt to be repaid, ie the main problem is identity theft. I haven’t followed what actually happens in identity theft cases, both in and out of court. In general the party suing on the contract has the burden of proving that the other party entered into the contract. So, interesting litigation on proof of identity around digital signatures is probably not looming, but if it does come around the background will be wildly and foreseeably inappropriate use of docusign-type signing.

          1. 2

            I forgot to mention that just before that part of our conversation, we had clearly established that by “digitally sign”, we meant to sign using the (cryptographic) client certificate issued by the government authority, which is used by everyone in the country who wishes to authenticate themselves to all of the government’s websites that need authentication (e.g. to pay taxes, to change your legal tax residence, to get your birth certificate, etc).

            In fact, a few days before, I had already signed that document by doing what you mentioned: adding a scribble to the PDF, which I had hand-drawn with my finger on my smartphone. This was rejected by the judge, although I’d really be interested in knowing why, because I don’t think it was due to mismatched signatures (otherwise they either could’ve just asked me to repeat my signature, knowing perfectly well that I’m me and not someone else trying to forge my signature, or if they suspected I’m not me, they wouldn’t offer to send me the document by snail mail to an address of my choosing). My lawyer clearly said that alternatively, we could send the judge an original copy of the document with hand-drawn signatures by all parties (although this would be very inconvenient because I’m currently in a different country).

            I understand what you mean by what is considered a legally binding contract, and I’m sure there are relatively similar laws on the countries I’ve lived (although perhaps the specifics can be different).

            But what is also interesting to me is that hand-drawn signatures don’t actually seem to mean anything in almost any context except if you interpret it as a symbolic act, or a mere formality. Why? Because nobody ever checks or rejects my signature, even though very often, I sign in very different ways. Nobody ever checks whether the signature matches my government ID, including lawyers and government clerks! Actually, that’s a lie: I’ve had my signature rejected a couple of times many years ago, in very absurd situations where they were forcing me to repeat drawing my signature several times until my new signature matched my old signature, in which I was actually allowed to look at my old signature to try and copy it as perfectly as possible (which is what I was trying to consciously do, unsuccessfully according to them). They were basically forcing me to forge my own signature.

            There’s also the issue that my hand-drawn signature, including a copy of my government ID, has been sent to hundreds of parties over the years, including many private individuals and private companies, which they could just digitally copy or manually recreate to forge my signature if they wanted to (although of course, that’s illegal and they’d suffer severe penalties if caught in some other way).

            Finally, because of the absurd/weird requests they made, it’s obvious to me that neither my lawyer nor the judge understand what guarantees a cryptographic signature gives you and I think, almost surely because of the similar name, that they just think it’s similar to a hand-drawn signature which you should do page by page. I mean, I don’t blame them, as they’re not technical people and these things can be hard to understand (even for many, if not most technical people).

            But I sure think the government should have more strict standards regarding how digital signatures are used and accepted, given that although they don’t 100% guarantee it was you who signed the document, they do allow you to (and usually do) give you stronger guarantees than hand-drawn signatures, while making fewer assumptions and simultaneously being more convenient (especially if you’re physically far away).

            Edit: Just a few weeks ago, someone also requested me to do something interesting: instead of adding a hand-drawn scribble of my signature to the PDF document in black (where the document’s text was in black), they told me the scribble needed to be in blue (?!).

        1. 7

          cat /usr/share/dict/words | grep -e '^a..$' | wc -l

          As everyone knows, this is the worst demonstration of pipes:

          grep -c '^a..$' /usr/share/dict/words

          1. 12

            I think this is actually a good demonstration of pipes, because it shows that even if you don’t know the -c flag you can kludge together the same thing with pipes. That makes them a lot more accessible!

            1. 5

              grep -c '^a..$' /usr/share/dict/words

              I’d say this is an even worse demonstration of pipes 😉

              If you happen to have a nice introductory demonstration of pipes that you’d be willing to share, I’d really appreciate it!

              1. 4

                Ok, but all of this is sh, not emacs, right?

                1. 5

                  it is about emacs letting you compose the functions provided by the utilities, provide ways to trigger (M-x or even a key combination when used often), and provide a presentation of the result.

                  1. 4

                    As an Emacs user, if the feature is well integrated and useful, I’m not too fussy about its implementation.

                    The feature, though using a tiny shell script and command line utility, is surfaced via Emacs M-x. It knows whether to apply functionality to current image buffer or file selection in dired. When I use it, I interact with Emacs.

                    1. 3

                      Or really, perl (exiftool).

                    1. 10

                      I’m glad I mostly use distro packages rather than language “package managers”, containers & static linking.

                      If this is a client-side vuln we’ll also have to worry about the plethora of mobile apps who ship openssl, often unwittingly.

                      1. 9

                        I’m prepared! (I’ve typed out sudo apt update && sudo apt upgrade and have my finger hovering over the enter button.)

                        1. 7

                          Yup. I’m sure we’ll see a flurry of follow on advisories for the 80 billion packages that thought shipping a bundled version of openssl was a good idea.

                          1. 8

                            “but it’s easier to statically link / use a vendored share library / etc”.

                            Deferring the tough problems until you’ve got 0-days in your production systems rather than dealing with complexity up front isn’t an especially great idea…

                            1. 1

                              I get the points about static linking, but in reality it’s not that difficult to prompt a rebuild of those packages that statically link openssl.

                              1. 10

                                Assuming you have a list of packages that statically link OpenSSL :)

                                1. 4

                                  Anything in rust that uses rust-openssl sometimes statically links OpenSSL…

                                  1. 1

                                    Thankfully rust’s openssl crate is pretty well maintained and pretty commonly used, so I expect we’ll see an update to that as well on tuesday

                                    1. 3

                                      How do I update all of the things that transitively depend on it? Across all my machines and all of the containers running on them?

                                      1. 1

                                        That I don’t know exactly, but cargo should know what versions get used in each builds and checking that against an eventual rustsec advisory shouldn’t be too hard.

                                        1. 4

                                          Well it turns out that cargo (and rustup) statically link OpenSSL, so depending on the vulnerability, you could hit and RCE when cargo goes to fetch the rust sec advisories. (Like if it’s exploitable in common TLS client usage and someone poisons yr DNS to tell your cargo to talk to their server)

                                      2. 2

                                        Amusingly, rust OpenSSL bindings are still on the 1.x version: 3.0 proven to be problematic for other reasons as well (build depends on less widely available Perl modules, some perf regressions).

                                    2. 3

                                      Pray also that no one decided to make one off patches to rename functions or change argument variables

                                      1. 1

                                        That’s basic package metadata which most package managers use.

                                        1. 4

                                          Ah, I more mean ad-hoc hand-compiled packages. Sorry I wasn’t more specific.

                                          1. 1

                                            Really regretting not maintaining a list

                                          2. 2

                                            Does this package statically link a vendored openssl 3.0? https://crates.io/crates/kv-assets What basic package metadata would indicate that?

                                            1. 1

                                              Many package managers require a list of dependencies, including compile-time-only deps. I’m not familiar with “crates” but I think rust programs have a Cargo.toml or Cargo.lock listing dependencies? Or does rust allow implicit deps?

                                              1. 2

                                                Cargo.toml lists immediate dependencies. Cargo.lock lists transitive dependencies.

                                            2. 1
                                    1. 14

                                      Slightly strange that a lot of this is about portability, but still prefers bash to POSIX sh. Even #!/usr/bin/bash scripts could be improved by writing them using as few bashisms as possible, such as using [ or test rather than [[ - all of which are bash builtins.

                                      The TRACE thing seems unnecessarily more complicated than running sh -x script.

                                      1. 15

                                        Even #!/usr/bin/bash scripts could be improved by writing them using as few bashisms as possible, such as using [ or test rather than [[ - all of which are bash builtins.

                                        I disagree—at least about your specific example. If you write shell scripts that specifically point to bash, then why not take advantages of the differences between [ and [[? (For example, word splitting: https://wiki.bash-hackers.org/syntax/ccmd/conditional_expression#word_splitting.)

                                        Note: this is a different question than “Should portable scripts prefer bash to POSIX shell?” My point is that scripts with bash in the shebang should absolutely use bash features.

                                        1. 5

                                          I get your point. My thoughts were that when I come across a bash script, I can still run it as a normal shell script like sh script and the fewer bashisms it contains, the easier it is for me to use (I don’t have bash installed).

                                          1. 2

                                            If you write shell scripts that specifically point to bash

                                            But that is the diametrical opposite of portability.

                                            1. 4

                                              But that is the diametrical opposite of portability.

                                              Right, but as I said, “Note: this is a different question than ‘Should portable scripts prefer bash to POSIX shell?’ My point is that scripts with bash in the shebang should absolutely use bash features.”

                                              The person I was replying to said “Even #!/usr/bin/bash scripts could be improved by writing them using as few bashisms as possible.” I disagree with that, and I tried to make it clear that I wasn’t talking about portability per se.

                                              You can also see lollipopman’s comment for further arguments in favor of [[ once you are already using bash.

                                          2. 6

                                            “Portability” these days means “works on OS X and Linux”. Might as well get improvements from bash in that case.

                                            1. 4

                                              Which version of bash? Don’t know if they still do, but macOS used to ship ancient bash releases because of GPLv3

                                              1. 3

                                                “Portability” these days means “works on OS X and Linux”.

                                                To whom?

                                                1. 3

                                                  What percentage of people who would execute a script of the type described by this article are not Linux or macOS users, do you think?

                                                  1. 2

                                                    I mean, I was being snarky, but that seems to be the case; other systems are routinely ignored, for better or worse.

                                                  2. 2

                                                    I’d argue portability these days means that it runs on many different systems.

                                                    1. 1

                                                      I wouldn’t call anything in bash an “improvement”…

                                                    2. 3

                                                      I find this idea that something “improves” by using a less capable shell nonsensical. What exactly improves except that you adhere to some standard nobody cares about. This purism buys you nothing. If you know that the target systems have bash, then use all its features.

                                                      1. 3

                                                        The first problem is that POSIX shell is an absolutely terrible scripting language, whereas bash is a merely awful one. If you reach the level of complexity where POSIX shell makes it hard to maintain then you are almost certainly at the point where you should ditch any kind of shell and use a programming language.

                                                        The second problem is the monoculture. For example, The shell shock vulnerability was so bad because everyone used bash for the scripts that were run from dhcpd and so a malicious user on the same broadcast domain could compromise any system that sent a dhcp request. This would have been far harder to exploit at scale if these scripts used /bin/sh and different systems used different implementations (mainstream systems use at least five different implementations of their default POSIX shell).

                                                        The third problem is performance. Bash is definitely not the fastest shell. FreeBSD uses the statically linked version of their POSIX shell for a bunch of things because there’s a noticeable speed up from things that run a load of short-lived shell scripts. I believe this was also part of the motivation for Ubuntu to use dash as /bin/sh instead of bash. If you require a specific implementation with many non-standard quirks then you can’t move to a different implementation.

                                                        And I say all of this as someone who uses bash as their interactive shell.

                                                    1. 14

                                                      Use bash. Using zsh or fish or any other, will make it hard for others to understand / collaborate. Among all shells, bash strikes a good balance between portability and DX.

                                                      I disagree. If you care for portability, and for example resource constrained environments use POSIX shell. If you don’t use something you are familiar with and does the job. Of course that might very well be bash. With zsh, fish and others popping up as main shells people use I think it makes sense to either use a standard for portability (then go for POSIX which is made for portability), but don’t just assume that bash will be the thing that’s installed or most easily available. And I think the trend is certainly going away, though very very slowly. Feel free to use it though, just like you use any other scripting language. It’s just that if you think about portability why not actually go for it?

                                                      Just make the first line be #!/usr/bin/env bash, even if you don’t give executable permission to the script file.

                                                      I completely agree on this one. It both provides portability and flexibility, for example when a situation arises where you want to for whatever reason need to run it with a different bash. I also think it’s a good indicator of whether the author of a scripts knows what they were doing. There might be exceptions though, similar to other full path situations (so scripting languages), but there should be a good reason. If you don’t have one or when in doubt use #!/usr/bin/env for bash, python, ruby, etc.

                                                      Use the .sh (or .bash) extension for your file. It may be fancy to not have an extension for your script, but unless your case explicitly depends on it, you’re probably just trying to do clever stuff. Clever stuff are hard to understand.

                                                      I disagree, because that would theoretically mean you’d have to append .sh to a big chunk of what is in /bin/ and /usr/bin. For example /usr/bin/firefox.sh. Also renaming all ./configure to ./configure.sh. Why does the caller need to know what language a piece of software is programmed in?

                                                      Use set -o X at the start of your script.

                                                      I wished these would become part of POSIX.

                                                      Use [[ ]] for conditions in if / while statements, instead of [ ] or test.

                                                      I disagree. Don’t reduce portability, especially when you might not need what you gain from it.

                                                      1. 7

                                                        I disagree. If you care for portability, and for example resource constrained environments use POSIX shell.

                                                        Completely agreed. bash is the default shell on GNU/Linux systems, but not on *BSD, MacOS, or most embedded Linux systems. Worse, on macOS, it is installed, but only the last GPLv2 version, so newer bashisms won’t work. All of the other shells are available as extra packages on most of these systems, so if you’re comfortable writing a shell script that requires someone to install an optional package then zsh is no worse than bash.

                                                        1. 1

                                                          Dash was preferred under the hood by many Debian-based systems. It’s POSIX-compatible but not Bash-compatible.

                                                          1. 1

                                                            As I recall, they install dash as sh, but still have bash installed by default. On BSD systems, bash is an optional install and a lot of folks who want a non-default shell will go with something else.

                                                        2. 2

                                                          Use set -o X at the start of your script.

                                                          I wished these would become part of POSIX.

                                                          They are.

                                                          Although, I prefer set -eux etc.

                                                          1. 1

                                                            This was originally on pipefail. Edited the point to mean I agree on the options without pasting each. Oops.

                                                            pipefail feels like something that should be in POSIX, but isn’t currently.

                                                        1. 2

                                                          This program is a very simple and easy-to-read implementation of clipboard tools on wayland, which is like this blog post but with actual code: https://git.sr.ht/~noocsharp/wayclip

                                                          1. 9

                                                            BTW I only use egrep and fgrep !!!

                                                            • egrep means [0-9]+ works like in Perl/Python/JS/PCRE/every language, not [0-9]\+ like GNU grep.
                                                              • egrep syntax is consistent with bash [[ $x =~ $pat ]], awk and sed –regexp-extended (GNU extension)
                                                              • These are POSIX extended regular expressions. Awk uses them too.
                                                            • fgrep is useful when I want to search for source code containing operators, without worrying about escaping

                                                            This style makes regular expressions and grep a lot easier to remember! I want to remember 2 regex syntaxes (shell and every language), not 3 (shell, awk, every language) !

                                                            This change should be reverted; there is no point to needless breakage

                                                            Again you wouldn’t remove grep –long-flags because it’s not POSIX

                                                            1. 6

                                                              Yes, egrep is a shorthand for grep -E and fgrep for grep -F. You haven’t lost anything. You can even make aliases or script wrappers if you want to use that syntax. But the point of this decision is that if you’re writing a script which is meant to be portable, you should use grep’s flags.

                                                              1. 10

                                                                I understand that, but what I don’t understand is why anyone, especially GNU grep maintainers, would think this would lead to portable shell scripts

                                                                Again, why not remove grep --extended-regexp too? You’re supposed to use grep -E, idiot

                                                            1. 3

                                                              Ah. Took me far too long to realise this is comments in the feedback sense rather than the <!-- --> sense.

                                                              1. 2

                                                                And for the three folks in Finland who administer multi-user Linux instances and rely on privileged ports for their mainframe-era security properties

                                                                Why single out Finland there?

                                                                1. 5

                                                                  Total guess: reference to Finland being the birthplace of IRC, which is just about the last remaining thing in the wild that uses ident, which sits on TCP port 113?

                                                                  1. 18

                                                                    Finland is the birthplace of Linus Torvalds.

                                                                    1. 3

                                                                      Total guess: reference to Finland being the birthplace of IRC, which is just about the last remaining thing in the wild that uses ident, which sits on TCP port 113?

                                                                      More fun facts: Almost no IRC servers use the RFC-defined port of 194 - it’s almost always 6667 or 6697.

                                                                      1. 2

                                                                        I’ve never heard of an IRC server using 194, nor seen it mentioned as an example in any IRC daemon’s config file templates.

                                                                        Edit to add: I feel like IRC has more… RFC documents which don’t bear much resemblance to reality written about it than most protocols.

                                                                    2. 3

                                                                      torilla tavataan

                                                                    1. 4

                                                                      Just a nitpick, but:

                                                                      For example, you might use a regular expression like [a-z.]@yourcompany.com to validate if something is a valid company email address

                                                                      is a rather strange regex (usernames can be any character, but only one).

                                                                      1. 6

                                                                        Late joiners will have to pick something from a higher unicode plane.

                                                                        1. 2

                                                                          even nittier pick - it’s a single lowercase a to z plus full-stop, isn’t it?

                                                                          1. 1

                                                                            No, the . matches any character.

                                                                            EDIT: actually according to regexr’s explainer it’s a literal full stop… wtf? why does the context change the behavior of that?

                                                                            1. 2

                                                                              It’s a character group. The only character I know off the top of my head that means something inside one is ^ and that only at the beginning to mean “match the inverse of this group “

                                                                              1. 1

                                                                                no, inside [] characters are literal and define sets. Neither . (any) nor ?, + or * (cardinality) make sense here, so they are literals.

                                                                            2. 1

                                                                              Fixed that! Thank you :)

                                                                            1. 1

                                                                              just bash and systemd

                                                                              Those are the biggest “just” dependencies to just get a wallpaper. It also uses curl and GLib’s gsettings. That’s probably millions of lines of code… to set a wallpaper.

                                                                              But if it works for you, that’s fine.

                                                                              1. 12

                                                                                On many systems those come already-installed. It’s not like they’re adding much.

                                                                              1. 2

                                                                                I’ve been trying to think if this is doable in POSIX make, as underpowered as that is… sadly my make-fu is not strong enough. I don’t think it has those good functions in it.

                                                                                1. 2

                                                                                  Much more interesting than FizzBuzz: https://nullprogram.com/blog/2016/04/30/

                                                                                  1. 1

                                                                                    omg incredible. so it can be done!

                                                                                1. 4

                                                                                  This is great - the level is much better than the many shallow tutorials which only explain the quirks of prefix notation and what cons is etc. One typo:

                                                                                  REPL> (for-each (lambda (str)
                                                                                                    (display
                                                                                                     (string-append "I just love "
                                                                                                                    (string-upcase str)
                                                                                                                    "!!!\n")))
                                                                                                  '("strawberries" "bananas" "grapes"))
                                                                                  ; prints:
                                                                                  ;   I just love ICE CREAM!!!
                                                                                  ;   I just love FUDGE!!!
                                                                                  ;   I just love COKIES!!!
                                                                                  
                                                                                  1. 3

                                                                                    Also in section 12:

                                                                                    We can also make a lambda and apply it. Let’s make one that can perform square roots:

                                                                                    actually calculates a square.

                                                                                    1. 2

                                                                                      I still find myself wanting a tutorial on the messy, everyday usage like reading, writing, and parsing strings from/to ports, but I enjoyed that!

                                                                                      1. 3

                                                                                        I still find myself wanting a tutorial on the messy, everyday usage like reading, writing, and parsing strings from/to ports, but I enjoyed that!

                                                                                        I wonder if there would be any interest in a Real World CHICKEN Scheme book about practical programming, in the spirit of Real World Haskell et al. I have an urge to write one.

                                                                                        1. 3

                                                                                          Please do! There’s been a long standing TODO of writing such a book.

                                                                                          1. 2

                                                                                            I would definitely read this book. this is the stage of my scheme understanding, and is greatly benefit from a real-world book.

                                                                                          2. 1

                                                                                            I had considered a section on ports, but wasn’t really sure if it would make things too large. Heck, I considered following up the metacircular evaluator part with one with implementing a simple scheme read (from ports) in a similarly small amount of code but thought, that’s probably getting to be too much and in the way of this feeling like a document someone can complete in a short amount of time. What do you think?

                                                                                            1. 1

                                                                                              Do you know of any existing articles/tutorials which fill that gap? If yes, you could link to them; if no, you could always write a separate article. I think you’ve got the length about perfect as it is.

                                                                                          3. 1

                                                                                            Great catches! I’ve updated the documents with the fixes. Thank you so much!

                                                                                        1. 44

                                                                                          Tabs have the problem that you can’t determine the width of a line, which makes auto-formatted code look weird when viewed with a different width. And configuring them everywhere (editor, terminal emulator, various web services) to display as a reasonable number of spaces is tedious and often impossible.

                                                                                          1. 24

                                                                                            I agree with you, tabs introduce issues up and down the pipeline. On GitHub alone:

                                                                                            • diffing
                                                                                            • are settings are per person or per repo
                                                                                            • yaml or python, where whitespace is significant
                                                                                            • what if you want things to line up, like comments, or a series of statements?
                                                                                            • combinations of the above interacting

                                                                                            If you’re turning this into, say, epub or pdf, would you expect readers and viewer apps to be able to adjust this?

                                                                                            I fixed up some old code this week, in a book; tabs were mostly 8 spaces, but, well, varied chapter by chapter. Instead of leaving ambiguity, mystery, puzzling, and headaches for future editors and readers to trip over, I made them spaces instead.

                                                                                            1. 8

                                                                                              I don’t get the point about yaml and python. You indent with tabs, one tab per level, that’s it. What problems do you see?

                                                                                              1. 4

                                                                                                In the Python REPL, tabs look ugly. The first one is 4 columns (because 4 columns are taken up by the “>>> “ prompt), the rest are 8 columns. So you end up with this:

                                                                                                >>> for n in range(20):
                                                                                                ...     if n % 2 == 1:
                                                                                                ...             print(n*n)
                                                                                                
                                                                                                1. 9

                                                                                                  When I’m in the Python REPL, I only ever use a single space. Saves so much mashing the spacebar and maintaining readability is never an issue as I’m likely just doing some debugging:

                                                                                                  >>> for n in range(20):
                                                                                                  ...  if n % 2 == 1:
                                                                                                  ...   print(n*n)
                                                                                                  
                                                                                                  1. 3

                                                                                                    True, but this shows that tabs don’t work well everywhere. Spaces do.

                                                                                                    1. 1

                                                                                                      Unless you use a proportional font.

                                                                                                      1. 2

                                                                                                        Even with a proportional font, all spaces have the same width.

                                                                                                2. 3
                                                                                                  def a():
                                                                                                  	x
                                                                                                          y
                                                                                                  

                                                                                                  The two lines look the same, but they’re not to the python interpreter, even though you could use just spaces or just tabs.

                                                                                                  1. 17

                                                                                                    Don’t mix tabs and spaces for indentation, especially not for a language where indentation matters. Your code snippet does not work in Python 3:

                                                                                                    TabError: inconsistent use of tabs and spaces in indentation

                                                                                                    1. 1

                                                                                                      That was my point.

                                                                                                      1. 3

                                                                                                        Your point is don’t mix tabs and spaces? Nobody proposed that. The comment you responded to literally states:

                                                                                                        You indent with tabs, one tab per level, that’s it.

                                                                                                        Or is your point don’t use tabs because if you mix in spaces it doesn’t work?
                                                                                                        Then my answer is don’t use spaces, because if you mix in tabs it doesn’t work.

                                                                                                3. 8

                                                                                                  what if you want things to line up, like comments, or a series of statements?

                                                                                                  https://nickgravgaard.com/elastic-tabstops/

                                                                                                  1. 2

                                                                                                    I appreciate that this is still surfaced, and absolutely adore it. I’d have been swayed by “tabs for indenting, spaces for alignment, for the sake of accessibility” if not for Lisp, which will typically includes indents of a regular tab-width off of an existing (arbitrary) alignment, such that indentation levels don’t consistently align with multiples of any regular tab-stops (eg. the spaces preceeding indention level 3 might vary from block to block depending on the context, and could even be at an odd offset). Elastic tab-stops seem like the only approach that could cator to this quirk, though I haven’t tried the demo with the context in mind.

                                                                                                    I also lament the lack of traction in implementations for Emacs, though it’s heartwarming to see those implementations that are featured. Widespread editor support may be the least of the hurdles to adoption, which feels like a third-party candidate in a two-party system. Does .editorconfig include elastics as an option? I’m not sure exactly how much work adding that option would entail, but that might be a great way to contribute to the preservation of this idea without the skills necessary to actually impliment support in an editor.

                                                                                                  2. 9

                                                                                                    what if you want things to line up

                                                                                                    Easy. Don’t.

                                                                                                    If you want to talk about diffing issues, then look at the diffs around half the Haskell community as a new value being longer requires a whole block to shift and either a bunch of manual manipulations or having to run a tool to parse and set your code just because you felt like things had to line up.

                                                                                                    1. 3

                                                                                                      what if you want things to line up, like comments, or a series of statements?

                                                                                                      Then you put spaces after your tabs. https://intellindent.info/seriously/

                                                                                                    2. 2

                                                                                                      I use tabs and autoformatters. I don’t think my code looks weird with any width between 2 and 8. What kind of weirdness do you refer to? About configuring, most developers have a dotfiles repo and manicure their setup there, why would setting a tabwidth there be more tedious than what most people do already anyway?

                                                                                                      1. 5

                                                                                                        Let’s say that you have the maximum line length set to 24 columns (just to make the example clear). You write code like this:

                                                                                                        if True:
                                                                                                            print("abcdefg")
                                                                                                            if False:
                                                                                                                print("xyz")
                                                                                                        

                                                                                                        With the tab width set to 4 columns, your autoformatter will leave all lines alone. However, if someone has the tab width set to 8, the fourth line will overreach the limit. If they’re also using the same formatter, it will break up the fourth line. Then you’ll wonder why it’s broken up, even though it’s the same length as the second line, which wasn’t broken up. And your formatter might join it up again, which will create endless conflicts.

                                                                                                        1. 4

                                                                                                          Optimal line reading length is around 60 chars per line, not 60 characters including all leading whitespace. Setting bounds based on character from column 0 is arbitrary, and the only goal should be not too many characters per line starting at the first non-whitespace character (and even this is within reason because let’s be real, long strings like URLs never fit).

                                                                                                          1. 3

                                                                                                            Setting bounds based on character from column 0 is arbitrary

                                                                                                            Not if you print the code in mediums of limited width. A4 paper, PDF, and web pages viewed from a phone come to mind. For many of those a hard limit of 80 columns from the start is a pretty good start.

                                                                                                            1. 1

                                                                                                              That is a fairer point as I was referring to looking at code in an editor–reason being that we’ve been discussing mediums where users can easily adjust the tab-width which is more on topic than static mediums. Web pages are the weird one where it should technically be just as easy to configure the width, but browsers have made it obnoxious or impossible to set our preferred width instead of 8 (I commented about it in the Prettier thread as people seem so riled up about it looking bad on GitHub instead of seeing the bigger picture that GitHub isn’t where all source code lives).

                                                                                                              1. 5

                                                                                                                Note that my favourite editor is the left half of my 13-inch laptop screen…

                                                                                                          2. 1

                                                                                                            I never really understood the need for a maximum length when writing software. Sure it makes sense to consider maximum line length when writing for a book or a PDF, but then it’s not about programming but about typesetting; you also don’t care about the programming font unless you’re writing to publish.

                                                                                                            If you really want to set a maximum line length, I’d recommend to have a maximum line length excluding the indentation, so that when you have to indent a block deeper or shallower, you don’t need to recalculate where the code breaks.

                                                                                                            But really don’t use a formatter to force both upper and lower limits to line lengths; sometimes it makes sense to use long lines and sometimes it makes sense to use short lines.

                                                                                                            1. 5

                                                                                                              Maximum line length makes sense because code is read more often than it’s written. In terms of readability, you’re probably right about maximum line length excluding indentation. But on the other hand, one of the benefits of maximum line length is being able to put multiple text buffers side-by-side on a normal monitor. Perhaps the very smart thing would be a maximum of 60 chars, excluding indentation, with a max of 110 chars including indentation. Of course, you have to treat tabs as a fixed, known width to do that.

                                                                                                              1. 3

                                                                                                                I never really understood the need for a maximum length when writing software.

                                                                                                                There are a bunch of editing tasks for which I want to view 2 or 3 different pieces of code side by side. I can only fit so many editors side by side at a font size that’s still readable.

                                                                                                                • looking at caller and callee
                                                                                                                • 3 way diff views
                                                                                                                • old and new versions of the same code
                                                                                                                1. 3

                                                                                                                  Personally, I hate manually breaking up lines when they get too long to read, so that’s what an autoformatter is for. Obviously the maximum readable length differs, but to do it automatically, one has to pick some arbitrary limit.

                                                                                                                  1. 1

                                                                                                                    Sure, but there’s a difference between breaking lines when they get too long, and putting them together again when they are too short.

                                                                                                                    When I use black to format Python code, it always annoys me that I cannot make lines shorter than the hard limit. I don’t really care that I can’t make them longer than some arbitrary limit. Sure, the limit is configurable, but it’s per-file, not per-line.

                                                                                                                    If the problem you have is “where should I split this 120-character one-liner that’s indented with 10 tabs”, then tabs aren’t your problem.