Threads for nanny

    1. 3

      The empty set is a subset of every set, ergo an empty string is a substring of every string. So for any given string an empty string exists at every valid index.

      I think this is pretty foundational first order logic. I’d be pretty interested to hear any good faith argument against it.

      Once you accept the above then the question is just one of semantics, are arrays in your language 0 indexed or 1 indexed? Does your function return the index of the first occurrence of the substring, the last occurrence all occurrences, etc?

      1. 2

        The empty set is a subset of every set, ergo an empty string is a substring of every string. So for any given string an empty string exists at every valid index.

        You’re conflating different concepts here. Indices refer to elements, not subsets. So in strings, indices refer to characters, not substrings. There are no characters in the empty string, so we can’t use it to find an index of a character in another string.

        If we translate your second sentence back to the language of sets:

        So for any given [set] [the empty set] exists at every [reference to an element]

        It doesn’t make sense anymore. Yes, the empty set exists for every element, but the contents of the empty set don’t (because it has no contents).

        See also the famous quote:

        Nothing is better than eternal happiness. A ham sandwich is better than nothing. Therefore, a ham sandwich is better than eternal happiness

        1. 2

          It doesn’t make sense anymore. Yes, the empty set exists for every element, but the contents of the empty set don’t (because it has no contents).

          You’re just moving the words around, but it doesn’t change the meaning. A set is a collection of objects, a string is a set of characters a string is a set, an empty string is an empty set. The empty set is a subset of every set so an empty string is a substring of every string.

          See also the famous quote:

          Nothing is better than eternal happiness. A ham sandwich is better than nothing. Therefore, a ham sandwich is >>better than eternal happiness

          The quote may be famous but it mixes up “nothing” with an empty set. A box that contains every 4 sided triangle is still a box. If you rewrite it to be “The set of things which are better than eternal happiness is empty. A ham sandwich is better than nothing. Therefore nothing is not in the empty set.” You get a logically consistent statement; because by definition an empty set has no subsets except for itself. The original quote abuses the linguistic quirk of “nothing” being used as a sort of colloquial contronym. The only way that sentence could be used as an argument in good faith is if you believe that person saying it means “the set of things better than a ham sandwich is empty.” Someone might believe that to be true…. (I don’t eat ham, so not me ;) ) and if they did believe it to be true then… well I wish them all the ham sandwiches.

          If I ask you “What would you take in trade for your life?” and you answer “nothing”, you aren’t saying that you would literally give me your life for nothing in exchange, rather you are saying the set of things which you would take in exchange for your life is empty.

          1. 3

            A set is a collection of objects, a string is a set of characters a string is a set

            Surely a string is a sequence of characters, not a set of characters. If it’s a set of characters, then we would have “no” = “nooooooo”.

            1. 1

              Exactly, it’s trivial to prove that a string is not a set in the mathematical sense, which I think immediately invalidates all of the ideas around subsets etc. when talking about indexing like this.

            2. 1

              The quote may be famous but it mixes up “nothing” with an empty set.

              That was exactly my point. It’s similar to what you’re doing.

              The empty set is a subset of every set so an empty string is a substring of every string.

              Yes, but you cannot pick out an element that the empty substring refers to, so it has no index. Indices refer to elements, not substrings.

              1. 1

                Every index refers to a substring of the string, each of those substrings contains the empty set.

                1. 1

                  As I said, “Once you accept the above then the question is just one of semantics.” If the behavior isn’t obvious it should be documented and if it IS obvious it should be extra documented :)

                  So where my expectations and your don’t align is that the examples in the article take a string as an argument, if the a string is taken as an argument and doesn’t raise some sort of exception when it is not precisely one element long I would expect that the returned integer is not the index of the match, but rather the lowest index of the substring which contains a match.

                  Not coincidently that matches the behavior and the documentation of the str.find method in python:

                  find(...)
                      S.find(sub[, start[, end]]) -> int
                  
                      Return the lowest index in S where substring sub is found,
                      such that sub is contained within S[start:end].  Optional
                      arguments start and end are interpreted as in slice notation.
                  
                      Return -1 on failure.
                  

                  the indexOf method in javascript

                  The indexOf() method of String values searches this string and returns the index of the first occurrence of the specified substring. It takes an optional starting position and returns the first occurrence of the specified substring at an index greater than or equal to the specified number.
                  

                  the indexOf method in .net

                  Reports the zero-based index of the first occurrence of a specified Unicode character or string within this instance. The method returns -1 if the character or string is not found in this instance.
                  <.... elided ...>
                  Returns
                  Int32
                  The zero-based index position of value from the start of the current instance if that string is found, or -1 if it is not. If value is Empty, the return value is startIndex.
                  

                  So, I’d say that all 3 of the examples I happened to reach for agreed with me on both counts. i.e. returning 0 is both logical and documented.

              2. 1

                Indices refer to elements

                Not necessarily: an index beyond the end of a string is still an index, but it does not refer to an element.

                For example, you can’t pick a character like str[len(str)] but you can take a slice like substr(str, len(str), 0). So len(str) is a valid index in the string that does not refer to a character in the string.

              3. 1

                The empty set is a subset of every set, so an empty cup is a subset of every cup.

                The empty set is a subset of every set, so an empty gesture is a subset of every gesture.

                These occurrences of ‘empty’ more obviously don’t all mean the same thing, but neither does the occurrence of “empty” in “empty string”.

                1. 1

                  Neither a cup nor a gesture is made up of sub-cups or sub-gestures.

                  Aside from all the other ways I’ve presented my argument in this thread and all of the literal examples of methods/functions across multiple languages I don’t know that I could say anything else to convince someone at this point who remains unconvinced.

                2. 1

                  Reminds me of this old blog post of mine: https://blog.carlana.net/post/2020/python-square-of-opposition/

                  Same idea but the set theory version.

                  1. 1

                    You are talking about cardinality in a question about ordinals. Sets are unordered.

                    If the question had been ‘is the empty string a substring of the empty string’ then I would agree with your analysis,

                    1. 1

                      Not all sets are unordered nor are they composed of unique values, an ordered multiset contains all 3 of cardinal, ordinal and multiple information. All subsets of an ordered subset are ordered.

                      1. 1

                        I think you are using a different set theory to me and you probably need to specify which one if you’re making claims about the behaviour of sets.

                        1. 1

                          I’m pretty far outside of my own math training at this point of the discussion, ordered sets and multisets are both used in topology. They have properties that allow for the application of traditional set theory but also preserve/convey the additional information of order and multiplicity.

                          The important part of this is just that basic set theory holds true on these types of sets as well and the call signature and semantics (such as those I pasted elsewhere in this thread) map directly to the expected behaviors of such sets. I.E. indexOf is method on a String which takes a String(“sub”) and returns the index of the first occurrence of the substring within the string.

                          I’m having trouble keeping track of who and how many people are participating in this thread, but… a common refrain I’m seeing is that an index refers to an “element”, but what I’m asserting is that a string is a set and the functions return indices which are offsets describing substrings which are strings which are sets.

                          I /think/ C++ implements String.find with multiple signatures including one that takes char* as an argument which is probably the closest thing to the “non-set” view of this discussion since the closest equivalent to an “empty string” passed to that particular method would be a null pointer which will result in a segfault.

                          So, the “elementalists” in this thread can rest knowing that their version of what is appropriate in the world is sometimes accounted for unambiguously; just not in any of the functions/methods in the linked article.

                          edit: since written comms often lack the appropriate level of tone/mood hinting. I’d just like to say I’m enjoying this conversation and hope that none of the respondents in this thread are reading any animosity or rudeness in my responses

                  2. 16

                    Thinking about indices gets easier once you start thinking of indices as the spaces between list elements.

                    Say you have a fence with 20 vertical slats that each each 1 inch wide, with little gap between them. Then your fence is 20 inches long (it’s a small fence, don’t judge), and it has 21 gaps between and around the slats. If you lay down a tape measure, the first gap is at 0inches, the second gap is at 1inch, the last gap is at 20 inches. Those are unambiguously at integer coordinates. But think of the fence slats. The first slat spans from 0in to 1in: is that slat number 0 or slat number 1? It’s ambiguous!

                    All the messiness of indexing comes from the ambiguity of assigning integers to elements. Whenever you can avoid that, and only assign integers to the gaps between elements, the messiness goes away. The substring “y” in “xyz” ranges from gap 1 to gap 2 inclusive. The first occurrence of “” in “xyz” ranges from gap 0 to 0. The last occurrence of “” in “xyz” ranges from gap 3 to gap 3.

                    Of course normal humans don’t want 0 to be an index, but you could start counting the gaps from 1 instead. So “xyz” has 4 gaps numbered 1,2,3,4. And the first occurrence of “” ranges from gap 1 to gap 1 and the last ranges from gap 4 to gap 4.

                    1. 8

                      Thinking about indices gets easier once you start thinking of indices as the spaces between list elements.

                      Or, an index is a distance from the start.

                      How far is “y” from the start of “xyz”? 1.

                      When you are looking for “”, can you find it 0 from the start? yes. Can you find it len from the start? yes.

                      1. 5

                        I think this is a good insight because it shows that 0-based “indexes” are really offsets from the start (A[o] is the element o steps from the beginning of A) and 1-based “indexes” are the natural numberings of the elements (A[n] is the nth element of A).

                        I wonder why programmers started using “index” and “indexing” for these operations and concepts, because they don’t match the normal English language definitions of those terms.

                        Offset, address, nth, position, location, key, etc. all seem like better choices to me.

                        1. 1

                          I wonder why programmers started using “index” and “indexing” for these operations and concepts, because they don’t match the normal English language definitions of those terms.

                          Probably because it’s used that way in math. Some fields of math even did their numbering from 0 rather than 1, just like we usually do on computers.

                          1. 1

                            This short article on the etymology of “index” might be interesting to you or others.

                            https://www.liverpooluniversitypress.co.uk/doi/pdf/10.3828/indexer.1983.13.3.2

                      2. 2

                        I often wonder how much the fact that English conflates cardinals and ordinals (there are suffixes for ordinals, but often cardinals are used as ordinals) causes this kind of confusion, any native speakers of languages with a stricter distinction between the two encounter this kind of confusion?

                        1. 1

                          I only speak English online, and have only come across this discussion online, for one anecdatum.

                        2. 1

                          All the messiness of indexing comes from the ambiguity of assigning integers to elements. Whenever you can avoid that, and only assign integers to the gaps between elements, the messiness goes away. The substring “y” in “xyz” ranges from gap 1 to gap 2 inclusive.

                          So indexOf should return a range? Not just one integer?

                          The first occurrence of “” in “xyz” ranges from gap 0 to 0. The last occurrence of “” in “xyz” ranges from gap 3 to gap 3.

                          I don’t see how the messiness is avoided with ranges.

                          1. How many times does “” appear in “xyz”? 4? One for each “gap”?

                          2. If we think of “indices as the spaces between list elements” (emphasis yours and mine), then there are only 2 “indices” in the string “xyz” (one between x and y, and one between y and z).

                          3. Okay, so maybe you meant that “indices” should refer to the “spaces” before or after a list element. Why do you think there is a “space” or “gap” before the x or after the z in “xyz” in the first place?

                          4. If we’re using 0-indexing here, what’s located at indices 3, 4, and 5?

                          5. If you want “index” to mean “range”, what do we call the numbers we’re assigning to the “gaps”? If we’re indexing the “gaps”, then we still have the messiness of indices (we’ve just indexed the gaps and not the elements).

                          IMO:

                          Indices should refer to elements. If we want to find the index of “brown” (string A) in the string “the quick brown fox” (string B), what we really mean is that we want an index N of an element in string B such that stringA[0] is equal to stringB[N], and stringA[1] is equal to stringB[N+1], all the way to N = len(stringA) - 1.

                          While the empty string is in fact a subset of all other strings, trying to find an index of it inside another string should be undefined because it has no elements. There is no “element at index 0” for the empty string. The index of “” in “abc”, “abc” in “”, and “” in “” should all be undefined.

                          1. 5
                            1. Yes, “” appears in “xyz” 4 times.
                            2. You shouldn’t index only the spaces between list elements, because then “x” has no indices, which is too few. I’m saying you should index the spaces between and around list elements, so that “x” has two indices. As you realize in the next numbered point.
                            3. There’s a space/gap before the “x” and after the “z” in “xyz” because we define there to be one? I’m confused at the question. In the fence analogy they are very natural places to talk about: it’s the start and end of the fence!
                            4. Index 3 is the end of the string “xyz”, just after the “z”. Indices 4 and 5 are out of bounds.
                            5. No, I don’t want “index” to mean “range”; those are obviously different. A range is two indices.

                            When you index gaps, the property that find has is that find(needle, haystack) = i implies that haystack from gap i to gap i + len(needle) inclusive is equal to needle.

                            Even under your definition, you can talk about the index of the empty string. Saying that the empty string is a substring of “xyz” at index 0 is correct per your definition, because for all n from 0 to 0 exclusive, “xyz”[n] = “”[n]. It’s like the fact that the sum of no numbers is zero, the product of no numbers is 1, the conjunction of no statements is true, and the disjuction of no statements is false. These all sound strange at first, but you’ll find that any time you want to use them (often in base cases of a proof or an algorithm), these definitions are correct and using them will never lead you astray.

                            1. 4

                              A+ comment! I especially like the density of your last observation; many logically true things do not seem intuitively correct until you fill out the logic around them.

                            2. 1

                              So indexOf should return a range? Not just one integer?

                              It does not need to since the needle yields the span. So it return the start of the range.

                          2. 3

                            But if we have 2 * 0 = 0, the inverse of that would be 0 / 0 = 2, which is just not true.

                            Is that “just not true” because 0 / 0 = 1?

                            Is that “just not true” because X * 0 = Y relates an infinite number of Xs to a single Y, so when inverted that would relate a single Y to an infinite number of Xs (which doesn’t match how functions work by definition)?

                            Or is it something else entirely?

                            1. 6

                              I think the elementary school understanding of division is enough to explain this. If you want to eat a pizza of size 2, how many pieces of size 0 do you need to put together to make that pizza of size 2? Mu. It doesn’t matter how many size 0 pizza slices you have, even an infinite amount, it still won’t add up to a pizza of size 2.

                              1. 1

                                This is the best answer. People go about proving by resourcing to premises that could themselves be questioned. The OP talks about “general consensus”, it’s not general consensus at all. It’s math, it what is correct and what males sense. People forget maths is just establishing formal notation and naming for what is already there. It just won’t add up to a pizza the size of 2. That is a fact of the universe, it’s not an analogy, it’s the case itself at display. No one decided anything, it just is.

                                1. 3

                                  I don’t want to over-privilege elementary school reasoning. For example, with computer programming, we use discrete math all the time, which isn’t really an elementary school concept. And elementary school notions of infinity and limits are fuzzy at best. But in this case, it seems pretty clear that if division is the name for the operation that one does with pizzas, then dividing by zero has to be undefined. Dividing by an infinitesimal? Elementary school intuitions on that might be right or wrong depending on how well thought through they are. But dividing by zero? There’s a pretty simple intuition here and it is the “correct” answer based on how we define division mathematically.

                              2. 2

                                You can’t even divide 0 by 0 and get a answer.

                                As I understand it division by 0 can literally be anything, which is bad because you could start proofing that 2 = 1 which is obviously false.

                                1. 2

                                  I think it’s something else entirely. Saying it’s “just not true” seems a bit circular. You can easily ask “well why is it just not true?”

                                  The “real” reason is that given the axioms of everyday arithmetic, division by zero is just meaningless nonsense. It has no semantic meaning. It’s like the phrase “colorless green ideas sleep furiously”. When using the (common) definitions of those words, the statement doesn’t express anything coherent.

                                  “Well why is ‘colorless green ideas sleep furiously’ meaningless nonsense?” Because our definitions make it so. If we use a more poetic interpretation, maybe the statement means “dull new thoughts lie waiting to spring to life”. But with the face-value definitions it’s just not coherent. Other less controversial example might be the phrases “married bachelor” or “square circle”. “1 divided by 0” is nonsense in the way that “I am a married bachelor” is nonsense.

                                  1. 1

                                    Other less controversial example might be the phrases “married bachelor” or “square circle”.

                                    The problem with language is that deliberately breaking the language is a feature of the language. It’s commonly used in sarcasm, where you say something obviously nonsensical to emphasize its nonsensicality.

                                    For instance, imagine the following conversation:

                                    A: I’m a bachelor. B: And are you married? A: I’m a married bachelor. B:

                                    This is because language is a tool whose purpose is to convey meaning to other speakers. “colorless green ideas” is nonsensical precisely because it

                                    1. is of nonsensical mental image (i.e. when you parse it with your mind, the mental picture of “colorless green” contradicts itself and you fail to draw a mental image), and

                                    2. is being neutrally expressed, and therefore can’t be a deliberate flouting of rules.

                                    In the right context, the phrase could be meaningful as sarcasm. But in all the contexts you’ve seen it, it isn’t. It’s just meaningless.

                                2. 1

                                  Wow those benchmarks are very disappointing. I’m dealing with some Python performance problems at work and I was really looking forward to the supposed performance gains.

                                  1. 4

                                    I bet there are all sorts of strange unexpected consequences of pdfs not being able to represent circles. I love those kinds of strange stories that trace complex consequences back to simple causes.

                                    Having said that, physical reality itself can’t really represent a circle properly. I always think it’s funny that people know exactly what a circle is and can argue about whether something is a ‘true’ or ‘perfect’ circle without ever having seen one in reality. It’s that sort of thing that made Plato convinced there was a world of pure forms that we remember.

                                    1. 3

                                      idk, a good enough lathe can make a circle that is pretty damn close to perfection, certainly smaller than unaided human perception. And you can measure that error consistently. With the really good ones you need to do things like account for how much the pressure of the sandpaper applied will cause your steel or whatever to flex. You can certainly make parts that are airtight, with enough work. It might not be a circle, but there’s no way you’d tell the difference between it and a true circle without lab equipment.

                                      Sure, down at the molecular level it becomes a raster of atoms, but there you could also start considering whether a proton or electron is a perfect circle, so. We are always capable of imagining things that don’t exist.

                                      1. 1

                                        This is the only real circle: x^2 + y^2 = 1

                                      2. 17

                                        Eh, when I’m composing longer text like this, I tend to write in Markdown, with a line per sentence. It’s easier to rearrange sentences, and when collaborating, git works much better, and the diffs are much easier to read.

                                        1. 14

                                          I do the same, it’s what is called Semantic Line Breaks which effectively makes the argument in the OP moot.

                                          1. 8

                                            It’s a really good idea, but I have to wonder if the time you save rearranging sentences actually makes up for the extra time spent explaining semantic line breaks to contributors.

                                            In the end I always read these things (Semantic line breaks and the OP) and think like “yeah, I should do that!” and then I end up forgetting a few days later and going back to one space.

                                            1. 2

                                              It’s difficult, especially if you don’t have a linter that can point it out to people and enforce it.

                                              You don’t necessarily need to go through and change all old text. You can just migrate text to use sembr whenever you edit it.

                                              1. 1

                                                that’s why having the website’s nice 😄

                                                1. 1

                                                  I don’t save much time rearranging sentences (though I do save some) but I save a lot of time from not having merge conflicts when two people edit different sentences in the same paragraph. Telling contributors ‘one line per sentence, it will make merging easier’ doesn’t take long.

                                                2. 5

                                                  SemBr appealed to me at first, but I disliked it when I tried using it. When you have short clauses, the source ends up looking weirdly like poetry (or like it’s trying to be a poem).

                                                  1. 1

                                                    I tend to not be that strict with sembr and tend to mostly put in a line break when I input a period or sometimes a comma, if I’ve written a long sentence.

                                                  2. 4

                                                    I only use SemBr with line-based markup formats, which these days is only mandoc. When writing markdown I try to make it look as much as possible like I intended the source to be the version that people read - after all that was the point of markdown, and it fits my habits from learning usenet netiquette at an impressionable age.

                                                    1. 4

                                                      I use that too when I’m writing manual pages — the roff(7) syntax assumes that intraline periods are for abbreviations so it won’t split the line at them. It’s a very convenient convention, I just find it distracting to read my own text with the linebreaks.

                                                      Anything else, I two space and get the benefits of greater readability and vim/emacs/fmt/etc compatibility.

                                                      1. 2

                                                        I don’t care if document authors use semantic line breaks “behind the scenes,” but when they escape into the rendered version—i.e., when the user-visible paragraphs have line breaks in the middle of them—they are distractingly unprofessional-looking.

                                                        This tends to be especially common on Stack Exchange sites, for some reason; maybe their Markdown parser renders intra-paragraph line breaks as real line breaks.

                                                        1. 2

                                                          Funnily enough, many markdown generators consider two spaces at the end of a line as a line break. I typed enter once before this sentence, but it’s not in a new paragraph. It obeyed semantic line breaks. But now I’ll type two spaces after this colon:
                                                          This sentence also had only one newline before it, but the two spaces at the end of the line before it added a line break.

                                                          1. 3

                                                            Yeah, I think that’s a fairly widely-supported Markdown thing. Trailing whitespace is usually considered sloppy, if not an outright mistake; this is the only context I know of where trailing whitespace is actually semantically meaningful.

                                                          2. 1

                                                            GitHub’s markdown renderer does this too. I seem to remember hacking my markdown exporter to spit out paragraphs as a single line to get around it.

                                                            1. 2

                                                              Hmm? I think that’s an issue on comments in PRs, but not for Markdown files in repos.

                                                              I’ve collaborated on a couple of books in Github repos, and the one with line-per-sentence / semantic line breaks looked as expected, while the one with paragraph-per-line had frequent merge conflicts.

                                                              1. 1

                                                                Oh, maybe. Comments and also the PR summary itself. (Which I tend to go to town with.) Weird that they should use different renders for Markdown files vs Markdown text in comments etc, huh?

                                                          3. 1

                                                            Is it though? AFAIU SemBr is not just about one line per sentence, it’s that plus adding line breaks where it makes sense.

                                                            1. 1

                                                              Glad to know there’s several of us weirdos. Though I always broke down my lines to be 80 characters, a sentence spanning multiple if need be. Not all editors will navigate a wrapped line every effectively.

                                                            2. 6

                                                              Git doesn’t work better or worse with semantic line breaks as such, but its built-in diff and merge algorithms are indeed line-based. The reason I’m making this distinction is that you can use different diff and merge algorithms per file type, which opens up using things like using word-based diffing tools (e.g. wdiff or wiggle) for Markdown/org/whatever files while keeping line-based diffs elsewhere.

                                                              1. 2

                                                                I knew about wdiff, but not wiggle! That’s … potentially useful? It’s hard to get a sense of it from the repo.

                                                                1. 1

                                                                  Behind-the-scenes, wiggle works effectively by exploding all spaces to line breaks, doing a diff on that, and then stitching it back together for editing/approval/etc. This does occasionally break in fascinating ways, but it’s served me well for…what, gotta be near a decade at this point.

                                                              2. 3

                                                                When I learned LaTeX, this was recommended because it made CVS work better. I kept using it with Markdown and AsciiDoc, and Subversion and git. It significantly reduces editing conflicts, means I don’t need to reflow text, and makes moving sentences around easier.

                                                                When I got the draft of my first book back from the copy editor, entirely full of annotations telling me to move text around, I was very grateful to whichever LaTeX guide told me to do this. I’d made my life much easier.

                                                                I’ve since insisted on it in other papers and in docs for my open source projects. Usually people grumble, think it’s weird, then find it’s saved them a pile of work and start doing it elsewhere.

                                                                1. 1

                                                                  This works well. Brian Kernighan recommended it on page 10 of UNIX for Beginners in 1974 :)

                                                                2. 38

                                                                  This is the silliest thing to argue about ever and I can’t even imag–

                                                                  oh right French puts a space before sentence-ending punctuation and it bugs me so much grr argh

                                                                  ahem, as I was saying, I can’t believe anyone would take the time to worry about this. :-P

                                                                  1. 11

                                                                    In English we put tabs after punctuation so everything lines up & readers can adjust the width for what works for them.

                                                                    1. 5

                                                                      Only in countries that use metric punctuation though. In America, we use the imperial punctuation space which equates to about 0.254 metric tabs worth of spaces.

                                                                      1. 2

                                                                        And yet some how the British are still using Imperial tabs defined by the width of the current king/queen’s pinky finger.

                                                                    2. 9

                                                                      This is even worse, we French people put space before sentence-ending punctuations (like ?!:), except for dots where space is only after the dot. Despite being French, I think this is weird and prefer the English way :)

                                                                      1. 6

                                                                        The rule I was taught is the space comes before and after a punctuation mark if and only if it is non-connex: ?!:;«» but not ,.()/

                                                                        1. 1

                                                                          What does non-connex mean?

                                                                          1. 1

                                                                            Sorry I meant disconnected/non-connected. It means it’s in two parts that don’t touch: https://en.wikipedia.org/wiki/Connectedness

                                                                            1. 1

                                                                              Thanks! I’ve been taught the same rule, except for the “disconnected” concept :)

                                                                        2. 2

                                                                          And that is why we have &nbsp; :)

                                                                      2. 3

                                                                        I use https://earl-grey.halt.wtf/ in VSCode/VSCodium and the foreground and backgrounds from it in my terminal. I find quite nice. It’s a little low contrast for my aging eyes, but a heavier weight font helps a ton (Jet brains mono medium these days).

                                                                        1. 2

                                                                          That’s a great theme! I used Solarized Light but I will try that one out.

                                                                          For fonts, I do like Jet Brains Mono a lot. But let me (unironically) suggest Comic Mono. Comic Sans is well-known to be easy to read, so much so that many people with dyslexia swear by it.

                                                                          1. 2

                                                                            I use the plex font, but either way, it never occurred to me to use a heavier weight font! Thank you so much for this comment!

                                                                          2. 1

                                                                            I use one that hasn’t yet been mentioned: Blokada 5. It has an app available on Android and iOS. It blocks via a local VPN. It’s free, open source, doesn’t require root/jailbreak, and blocks ads in all apps.

                                                                            https://blokada.org/

                                                                            1. 2

                                                                              Interestingly I was experiencing crashes all day until I updated, so I wonder if I was affected. The odd part is I am using a phone with Google’s Tensor CPU.

                                                                              1. 5

                                                                                The Tensor SoCs seem to be built off of Samsung’s Exynos, so even though there are differences (and probably growing over time), they’ll share some CPU quirks.

                                                                                1. 1

                                                                                  Yeah I read about that earlier today! I guess this really confirms things :)

                                                                                  1. 1

                                                                                    But the chip that seems to have issues in this bug report uses a custom design CPU from Samsung, while the Tensor SoCs only use the same bog standard ARM designed cores that everyone uses.

                                                                                  2. 4

                                                                                    What day? If it was yesterday (June 12), and you’re running Nightly, it was more likely https://bugzilla.mozilla.org/show_bug.cgi?id=1837869

                                                                                    1. 1

                                                                                      I had crashes all day yesterday on my Pixel 6 Pro running Nightly. The app would never open at all even after reinstall. I had to switch to beta.

                                                                                      1. 1

                                                                                        You’re probably right! It may just be coincidental timing. Thanks for sharing the link.

                                                                                      2. 1

                                                                                        The mobile tensor unit is an accelerator, not a CPU.

                                                                                        1. 7

                                                                                          No, Google named their whole semi-custom-Samsung SoC line “Tensor”

                                                                                      3. 29

                                                                                        Why Twitter didn’t go down … yet

                                                                                        I was hoping for some insights into the failure modes and timelines to expect from losing so many staff.

                                                                                        This thread https://twitter.com/atax1a/status/1594880931042824192 has some interesting peeks into some of the infrastructure underneath Mesos / Aurora.

                                                                                        1. 12

                                                                                          I also liked this thread a lot: https://twitter.com/mosquitocapital/status/1593541177965678592

                                                                                          And yesterday it was possible to post entire movies (in few-minute snippets) in Twitter, because the copyright enforcement systems were broken.

                                                                                          1. 5

                                                                                            That tweet got deleted. At this point it’s probably better to archive them and post links of that.

                                                                                            1. 11

                                                                                              It wasn’t deleted - there’s an ongoing problem over the last few days where the first tweet of a thread doesn’t load on the thread view page. The original text of the linked tweet is this:

                                                                                              I’ve seen a lot of people asking “why does everyone think Twitter is doomed?”

                                                                                              As an SRE and sysadmin with 10+ years of industry experience, I wanted to write up a few scenarios that are real threats to the integrity of the bird site over the coming weeks.

                                                                                              1. 12

                                                                                                It wasn’t deleted - there’s an ongoing problem over the last few days where the first tweet of a thread doesn’t load on the thread view page.

                                                                                                It’s been a problem over the last few weeks at least. Just refresh the page a few times and you should eventually see the tweet. Rather than the whole site going down at once, I expect these kinds of weird problems will start to appear and degrade Twitter slowly over time. Major props to their former infrastructure engineers/SREs for making the site resilient to the layoffs/firings though!

                                                                                                1. 2

                                                                                                  Not only to the infra/SREs but also to the backend engineers. Much of the built-in fault-tolerance of the stack was created by them.

                                                                                                1. 1

                                                                                                  hm, most likely someone would have a mastodon bridge following these accounts RT-ing :-)

                                                                                                2. 2

                                                                                                  FWIW, I just tried to get my Twitter archive downloaded and I never received an SMS from the SMS verifier. I switched to verify by email and it went instantly. I also still haven’t received the archive itself. God knows how long that queue is…

                                                                                                  1. 2

                                                                                                    I think it took about 2 or 3 days for my archive to arrive last week.

                                                                                                3. 2

                                                                                                  oh, so they still run mesos? thought everyone had by now switched to k8s…

                                                                                                  1. 13

                                                                                                    I used to help run a fairly decent sized Mesos cluster – I think at our pre-AWS peak we were around 90-130 physical nodes.

                                                                                                    It was great! It was the definition of infrastructure that “just ticked along”. So it got neglected, and people forgot about how to properly manage it. It just kept on keeping on with minimal to almost no oversight for many months while we got distracted with “business priorities”, and we all kinda forgot it was a thing.

                                                                                                    Then one day one of our aggregator switches flaked out and all of a sudden our nice cluster ended up partitioned … two, or three ways? It’s been years, so the details are fuzzy, but I do remember

                                                                                                    • some stuff that was running still ran – but if you had dependencies on the other end of the partition there was lots of systems failing health checks & trying to get replacements to spin up
                                                                                                    • Zookeeper couldn’t establish a quorum and refused to elect a new leader so Mesos master went unavailable, meaning you didn’t get to schedule new jobs
                                                                                                    • a whole bunch of business critical batch processes wouldn’t start
                                                                                                    • we all ran around like madmen trying to figure out who knew enough about this cluster to fix it

                                                                                                    It was a very painful lesson. As someone on one of these twitter threads posted, “asking ‘why hasn’t Twitter gone down yet?’ is like shooting the pilot and then saying they weren’t needed because the plane hasn’t crashed yet”.

                                                                                                    1. 8

                                                                                                      Twitter is well beyond the scale where k8s is a plausible option.

                                                                                                      1. 2

                                                                                                        I wonder what is the largest company that primarily runs on k8s. The biggest I can think of is Target.

                                                                                                        1. 3

                                                                                                          There’s no limit to the size of company that can run on kube if you can run things across multiple clusters. The problem comes if you routinely have clusters get big rather than staying small.

                                                                                                            1. 1

                                                                                                              Oh, I didn’t realize that was their main platform.

                                                                                                              1. 2

                                                                                                                I was thinking about that too, but I’m guessing that CFA has a fraction of the traffic of Target (especially this time of year). Love those sandwiches though…

                                                                                                          1. 2

                                                                                                            Had they done so, I bet they’d already be down :D

                                                                                                            1. 1

                                                                                                              I work at a shop with about 1k containers being managed by mesos and it is a breath of fresh air after having been forced to use k8s. There is so much less cognitive overhead to diagnosing operational issues. That said, I think any mesos ecosystem will be only as good as the tooling written around it. Setting up load balancing, for instance . . . just as easy to get wrong as right.

                                                                                                          2. 3

                                                                                                            What’s the deal with Nim? It seems like it’s in a similar category to golang, but more Pythonish than Java-ish.

                                                                                                            1. 9

                                                                                                              What’s the deal with Nim? It seems like it’s in a similar category to golang, but more Pythonish than Java-ish.

                                                                                                              Go’s primary use case was for writing server-side web applications at Google. Nim is more of a systems programming language. Someone was even trying to clone the xv6 kernel in it at one point, but I’m pretty sure that effort is abandoned.

                                                                                                              FWIW, it can also compile to JavaScript. Their forum, forum.nim-lang.org, is a single-page application written in Nim. That says a lot about how flexible it is.

                                                                                                              NB: I don’t program in Nim and don’t have any stake in it, but it looks cool from an outsider’s point of view.

                                                                                                              1. 1

                                                                                                                Go also has (had?) a JS target. People got excited about it for a while, but the juice wasn’t worth the squeeze as far as I can tell, and it seems like it’s mostly been forgotten about, especially since wasm support landed.

                                                                                                                1. 1

                                                                                                                  Go also has (had?) a JS target. People got excited about it for a while,

                                                                                                                  This stuff is way far away from my areas of interest, being a fervent hater of all things modern web and SPA. But their SPA did seem responsive when I last looked. And hey, from a purely technical point of view, it is clever and interesting in a way that “someone did a thing with Framework of the Week” is not. Since the Nim project is eating its own dogfood, it is also an incentive not to let that backend bitrot. It’s a testimonial for the breadth of the language. I never thought I’d be saying “Oh cool, someone made an SPA”.

                                                                                                                  1. 1

                                                                                                                    I think there are two. GopherJS and WASM targets.

                                                                                                                2. 7

                                                                                                                  More or less. I mentally bin it in with Java and C# and Go and D in the bucket of “reasonably fast application languages”. I consider it quite competently designed made, it has some ability to do low-level programming but it is mainly used with a GC. I don’t use it much because I’m usually either doing lower-level stuff in Rust or gluing stuff together with Python/Lua… but I wish I had more reasons to use it.

                                                                                                                  The dark secret is its syntax is a lot closer to Pascal than Python. Shhhh, don’t tell anyone.

                                                                                                                  1. 3

                                                                                                                    I personally often find Nim more convenient as a glue language than Python.

                                                                                                                  2. 5

                                                                                                                    That’s basically it, sure. It’s in that Go, C++, D, Rust configuration space. Implementation/output-wise between D and C++, textually/input wise somewhere around Python with a dash of macros between Rust and Lisp (I don’t know any D to comment on its macros so sorry if I left it out).

                                                                                                                    It compiles to C, it has a runtime including threads and garbage collection but it’s optional (so lighter than Go’s or Java’s). It has a very Python-y syntax but I’ve seen the author of this post’s code and they use quite a lot of the macro system and it feels pretty natural so it’s hard to pin it too hard in that Python bucket but certainly at a glance it looks that way.

                                                                                                                    It compiles to C (or javascript, or wasm, or some other targets) and supports most of the native datatypes like pointers that make FFI somewhat straight forward so it’s pretty easy to plug in to most ecosystems.

                                                                                                                    It’s a small cult following for sure, there aren’t Nim folk coming into every thread telling you to rewrite your stuff in Nim

                                                                                                                      1. 2

                                                                                                                        Between templates, template mixins, value/alias template parameters, string mixins, CTFE, etc., metaprogramming in D is effectively as powerful as macros though right? I think a lot of people would consider string mixins + CTFE macros.

                                                                                                                        1. 4

                                                                                                                          Okay, yes. D has deliberately bad macros. :)

                                                                                                                          If you limit yourself to D’s built-in syntax, you can do some macro-like things. But you can’t destructure things and you can’t mess with custom syntax outside of strings.

                                                                                                                    1. 4

                                                                                                                      More like an early Rust with a wilder syntax, GC (originally, now giving way to a more static memory management system[0]) and lesser focus on thread safety.

                                                                                                                      [0] https://nim-lang.org/blog/2020/10/15/introduction-to-arc-orc-in-nim.html

                                                                                                                      1. 4

                                                                                                                        one of the challenges with newer (or in general less-mainstream) programming languages is that they do not have client libraries for major open source databases, stream processors, UI toolkits, and various other middleware.

                                                                                                                        For example I cannot find Zig, Racket, Nim client libraries for Kafka Streams. probably the same is true (to some degree for QT, may be for other major databases as well)

                                                                                                                        I am not saying that this is a language issue, it is just these things make enterprise adoption in the mean time more difficult for low-budget/see-if-it-works kinds of efforts.

                                                                                                                        1. 5

                                                                                                                          At least in Zig it’s first-class supported to just use the C library for it.

                                                                                                                          1. 2

                                                                                                                            Zig is another language that gets pretty rave reviews from people enjoying exploring it. I’ve yet to encounter it (or Nim, or Crystal) in real life, and I’ve barely ever seen Rust, but it’s cool that people are continuing to innovate and explore in the language area. I’m ready to let C++, Python, Java, and JavaScript slowly retire into the sunset, even though they’re all incredible languages with extremely rich ecosystems.

                                                                                                                          2. 3

                                                                                                                            Yeah, we’re pretty spoiled at this point by the plethora of options in well-established languages. 🤷‍♂️

                                                                                                                            But it’s pretty cool to read the excitement in this blog article on Nim. Programming should at least occasionally be fun, and it sounds like Nim helped him find that fun again. That’s something that I can appreciate.

                                                                                                                          3. 2

                                                                                                                            You answered your own question. Why the question mark?

                                                                                                                            It is very pythonish and has the benefits of other languages you mentioned.

                                                                                                                            As mentioned by others, it läks a killer library/application.

                                                                                                                            1. 3

                                                                                                                              Go and Nim are both descendants of Pascal.

                                                                                                                              Go is optimized for disposable code monkeys at an advertising company. Nim is a hackers’ Pascal.

                                                                                                                            2. 25

                                                                                                                              Firefox’s killer feature for me is its history and bookmarks sync/search. If I start typing in Firefox’s omnibar, it automatically searches my entire browsing history and bookmarks. I get the same ability on my phone, it shares history.

                                                                                                                              Chrome does not do that, even though Google knows all my history, and it makes for an awful UX.

                                                                                                                              I can also run my own Firefox sync server, retaining control over my data.

                                                                                                                              1. 7

                                                                                                                                You can even make it so that it only searches through open tabs or only bookmarks and so on by using some modifiers

                                                                                                                                1. 3

                                                                                                                                  Thanks, that’s very helpful! And thanks for working on Firefox!

                                                                                                                              2. 1

                                                                                                                                Somebody on reddit recently made me aware of gdu, which is just ncdu but faster.

                                                                                                                                1. 27

                                                                                                                                  That looks really useful.

                                                                                                                                  This FAQ from the bottom of the README needs to go at the top:

                                                                                                                                  Why shouldn’t I just use jq?

                                                                                                                                  jq is awesome, and a lot more powerful than gron, but with that power comes complexity. gron aims to make it easier to use the tools you already know, like grep and sed.

                                                                                                                                  gron’s primary purpose is to make it easy to find the path to a value in a deeply nested JSON blob when you don’t already know the structure; much of jq’s power is unlocked only once you know that structure.

                                                                                                                                  1. 6

                                                                                                                                    Agreed, this was my first question. And I’ve personally felt the pain of jq’s complexity, so this is a good argument.

                                                                                                                                    1. 6

                                                                                                                                      This tool helps me out a lot with complex jq queries: https://github.com/fiatjaf/jiq

                                                                                                                                      1. 3

                                                                                                                                        FWIW, here’s a version of that that works for jq or any other filter: https://github.com/kbd/setup/blob/master/HOME/bin/fzr It’s basically just a big call to fzf, using its preview window for display.

                                                                                                                                        So, <file fzr jq gives you an interactive jq repl, but it can also work for non-jq things.

                                                                                                                                        1. 3

                                                                                                                                          That’s awesome, thanks!

                                                                                                                                          1. 1

                                                                                                                                            It is the the third time I see this workflow pattern and I also wrote a hacky experimental similar tool in the past. It looks like something people want.

                                                                                                                                            There are tools that allow this but I think someone will implement a dedicated tool to do this. Sort of an interactive xargs so to speak:

                                                                                                                                            Accept a command as a string with a placeholder and on each keystroke re-run the command replacing the placeholder with the contents of the prompt and display the output.

                                                                                                                                            1. 1

                                                                                                                                              Accept a command as a string with a placeholder and on each keystroke re-run the command replacing the placeholder with the contents of the prompt and display the output.

                                                                                                                                              Yes this is exactly what the linked tool does. fzf provides all the UI needed and it works well, give it a try.

                                                                                                                                              1. 1

                                                                                                                                                Cool! I looked at it closer, I thought it was jq specific. But it’s just using fzf as a UI. Nicely done.

                                                                                                                                            2. 1

                                                                                                                                              Wow, thanks! I really need to learn fzf better

                                                                                                                                      2. 4

                                                                                                                                        This is awesome news. I’ve been using it as my daily driver since the end of 2019 and had no complaints. I don’t have any formal comparisons, but informally have found that Emacs just feels a smoother and more responsive with it. Not hugely so – it’s a not a 2x. But it’s enough to feel subtly better. Stability-wise, it’s also seemed to be about as good as mainline Emacs (i.e., very).

                                                                                                                                        The one real tricky aspect to getting it working was that I ended up first having to make and install a custom build of GCC that included libgccjit before I could build Emacs with this branch. I expect that distros will make it easier to install once this lands, however.

                                                                                                                                        1. 2

                                                                                                                                          FWIW, libgccjit is already in Debian testing.

                                                                                                                                          Personally, I haven’t noticed any performance difference (or any difference, really) with the ‘native-comp’ branch, but I like that they’re modernizing and updating the internals.

                                                                                                                                          1. 4

                                                                                                                                            Mostly the same here. I guess how much of a difference it makes depends on how you use Emacs, and I just don’t have that many computationally intensive tasks. The main bottle-neck are probably still IO with the disk or the network.

                                                                                                                                          2. 1

                                                                                                                                            I’ve been using it as my daily driver since the end of 2019 and had no complaints.

                                                                                                                                            Same, except I could complain about libgccjit’s compile time.. Normal operations certainly feel a lot snappier and it makes Emacs a lot more pleasant to use. But it doesn’t fix things like Emacs choking on long lines, big xml/json files, etc. so don’t expect that.

                                                                                                                                            1. 2

                                                                                                                                              libgccjit’s compile time.

                                                                                                                                              I think this is because it’s not really being used as a JIT in the sense that runtimes like v8 or the JVM mean by a JIT (opportunistic compilation of hot functions or code paths, often done in the background). If I understand it correctly, libgccjit is just a nice way to call the full GCC compiler in an incremental fashion, and have it compile to memory instead of disk. And the way Emacs native-comp is using it is to compile entire modules as they’re loaded. So it’s more like on-demand calling out to a full AOT compiler. Still nice, but a different way of integrating a native-code compiler, despite JIT being in the name.

                                                                                                                                          3. 10

                                                                                                                                            fzy - not so bloated and written in C.

                                                                                                                                            1. 4

                                                                                                                                              Thanks for the pointer, it looks lean. Pardon my ignorance, but I’m not sure what “written in C” means as an advantage over fzf here - as a shell-level user of fzf which seems plenty fast enough for me, what does this imply for me?

                                                                                                                                              That said, icy makes a good point in reply to this, too, relating to the Unix philosophy.

                                                                                                                                              1. 2

                                                                                                                                                For one the binary is about 100x smaller..

                                                                                                                                                1. 10

                                                                                                                                                  Not to be rude, but who cares? fzf is 2.3mb on my system, that’s basically nothing. In fzy’s readme under “Sorting”, fzf gives the same results as fzy, except for file over filter (admittedly yes, fzy is better on that one specific case). fzy claims to be faster but there are no benchmarks. fzf is already near instantaneous. fzf also has extended search syntax for exact match/inverse/prefix/suffix. What you call “bloat” others call “features”.

                                                                                                                                              2. 3

                                                                                                                                                Yeah there are many like this. skim, selecta are others. I found fzf to provide very responsive UI with extremely large input. Like, even if search takes a few seconds the UI still processes keystrokes. I can’t say the same about every simplified version out there. That said I haven’t tried fzy.

                                                                                                                                                1. 3

                                                                                                                                                  +1 for fzy! It’s just enough. fzf does one too many things.

                                                                                                                                                  1. 2

                                                                                                                                                    thanks for sharing this, I’ve effectively replaced fzf!

                                                                                                                                                  2. 11

                                                                                                                                                    Pretty cool, thanks! I didn’t know ripgrep could do replacements.

                                                                                                                                                    In-place editing can be done with the sponge utility from moreutils:

                                                                                                                                                    rg --passthru 'blue' -r 'red' ip.txt | sponge ip.txt
                                                                                                                                                    
                                                                                                                                                    1. 1

                                                                                                                                                      You’re welcome.

                                                                                                                                                      sponge is a nice suggestion, I’ll add it to the post, thanks.

                                                                                                                                                    2. 15

                                                                                                                                                      No history searching is supercharged without fzf: https://github.com/junegunn/fzf

                                                                                                                                                      1. 2

                                                                                                                                                        How do you deal with the crappy fuzzy matching of fzf? like https://github.com/junegunn/fzf/issues/1823

                                                                                                                                                        1. 1

                                                                                                                                                          I haven’t had any problems with it, I wouldn’t call it crappy either. What completion system (in any software) do you know of that solves the issue you described there? I don’t personally know of any autocomplete that works on editing distance instead of something like /f.*o.*o/.

                                                                                                                                                          1. 1

                                                                                                                                                            Hm, ok. I didn’t investigate further. I still use fzf for history search just not for cd anymore.

                                                                                                                                                        2. 98

                                                                                                                                                          I don’t write tests to avoid bugs. I write tests to avoid regressions.

                                                                                                                                                          1. 14

                                                                                                                                                            Exactly. I dislike writing tests, but I dislike fixing regressions even more.

                                                                                                                                                            1. 7

                                                                                                                                                              And i’d go even further:

                                                                                                                                                              I write tests and use typed languages to avoid regressions, especially when refactoring.

                                                                                                                                                              A test that just fails when I refactor the internal workings of some subcomponents, is not a helpful test – it just slows me down. 99% of my tests are on the level of treating a service or part of a service as a black box. For a web service this is:

                                                                                                                                                              test input (request) -> [black box] -> mocked database/services
                                                                                                                                                              

                                                                                                                                                              Where black box is my main code.

                                                                                                                                                              For NodeJS the combo express/supertest is awesome for the front bit. I wish more web frameworks in Rust etc also had this. I.e. providing ways to “fake run” requests through without having to faff around with server/sockets (and still be confident it does what it should).

                                                                                                                                                              1. 5

                                                                                                                                                                Now the impish question: what is the correct decision if the test is more annoying to write than the regression is to observe and fix?

                                                                                                                                                                1. 3

                                                                                                                                                                  Indeed!

                                                                                                                                                                  (I research ways[1] to avoid that. But of course they don’t apply when you’ve already chosen a stack and framework for development. In my day job we just make hard decisions about priority and ROI and fall back sometimes to code comments, documents or oral story-telling.)

                                                                                                                                                                  [1] https://github.com/akkartik/mu1#readme (first section)

                                                                                                                                                                  1. 2

                                                                                                                                                                    Every project is different, but ideally you can invest time in the testing infrastructure such that writing a new test is no longer annoying. I.e, maybe you can write re-usable helper functions and get to the point where a new test means adding an assertion or copy / pasting an existing test and modifying it a bit. The tools used (test harness, mocking library, etc) also play a huge role in whether tests are annoying or not, spending time ensuring you’re using the right ones (and learning how to properly use them) is another way to invest in testing.

                                                                                                                                                                    The level of effort you should spend on testing infrastructure depends on the scope, scale and longevity of your project. There are definitely domains that will be a pain to test pretty much no matter what.

                                                                                                                                                                    1. 2

                                                                                                                                                                      In my experience such testing frameworks tend to add to the problem, rather than solve it. Most testing frameworks I’ve seen are complex and can be tricky to work with and get things right. Especially when a test is broken it can be a pain to deal with.

                                                                                                                                                                      Tests are hard because you essentially need to keep two functions in your head: the actual code, and the testing code. If you come back to a test after 3 years you don’t really know if the test is broken or the code is broken. It can be a real PITA if you’re using some super-clever DSL testing framework.

                                                                                                                                                                      People trying to be “too clever” in code can lead to hard to maintain code, people trying to be “too clever” in tests often leads to hard to maintain tests.

                                                                                                                                                                      Especially in tests I try to avoid needless abstractions and be as “dumb” as possible. I would rather copy/paste the same code 4 times (possible with some slight modifications) than write a helper function for it. It’s just such a pain to backtrack when things inevitably break.

                                                                                                                                                                      It really doesn’t need to be this hard IMHO; you can fix much of it by letting go of the True Unit Tests™ fixation.

                                                                                                                                                                      1. 2

                                                                                                                                                                        I don’t disagree, and I wasn’t trying to suggest using a “clever” testing framework will somehow make your tests less painful. Fwiw I even suggested the copy / paste method in my OP and use it all the time myself :p. My main point was using the right tool / methods for the job.

                                                                                                                                                                        I will say that the right tool for the job is often the one that is the most well known for the language and domain you’re working in. Inventing a bespoke test harness and trying to force it on the 20 other developers who are already intimately familiar with the “clever” framework isn’t going to help.

                                                                                                                                                                        1. 2

                                                                                                                                                                          Fair enough :-)

                                                                                                                                                                          I will say that the right tool for the job is often the one that is the most well known for the language and domain you’re working in. Inventing a bespoke test harness and trying to force it on the 20 other developers who are already intimately familiar with the “clever” framework isn’t going to help.

                                                                                                                                                                          I kind of agree because there’s good value in standard tooling, but on the other hand I’ve seen rspec (the “standard tool” for Ruby/Rails testing) create more problems than solve IMHO.

                                                                                                                                                                  2. 4

                                                                                                                                                                    When fixing testable bugs you often need that “simplest possible test case” anyway, so you can identify the bug and satisfy yourself that you fixed it. A testing framework should be so effortless that you’d want to use it as the scaffold for executing that test case as you craft the fix. From there you should only be an assert() or two away from a shippable test case.

                                                                                                                                                                    (While the sort of code I write rarely lends itself to traditional test cases, when I do, the challenge I find is avoiding my habit of writing code defensively. I have to remind myself that I should write the most brittle test case I can, and decide how robust it needs to be if and when it ever triggers a false positive.)

                                                                                                                                                                    1. 3

                                                                                                                                                                      +1

                                                                                                                                                                      This here, at the start of the second paragraphs is the greatest misconception about tests:

                                                                                                                                                                      In order to be effective, a test needs to exist for some condition not handled by the code.

                                                                                                                                                                      A lot of folks from the static typing and formal methods crowd treat tests as a poor man’s way of proving correctness or something… This is totally not what they’re for.

                                                                                                                                                                      1. 1

                                                                                                                                                                        umm…..aren’t regressions bugs?

                                                                                                                                                                        1. 9

                                                                                                                                                                          Yes, regressions are a class of bug. The unwritten inference akkartik made when saying “I don’t write tests to avoid bugs” is that it is refers specifically to writing tests to pre-empt new bugs before they can be shipped.

                                                                                                                                                                          Such defensive use of tests is great if you’re writing code for aircraft engines or financial transactions; whereas if you’re writing a christmas tree light controller as a hobby it might be seen as somewhat obsessive compulsive.

                                                                                                                                                                          1. 0

                                                                                                                                                                            I-I don’t understand. Tests are there to catch bugs. Why does it matter particularly at what specific point in time the bugs are caught?

                                                                                                                                                                            1. 9

                                                                                                                                                                              Why does it matter particularly at what specific point in time the bugs are caught?

                                                                                                                                                                              Because human nature.

                                                                                                                                                                              Often times a client experiencing a bug for the first time is quite lenient and forgiving of the situation. When it’s fixed and then the exact same thing later happens again, the political and financial consequences of that are often much, much worse. People are intensely frustrated by regressions.

                                                                                                                                                                              Sure, if we exhaustedly tested everything up front, they might never have experienced the bug in the first place, but given the very limited time and budgets on which many business and enterprise projects operate, prioritizing letting the odd new bug slip through in favor of avoiding regressions often makes a hell of a lot of sense.

                                                                                                                                                                              1. 5

                                                                                                                                                                                Not sure if you are trolling …

                                                                                                                                                                                Out of 1000 bugs a codebase may have, users will never see or experience 950 of them.

                                                                                                                                                                                The 50 bugs the user hits though – you really want to make sure to write tests for them, because – based on the fact that the user hit the bug – if it breaks again, the user will immediately know.

                                                                                                                                                                                That’s why regression tests give you a really good cost/benefit ratio.

                                                                                                                                                                                1. 3

                                                                                                                                                                                  A bug caught by a test before the bad code even lands is much easier to deal with than a bug that is caught after it has already been shipped to millions of users. In general the further along in the CI pipeline it gets caught, the more of a hassle it becomes.

                                                                                                                                                                                  1. 3

                                                                                                                                                                                    The specific point in time matters because the risk-reward payoff calculus is wildly different. Avoiding coding errors (“new bugs”) by writing tests takes a lot of effort and generally only ever catches the bugs which you can predict, which can often be a small minority of actual bugs shipped. Whereas avoiding regressions (“old bugs”) by writing tests takes little to no incremental effort.

                                                                                                                                                                                    People’s opinion of test writing is usually determined by the kind of code they write. Some types of programming are not suited to any kind of automated tests. Some types of programming are all but impossible to do if you’re not writing comprehensive tests for absolutely everything.

                                                                                                                                                                                    1. 2

                                                                                                                                                                                      The whole class of regression tests was omitted from the original article which is why it’s relevant to bring them up here.

                                                                                                                                                                                      1. 2

                                                                                                                                                                                        The article says “look back after a bug is found”. That sounds like they mean bugs caught in later stages (like beta testing, or in production).

                                                                                                                                                                                        If you define bugs as faults that made it to production, then faults caught by automated tests can’t be bugs, because they wouldn’t have made it to production. It’s just semantics, automated tests catch certain problems early no matter how you call them.

                                                                                                                                                                                        1. 1

                                                                                                                                                                                          I’m of the same opinion. It means that the reason why we’re writing tests is not to catch bugs in general, but specifically to catch regression bugs. With this mindset, all other catching of bugs is incidental.

                                                                                                                                                                                🇬🇧 The UK geoblock is lifted, hopefully permanently.