Threads for kevincox

    1. 3

      Wouldn’t it be trivial for apple and google to implement e2ee for notifications?

      Generate an asymmetric keypair for notifications per device, or even per app. The pair could also be shared via keychain or whatever for convenience. Google/apple can then distribute the pub keys to apps that are granted notification permissions. Then apps can encrypt their notifications with the pub key. They can also sign it, so google/apple servers can’t mitm.

      1. 4

        This is how the Web Push API works. It is a pretty nice protocol. The browser (or whatever other tool imolements it) gives you a URL and a key. Then the pusher hits the endpoint with an encrypted blob.

        IIUC WebPush can also traverse Android and iOS (has that shipped yet) push notifications so it seems like this could be implemented.

        It seems like the worst case would be that you don’t get a perfect preview for the notification. But it seems that even if the key lives with the OS not the target application that should be good enough for most cases.

      2. 3

        How does this prevent mitm? If Google/Apple both distribute the public keys and route the notifications, they can (be compelled to) generate their own keypair in-between. It would be detectable (though not by the apps themselves, since they’d never see these keys), but.

    2. 5

      There is lots of interesting context in this blog post by Beeper: https://blog.beeper.com/p/how-beeper-mini-works and this blog post by the person who reversed the protocol: https://jjtech.dev/reverse-engineering/imessage-explained/.

      The most concerning bit is:

      When making an IDS registration request, a binary blob called “validation data” is required. This is essentially Apple’s verification mechanism to make sure that non-Apple devices cannot use iMessage.

      Note: The binary that generates this “validation data” is highly obfuscated. pypush sidesteps this issue by using a custom mach-o loader and the Unicorn Engine to emulate an obfuscated binary. pypush also bundles device properties such as the serial number in a file called data.plist, which it feeds to the emulated binary.

      So it sounds like:

      1. This will be fairly easily banned by blocking serial numbers that are used too frequently.
      2. This relies on tricking Apple to believe that the session is “authentic” as in rooted in Apple hardware.

      I wonder how Beeper intends to build a reliable product on this. Are they just going to throw a fit an apologize if Apple shuts it down?

      1. 3

        There is no company here, but they might find themselves acquired. I do not like Apple’s rent extraction games, but I’m not sure if I like the carriers more. It’s just a bad situation all the way around.

        1. 1

          What do you mean specifically in the case of Apple and iMessages/Messages.app? Is your claim is that their choice to keep iMessage only on Apple hardware does not add value to the consumer? That argument would need justified, as it isn’t self-evident. Also, a strong argument would need to discuss the costs and benefits {(a) to Apple and (b) to customers} of supporting iMessages on other hardware, would it not? What is the calculus?

    3. 1

      Note: Adding the lang attribute is particularly important for assistive technology users. For instance, screen readers alter their voice and pronunciation based on the language attribute.

      I’m sorry, it is the year 2023. It is trivial to identify the language of a paragraph of text, and, if you fail and just use the default voice, any screen reader user will be either a) as confused as I would be, reading a language I clearly don’t understand, or b) able to determine that they are getting German with a bad Spanish accent, assuming they speak both languages. Please, please, please, accessibility “experts”, stop asking literally millions of people to do work on every one of their pieces of content, when the work can be done trivially, automatically.

      1. 6

        These are heursics, and not always correct. Especially for shorter phrases it is very possible that it is valid in multiple languages. I think it is of course good they threse heuristics exist but it seems that it is best to also provide more concrete info.

        The ideal situation is probably both. Treat the HTML tags as a strong signal, but if there is lots of text and your heuristics are fairly certain that it is wrong consider overriding it, but if it is short text or you aren’t sure go with what it says.

        Makes me wonder if there is a way to indicate “I don’t know” for part of the text. For example if I am embedding a user-submitted movie title that may be another language. I could say that most of this site is in English, but I don’t know what language that title is, take your best guess.

        1. 5

          Makes me wonder if there is a way to indicate “I don’t know” for part of the text.

          From https://www.loc.gov/standards/iso639-2/faq.html#25:

          1. How does one indicate undetermined languages using the ISO 639 language codes?

            In some situations, it may be necessary to indicate that the identity of the language used in an information object has not been determined. If the situation is that it is undetermined because there is no language content, the following identifier is provided by ISO 639-2:

            zxx (No linguistic content; Not applicable)

            If there is language content, but the specific language cannot be determined a special identifier is provided by ISO 639-2:

            und (Undetermined)

          Also in fun ISO language codes: You can add -fonipa to a language code to indicate IPA transcription:

          From my resume:

          <h1 lang="tr">
          	<span class="p-name">Deniz Akşimşek</span>
          	<i lang="tr-fonipa">/deniz akʃimʃec/</i>
          </h1>
          
      2. 3

        It is trivial to identify the language of a paragraph of text

        It’s an AGI-hard problem…

        Consider my cousin Ada. The only way a screen reader (or person) can read that sentence correctly without a <span lang=tr> is by knowing who she is.

        What is possible, though far from trivial, is to apply a massive list of heuristics, which is sometimes the best option available, i.e. user-generated content. However, When people who do have the technical knowledge to take care of these things don’t, responsible authors who mark their languages will then have to work around them.

        1. 1

          But never, in all of human history, has a letter, or book, or magazine article ever noted your cousin’s name language in obscure markup. That’s not how humans communicate, and we shouldn’t start now.

      3. 2

        I write lang="xy" attributes. I, for one, certainly would prefer that the relatively small number of HTML authors take the small amount of care to write lang="xy" attributes, so that user agents can simply read those nine bytes, than that the much larger number of users spend the processing power to run the heuristics to identify the language (and maybe fail to guess correctly). Consider users over authors. Maybe, if one considers only screen readers, the effect shrinks away, but there are other user agents that care in what language is the text on the Web, as common as Google Chrome, which identifies the language so that it can offer to Google-Translate it.

        1. 2

          I, for one, certainly would prefer that the relatively small number of HTML authors take the small amount of care to write lang=“xy” attributes, so that user agents can simply read those nine bytes, than that the much larger number of users spend the processing power to run the heuristics to identify the language (and maybe fail to guess correctly).

          This is the fundamental disconnect. You are not making this ask of the “relatively small number of HTML authors”. You are making this ask of literally every single person who tweets, posts to facebook or reddit, or sends an email. This is an ask of, essentially, every person who has ever used a computer. The content creator is the only person who knows the language they are using.

    4. 1

      Nice! I knew the last two, but I don’t think I’ve ever heard of the first three before. I’ll remember this translate property. I think that’s an interesting one.

      1. 2

        I didn’t know about it either. But do any browsers do anything with it? It would be really cool if they gave you some sort of indication “this page is also available in English, [View]”. But I’ve never seen something like that. It is always hunting around for an in-page “English”, or “En” or English in the source language, or a UK flag, or a US flag.

        1. 1

          I assumed that it skips the contents when you click the “Translate…” button on mobile Chrome (and whatever other browsers have this feature).

          1. 2

            Sorry, I should have been more specific. I was talking about <link href="https://example.com/de" rel="alternate" hreflang="de" /> where there is a first-party translation available. Not triggering machine-translation.

            1. 1

              Oh, yeah. I have never actually made a multi-language website. So, I have no idea!

    5. 17

      It’s remarkably easy to get invited to Lobste.rs What I like about the invite only system is that you have to contact a person for an invite. I got invited by contacting a person. This simple extra effort is so much better than all the AI tools to filter out bots. I’m sure bad humans do slip in, and now with ChatBots, bad computers impersonating good humans will slip in, but I still think the system is better than something that lets the FSB sign up a hundred thousand accounts and bring down lobste.rs.

      1. 12

        What I like about the invite system is that it provides some basic accountability. If spammers are getting invites we can look at the tree and finf the right spot to prune. It provides some basic protection from spam and sick puppets.

        I do think invites should be fairly easy to get. I think I would extend an invite to most people to asked me. If I don’t know them personally I would just want to see some comment history on another site to make sure that they are likely a human and interesting in what is on-topic here.

        It would be interesting to have some sort of indication of “how sure” you should be for invites. Right now the about page says you should invite “people [you] believe will contribute positively” which doesn’t say much about the confidence margin you should have. I wonder if some “the mods are doing ok, don’t worry about being too stingy with invites” vs “the mods are busy right now, be conservative” would be useful, and it could occasionally be updated. But I’m probably just way overthinking this.

        1. 4

          As an addendum to your comment on accountability, I feel that accountability also extends in both directions. I was given an invite by a stranger and knowing that they might pay a price if I’m too big of a jerk is a good reminder to give every comment a second editing pass and make sure that I’m contributing instead of ranting.

      2. 8

        I just hopped on IRC - as the about page also suggests

    6. 23

      There’s some discussion there of the self-promotion rule, and I wonder if ratelimiting how much of your own stuff you can post could help limit self-promotion while helping avoid interesting posters being banned – like you only get one “authored by” a week and are expected to always label your authored posts. Maybe you get to bank a couple ‘authored by’s up if you have a flurry of writing (or maybe if events or just Lobsters briefly getting really into a topic makes your posts more relevant).

      You could also attack the problem from the other side: you don’t want to see a bunch of the same stuff on the homepage, so maybe under some circumstances the ranking will push an item down if it’d be the second on the homepage with the same submitter or source (where source could be domain or well-known-domain/user, like HN does).

      Favoring dissimilar content also keeps the homepage varied when it’s more than one user (say, Bob and his friends post and vote for Bob’s work) without having to judge whether that was motivated by genuine interest or a desire to promote Bob’s stuff.

      1. 25

        As a serial poster, I stopped posting my own stuff to lobste.rs a while back. I just don’t see the need to except in some exceptional/special circumstances (like I did for my Xesite v4 post). Most of the stuff I write is intended for more general audiences, but I enjoy reading the comments on lobste.rs more than anything else. There’s a much higher signal : noise ratio here. It’s nice.

        1. 21

          I think it boils down to honesty. When I post my own blog post on lobste.rs, it is because I genuinely think it will interest lobste.rs crowd.

          In doubt, I wait for someone else to post it (but this is possible only because I’ve readers active on this website. If you read this, thank you a lot!)

          1. 12

            I think this is key. If 90% of your submissions are your own stuff but they get tons of appreciation who really cares. I would expect this to be a small minority of people but I don’t think it is a problem. But if you submit every one of your blog posts and they get maybe 2 votes a piece then I don’t think it matters if you submit 100 links between each maybe you should be more selective about which of your blogs you submit.

      2. 11

        Seems it’s less bans and more people voluntarily leaving after being warned.

      3. 10

        It’s something I don’t understand. I’ve posted 11 links including 5 to my own blog and 1 to my own project. That’s clearly auto-promotion.

        Yet, I had 4 links removed by moderators. None of them where linking to my own content and, the 4 times, it felt quite harsh as it was “on-topic” but “not-enough”.

        It takes time to learn the rules at lobste.rs. But, according to my experience, this is not related to autopromotion but mostly on the strict “on-topic” rule.

      4. 7

        I do not recall any instance of someone being banned for respecting the reasonable rules for self-promotion, and thus the community losing a valuable voice.

        1. 37

          A look at the moderation log shows that self-promotion is the primary reason why most users are banned.

          I think the primary problem is that there isn’t a clear guideline for people to follow. The percentages of contributions which lobste.rs users think it is okay to be self-promoting range from 10% to 49%, including a bunch of values in between. Other users think that any percentage is fine as long as your contributions are getting votes.

          I know lobste.rs relies on “community norms”, but I don’t think that those norms are clear here. I think some of the people who had bad experiences here would still be active here if there was a clear list of rules to follow - like HN’s guidelines page. While I realize that some people will submit spam regardless (as Upton Sinclair said, “It is difficult to get a man to understand something, when his salary depends upon his not understanding it”), clear rules (not subject to interpretation by moderators) could decrease the number of good-faith bad submissions from new users.

          1. 31

            Self-promo is also the primary reason why mods PM users, and the vast majority of users reasses site norms and follow them with nothing public appearing in the modlog along the way. We PM because posting a public comment is automatically a punishment by public shaming, but the second-order effect is that it’s not obvious that anything ever happened.

            1. 6

              Can you reflect on your sense of how the Venn diagram for self-promotion-related PMs and bans looks? Maybe also as intersected by any mitigating factors you tend to consider?

              (What fraction of people banned for this get a PM first? Does the fraction change if we focus just on users that made a meaningful effort to participate beyond self-promotion?)

              1. 27

                Most get at least one a warning first. The ones that haven’t gotten warned first about self-promo are doing something real deliberately spammy like setting up a voting ring, creating sockpuppets, or disregarding the big bright red message that warns they’re about to earn a ban. Lobsters looks a lot like HN and Reddit so it’s not surprising that people carry those sites’ norms here. Ideally we want to nudge people into learning how our community works to contribute productively to it, but sometimes we see people who knowingly break rules to benefit themselves.

                Mods do scale our response to their other site activity. People who are voting + commenting normally on other people’s stories get a softer message. People who have zero site activity beyond posting their company blog with a strong marketing call-to-action to buy their thing will get a sterner message and significantly less leeway. This also informs domain bans. If you look at those messages, some are “yeah this is spam and we don’t want to ever reward you” and a lot are more like “hey, you’re ignoring PMs or not correcting your behavior, so here’s an escalated response” when someone submits topical, quality stuff but is treating the site as a write-only promo tool instead of a community to be part of.

                1. 10

                  I’ve been a daily lurker for years, and I’m kind of afraid of posting here because of the moderation, so I typically just don’t.

                  When you look at my profile, does it look like I’m a participant in the community? Or would I look like a “write-only” account?

                  1. 4

                    I’d try to find a few more links authored by other people before submitting more stuff authored by you, but you’re commenting too and that balances it out too. Comments are site contributions (although commenting exclusively on your own submissions can be self-promotion too). I’m not a mod, just my opinion.

                    1. 4

                      Appreciate it! I know that my profile kind of looks like a spammer unless you look at my comment history. In fact, two of the three links I’ve submitted were obscure things that I had written and was proud of.

                      That was kind of the point of my question: as a passive participant in this community, would I get banned the next time I post something I was proud of doing?

                      1. 5

                        Probably not because you’re commenting otherwise. Note also how @pushcx mentions upthread that you’d likely receive a warning before a ban.

                        I have some stuff authored by myself and my employer on my profile but I have plenty of other stuff. That’s the kind of contribution pattern the rule about avoiding self promotion is trying to promote.

                    2. 4

                      Plus from a mod perspective, they can see things like voting, so that counts as engaging with the community.

            2. 3

              I’m curious how closely “number of upvoted comments” aligns with “understands site norms”. Would be a truly hideous query unless there’s structured data regarding “received PM from mod about not understanding site norms”, though.

          2. 22

            The percentages of contributions which lobste.rs users think it is okay to be self-promoting range from 10% to 49%, including a bunch of values in between

            It’s interesting that both of the people in the posts that you linked to said that they’d engage more, but then didn’t. One never posted again, the other has posted on two threads that were both about their employer. I wonder if some of these people ever intended to become involved with the community.

            The best bit of marketing from the last few decades has been to call advertising platforms ‘social media’. That phrasing has tainted everything that actually has a social component.

            1. 9

              If somebody doing self-promotion wrings their hands and goes “gee I sure wish I could’ve engaged more, but my stuff got flagged”, and then doesn’t…fuck’em.

              I’m sure we’ve lost a few really great contributors. We’ve also managed to keep out legions of growth hackers, spammers, and exploiters who just see us as a really valuable marketing venue–and hilariously, the more effective we are at keeping them out the more attractive a target we become!

          3. 10

            I posted a blog post of mine a couple of weeks ago, and before posting I was looking for a rule about self-promotion, but couldn’t find one (other than perhaps common sense). Now, I’m totally willing to believe I missed it, but I just don’t know how I would have known the norm without this thread.

            1. 5

              I’ve seen that a lot. I joined because a paper I was a coauthor on was submitted and I went to IRC to ask for an account. The folks there gave me an invite and talked a bit about self promotion. When I’ve invited others, I’ve given the, the 10% rule of thumb. I’d always assumed that introducing new people to the social norms was the job of whoever invited them. Maybe we need something written to that effect next to the ‘send invitation’ button, if these informal rules are not documented anywhere.

          4. 3

            Transparency has its benefits, but it might also lead to people trying to stay just below the line.

            It depends on how much that matters. Anti-fraud measures are almost never transparent for example.

            1. 1

              Transparency has its benefits, but it might also lead to people trying to stay just below the line.

              You that like it’s a bad thing?

              1. 1

                I think it’s potentially bad if you end up with several marginally annoying accounts that you don’t want to remove because they’re following the rules.

                It’s not a huge problem, but it’s also not necessarily better.

      5. 6

        I’d even argue for a two-week limit. After several years on this site, it’s rare I see any one author putting out quality content (definitions vary) more frequently than that.

        1. 4

          Yeah, I try to write my blog weekly but really in about 2/3 of the weeks I just do the bare minimum because it takes hours to write anything more than just like a basic “here’s the features I worked on this week” bullet list and it is just hard to find that kind of time (and even then I think most my blog isn’t that interesting). But, on the other hand, there’s some cases that are better - Microsoft’s Raymond Chen’s blog is pretty consistently good, though he’s retelling 30 years of stories and writes up a bunch of backlog ahead of time too. So idk, I think some of it is a judgement call, but at the same time if you see a blog you do consistently like, you can always check it yourself instead of waiting for someone to post it here again too.

        2. 3

          You mean between self-promotion posts? Yeah, I could see that working. Some really prolific folk manage slightly more than that, but not by a huge margin.

          1. 5

            I can also see a good argument that if you are prolific enough manage more than one lobsters-worthy post per two weeks then you have likely gained enough followers to submit thebrest of your posts for you or it doesn’t hurt to be a bit selective in what you submit here.

            I don’t know if I really like the idea of a hard rule in general but this seems a pretty reasonable upper bound of self-promotion.

    7. 2

      I think WebKit started JITing CSS selectors around 2014, and I presume other browser engines do the same thing, so this is replacing one Turing-complete JIT’d language with another Turing-complete JIT’d language. I’m still in favour of doing as much with declarative languages as possible (and so in favour of everything that the article proposes), but I find the framing of this somewhat amusing.

      1. 3

        I don’t follow how the implementation of a language affects if it is declarative or not. Also CSS isn’t really turning complete. (I think it technically is when combined with the DOM or an input source, but in practice no one is using it like that.)

      2. 1

        Are there any non-terminating CSS selectors?

    8. 3

      It’s a well appreciated improvement, can’t wait to update my clumsy inverse gitignores file lists.

      1. 2

        That’s exactly what I did. Worked quite well. Before After.

        Not really any shorter but more to the point.

    9. 3

      I’ll be the first person to come out in favour of Doing it With CSS, but I do have to object to this snippet:

      It’s one of the core principles of web development and it means that you should Choose the least powerful language suitable for a given purpose.

      What?? According to whom? You can’t be seriously telling me that it’s preferable to centre content using <center> instead of with CSS? This logic would even support the usage of <font> tags and inline colours and all that HTML3 nonsense. This is a nightmare for developers and users of accessibility aids – literally nobody wins!

      I was under the impression that since HTML5 the accepted best practice was to separate content from form: your HTML markup should describe the entire document as well as the semantic relations between elements, and your stylesheet should contain all visual information which does not have explicit semantic importance.

      1. 5

        The rule of least power is a core W3C principle.

        Of course, it’s not an absolute rule, it’s a guiding principle. In particular here, in the sentence “Choose the least powerful language suitable for a given purpose”, we can argue about “suitable” and “purpose”. If you want some text to be visually centred, there is a good case that this is an HTML misfeature and should never have been a part of this language at all. Which is why center is deprecated, and that’s a pretty good reason not to use it.

        1. 3

          This document doesn’t support the author’s interpretation, though: specifically, it doesn’t support choosing HTML over CSS in all cases, as the author directly suggests. Rather, it reflects on choice of language generally and makes some broad recommendations for the web, viewing HTML and CSS as two parts of the same “least-power” solution.

          Many Web technologies are designed to exploit the Rule of Least Power. HTML is intentionally designed not to be a full programming language, so that many different things can be done with an HTML document: software can present the document in various styles, extract tables of contents, index it, and so on. Similarly, CSS is a declarative styling language that is easily analyzed. […] Thus, HTML, CSS and the Semantic Web are examples of Web technologies designed with “least power” in mind. Web resources that use these technologies are more likely to be reused in flexible ways than those expressed in more powerful languages.

          Yes, the rule of least power can be used as a useful guideline to develop best practices, but it’s not a rule about when to use HTML and when to use CSS (although it could be applied productively to the case of javascript).

          1. 2

            OK, so the part you are really objecting to is “On the web this means preferring HTML over CSS”, and I agree with your comments on that.

      2. 0

        What’s the biggest problem with <center>?

        Considering accessibility, I remember that being brought forward as a reason to prefer semantic HTML. But nowadays, aren’t screen readers much better and will figure out something is a headline if it is just a bold div? Why should I use the less convenient CSS if screen readers don’t care?

        If the strongest argument is elegance or similar, I think I would rather worry about the massive ad-hoc JS libraries, which to my FP eyes is a much bigger annoyance.

        What about backwards compat? I like using HTML 3 tags instead of flex box. It means my code is more compatible. CSS2 is harder to implement than HTML 3.

        Old HTML gives me a fuzzy feeling, it’s almost nostalgic. Marquee is a favorite relic from a less corporate web. Why doesn’t it make you nostalgic?

        1. 4

          aren’t screen readers much better and will figure out something is a headline if it is just a bold div?

          I’d bloody hope not. It’s annoying enough having to put role="list" on <ul> elements because other people abused them. Why should I, as someone who writes conformant HTML, have to use hacks to make sure my bold divs don’t turn into headings?

          Why do people constantly make the argument that browsers/AT should just “do accessibility” for them? If someone said “why does anyone care about performance when optimizing compilers exist” they’d be laughed out of the room. When browsers try to work around HTML syntax errors, they are maligned. And bringing more of these annoying heuristics to accessibility is supposed to be an improvement?

        2. 3

          I believe the main reason to centre with CSS instead of <center> remains the separation of style and presentation. If you decide to not centre a particular kind of element in the future, it’s easy to update the CSS in one place and apply the change consistently across a whole site. If you use <center> then you need to go and audit every use and find the ones that you do and don’t want to change.

          This is somewhat less relevant if your text is stored in some other format and the centred part is applied server-side via some template engine.

          1. 2

            Nah, just put

            center {
              text-align: left !important;
            }
            

            and you have applied the change consistently in one place.

        3. 3

          my programming career started with a “teach yourself web design” book I found in my primary school library and I distinctly remember convincing my parents to allow me to install firefox 2.0 on their home computer so that I could use the <blink> tag on my websites. So yes, HTML3 definitely brings back some good memories :)

          But oh my days, it’s not good! There is definitely a reason why we settled on the separation of markup from style. As soon as your website gets bigger than 2 files it is going to cause more headaches than it triggers warm fuzzy nostalgia. To answer your question directly, <center> causes problems as soon as you decide you don’t want those elements centred any more. If you want to provide custom stylesheets for your website (which should also bring back some web 1.0 fuzzies, I hope), it’s going to complicate that. More fundamentally, it puts one – and only one – aspect of your layout rules in with the markup, and makes it more difficult to write the inevitable stylesheet that will come later. Yeah, I still use it on some of my websites for old time’s sake, but not once they expand beyond 2 or 3 files.

          Your point about screenreaders is an interesting one – yes, they have become good at detective work because they had to. I don’t think that’s something to celebrate, and this detective work inevitably carries with it an element of uncertainty. I’d rather try and create websites with a beautiful structure; even if screenreaders have become sufficiently advanced that they can work around hostile design, there are a host of other accessibility aids (and more generally, cool things we could do with hypertext) currently rendered impossible because not enough websites provide friendly markup they can work with.

          And btw, I don’t use any javascript frameworks either :)

          1. 2

            As soon as your website gets bigger than 2 files it is going to cause more headaches than it triggers warm fuzzy nostalgia.

            Well no, because then you make it all .cgi scripts in perl which then print <span font=”$myfont”> (Since you know, you can’t trust that all browsers support this newfangled css thing yet).

            1. 2

              you joke, but perl+slowcgi is still my favourite way to bootstrap a quick web app. that book I read in primary school still haunts me

              1. 1

                you joke

                only half, I’m afraid.

      1. 4

        You can in pretty much any programming language by using IEEE 754 floating point. It is clearly defined and will compute to infinity or NaN.

        1. 1

          It has to be special cased, otherwise in programming it would just be a endless loop.

          Subtracting 0 from $VARIABLE until it reaches 0, which will never happen.

          1. 3

            It probably is special cased, but also, people generally use much smarter algorithms for division than “subtract B from A until you get 0”.

          2. 2

            That algorithm doesn’t work for floating point numbers anyways. For example 2^64/1 would never complete because the closest double precision number to 2^64 is also the closest number to itself minus 1. So you can subtract 1 all day and it will keep rounding back to itself.

      2. 3

        J also allows it and so does JS, and in both cases the answer is more reasonably infinity.

      3. 2

        Aside: @hwayne, since you’re here, I’d like to feed back that I thought this was pretty confusing:

        […] I don’t know Pony, and I don’t have any interest in learning Pony.¹ But this tweet raised my hackles for two reasons:

        1. It’s pretty smug. I have very strong opinions about programming, but one rule I try to follow is do not mock other programmers.² […]
        2. It’s saying that Pony is mathematically wrong. […]

        Reading this linearly until hitting a footnote reference (“¹”) and then jumping down to where there was what looked like an inline footnote body, I thought you were saying you had no interest in learning Pony because Pony was too smug and mocked other programmers; only after reading the second (not-a-)“footnote” (Pony is saying Pony is wrong? huh?) did I catch my mistake.

        1. 2

          Well that’s a weird UX failure mode of footnotes + numbered lists! That footnote should go to this:

          In the year since I wrote this post, Pony added partial division. I still don’t know anything about the language but they’ve been getting grief over this post so I wanted to clear that up. [return]

      4. 1

        Previously discussed on Lobsters as: https://lobste.rs/s/ilwn5n/1_0_0

    10. 0

      edit: some of this is dumb. you are testing open and read, the python effectively is C. The ultimate conclusion is very interesting. Great work

      1. 38

        you are testing open and read, the python effectively is C

        You would hope so, but when you see one “effectively C” taking twice as long as another “effectively C” then there must be something other than open and read going on. That difference is what is interesting.

    11. 6

      A similar tip is to not put your personal files in the top-level of $HOME, since it often becomes a cluttered with other things. Instead, put your files in a sub-directory like $HOME/realhome.

      1. 12

        I am apparently some kind of Unix anarchist so I’ve done the opposite of this: I set my $HOME to /home/danso/home and configured my shells to open in /home/danso. I even made an alias ¬=/home/danso to recreate the functionality of ~ !

        It’s going great. Now all those rude programs which create files in $HOME are doing it in a subdir.

      2. 2

        Yep, in an ideal world, that works perfectly but the amount of tool that assumes $HOME=~ is just too many.

        1. 7

          ~ is pretty much just syntactic sugar for $HOME, though, right?

        2. 2

          Can you elaborate on why that would be a problem? Sorry if it’s obvious.

          1. 3

            I assume that you can override $HOME but ~ always comes from the user database.

            1. 7

              My immediate thought was “surely not?!” so I checked bash:

              If this login name is the null string, the tilde is replaced with the value of the shell parameter HOME. If HOME is unset, the home directory of the user executing the shell is substituted instead.

              and dash:

              If the username is missing (as in ~/foobar), the tilde is replaced with the value of the HOME variable (the current user’s home directory).

    12. 11

      I don’t really see the problem with containers running as root. Root in the container is not root outside. On Linux, UID namespaces map this to a different UID. On FreeBSD, it’s marked as ‘in prison’ in the kernel and is trusted no more than any other unprivileged user. We stopped running things as root for two reasons:

      • Blast radius
      • Compromise persistence

      A big in something running as root can take out the whole system. Even fairly benign things like allowing a log file to grow too much can be catastrophic because *NIX systems reserve a small amount of disk space for root to use to recover when the disk is full. This does not apply in containers. At worst, it can take out the container, but the host won’t let it completely fill the filesystem or exercise any of the normal root privileges because it’s not root.

      Compromise persistence is a big problem in general: a compromise of a root process can modify any configuration file or (unsigned) system binary (potentially including the kernel) or even modify system firmware, so guaranteeing that it’s fixed is almost impossible. In contrast, containers are immutable. If their configs are not part of the base image, they’re going to be a read-only bind mount or a volume. Things that should persist are on separate volumes and these need auditing in the event of a compromise, but these can typically be mounted noexec, so at least have defence against the easy attacks.

      For containers, you typically have one container per trust domain. If running as root in the container is a problem, the underlying problem is usually that you have too many things in the container.

      For a lot of things, the main benefit of the Dockerfile or Comtainerfile is a machine-checked set of build instructions. Build instructions in the docs may be stale, but the ones used to build the container are up to date.

      I’m not sure I buy the ease of use argument. I can build RPMs, Debs, and FreeBSD packages by setting a few flags in my CMakeLists.txt. If I create a FreeBSD package with the ports tree, I typically write less than writing a Dockerfile, unless the build is particularly baroque. But if I have custom requirements for my deployment (e.g. must add these extra compiler flags for security, must link with this hardened allocator) then I find it much easier to look in the container build instructions than most packaging files (which include a load of convenience functions that hide the details of how the build is actually run).

      I do agree that I see it as a red flag if a container image is the only way that a package is a container image, or if the only build instructions are the container (or, worse, an Ansible script). These things are always going to be a pain to deploy in any non-default configuration and are probably tightly coupled to things by accident.

      1. 20

        I don’t really see the problem with containers running as root. Root in the container is not root outside.

        Root in the container has repeatedly offered access to kernel bugs that could be used escalate to root outside the container.

        1. 3

          Are we talking escalation by eventually mapping container root to host root, or escalation by container root having access to kernel mechanisms that eventually allow privileged access outside the container?

          I’m not very familiar with recent container escape vulns so I’m not on very steady ground here but the ones I remember from the first crop of container bugs largely abused the container - host mappings of users and “capabilities” (scare quotes because Linux doesn’t really have capabilities). Once Docker started using UID namespaces, that vector was pretty much gone, as any access to host resources that the container’s root could gain were subject to whatever restrictions were enforced on the (unprivileged) users to which container root was mapped.

          The way Docker does isolation on Linux is inherently limited by Linux’ weird isolation mechanisms (the plural form there being a problem in and of itself, too…), which meant that early on “container root” was very much a real root that the kernel second-guessed on every step (and, of course, you could never second-guess everything, so eventually some operations slipped through the cracks).

          I, too, was under the impression that this is no longer the case now and that container root really is unable to perform privileged operations on non-container resources, modulo privilege enforcement bugs which would otherwise be accessible to unprivileged container users, too. Do I have it wrong?

          (Edit: to clarify, I’ve mainly used Docker for development so when I’m asking if I have it wrong, I’m not passive-aggressively suggesting I’m not, I’m really asking if I’m wrong :-D. Sorry if that seems obvious but this is the Internet and all, sometimes this kind of feels like it needs to be said…)

          1. 15

            No, there were multiple escapes related to USERNS in particular, e.g.

            1. 7

              *sigh* so good to know things are still awful.

          2. 5

            Are we talking escalation by eventually mapping container root to host root, or escalation by container root having access to kernel mechanisms that eventually allow privileged access outside the container?

            The later.

            https://sysdig.com/blog/detecting-mitigating-cve-2022-0492-sysdig/

            https://www.crowdstrike.com/blog/crowdstrike-discovers-new-container-exploit/

            There’s been others, but I’m being bad and replying from bed.

            I think of it this way: container root offers a larger attack surface.

        2. 2

          This is true but for what it is worth you shouldn’t be trusting any code running in the same kernel as strongly isolated. The attack surface is just way too large. Even cross-VM attacks are common so the truly paranoid shouldn’t be sharing hardware with any third party that isn’t largely trusted. I tend to treat UNIX users, Linux namespaces and FreeBSD Jails as good enough isolation for between first-party services but not enough that I would consider an RCE in one service to not risk other services on the same machine.

          1. 3

            I could not more agree more. I note that professional container hosts generally prefer hardware assisted VM-per-container isolation.

      2. 2

        I don’t really see the problem with containers running as root. Root in the container is not root outside. On Linux, UID namespaces map this to a different UID.

        I haven’t seen this behavior with how Docker for Windows interacts with other WSL2 guests that share volume mounts. I constantly run into issues where a container’s root user has scribbled new directories with root perms all over, and while I haven’t tested it specifically, I don’t imagine that there’s something preventing a container guest’s root user from writing arbitrary setuid executables as root to the volume mount either.

      3. 2

        I think some of the filesystem complaints are downstream from “everyone” using Macs and not hitting the filesystem issues because of how the filesystem layer in Docker 4 Mac basically handling everything. So when a linux user shows up, they end up with a bunch of weird problems despite Linux supoposedly being the “right” OS for all of this.

        To be honest I think this is a Docker failing, where they could have offered some “reasonable defaults” setup for file handling. But computers are complicated

    13. 38

      Sorry if I sound like a broken record, but this seems like yet another place for Nix to shine:

      • Configuration for most things is either declarative (when using NixOS) or in the expected /etc file.
      • It uses the host filesystem and networking, with no extra layers involved.
      • Root is not the default user for services.
      • Since all Nix software is built to be installed on hosts with lots of other software, it would be very weird to ever find a package which acts like it’s the only thing on the machine.
      1. 20

        The amount of nix advocates on this site is insane. You got me looking into it through sheer peer pressure. I still don’t like that it has its own programming language, still feels like it could have been a python library written in functional style instead. But it’s pretty cool to be able to work with truly hermetic environments without having to go through containers.

        1. 22

          I’m not a nix advocate. In fact, I’ve never used it.

          However – every capable configuration automation system either has its own programming language, adapts someone else’s programming language, or pretends not to use a programming language for configuration but in fact implements a declarative language via YAML or JSON or something.

          The ones that don’t aren’t so much config automation systems as parallel ssh agents, mostly.

          1. 6

            Yep. Before Nix I used Puppet (and before that, Bash) to configure all my machines. It was such a bloody chore. Replacing Puppet with Nix was a massive improvement:

            • No need to keep track of a bunch of third party modules to do common stuff, like installing JetBrains IDEA or setting up a firewall.
            • Nix configures “everything”, including hardware, which I never even considered with Puppet.
            • A lot of complex things in Puppet, like enabling LXD or fail2ban, were simply a […].enable = true; in NixOS.
            • IIRC the Puppet language (or at least how you were meant to write it) changed with every major release, of which there were several during the time I used it.
        2. 15

          I still don’t like that it has its own programming language

          Time for some Guix advocacy, then?

          1. 8

            As I’ll fight not to use SSPL / BUSL software if I have the choice, I’ll make sure to avoid GNU projects if I can. Many systems do need a smidge of non-free to be fully usable, and I prefer NixOS’ pragmatic stance (disabled by default, allowed via a documented config parameter) than Guix’s “we don’t talk about nonguix” illusion of purity. There’s interesting stuff in Guix, but the affiliation with the FSF if a no-go for me, so I’ll keep using Nix.

            1. 11

              Using unfree software in Guix is as simple as adding a channel containing the unfree software you want. It’s actually simpler than NixOS because there’s no environment variable or unfree configuration setting - you just use channels as normal.

              1. 13

                Indeed, the project whose readme starts with:

                Please do NOT promote this repository on any official Guix communication channels, such as their mailing lists or IRC channel, even in response to support requests! This is to show respect for the Guix project’s strict policy against recommending nonfree software, and to avoid any unnecessary hostility.

                That’s exactly the illusion of purity I mentioned in my comment. The “and to avoid any unnecessary hostility” part is pretty telling on how some FSF zealots act against people who are not pure enough. I’m staying as far away as possible from these folks, and that means staying away from Guix.

                The FSF’s first stated user freedom is “The freedom to run the program as you wish, for any purpose”. To me, that means prioritizing Open-Source software as much as possible, but pragmatically using some non-free software when required. Looks like the FSF does not agree with me exercising that freedom.

                1. 11

                  The “avoid any unnecessary hostility” is because the repo has constantly been asked about on official Guix channels and isn’t official or officially-supported, and so isn’t involved with the Guix project. The maintainers got sick of getting non-Guix questions, You have an illusion there’s an “illusion” of purity with the Guix project - Guix is uninvolved with any unfree software.

                  To me, that means prioritizing Open-Source software as much as possible, but pragmatically using some non-free software when required.

                  This is both a fundamental misunderstanding of what the four freedoms are (they apply to some piece of software), and a somewhat bizarre, yet unique (and wrong) perspective on the goals of the FSF.

                  Looks like the FSF does not agree with me exercising that freedom.

                  Neither the FSF or Guix are preventing you from exercising your right to run the software as you like, for any purpose, even if that purpose is running unfree software packages - they simply won’t support you with that.

                  1. 5

                    Neither the FSF or Guix are preventing you from exercising your right to run the software as you like, for any purpose, even if that purpose is running unfree software packages - they simply won’t support you with that.

                    Thanks for clarifying what I already knew, but you were conveniently omitting in your initial comment:

                    Using unfree software in Guix is as simple as adding a channel containing the unfree software you want. It’s actually simpler than NixOS because there’s no environment variable or unfree configuration setting - you just use channels as normal.

                    Using unfree software in NixOS is simpler than in Guix, because you get official documentation, and are able to discuss it in the project’s official communication channels. The NixOS configuration option is even displayed by the nix command when you try to install such a package. You don’t have to fish for an officially-unofficial-but-everyone-uses-it alternative channel.

            2. 4

              I sort of came to the same conclusion while evaluating which of these to go with.

              I think I (and a lot of other principled but realistic devs) really admire Guix and FSF from afar.

              I also think Guix’s developer UI is far superior to the Nix CLI, and the fact that Guile is used for everything including even configuring the boot loader (!).

              Sort of how I admire vegans and other people of strict principle.

              OT but related: I have a 2.4 year old and I actually can’t wait for the day when he asks me “So, we eat… dead animals that were once alive?” Honestly, if he balks from that point forward, I may join him.

              1. 3

                OT continued: I have the opposite problem: how to tell my kids “hey we try not to use the shhhht proprietary stuff here.

                I have no trouble explaining to them why I don’t eat meat (nothing to do with “it was alive”, it’s more to help boost the non-meat diet for environmental etc reasons. Kinda like why I separate trash.). But how to tell them “yeah you can’t have Minecraft because back in the nineties people who taught me computer stuff (not teachers btw), also thought me never to trust M$”. So, they play Minecraft and eat meat. I … well I would love to have time to not play Minecraft :)

        3. 9

          I was there once. For at least 5-10 years, I thought Nix was far too complicated to be acceptable to me. And then I ran into a lot of problems with code management in a short timeframe that were… completely solved/impossible-to-even-have problems in Nix. Including things that people normally resort to Docker for.

          The programming language is basically an analogue of JSON with syntax sugar and pure functions (which then return values, which then become part of the “JSON”.

          This is probably the best tour of the language I’ve seen available. It’s an interactive teaching tool for Nix. It actually runs a Nix interpreter in your browser that’s been compiled via Emscripten: https://nixcloud.io/tour/

          I kind of agree with you that any functional language might have been a more usable replacement (see: Guix, which uses Guile which is a LISPlike), but Python wouldn’t have worked as it’s not purely functional. (And might be missing other language features that the Nix ecosystem/API expects, such as lazy evaluation.) I would love to configure it with Elixir, but Nix is actually 20 years old at this point (!) and predates a lot of the more recent functional languages.

          As a guy “on the other side of the fence” now, I can definitely say that the benefits outweigh the disadvantages, especially once you figure out how to mount the learning curve.

        4. 7

          The language takes some getting used to, that’s true. OTOH it’s lazy, which is amazing when you’re trying to do things like inspect metadata across the entire 80,000+ packages in nixpkgs. And it’s incredibly easy to compose, again, once you get used to it. Basically, it’s one of the hardest languages I have learned to write, but I find it’s super easy to read. That was a nice surprise.

        5. 3

          Python is far too capable to be a good configuration language.

        6. 3

          Well, most of the popular posts mainly complaint about the problems that nix strive to solve. Nix is not a perfect solution, but any other alternative is IMO worse. The reason for nix’s success however is not in nix alone, but the huge repo that is nixpkgs where thousands of contributors pool their knowledge

      2. 8

        Came here to say exactly that. And I’d add that Nix also makes it really hard (if not outright impossible) for shitty packages to trample all over the file system and make a total mess of things.

      3. 6

        I absolutely agree that Nix is ideal in theory, but in practice Nix has been so very burdensome that I can’t in good faith recommend it to anyone until it makes dramatic usability improvements, especially around packaging software. I’m not anti-Nix; I reallly want to replace Docker and my other build tooling with it, but the problems Docker presents are a lot more manageable for me than those that Nix presents.

      4. 4

        came here to say same.

        although I have the curse of Nix now. It’s a much better curse though, because it’s deterministic and based purely on my understanding or lack thereof >..<

      5. 2

        How is it better to run a service as a normal user outside a container than as root inside one. Root inside a container = insecure if there is a bug in docker. Normal user outside a container typically means totally unconfined.

        1. 7

          No, root inside a container means it’s insecure if there’s a bug in Docker or the contents of the container. It’s not like breaking out of a VM, processes can interact with for example volumes at a root level. And normal user outside a container is really quite restricted, especially if it’s only interacting with the rest of the system as a service-specific user.

          1. 10

            Is that really true with Docker on Linux? I thought it used UID namespaces and mapped the in-container root user to a pin unprivileged user. Containerd and Podman on FreeBSD use jails, which were explicitly designed to contain root users (the fact that root can escape from chroot was the starting point in designing jails). The kernel knows the difference between root and root in a jail. Volume mounts allow root in the jail to write files with any UID but root can’t, for example, write files on a volume that’s mounted read only (it’s a nullfs mount from outside the jail and so root in the container can’t modify the mount).

            1. 10

              I thought it used UID namespaces and mapped the in-container root user to a pin unprivileged user.

              None of the popular container runtimes do this by default on Linux. “Rootless” mode is fairly new, and I think largely considered experimental right now: https://kubernetes.io/docs/tasks/administer-cluster/kubelet-in-userns/

              https://github.com/containers/podman/blob/main/rootless.md

            2. 8

              Is that really true with Docker on Linux?

              Broadly, no. There’s a mixture of outdated info and oversimplification going on in this thread. I tried figuring out where to try and course-correct but probably we need to be talking around a concept better defined than “insecure”

            3. 4

              Sure, it can’t write to a read-only volume. But since read/write is the default, and since we’re anyway talking about lazy Docker packaging, would you expect the people packaging to not expect the volumes to be writeable?

              1. 1

                But that’s like saying alock is insecure because it can be unlocked.

                1. 1

                  I don’t see how. With Docker it’s really difficult to do things properly. alock presumably has an extremely simple API. It’s more like saying OAuth2 is insecure because its API is gnarly AF.

        2. 3

          This is orthogonal to using Nix I think.

          Docker solves two problems: wrangling the mess of dependencies that is modern software and providing security isolation.

          Nix only does the former, but using it doesn’t mean you don’t use something else to solve the latter. For example, you can run your code in VMs or you can even use Nix to build container images. I think it’s quite a lot better at that than Dockerfile in fact.

        3. 2

          How is a normal user completely unconfined? Linux is a multi-user system. Sure, there are footguns like command lines being visible to all users, sometimes open default filesystem permissions or ability to use /tmp insecurely. But users have existed as an isolation mechanism since early UNIX. Service managers such as systemd also make it fairly easy to prevent these footguns and apply security hardening with a common template.

          In practice neither regular users or containers (Linux namespaces) is a strong isolation mechanism. With user namespaces there have been numerous bugs where some part of the kernel forgets to do a user mapping and think that root in a container is root on the host. IMHO both regular users and Linux namespaces are far too complex to rely on for strong security. But both provide theoretical security boundaries and are typically good enough for semi-trusted isolation (for example different applications owned by the same first party, not applications run by untrusted third parties).

    14. 8

      Jitter becomes more important as network performance increases, also.

      1. 4

        I never thought about this. Is it because the reliability and latency of the network introduce their own source of jitter, so the more reliable and low latency, the less variance?

        1. 4

          Yes. You always get some amount of jitter for free. So it hurts when the base amount isn’t sufficient.

        2. 1

          Exactly that. Out in the world it’s not all that important, but when it’s two machines on your side of a firewall, on very fast local wired network connections, jitter can save you from the kinf of situation where you end up bringing one machine back up manually at a time, waiting, bringing up the next one, etc to avoid triggering some cascading failure that keeps you from getting back up and running smoothly even when there’s no-longer some other underlying problem to be addressed.

    15. 9

      I have seen spreadsheets that have survived generations of engineers at companies I’ve come to work at, and which have survived my own departure unbothered.

      The enduring power of spreadsheets, seemingly in the face of maintainability, scalability, revision control, and so forth–what the Boondocks would refer to as “talkin’ all that good shit”–should serve as a reality check for everyone that wants to spend their dwindling life force reinventing the wheel in the name of aesthetics.

      Even more bluntly: a properly functioning simple-but-ugly spreadsheet has made companies I’ve been at more profit than any individual set of devops or application initiatives, and more galling to us, it’s pretty much completely handled by normies.

      1. 3

        This is sort of like warning people from inventing a table saw because lots of carpenters have hand saws that their father passed down to them. No one is arguing that a hand saw isn’t a good tool, but it is clear a table saw can be faster and produce better cuts.

        The reason why reinventing the spreadsheet is so popular is because it is clear that spreadsheets are an incredibly valuable tool, but also that they have major flaws (robustness, maintainability, …). I’m sure that some tool will hit the right set of tradeoffs to improve the situation for many use cases. But I don’t think anyone thinks it is a free lunch. Plus spreadsheets aren’t standing still. Microsoft has put in many reliability features into Excel over the years (although usually they are opt-in).

        So I say the opposite. Give it a try! Disrupt spreadsheets! You are unlikely to succeed. But the value if you do is immense.

    16. 2

      Another great feature is $CDPATH. This makes cd look for relative directories in all of these locations rather than just the current directory. Mine currently looks like .:/home/kevincox/p because I clone all of my projects and other git repos to ~/p. This way I can just type cd foo and jump into ~/p/foo no matter where I am. In my case I put the . first so that directories in the current directory take priority but that has never been a problem for me.

    17. 1

      Once nice thing about setopt auto_cd is that it enabled tab-completion for directories at the start of the prompt. This can make running scripts that aren’t in your path much easier. Usually script<Tab> doesn’t find anything (well except the script command on your path), but with auto_cd on it will find scripts/ then you can continue tab-completing the files inside.

      Right now I sometimes work around this by typing ./ first so that directories (and files) in the current working directory will be suggested. So for example ./scr<Tab> will complete ./scripts/.

      I should probably see if this is a completion option but the work around of just enabling auto_cd even if I don’t use it is acceptable.