1. 3

    Given the “localhost” hard coded there, my guesses would be:

    • someone has the tcp port forwarded to production and all the configuration variables overridden to match production credentials in order that they can use their WIP working copy code to view live production data. I think this is most plausible.
    • that script was run on production instead of on someone’s isolated dev machine
    • disgruntled insider left a logic bomb
    • maybe it was actually something else unrelated? coincidental ransomware hit at the same time, which happened to succeed in deleting the data but failed to deliver the ransom note? Eh kinda far fetched.

    A safe replacement for this would be to put your local dev DB in a sandbox (VM or container) that isn’t even orchestrated via the same mechanism as prod. Recreate it by deleting & recreating the sandbox.

    Also: much sympathy. Btdt, never want to again.

    1. 1

      I’d agree with your most-plausible prediction, and hence will probably take something valuable away from the article: programmatically make sure localhost is really localhost before you do something destructive, and only keep code that’s designed for mass destruction if it’s absolutely necessary. You could imagine this mistake ending in “Error: refusing to run on production system” or “Access denied: invalid credentials” and everyone sleeping well.

      1. 1

        someone has the tcp port forwarded to production and all the configuration variables overridden to match production credentials in order that they can use their WIP working copy code to view live production data. I think this is most plausible.

        take something valuable away from the article: programmatically make sure localhost is really localhost before you do something destructive

        There’s a much simpler “take away” from this: Don’t ever do that. Like, ever. I mean ever.

        There is literally zero reason to connect your local, untested, in-dev code, to a production database. None. Zip. Zilch. Nada. No matter what “but..” you can think of, there is a better solution.

        To clarify something, I didn’t say no one should have access to prod. I’m talking about processes, not access rights. To use a git analogy, if prod is the main/default/stable branch of your codebase, the take away is, “don’t do dev on the main branch”.

        1. 1

          Agree that is an absolute footgun idea, but it’s possible definitely to do by accident in disorganized or overly-permissive environments.

          1. 1

            Anything is possible “by accident” if you have zero idea what you’re doing.

            1. 1

              Plenty of accidents are caused by people who thought they knew damn well what they were doing.

    1. 1
      for (;;) {
          ...
      }
      
      1. 1

        yes sure. also

        while(true) { ... }

        but i think this is much clearer code instead of abusing notation for things they were never intended for..

      1. 1

        This is certainly lazier than encoding the interleaved advertisements and content as a single MPEG DASH stream.

        1. 4

          This lost me within the first paragraph. Others are not required to use their own resources to publish your chosen speech for free. Declining such a transaction is not “censorship”. Seek remedy in antitrust law IMO.

          1. 3

            Same here. I have a lot of sympathy for individuals or groups who are caught up by social media moderation - sometimes in error - but to complain that the freakin’ New York Post, a newspaper that’s part of one of the largest media companies in the world is being “silenced” is beyond silly.

            The Post is free to promote the story on its own property - and when I checked yesterday it didn’t seem to bother to do so.

            1. 3

              It’s absolutely censorship, done by powerful entities that control ostensibly-public and widely used mediums of information flow. It’s really no different from the majority of the citizens in some place where a particular religious tradition is strong ordering the local library to remove books whose contents offend that religious tradition. This doesn’t prevent other libraries from stocking those books. It doesn’t even prevent the New York Post from writing an approving article detailing exactly what those banned books are, with the implication that you might want to read them yourself. It’s still censorship.

              1. 2

                I can publish on the internet myself for less than five dollars per month using a common carrier. You’re entitled to speech, not venue. (N.B. And please don’t pretend that political disagreements are akin to religious discrimination.)

            1. 9

              Youtube-dl allows me to time-shift, space-shift, and format-shift videos to devices that don’t have Youtube clients, browsers, or even network connections. I do this in the privacy of my own home to avoid filling landfills and spending money that doesn’t need to be spent. I seem to remember some DMCA exemptions in/around this area.

              1. 0

                hope this can help an independent project with no legal team

              1. 2

                I guess it’s safe to say that mmap is a sort of DMA mechanism? Really an interesting model. I guess most computers spend a lot of time reading files, but I’m always surprised at the diversity and the tooling around something that often (in high-level languages) just gets mapped to “here’s a stream of bytes”

                1. 3

                  I’ve always thought about it as an inevitable logical consequence of demand paging implemented by a hardware MMU: e.g. “if we can do it for program memory and swap files, why not all memory and normal files; there’s hardware support”. There are even some neat tricks you can do with madvise(2) to control automatic (potentially parallel) readahead.

                  Would love to re-run the author’s experiment with the kernel’s memmove implementation patched in place of the compiler-generated AVX instructions.

                  1. 1

                    “DMA” usually refers to external devices like a NIC or GPU accessing RAM without the CPU’s involvement, I haven’t heard anyone refer to anything that happens on-CPU as “DMA”.

                    something that often (in high-level languages) just gets mapped to “here’s a stream of bytes”

                    Well, that’s the more “common” interface because stream processing is just a very common thing to do (think grep/sed, streaming parsers, sending contents of a file to a socket, writing a log file…) but like, both RAM and SSD are random-access devices, so mapping a file into an address space is the interface that matches the underlying storage more closely.

                    Streams are for rolls of magnetic tape :)

                    On a more serious note though, there is plenty of support in higher level languages for mmap.

                    1. 1

                      my understanding of DMA comes more from stuff like game consoles, where you’re writing to certain memory addresses to do stuff to the GPU (or like… reading memory to see the state of buttons, instead of making some sort of system call). This might not actually be the right use of the term though.

                      Seeing stuff like the Python thing, or the original post…. is there any reason not to go with mmap for files? Since it seems that behind the scenes it does trickery to basically accomplish the same as syscalls?

                      1. 4

                        you’re writing to certain memory addresses to do stuff to the GPU

                        I think what you’re describing is memory-mapped I/O, or kernel-bypass I/O when you’re doing memory-mapped I/O from a process on an OS without going via the OS. DMA is when a device directly accesses memory. You can often trigger this via MMIO. For example, writing commands into a ring buffer and then poking a memory-mapped control register to instruct the device to read the next entries from memory (and, on a GPU, those commands may trigger additional DMAs to read data from other bits of memory).

                        Seeing stuff like the Python thing, or the original post…. is there any reason not to go with mmap for files?

                        There are a bunch of reasons:

                        • It consumes virtual address space. Less of a problem on 64-bit systems (though they’re typically actually 48-bit systems, in terms of virtual address size), but calling mmap on a lot of 16GiB files will use a nontrivial amount of your address space. Depending on how fragmented it is, you will eventually run out, though that might not matter in practice (for example, lld mmaps pretty much everything and even a big program doesn’t require it to map more than around 10GiB of stuff, which is trivial on a 64-bit system).
                        • The sharing semantics are a bit weird. When you read, you get a snapshot at the current time. With mmap and MAP_SHARED, you will see a view of the file that another process can modify while you’re reading it (i.e. in between reading two adjacent bytes). If you use MAP_PRIVATE, you have a private copy, but only after the first time a page has actually been mapped for you. If you touch page A in a region, another process modifies the data corresponding to page B, and then you read page B, it is entirely nondeterministic whether you will see the ‘before’ or ‘after’ view of the page. If the OS happens to have populated that page for you, you will see the ‘before’ view, if it has not then you will see the ‘after’ view. Any subsequent access will see whatever you saw the first time.
                        • There’s no good way of modifying the length of a file via mmap-like views. You can separately ftruncate the file, but if you make it smaller then you will have stale mappings (I think), if you make it larger then you need to map the additional bits. Linux has mremap, but it is similar to realloc and will break any pointers that you have into that region. With write, you can trivially append to a file. Doing the same thing with mmap is much harder.
                        • The mmap family of things only work for file-like objects. They don’t work for things like sockets or pipes, which are inherently stream-oriented. If you want your code to work on any kind of readable file descriptor, using read will work, using mmap will not. You may still choose to use mmap for performance reasons if the file descriptor is a mappable object, but now you have two independent code paths to test. If performance doesn’t matter too much, just using read / write is probably fine.

                        The read / write and mmap mechanisms address different problems. Both enable some use cases that the other doesn’t.

                  1. 3

                    Please get a TLS certificate and serve this over https.

                    1. 8

                      TBH i get it when people don’t want to set up HTTPS for every server. I’ve had issues with TLS in the past, that even broke domain names, because of some minor mistakes here and there.

                      1. 5

                        “Minor mistakes here and there” is an opportunity to learn. Now that we have Let’s Encrypt, there are only few excuses to not provide secure connections everywhere.

                        Personally, I have mostly stopped visiting websites that offer insecure http only.

                        1. 3

                          Personally, I have mostly stopped visiting websites that offer insecure http only.

                          Well I guess if you’re not visiting the site your opinion doesn’t really count for much. There are reasons to support HTTP, and reasons not to use HTTPS. Just because they might not apply to you doesn’t mean they don’t apply to the person creating the content. Nobody owes you a HTTPS connection.

                          1. 4

                            True, I’m just stating my preferences and asking nicely.

                          2. 3

                            I’m talking about Let’s Encrypt, I wouldn’t have never set TLS up if it weren’t for free.

                            Personally, I have mostly stopped visiting websites that offer insecure http only.

                            I don’t get why? What’s the problem, especially if it’s a personal or a hobby site? No accouts, no important data, nothing to care about.

                            1. 2

                              I live in a country where the government is spying on its citizens and is logging all data connections.

                              1. 2

                                Denmark?

                                1. 2

                                  Yes, Denmark.

                              2. 1

                                Private entities in the United States (e.g. your hotel, that hotspot you used over coffee) often make use of user data to further solicit commercial transactions. I’d take it as a fun excuse to play with Let’s Encrypt – or to see if you can get your CA to issue you a domain-validated certificate backed by an 8192-bit key (higher is probably possible, but it sacrifices compatibility and it a bit too absurd even for me).

                            2. 1

                              I agree, TLS is often a huge barrier for someone, e.g. whose device clock is not set.

                              I don’t think it is always necessary for just reading text.

                              If your ISP is MITMing you to the extent that this is an issue, you’ve got bigger problems.

                            3. 5

                              Yah, no, why?

                              I mean, the spec for twtxt doesn’t require https. Non-modern computer systems can’t use https, stuff like plan9 and such.

                              Why is it important that this be put over https? Why not comment on the spec, or on the implementation of the site, or pretty much anything to do with the OP than this load of nonsense.

                              Seriously, this comment is like something you’d see on /. or HN.

                              1. 1

                                Chill, man. For reasons that I have already made clear in this thread I have a strong preference for encrypted connections, and all I did was asking nicely for https. That’s a comment that is just as valid as if I’d commented the CSS or on the concept of aggregation, and implementing https doesn’t necessarily remove http, so if people wish to connect over an insecure connection they can do so, the opposite is not true: if https is unavailable, you cannot choose it.

                                I’ve been on twtxt for more that three years and the site mentioned is not a new one. I’m not a big fan of these aggregation sites: They keep obsolete feeds around and put a burden on the publisher of the original individual twtxt streams to update them or have them removed.

                            1. 2

                              I’ve dropped a live database by accident as well, and seen a few colleagues do the same. Happens to the best of us, but mistakes like let you grow quickly.

                              Enjoy the journey in figuring out what went wrong!

                              1. 1

                                This is a tangent, but is there a good work on human factors in SRE? I’ve always done some informal things (prompt coloring, aliases to warn, trying to actually sleep, etc.) – but has anyone presented a more systematic approach?

                                1. 2

                                  Fine grain ACLS. Only let one user from one location do drops. It should be like a ceremony to get a database dropped.

                                  1. 1

                                    I think the molly-guard (apparently named after the plastic cover that guards a physical flip switch) package is a good example. It prevents you from accidentally running shutdown or reboot on remote hosts by requiring you to type in the name of the host before actually running the command. Very useful when you forget that you’re ssh’d into a host on a certain xterm and absentmindedly want to reboot your own computer.

                                    Would be nice if there were more things like this!

                                1. 1

                                  I always thought it to mean “cast as”, i.e. “to view a thing in a particular limited or different way”.

                                  1. 16

                                    Locate the offset at 3M lines, then use fallocat to punch hole at that offset.

                                    1. 3

                                      Specifically with --collapse-range, which I wasn’t aware of until now. It’ll drop data and and renumber the blocks for you; neat.

                                    1. 6

                                      If management is sad and perplexed and regretful to see an employee leave, then that employee probably did not demand enough change from their managers. Management in this scenario, and in so many others, is not interested in the best interests of the employee. Accordingly, if an employee is depressed or otherwise experiencing negative mental health effects because of employment, then they have a reasonable expectation that their manager, if competent, must do something to improve the situation.

                                      1. 14

                                        +1. I’ve seen engineers fall into two camps when they’re unhappy with their job. The really vocal ones who never stop fighting / advocating for what they want / should happen, and the ones who just bottle it up and get really quiet.

                                        The former will either get what they want eventually, or they will leave but no will be surprised.

                                        The latter though… they get stuck in their head, working at the company becomes increasingly depressing / stressful, and one day they just quit. And everyone is surprised because, well, they didn’t say anything (or communicate it in a way that others perceived it as important / etc.).

                                        Not saying this is necessarily true of the author, but a more general comment.

                                        It’s always the silent ones.

                                        1. 9

                                          Why ask for something from your manager and risk getting fired when you can keep your job, find a new job, and then resign at your own time on your own timeline?

                                          Maybe your manager won’t fire you, but why take that risk? You don’t owe them anything.

                                          1. 6

                                            This is worth answering.

                                            The last two times I spoke up to management like this are still clear in my mind. In one situation, upper management had instructed employees in ways which contradicted federal law; in another situation, state law had been enacted which affected our products. In the former case, I spoke up not for myself, but for other employees who did not understand that they were being disenfranchised; in the latter case, I spoke up not for myself, but for the entire business’s legal safety.

                                            I don’t owe my managers anything, yes, but I do certainly owe my employer and fellow employees quite a bit. I have an ethical obligation, as well as possible legal obligations.

                                            Also, to be frank, I don’t really mind taking a break of a few months between jobs. Employment is difficult, employers are terrible people, and the entire system of wealth and labor extraction makes me sick and tired.

                                            1. 2

                                              I do certainly owe … fellow employees quite a bit

                                              Definitely! This is a great point. If you have the priviledge to be able to handle it, sticking your neck out for fellow employees is absolutely worth trying.

                                            2. 4

                                              Managers and ICs are not antagonists. The manager’s job is to interface with other teams and keep the IC out of endless meetings and business processes. This shouldn’t be a yes-boss situation.

                                              1. 4

                                                Except the manager has the power to fire the underling, and the underling has no influence over the manager. The power imbalance is real, even if most manager like to pretend it is not.

                                              2. 3

                                                I’m really confused why asking for any of these things would put you at risk of being fired? I’m a manager and I cannot imagine thinking about firing someone for expressing their desire for things to be different or better.

                                                1. 3

                                                  I cannot imagine thinking about firing someone for expressing their desire for things to be different or better.

                                                  Well, then I’m glad you’re a good manager :) But one is not generally safe to assume their manager is a good manager. “Asking for things to be different or better” is very contentious, and is definitely not just a safe thing to do in general.

                                                  1. 2

                                                    Absolutely agree with this.

                                                    If there is genuine concern about being fired for asking a question about additional opportunities, it may be how the question is framed?

                                                    1. 1

                                                      You’re making the right call, but not everyone does. I lost a job because of this. I escalated a normalization of deviance issue (social and technical – extreme callousness in communication re: legitimate high-severity defects, and an absurdly high potential for PR and/or technical damage to the enterprise), and it ultimately ended my tenure there. I can’t say any more, unfortunately – that’s how bad it went.

                                                1. 2

                                                  I can sympathize with the author.

                                                  Long ago, in a medical setting, I participated as an analyst in a study that charged patients a “retainer fee” who would have ordinarily fallen under the hospital’s charity policy. Results revealed that a significant number of people were subject to this fee while still paying full price, out of pocket, due to being at 400%+ of the Federal Poverty Level. This work is published, and I – as a data analyst – saved the paper from a retraction by arguing with a department chair for an hour about external validity. It didn’t change the results, but entire providers had been omitted from the EHR pull, which was corrected pre-publication.

                                                  I am still angry and upset about my participation to this day. People in need – predominantly suffering from depression, anxiety disorders, and metabolic disorders, not to mention pregnancies – were told to pay more than $100 extra at the point of care in order to be seen, or were diverted. This was deeply unethical and I regret my participation.

                                                  I’ve considered writing to the journal to try to lessen my own feelings of guilt, and may well do so. I don’t know if it hurt anyone. I don’t know if a depressed patient walked out after that and committed suicide. Nobody else does either.