1. 7

    The irony is that he’s now trying to build better tools that use embedded DSLs instead of YAML files but the market is so saturated with YAML that I don’t think the new tools he’s working on have a chance of gaining traction and that seems to be the major part of angst in that thread.

    One of the analogies I like about the software ecosystem is yeast drowning in the byproducts of their own metabolic processes after converting sugar into alcohol. Computation is a magical substrate but we keep squandering the magic. The irony is that Michael initially sqauandered the magic and in the new and less magical regime his new tools don’t have a home. He contributed to the code-less mess he’s decrying because Ansible is one of the buggiest and slowest pieces of infrastructure management tools I’ve ever used.

    I suspect like all hype cycles people will figure it out eventually because ant used to be a thing and now it is mostly accepted that XML for a build system is a bad idea. Maybe eventually people will figure out that infrastructure as YAML is not sustainable.

    1. 1

      What alternative would you propose to DSLs or YAML?

      1. 4

        There are plenty of alternatives. Pulumi is my current favorite.

        1. 3

          Thanks for bringing Pulumi to my radar, I hadn’t heard of it earlier. It seems quite close to what I’m currently trying to master, Terraform. So I ended up here: https://pulumi.io/reference/vs/terraform.html – where they say

          Terraform, by default, requires that you manage concurrency and state manually, by way of its “state files.” Pulumi, in contrast, uses the free app.pulumi.com service to eliminate these concerns. This makes getting started with Pulumi, and operationalizing it in a team setting, much easier.

          Which to me seemed rather dishonest. Terraform’s way seems much more flexible and doesn’t tie me to Hashicorp if I don’t want that. Pulumi seems like a modern SAAS money vacuum: https://www.pulumi.com/pricing/

          The positive side, of course, is that doing many programmatic-like things in Terraform HCL is quite painful, like all non-turing programming tends to be when you stray from the path the language designers built for you … Pulumi handles that obviously much better.

          1. 3

            I work at Pulumi. To be 100% clear, you can absolutely manage a state file locally in the same way you can with TF.

            The service does have free tier though, and if you can use it, I think you should, as it is vastly more convenient.

            1. 3

              You’re welcome to use a local state file the same way as in Terraform.

            2. 1


        1. 1

          Can someone elaborate on what he means by agile burning people out?

          1. 6

            I interpreted it as in the context of free software contribution. It simply isn’t sensible on a modern software project to suggest spending considerable time working on random tooling, because project management style and language has evolved to the point where the value of doing so can’t even be captured. “I’d like to add [X] to software [Y]” .. “Great, but what user story does this relate to? We’re focusing on the billing feature this week - there are 17 tickets still open for that due end of sprint Friday”, etc.

          1. 3

            I like the diagrams on these pages. Does anyone recognize the tool? I guess it’s possible they’re simply hand-drawn with great care.

            1. 1

              It looks like they are “hand drawn” using Inkscape or something. I guess you could use Dia to achieve something similar.

              1. 2
            1. 1

              For reference, existing tools presently require full debug info installed, which on my Ubuntu VM is 583MB on disk for the main kernel image alone.

              1. 42

                This is what running a VM in Microsoft Azure means. Microsoft controls the underlying datacenter hardware, host OS, hypervisors etc.

                There is no defense against sustained physical access.

                1. 10

                  I think the difference raymii is making is between online and offline access - yes, they can unrack the servers and look at your disks with a magnifying glass, but online access where they can log in live to your running instance is a different threat model. If you rack hardware somewhere, sure, they have your hardware, but they most likely don’t have (an equivalent of) the root password. This story surprised me.

                  1. 18

                    But we’re talking about virtual machines here, right? So you don’t need to unrack anything; your magnifying glass is just /proc/$(pgrep qemu)/mem (or whatever the hyper-v equivalent is), to peruse at your leisure, online, from the host.

                    (And even in the case of rented physical servers, there are still probably BMCs and such in scope that could achieve analogous things.)

                    1. 2

                      But that is still more work than just executing commands via an agent that’s already running. You still have to do something to get root access to a specific machine, instead of being able to script against some agent and accessing all machines.

                      Leaving your door unlocked is one thing; setting it wide open with a sign “Enter here” is another.

                      1. 2

                        On the plus side, though it is “easy” it also appears to be logged and observable within the VM, which is the part most obviously unlike actual backdoors.

                    2. 13

                      There is absolutely nothing to be done from within a VM to prevent the host flipping a bit and backdooring it arbitrarily, or snapshotting it without shutting it down and doing the same. I’d be very surprised all the big names don’t have this functionality available internally – at least Google support live migration, which is the same tech.

                      There are open toolkits for doing arbitrarily nasty poking and introspection to a running VM, e.g. volatility framework

                      Hard to point fingers at Microsoft here

                      1. 3

                        Moreover, Live Migration of VM’s is a functionality available in widely deployed VMware ESXi software since 90’s. I suppose even longer than that on Big Iron.

                      2. 2

                        They can access the memory. That is equivalent to a root password. IMHO CPU supported memory encryption like Intel SGX is snake-oil at most, if you are targeted by the phisycal host of your VM.

                        Hosting in the cloud is a matter of trust and threat analysis.

                      3. 5

                        I’m really suprised, it seems that everybody thinks it’s common knowledge and they seem to think it’s normal. I don’t like my hosting provider having this level of access to my data and machines. We are smart enough to find a solution to this, hosting infrastructure without giving up on all security…

                        1. 26

                          With managed virtualized infrastructure, “this level of access” is completely unavoidable. They run the virtualized hardware your “server” is running on; they have complete memory and CPU state access, and they can change anything they want.

                          I guess it makes backdooring things marginally simpler to write a guest-side agent, but their actual capabilities are totally unchanged.

                          This is something that ought to be common knowledge, but unfortunately doesn’t seem to be.

                          1. 1

                            The risk of your provider taking a snapshot of your disk and ram is always there with virtualization. But, you could encrypt the disk, which would make it harder for them (they have to scan ram for the key, then decrypt). But just an agent with root privileges… what bothers me the most I guess is that it is not made clear. A note in /etc/issue or motd with “we have full root acces in your vm, read http://kb.ms.com/kb77777 for more info” would make it clear right from the get-go.

                            1. 10

                              (they have to scan ram for the key, then decrypt)

                              Not even that, just put a backdoor in the BIOS, boot loader, initramfs, or whatever code is used to unlock the encrypted disk to intercept key entry.

                              1. 3

                                Do you know of any isolated / trusted vm like solution? Where provider access is mitigated?

                                1. 12

                                  No. Even the various “gov clouds” are mainly about isolation from other customers and data center location.

                                  The cloud providers are executing the cpu instructions that the VM image provided by you (or picked from the store) contains. There isn’t any escaping that access level.

                                  The only option is to actually run your own physical hardware that you trust in an environment you consider good enough.

                                  1. 4

                                    In my comment about host and TLA resistance, I had a requirement for setups resistant to domestic TLA’s that might give orders for secrets to be turned over or use advanced attacks (which are getting cheaper/popular). It can be repurposed for an untrusted, host setup.

                                    “If it has to be U.S. and it’s serious, use foreign operated anti-tamper setup. The idea is all sensitive computations are run on a computer stored in a tamper detecting container that can detect radiation, temperature changes, power surges, excessive microwaves, etc. Tamper detection = data wipe or thermite. The container will be an EMSEC safe and the sensors/PC’s will always be located in a different spot in it. The system is foreign built and operated with the user having no control of its operation except what software runs in deprivileged VM’s in it. Status is monitored remotely. It helps to modify code so that most sensitive stuff like keys are stored in certain spot in memory that will be erased almost instantly.”

                                    The clouds aren’t built anything like this. They have total control like those in physical possession of hardware and software almost always have total control. They can do what they want. You won’t be able to see them do it most of the time without some clever detection mechanisms for security-relevant parts of the stack. That’s before we get to hardware risks.

                                    Bottom line: external providers of computing services should always considered trusted with full access to your data and services. By default. Every time. It’s why I encourage self-hosting of secrets. I also encourage pen, paper, and people for most confidential stuff. Computers aren’t as trustworthy.

                                    1. 3

                                      What is your threat model?

                                      There is something based on selinux for Xen https://wiki.xen.org/wiki/Xen_Security_Modules_:_XSM-FLASK which can by design prevent the privileged “dom0” from reading the memory of nonprivileged guest domains. But that assumes you trust your provider to actually implement this when they say they do.

                                  2. 7

                                    A note in /etc/issue or motd with “we have full root acces in your vm, read http://kb.ms.com/kb77777 for more info” would make it clear right from the get-go.

                                    I think this is a combination of “common knowledge, so not worth mentioning specially” for users who already know this and “let sleeping dogs lie” for people who don’t already know. I mean, why press people with their noses on a fact that the competitor is equally mum about? Seems like a bad PR move; you’d get clueless people all alarmed and leaving your platform for reasons that are totally bogus, as any competitor has the same kind of access.

                            1. 16

                              Dell XPS 13 9350 (over two years old now). Previous two were ThinkPad X series. None of them with 15” displays, though.

                              My main problem with cheap laptops, and even some expensive “consumer market” laptops is flimsy keyboards with poor key travel or (worse) flex in the top of the chassis when typing (I’m a relatively heavy typist.)

                              Plus I value a docking station or a USB type C cable where I can quickly plug in/out at my desk.

                              (Your priorities may vary, of course.)

                              If you’re on a budget, I recommend looking for something high specced and a couple of years old. My laptop before this one was bought used (two years old) and had belonged to the CTO of a high frequency trading company. Was optioned up completely when new, so build quality and specs were still way above anything available new at that price.

                              1. 5

                                I have had the XPS 13 9343 for around ~three years I think. I think it’s great.

                                If you’re on a budget

                                I bought this particular one refurb from Amazon for ~$900. I feel like I gambled and got lucky.

                                After having used this one for so long, I think I’d prefer a laptop with more memory. Everything else has been excellent.

                                1. 1

                                  I bought my laptop used as well. It was in person and the person let me test it, so it didn’t feel like a huge gamble, but it was more time consuming.

                                2. 4

                                  Another (new) xps user. Enjoying it so far, had a Zenbook before this and was cheap components by comparison. I’ve only had mine for 3 months, so far far I’m very happy.

                                  1. 2

                                    Thanks. I’m looking at the XPS15. the non-touch model is a strong contender.

                                    1. 4

                                      Have an xps15 with Linux, no trouble whatsoever and it’s an amazingly nice experience.

                                      1. 3

                                        Maybe I just got a bad release, because I’ve usually had good luck with Dells, but my XPS 15 had tons of thermal problems. The battery started swelling and popped off the trackpad! It was a refurb unit off eBay (but Dell certified), so who knows.

                                      2. 3

                                        After dragging my heels forever, I finally settled on an XPS last week as a replacement for the endless series of 2011 Macbook Pros I’ve been wearing out for the past 10 years (2007 Macbooks before that). I don’t like buying new hardware, so ended up with a 4K 9550 / i7 quad / 32 GB RAM from eBay.

                                        The machine is almost everything I was hoping for, including the touchpad, with one exception: the panel response time is so bad you could measure it with a sand timer. Looking around, it seems this is a long-running complaint with XPS. I’m chatting to the seller to see if he repasted the machine because there was some trick to make the panel behave sanely, but otherwise, looks like this is not the Macbook replacement I’ve been dreaming of :(

                                        Currently travelling with my trusty beaten up “hobo” Macbook Pro and its barely functional keyboard – it’s almost impossible to beat this machine, and it’s increasingly looking like its final replacement is going to be yet another 2011 Macbook Pro

                                        Note that many of the XPS 13 models have soldered disk / RAM.

                                      3. 2


                                        If you are willing to spend as much, the XPS 15” is great. For a cheaper option, consider Dell’s Inspirons. https://www.dell.com/en-us/shop/dell-laptops/new-inspiron-15-7000/spd/inspiron-15-7580-laptop/dncwwc120h. They used to be of awful quality but the new series is decent (15” 1080p IPS, metallic body, thin bezels, great linux support, reliable build quality, comes with dual-drives - SSD and HDD together). I’ve been using one myself since over a year now. But don’t expect more than 3 hours of battery life for serious work, webcam is garbage, and the aluminium edges will cut your wrists.

                                      1. 3

                                        Urgh, damn it. I guess I should download Wikipedia while Europeans like me are still allowed to access all of it… It’s only 80 GB (wtf?) anyway.

                                        1. 3

                                          That and the Internet Archive. ;)

                                          Regarding Wikipedia, do they sell offline copies of it so we don’t have to download 80GB? Seems like it be a nice fundraising and sharing strategy combined.

                                          1. 3

                                            I second this. While I know the content might change in the near future, it would be fun to have memorabilia about a digital knowledge base. I regret throwing to the garbage my Solaris 10 DVDs that Sun sent me for free back in 2009. I was too dumb back then.

                                            1. 2

                                              Its a bit out of date but wikipediaondvd.com and lots more options at dumps.wikimedia.org.

                                              I wonder how much traffic setting up a local mirror would entail, might be useful. Probably the type of thing that serious preppers do.

                                              1. 1

                                                You can help seeding too.

                                            2. 4

                                              Actually Wikipedia is exempt from this directive, as is also mentioned in the linked article. While I agree that this directive will have a severely negative impact on the internet in Europe, we should be careful not to rely on false arguments.

                                              1. 1

                                                Do you remember the encyclopedias of the 90s? They came on a single CD. 650MB.

                                                1. 5

                                                  To be explicit, this is not a “modern systems are bloated” thing. The English Wikipedia has an estimated 3.5 billion words. If you took out every single multimedia, talk page, piece of metadata, and edit history, it’d still be 30 GB of raw text uncompressed.

                                                  1. 4

                                                    Oh that’s not what I was implying. The commenter said “It’s only 80 GB (wtf?)”

                                                    I too was surprised at how small it was, but them remembered the old encyclopedias and realized that you can put a lot of pure text data in a fairly small amount of space.

                                                    1. 1

                                                      Remember that they had a very limited selection with low-quality images at least on those I had. So, it makes sense there’s a big difference. I feel you, though, on how we used to get a good pile of learning in small package.

                                                    2. 1

                                                      30 GB of raw text uncompressed

                                                      That sounds like a fun text encoding challenge: try to get that 30GB of wiki text onto a single layer DVD (about 4.6GB?)

                                                      I bet it’s technically possible with enough work. AFAIK Claude Shannon experimentally showed that human readable text only has a few bits of information per character. Of course there are lots of languages but they must each have some optimal encoding. ;)

                                                      1. 2

                                                        Not even sure it’d be a lot of work. Text packs extremely well; IIRC compression ratios over 20x are not uncommon.

                                                        1. 1

                                                          Huh! I think gzip usually achieves about 2:1 on ASCII text and lzma is up to roughly twice as good. At least one of those two beliefs has to be definitely incorrect, then.

                                                          Okay so, make it challenging: same problem but this time an 700MB CD-R. :)

                                                          1. 4

                                                            There is actually a well-known text compression benchmark based around Wikipedia, the best compressor manages 85x while taking just under 10 days to decompress. Slightly more practical is lpaq9m at 2.5 hours, but with “only” 69x compression.

                                                            1. 1

                                                              What does 69x compression mean? Is it just 30 GB / 69 = .43 GB compressed? That doesn’t match up with the page you linked, which (assuming it’s in bytes) is around 143 MB (much smaller than .43 GB).

                                                              1. 5

                                                                From the page,

                                                                enwik9: compressed size of first 10e9 bytes of enwiki-20060303-pages-articles.xml.

                                                                So 10e9 = 9.31 GiB. lpaq9m lists 144,054,338 bytes as the compressed output size + compressor (10e9/144,054,338 = 69.41), and 898 nsec/byte decompression throughput, so (10e9*898)/1e9/3600 = 2.49 hours to decompress 9.31GiB.

                                                              2. 1

                                                                Nice! Thanks.

                                                  1. 15

                                                    This is completely moronic and will have a massive negative impact. They are requiring websites to deploy technology that literally doesn’t exist. There is no such tool that can work out if content is copyrighted or not. Even humans struggle to tell the difference between fair use and not, how on earth will a computer be able to tell the difference between a copyrighted recording of a public domain song and a public domain recording of a public domain song? Audio wise the two are almost exactly the same thing.

                                                    Of course those with power don’t care because systems exist that can detect all of their content and the system will only harm everyone else when there content is constantly removed by automated systems that can’t tell they did nothing wrong.

                                                    1. 4

                                                      It sounds like this is referring to article 13, but article 13 has little in common with what was described above. Ignoring explicit exemptions made for small business and education (arguably including Wikipedia), it obliges information providers storing and providing access to the public to large amounts of copyright protected works to take appropriate and proportionate measures to ensure protection of works, such as implementing effective technologies. It also places the onus on rightsholders to supply the data necessary to power effective countermeasures.

                                                      In other words, the requirement is only on a particular class of (IMHO well-described) web sites to implement countermeasures, those measures must be deemed proportionate and effective, and the data that powers them must be supplied by the rightsholder.

                                                      The example given is of a countermeasure that falls short of the plain criteria - if an ineffective technology creates a needless burden on the operation of a large copyrighted content web site, it is easily argued to be neither appropriate nor proportionate.

                                                      1. 2

                                                        “Easily argued” where? When the regulators come to you and say you are in violation… where do you fight that? How do you fight that? What are the costs? Being right has very little to do with being easy, you can easily be right and bankrupted by regulation and legal fees.

                                                        1. 5

                                                          The same applies to every new regulation, article 13 is no exception, and like GDPR will slowly be tested out, article 13 will receive the same, and those cases will be paid by a wide variety of deep pockets interested in such things, as you’re no doubt already aware. By the time it filters down to the little guy most of the legal legwork will already have been done, and words like “large”, “proportionate” and “effective” will have well-defined meanings.

                                                          It only takes one case from e.g. a big content aggregator site to begin setting precedent.

                                                          Just to be clear this isn’t my department, and I didn’t spend any time reading article 11, but it’s tiring to see a pirate party representative’s blog post repeated everywhere with no counterpoints, especially when that post is demonstrably inaccurate according to the most basic reading of the regulation

                                                          1. 4

                                                            The same applies to every new regulation

                                                            Which is why regulations should not be broad nor generalized. “Large”, “Proportionate” and “Effective” are very much words that get to be defined at will by those given the power to interpret them. I don’t like regulations with those words anymore than an official “Be ‘nice’ and ‘kind’ to your neighbor, or face the penalties” – “Nice” and “Kind” of course to be defined later, by someone who is not you.

                                                            and those cases will be paid by a wide variety of deep pockets

                                                            I don’t see how you are so certain of this, GDPR absolutely hasn’t been paid for by the “deep pockets” alone. Its cost has hurt may of the “excluded small businesses” (under 250) because the follow on inclusion criteria was so broad as to bring many small businesses BACK into needing to deal with GDPR in a major cost-impacting, employee headcount cutting, and even transnational relocation way.

                                                            it’s tiring to see a pirate party representative’s blog post repeated everywhere with no counterpoints

                                                            Fair enough, and I understand the contrarian instinct. I get in trouble for it in my home all the time, I will almost reflexively argue the other side – not the best plan for domestic bliss.

                                                            1. 2

                                                              I will almost reflexively argue the other side – not the best plan for domestic bliss.

                                                              Same here. I don’t know why I do this.

                                                    1. 3

                                                      Never heard of billiard before, looks like a ripe place to mine unknown fixes for forking quirks. I guess it’s preaching to the choir on lobste.rs, but here is another long yet hopelessly incomplete list of reasons you should avoid fork like the plague. There’s a reason newer OS designs like NT never publicly supported it.

                                                      1. 2

                                                        I shared this kind of as an experiment, it’s very detailed in a specific way and not really widely applicable to lots of people. Mostly it’s documentary essay material that can eventually part migrate to the docs, but there is always the possibility someone has some brilliant related idea or correction to make :)

                                                        1. 1

                                                          Thank you for the planet. There seems to be about 100 blogs/feeds coming in to the planet. But the planet rss feed is just 100 items, most of which seem to come from just a couple of blogs that don’t have proper timestamps?

                                                          1. 2

                                                            Well spotted.. it wasn’t apparent yesterday but I just fixed an SSL problem and suddenly there are quite a few. I’ll remove any more I spot, but please, feel free to go crazy on pull requests :)

                                                            edit: this is way more broken than I thought. Planet doesn’t seem to do anything about feeds that lack timestamp, which is surprising. Anyone got a recommendation for better software? The main value in this existing thing is the Travis setup and the list of feed URLs.

                                                            edit: ok, I /think/ I’ve got it this time.. some bad settings in there, and squelching untimestamped feeds doesn’t happen after the first time they’re seen, so had to wipe the cache and start again

                                                            1. 1

                                                              I’m tempted to write something better, or at least help improve what you have currently got working :)

                                                              1. 1

                                                                I’ve once authored a planet generator named Uranus, but I don’t really maintain it anymore. It does have the advtange of not having any dependencies other than Ruby, though (no gems, just plain stdilb). There’s another planet generator named Pluto that is still maintained.

                                                            1. 6

                                                              Thanks - I was going to suggest doing something similar but didn’t get around to even making a suggestion :( Perhaps it can be made “official” and we could have a planet.lobste.rs?

                                                              1. 6

                                                                I think there may be room for it! Although I note Planet itself is starting to show its age quite badly. Still hard to beat the simplicity, and with a little theming and e.g. setting a max length on the articles, things might start to look very nice.

                                                                Can’t hate on Planet too much, I was setting this up for private use before I realized it might be worth sharing. The fact Planet is still a go-to RSS reader is quite impressive given its vintage!

                                                                1. 2

                                                                  Good point about the age of Planet - I’ve not looked around seriously for a replacement myself but a few of the alternatives are also a bit long in the tooth. Moonmoon looks to be maintained, although it’s written in PHP rather than Python.

                                                              1. 4

                                                                I write a mixture of shoot-from-the-hip spam and technical deep dives, usually after too much coffee, and often with at least some tangential relation to Python. Very occasionally I’ve accidentally broken some big stories. For the past year I’ve been documenting progress on developing Mitogen and its associated Ansible extension.


                                                                Favourite tech: Guerilla optimization for PyPy, Deploying modern apps to ancient infrastructure, Fun with BPF

                                                                Favourite rants: I will burn your progress bar on sight, Data rant

                                                                1. 2

                                                                  I learned of this project some time ago when you posted about it in one of the “what are you working on” posts. Since then I’ve been waiting to use this since I only really use ansible twice a year, and on those opportunities, I have to work remotely over a very slow and convoluted connection (bouncing through multiple hosts and traveling down a home DSL connection and then a PtP WiFi link in someones garage). When I use ansible there are time constraints, the infra in the garage is only online for so long every weekend, so ansible runs that take 40 minutes or longer are super annoying (especially when this is the developing and testing stage). This project looks promising to me, I plan on using it very soon for my work and I hope to see improvements to my productivity as a result, thanks!

                                                                  1. 1

                                                                    Its network profile has “evolved” (read: regressed!) a little since those early days, but it should still be a massive improvement over a slow connection. Running against a local VM with simulated high latency works fine, though I’ve never ran a direct comparison of vanilla vs. Mitogen with this setup.

                                                                    That’s a really fun case you have there – would love a bug report even just to let me know how the experience went.

                                                                    edit: that’s probably a little unfair. Roundtrips have reduced significantly, but to support that the module RPC has increased in size quite a bit. For now it needs to include the full list of dependency names. As a worst-case example, the RPC for the “setup” module is 4KiB uncompressed / 1KiB compressed, due to its huge dependency list. As a more regular case, a typical RPC for the “shell” module is 1177 uncompressed / 631 bytes compressed.

                                                                    A lot of this is actually noise and could be reduced to once-per-run rather than once-per-invocation, but it requires a better “synchronizing broadcast” primitive in the core library, and all my attempts to make a generic one so far have turned out ugly

                                                                  1. 3

                                                                    It’s probably way out of the intended scope, but could Mitogen be used for basic or throwaway parallel programming or analytics? I’m imagining a scenario where a data scientist has a dataset that’s too big for their local machine to process in a reasonable time. They’re working in a Jupyter notebook, using Python already. They spin up some Amazon boxes, each of which pulls the data down from S3. Then, using Mitogen, they’re able to push out a Python function to all these boxes, and gather the results back (or perhaps uploaded to S3 when the function finishes).

                                                                    1. 3

                                                                      It’s not /that/ far removed. Some current choices would make processing a little more restrictive than usual, and the library core can’t manage much more than 80MB/sec throughput just now, limiting its usefulness for data-heavy IO, such as large result aggregation.

                                                                      I imagine a tool like you’re describing with a nice interface could easily be built on top, or maybe as a higher level module as part of the library. But I suspect right now the internal APIs are just a little too hairy and/or restrictive to plug into something like Jupyter – for example, it would have to implement its own serialization for Numpy arrays, and for very large arrays, there is no primitive in the library (yet, but soon!) to allow easy streaming of serialized chunks – either write your own streaming code or double your RAM usage, etc.

                                                                      Interesting idea, and definitely not lost on me! The “infrastructure” label was primarily there to allow me to get the library up to a useful point – i.e. permits me to say “no” to myself a lot when I spot some itch I’d like to scratch :)

                                                                      1. 3

                                                                        This might work, though I think you’d be limited to pure python code. On the initial post describing it:

                                                                        Mitogen’s goal is straightforward: make it childsplay to run Python code on remote machines, eventually regardless of connection method, without being forced to leave the rich and error-resistant joy that is a pure-Python environment.

                                                                        1. 1

                                                                          If it are just simple functions you run, you could probably use pySpark in a straight-forward way to go distributed (although Spark can handle much more complicated use-cases as well).

                                                                          1. 2

                                                                            That’s an interesting option, but presumably requires you to have Spark setup first. I’m thinking of something a bit more ad-hoc and throwaway than that :)

                                                                            1. 1

                                                                              I was thinking that if you’re spinning up AWS instances automatically, you could probably also configure that a Spark cluster is setup with it as well, and with that you get the benefit that you neither have to worry much about memory management and function parallelization nor about recovery in case of instance failure. The performance aspect of pySpark (mainly Python object serialization/memory management) is also actively worked on transitively through pandas/pyArrow.

                                                                              1. 2

                                                                                Yeah that’s a fair point. In fact there’s probably an AMI pre-built for this already, and a decent number of data-science people would probably be working with Spark to begin with.

                                                                        1. 17

                                                                          You’d save yourself a lot of trouble upfront not borrowing the filezilla name - it’s trademarked. Already there’s an argument for whether “-ng” postfix constitutes a new mark, why bother even having it. Just completely rename it

                                                                          Hilariously their trademark policy seems to prohibit their use of their own name

                                                                          1. 3

                                                                            Oh, great point. We will need to think of a new name.

                                                                            How about godzilla-ftp.

                                                                            1. 14

                                                                              How about filemander? It’s still in the same vein as “zilla,” but far more modest. The fact that you’re refusing cruft, provides a sense of modesty.

                                                                              Also, “mander” and “minder” — minder maybe isn’t exactly right for an FTP client, but it’s not completely wrong…

                                                                              1. 4


                                                                                Great name! A quick ddg search does not show any existing projects using it.

                                                                                1. 1

                                                                                  And it sounds a bit like “fire mander”, which ties in well with the mythological connections between salamanders and fire.

                                                                                  1. 1

                                                                                    Yeah, the intention was to have a cute salamander logo–way more modest a lizard than a “SOMETHINGzilla!”

                                                                                2. 8
                                                                                  1. 5

                                                                                    Just remember to make sure it’s easy for random people to remember and spell. They’ll be Googling it at some point.

                                                                                1. 2

                                                                                  The USR1 signal is supported on Linux for making dd report the progress. Implementing something similar for cp and other commands, and then make ctrl-t send USR1 can’t be too hard to implement. Surely, it is not stopped by the Linux kernel itself?

                                                                                  1. 8

                                                                                    SIGUSR1 has an nasty disadvantage relative to SIGINFO: by default it kills the process receiving it if no handler is installed. 🙁 The behavior you really want is what SIGINFO has, which is defaulting to a no-op if no handler is installed.

                                                                                    • I don’t want to risk killing a long-running complicated pipeline that I was monitoring by accidentally sending SIGUSR1 to some process that doesn’t have a handler for it
                                                                                    • there’s always a brief period between process start and the call to signal() or sigaction() during which SIGUSR1 will be lethal
                                                                                    1. 1

                                                                                      That’s interesting. The hacky solution would be to have a whitelist of processes that could receive SIGUSR1 when ctrl-t was pressed, and just ignore the possibility of someone pressing ctrl-t at the very start of a process.

                                                                                      A whitelist shouldn’t be too hard to maintain. The only tool I know of that handles SIGUSR1 is dd.

                                                                                    2. 5

                                                                                      On BSD it’s part of the TTY layer, where ^T is the default value of the STATUS special character. The line printed is actually generated by the kernel itself, before sending SIGINFO to the foreground process group. SIGINFO defaults to ignored, but an explicit handler can be installed to print some extra info.

                                                                                      I’m not sure how equivalent functionality could be done in userspace.

                                                                                      1. 1

                                                                                        It would be a bit hacky, but the terminal emulator could send USR1 to the last started child process of the terminal, when ctrl-t is pressed. The BSD way sounds like the proper way to do it, though.

                                                                                        1. 4

                                                                                          I have a small script and a tmux binding for linux to do this:

                                                                                          # tmux-signal pid [signal] - send signal to running processes in pids session
                                                                                          # bind ^T run-shell -b "tmux-signal #{pane_pid} USR1"
                                                                                          [ "$#" -lt 1 ] && return 1
                                                                                          sid=$(cut -d' ' -f6 "/proc/$1/stat")
                                                                                          : ${sig:=USR1}
                                                                                          ps -ho state,pid --sid "$sid" | \
                                                                                          while read state pid; do
                                                                                                  case "$state" in
                                                                                                  R) kill -s"$sig" "$pid" ;;
                                                                                          1. 4

                                                                                            Perfect, now we only need to make more programs support USR1 and lobby for this to become the default for all Linux terminal emulators and multiplexers. :)

                                                                                    1. 10

                                                                                      This post is pure fluff. It hints at Linux preferring throughput over latency in some cases, but fails to give a single concrete example of that being true. It’s reminiscent of the popular “BSD vs. Linux” arguments I heard (and sadly accepted as gospel) in the late 90s

                                                                                      1. 1

                                                                                        The user still has to allow the website USB access to the device and if they are one of the people who own these USB calls they are probably smart enough not to allow it.

                                                                                        1. 12

                                                                                          This comment contains content of such life-changing awesomeness that we must request your permission before revealing it to you. If you dare, click “Accept” in the dialog now displayed at the top of the window.

                                                                                          I’ve been interested in all things infosec for around 20 years at this point, and I /still/ regularly get those permission dialogs wrong, and the problem isn’t me, the problem is that the dialogs exist at all. Nobody can be expected to get them right, and even when present the behaviour they gate should never be game-ending as in WebUSB

                                                                                          1. 11


                                                                                            • There’s some other vulnerability that combines with this
                                                                                            • Some killer consumer device uses WebUSB, dramatically increasing the number of users
                                                                                            • You’re drunk or sleep deprived or distracted
                                                                                            • The malware exploits a dark pattern
                                                                                            1. 9

                                                                                              they are probably smart enough not to allow it.

                                                                                              Such users empirically do not exist.

                                                                                              1. 4

                                                                                                [citation needed]

                                                                                                At quick glance, I found this study: https://pdfs.semanticscholar.org/4c40/c0ea6b02630839658ba7939dd609c621bf17.pdf

                                                                                                Popular opinion holds that browser security warnings are ineffective. However, our study demonstrates that browser security warnings can be highly effective at preventing users from visiting websites: as few as a tenth of users click through Firefox’s malware and phishing warnings. We consider these warnings very successful.

                                                                                                People do react to unknown notifications. (The study goes on and talks about the efficiency of such notifications related to their design)

                                                                                                Sure, at enterprise scale, that still means something is going through, so you might want to deploy your browser with appropriate policies which deny such request every time if you want.

                                                                                                1. 1

                                                                                                  Oh, neat link!

                                                                                                  That said, with a million users only a tenth is still a hundred thousand.

                                                                                                  1. 1

                                                                                                    Sure, but that problem we have on so many levels.

                                                                                                    For individuals, it’s protection, for cohorts, less so.

                                                                                            1. 1

                                                                                              I don’t understand how it’s possible pick three here: “full-native speed”, single address space OS (everything in ring 0) and security. I believe you can only pick two.

                                                                                              1. 1

                                                                                                Well, that’s what nebulet is trying to challenge.

                                                                                                  1. 1

                                                                                                    I haven’t yet read the whole paper but in the conclusion they say that performance was a non-goal. They “also improved message-passing performance by enabling zero-copy communication through pointer passing”. Although I don’t see why zero-copy IPC can’t be implemented in a more traditional OS design.

                                                                                                    The only (performance-related) advantage such design has in my opinion is cheaper context-switching, but I’m not convinced it’s worth it. Time (and benchmarks) will show, I guess.

                                                                                                    1. 1

                                                                                                      When communication across processes becomes cheaper than posting a message to a queue belonging to another thread in the same process in a more traditional design, I’d say that that’s quite a monstrous “only” benefit.

                                                                                                      I should have drawn your attention to section 2.1 in the original comment, that’s where you original query is addressed. Basically the protection comes from static analysis, a bit like the original Native Client or Java’s bytecode verifier