1. 5

    I’m unconvinced that the size of binaries is correlated at all with any metric people actually care about. Anecdotally, people used to write games in assembly, and then C++ - both languages that produce reasonably sized binaries, but nowadays it’s common to include interpreters (lua, etc), drivers for many different controllers, whatever crap the unity standard library includes, etc. This is great for dev productivity, but has no value to the consumer (or even negative value, since they need to download all of that).

    I know that this is mentioned in the post, but I think that it completely undermines the point of all of the analysis that uses binary size.

    1. 2

      It’s mostly the size of the assets. Binaries are nothing compared to them.

      1. 1

        C++ - both languages that produce reasonably sized binaries

        Wait, what?

        Including a single template in your C++ code can easily dwarf the size of the Lua interpreter.

      1. 4
        y = false, true; // returns true in console
        console.log(y); // false (left-most)
        

        Huh, that definitely tripped me up for a second. Is this because the comma is higher precedence than the assignment?

        1. 9

          The assignment to y belongs wholly to the expression on the left side of the comma operator: y = false. The left and right sides of the comma operator don’t interact. The comma operator is just a way to squeeze in two or more expressions where only one expression is valid, e.g. the first parameter of a for loop. The list of expressions is treated as a single expression that always evaluates to the result of the right-most expression. For that reason y = (false, true) has your expected result. Along the same lines var x = 1, y = 2 expands to var x; var y; x = 1; x = 2; because of variable hoisting.

        1. 6

          Recently there’s been a lot of discussion of keyboard latency, as it is often much higher than reasonable. I’m interested in how much the self-built keyboard community is aware of the issue. Tristan Hume recently improved the latency of his keyboard from 30ms to 700µs.

          1. 2

            The Planck that Dan and I tested had 40ms of latency - not sure how much that varies from unit to unit though.

            1. 3

              I would expect very little, using the QMK firmware with a custom keymap. There’s typically only a handful of C with a couple ifs, no loops.

            2. 2

              Why are those levels of latency problematic? I would think anything under 50ms feels pretty much instantaneous. Perhaps for people with very high typing speeds or gamers?

              1. 1

                The end-to-end latency on a modern machine is definitely noticeable (often in the 100s of ms). Many keyboards add ~50 ms alone, and shaving that off results in a much nicer UX. It is definitely noticeable comparing, say, an Apple 2e (~25ms end-to-end latency) to my machine (~170ms end-to-end latency, IIRC).

              2. 1

                I recall reading about that. I’ll see about getting some measurements made, and see what it’s like on my Planck.

                I’m interested in how much the self-built keyboard community is aware of the issue

                I haven’t really seen much about it :/ If we could find an easy way of measuring latency without needing the RGB LEDs and camera, that would be good.

                1. 2

                  a simple trick - use a contact microphone (piezo), jack it into something like https://www.velleman.eu/products/view/?id=435532

              1. 2

                I sort of disagree that this should be personal preference - Git itself, as well as many prominent figures all advocate for imperative. Seeing as this is already relatively standard, I don’t see any benefit in taking other approaches. In order to be convinced that another approach is better, I’d have to be convinced that the benefits are worth doing something non-standard, which I think is unlikely for the vast majority of projects.

                Like most style-guide things, I didn’t like it at first, but stockholm syndrome has set in and it’s all good now :P

                1. 14

                  I am back from Recurse Center. I’m catching up with family and friends, and doing the hundred put-off chores that come with returning from a long trip and finishing a big project. Probably not much code in my future this week, but I hope to sneak in some Advent of Code.

                  1. 5

                    Thanks for writing that report! It sounds like it was a very productive and fun trip.

                    And thanks a lot for hosting lobste.rs too! It’s been a great resource for me while I develop my shell.

                    I was considering attending Recurse, as a change of environment to “finish” up my shell in 2018. One silly question: do they have computers there? Or is everyone coding on a laptop? Do they have monitors?

                    I looked here and couldn’t find the answer:

                    https://www.recurse.com/manual#sec-environment

                    It’s a little silly, but I’m most productive on Linux, while my laptop is a Mac. I have used VirtualBox but somehow it feels a little wrong. Probably something to do with the screen size. Also my tests take a fair amount of computing power.

                    Do they have a printer there? Another thing is that I frequently print out CS papers to read (I don’t like reading long docs on a laptop or tablet.)

                    They aren’t dealbreakers as I can make my own arrangements, but I’m just curious.

                    1. 4

                      Folks bring their own computers. Mostly that’s laptops, but one or two people brought desktop PCs. I think it was because they wanted the processing power for ML tasks, but you could bring one just because you prefer it, sure. There are a half-dozen monitors available for use. There are two printers, one of which can take print jobs via email (I spent an hour or two with cups but never got eiter working).

                      1. 4

                        Just going to chime in to say that RC is great! I finished up an 18-week stint there a month or so ago, and I really enjoyed my time there.

                        Re: printing out papers - as @pushcx mentioned, there are a couple printers, and one of the parts of RC that I enjoyed a lot was finding interesting papers that folks had printed out lying around the space and reading them :)

                        1. 3

                          Hi! RC is awesome – I’m not sure if they have a printer but there are probably between 5 and 10 monitors, and almost always some are free.

                          1. 1

                            There is at least one working laser printer there as of May of this year :)

                      1. 11

                        Hey @loige, nice writeup! I’ve been aching to asks a few questions to someone ‘in the know’ for a while, so here goes:

                        How do serverless developers ensure their code performs to spec (local testing), handles anticipated load (stress testing) and degrades deterministically under adverse network conditions (Jepsen-style or chaos- testing)? How do you implement backpressure? Load shedding? What about logging? Configuration? Continuous Integration?

                        All instances of applications written in a serverless style that I’ve come across so far (admittedly not too many) seemed to offer a Faustian bargain: “hello world” is super easy, but when stuff breaks, your only recourse is $BIGCO support. Additionally, your business is now non-trivially coupled to the $BIGCO and at the mercy of their decisions.

                        Can anyone with production experience chime in on the above issues?

                        1. 8

                          Great questions!

                          How do serverless developers ensure their code performs to spec (local testing)

                          AWS e.g. provides a local implementation of Lambda for testing. Otherwise normal testing applies: abstract out business logic into testable units that don’t depend on the transport layer.

                          handles anticipated load (stress testing)

                          Staging environment.

                          and degrades deterministically under adverse network conditions (Jepsen-style or chaos- testing)?

                          Trust Amazon / Microsoft / Google. Exporting this problem to your provider is one of the major value adds of serverless architecture.

                          How do you implement backpressure? Load shedding?

                          Providers usually have features for this, like rate limiting for different events. But it’s not turtles all the way down, eventually your code will touch a real datastore that can overload, and you have to detect and propagate that condition same as any other architecture.

                          What about logging?

                          Also a provider value add.

                          Configuration?

                          Providers have environment variables or something spiritually similar.

                          Continuous Integration?

                          Same as local testing, but automated?

                          but when stuff breaks, your only recourse is $BIGCO support

                          If their underlying infrastructure breaks, yep. But every architecture has this problem, it just depends on who your provider is. When your PaaS provider breaks, when your IaaS provider breaks, when your colo provider breaks, when your datacenter breaks, when your electrical provider blacks out, when your fuel provider misses a delivery, when your fuel mines have an accident. The only difference is how big the provider is, and how much money its customers pay it to not break. Serverless is at the bottom of the money food chain, if you want less problems then you take on more responsibility and spend the money to do it better than the provider for your use case, or use more than one provider.

                          Additionally, your business is now non-trivially coupled to the $BIGCO and at the mercy of their decisions.

                          Double-edged sword. You’ve non-trivially coupled to $BIGCO because you want them to make a lot of architectural decisions for you. So again, do it yourself, or use more than one provider.

                          1. 4

                            And great answers, thank you ;)

                            Having skimmed the SAM Local doc, it looks like they took the same approach as they did with DynamoDB local. I think this alleviates a lot of the practical issues around integrated testing. DynamoDB Local is great, but it’s still impossible to toggle throttling errors and other adverse conditions to check how the system handles these, end-to-end.

                            The staging-env and CI solution seems to be a natural extension of server-full development, fair enough. For stress testing specifically, though, it’s great to have full access to the SUT, and to be able to diagnose which components break (and why) as the load increases. This approach goes contrary to the opaque nature of the serverless substrate. You only get the metrics AWS/Google/etc. can provide you. I presume dtrace and friends are not welcome residents.

                            f their underlying infrastructure breaks, yep. But every architecture has this problem, it just depends on who your provider is. When your PaaS provider breaks, when your IaaS provider breaks, when your colo provider breaks, when your datacenter breaks, (…)

                            Well, there’s something to be said for being able to abstract away the service provider and just assume that there are simply nodes in a network. I want to know the ways in which a distributed system can fail – actually recreating the failing state is one way to find out and understand how the system behaves and what kind of countermeasures can be taken.

                            if you want less problems then you take on more responsibility

                            This is something of a pet peeve of mine. Because people delegate so much trust to cloud providers, individual engineers building software on top of these clouds are held to a lower and lower standard. If there is a hiccup, they can always blame “AWS issues”[1]. Rank-and-file developers won’t get asked why their software was not designed to gracefully handle these elusive “issues”. I think the learned word for this is the deskilling of the workforce.

                            [1] The lack of transparency on the part of the cloud providers around minor issues doesn’t help.

                            1. 3

                              For stress testing specifically, though, it’s great to have full access to the SUT, and to be able to diagnose which components break (and why) as the load increases.

                              It is great, and if you need it enough you’ll pay for it. If you won’t pay for it, you don’t need it, you just want it. If you can’t pay for it, and actually do need it, then that’s not a new problem either. Plenty of businesses fail because they don’t have enough money to pay for what they need.

                              This is something of a pet peeve of mine. Because people delegate so much trust to cloud providers, individual engineers building software on top of these clouds are held to a lower and lower standard. If there is a hiccup, they can always blame “AWS issues”[1]. Rank-and-file developers won’t get asked why their software was not designed to gracefully handle these elusive “issues”

                              I just meant to say you don’t have access to your provider’s infrastructure. But building more resilient systems takes more time, more skill, or both. In other words, money. Probably you’re right to a certain extent, but a lot of the time the money just isn’t there to build out that kind of resiliency. Businesses invest in however much resiliency will make them the most money for the cost.

                              So when you see that happening, ask yourself “would the engineering cost required to prevent this hiccup provide more business value than spending the same amount of money elsewhere?”

                          2. 4

                            @pzel You’ve hit the nail on the head here. See this post on AWS Lambda Reserved Concurrency for some of the issues you still face with Serverless style applications.

                            The Serverless architecture style makes a ton of sense for a lot of applications, however there are lots of missing pieces operationally. Things like the Serverless framework fill in the gaps for some of these, but not all of them. In 5 years time I’m sure a lot of these problems will have been solved, and questions of best practices will have some good answers, but right now it is very early.

                            1. 1

                              I agree with @danielcompton on the fact that serverless is still a pretty new practice in the market and we are still lacking an ecosystem able to support all the possible use cases. Time will come and it will get better, but having spent the last 2 years building enterprise serverless applications, I have to say that the whole ecosystem is not so immature and it can be used already today with some extra effort. I believe in most of the cases the benefits (not having to worry too much on the underlying infrastructure, don’t pay for idle, higher focus on business logic, high availability and auto-scalability) overcome by a lot the extra effort needed to learn and use serverless today.

                            2. 3

                              Even though @peter already gave you some great answers, I will try to complement them with my personal experience/knowledge (I have used serverless on AWS for almost 2 years now building fairly complex enterprise apps).

                              How do serverless developers ensure their code performs to spec (local testing)

                              The way I do is a combination of the following practices:

                              • unit testing
                              • acceptance testing (with mocked services)
                              • local testing (manual, mostly using the serverless framework invoke local functionality, but pretty much equivalent to SAM). Not everything could be locally tested depending on which services you use.
                              • remote testing environment (to test things that are hard to test locally)
                              • CI pipeline with multiple environments (run automated and manual tests in QA before deploying to production)
                              • smoke testing

                              What about logging?

                              In AWS you can use cloudwatch very easily. You can also integrate third parties like loggly. I am sure other cloud providers will have their own facilities around logging.

                              Configuration?

                              In AWS you can use parameters storage to hold sensible variables and you can propagate them to your lambda functions using environment variables. In terms of infrastructure as code (which you can include in the broad definition of “configuration”) you can adopt tools like terraform or cloudformation (in AWS specifically, predefined choice by the serverless framework).

                              Continuous Integration?

                              I tried serverless successfully with both Jenkins and CircleCI, but I guess almost any CI tool will do it. You just need to configure your testing steps and your deployment strategy into a CI pipeline.

                              when stuff breaks, your only recourse is $BIGCO support

                              Sure. But it’s kind of proof that your hand-rolled solution will be more likely to break than the one provided by any major cloud provider. Also, those cloud providers very often provide refunds if you have outages given by the provider infrastructure (assuming you followed their best practices on high availability setups).

                              your business is now non-trivially coupled to the $BIGCO

                              This is my favourite as I have a very opinionated view on this matter. I simply believe it’s not possible to avoid vendor lock-in. Of course vendor lock-in comes in many shapes and forms and at different layers, but my point is that it’s fairly unpractical to come up with an architecture that is so generic that it’s not affected by any kind of vendor lock-in. When you are using a cloud provider and a methodology like serverless it’s totally true you have a very high vendor lock-in, as you will be using specific services (e.g. API Gateway, Lambda, DynamoDB, S3 in AWS) that are unique in that provider and equivalent services will have very different interfaces with other providers. But I believe the question should be: is it more convenient/practical to pay the risk of the vendor lock-in, rather than spending a decent amount of extra time and effort to come up with a more abstracted infrastructure/app that allows switching the cloud provider if needed? In my experience, I found out that it’s very rarely a good idea to over-abstract solutions only to reduce the vendor lock-in.

                              I hope this can add another perspective to the discussion and enrich it a little bit. Feel free to ask more questions if you think my answer wasn’t sufficient here :)

                              1. 6

                                This is my favourite as I have a very opinionated view on this matter. I simply believe it’s not possible to avoid vendor lock-in. Of course vendor lock-in comes in many shapes and forms and at different layers, but my point is that it’s fairly unpractical to come up with an architecture that is so generic that it’s not affected by any kind of vendor lock-in.

                                Really? I find it quite easy to avoid vendor lock-in - simple running open-source tools on a VPS or dedicated server almost completely eliminates it. Even if a tool you use is discontinued, you still can use it, and have the option of maintaining it yourself. That’s not at all the case with AWS Lambda/etc. Is there some form of vendor lock in I should be worried about here, or do you simply consider this an unpractical architecture?

                                When you are using a cloud provider and a methodology like serverless it’s totally true you have a very high vendor lock-in, as you will be using specific services (e.g. API Gateway, Lambda, DynamoDB, S3 in AWS) that are unique in that provider and equivalent services will have very different interfaces with other providers. But I believe the question should be: is it more convenient/practical to pay the risk of the vendor lock-in, rather than spending a decent amount of extra time and effort to come up with a more abstracted infrastructure/app that allows switching the cloud provider if needed? In my experience, I found out that it’s very rarely a good idea to over-abstract solutions only to reduce the vendor lock-in.

                                The thing about vendor lock-in is that there’s a quite low probability that you will pay an extremely high price (for example, the API/service you’re using being shut down). Even if it’s been amazing in all the cases you’ve used it in, it’s still entirely possible for the expected value of using these services to be negative, due to the possibility of vendor lock-in issues. Thus, I don’t buy that it’s worth the risk - you’re free to so your own risk/benefit calculations though :)

                                1. 1

                                  I probably have to clarify that for me “vendor lock-in” is a very high level concept that includes every sort of “tech lock-in” (which would probably be a better buzz word!).

                                  My view is that even if you use an open source tech and you host it yourself, you end up taking a lot of complex tech decisions from which is going to be difficult (and expensive!) to move away.

                                  Have you ever tried to migrate from redis to memcache (or vice versa)? Even though the two systems are quite similar and a migration might seem trivial, in a complex infrastructure, moving from one system to the other is still going to be a fairly complex operation with a lot of implications (code changes, language-driver changes, different interface, data migration, provisioning changes, etc.).

                                  Also, another thing I am very opinionated about is what’s valuable when developing a tech product (especially if in a startup context). I believe delivering value to the customers/stakeholders is the most important thing while building a product. Whatever abstraction makes easier for the team to focus on business value I think it deserves my attention. On that respect I found Serverless to be a very good abstraction, so I am happy to pay some tradeoffs in having less “tech-freedom” (I have to stick to the solutions given by my cloud provider) and higher vendor lock-in.

                                2. 2

                                  I simply believe it’s not possible to avoid vendor lock-in.

                                  Well, there is vendor lock-in and vendor lock-in… Ever heard of Oracle Forms?

                              1. 4

                                I find the “doord” analogy incorrect. It makes systemd look like it is based on a loosy idea from the start. Openning doors faster in a car is not as important as booting an OS. While I’m not a systemd fan, I find the comparison unfair, which weaken the argument against it. Systemd was based on an important fact: existing init systems were a mess to manage. Sadly, the implementation grew into something that’s even more complex and huge.

                                I was expecting more focus on what Devuan is doing for the opensource community like supporting software that do not depend on an init system, or encouraging simple ideas instead of overengineered ones (looking at you systemd-hostnamed…).

                                Instead, this article looks just like any other rant against systemd, with the same arguments everyone brings up that all fall in the “bugs” category.

                                After all, systemd brings some kind of « stability », as its interface is consistent (even though it has bugs). For many people, the new shinny features of systemd are definitely not worth its complexity, and this is for these people that the work from the Devuan guys is important. By keeping the alternative to systemd alive, they keep the spirit of linux which aim to keep every piece of software running on top of the kernel swappable, instead of relying on a rigid and complex API.

                                1. 5

                                  There’s a really good comment by someone who maintained Arch Linux’s init scripts pre-systemd about why they switched over. I’m as anti-systemd as the next person, but it’s important to understand why it became so successful.

                                  1. 7

                                    Having a standard init system is incredibly valuable for package maintenance and having full process control does require having code in init to track children, grandchildren and even detached child processes. You can’t do that without being the init process.

                                    All that being said, systemd is terrible from a usability standpoint. I honestly haven’t seen all the random/crashing bugs people complain about, but I do think systemctl is a terrible command, the bash completion is terrible slow, you can’t just edit a target file; you have to reload the daemon process for those changes to take effect, you have to call status after a command to see the limited log output, binary logs, etc. etc. etc.

                                    There have been so many attempts to take the one good thing (standardized init scripts) and make drop in replacements (uselessd and others) and they all hit some pretty hard limits and are eventually abandoned. That’s sad that systemd is so integrated that replacements aren’t even remotely trivial.

                                    Without systemd, you need one of the udev forks, consolekit and a few other things to make things work. Void Linux, Gentoo and Devuan are pretty critical in keeping this type or architecture viable. Maybe one day someone will come up with an awesome supervisor replacement and get other distributions on-board to have a real alternative.

                                    1. 5

                                      Having a standard init system is incredibly valuable for package maintenance

                                      The problem here is that Systemd can never be a standard init system, because it’s Linux only.

                                      Maybe one day someone will come up with an awesome supervisor replacement and get other distributions on-board to have a real alternative.

                                      I’m working on it :) https://github.com/davmac314/dinit

                                      This has been my pet project for some time, although I’m long due to write a blog post update on progress. (Not a lot of commit recently, I know - that’s because Dinit uses an event loop library, Dasynq, which I’ve been focussing on instead - that should be able to change now, as I’ve just released Dasynq 1.0).

                                    2. 5

                                      it was impossible to say when a certain piece of hardware would be available […] this was solved by first triggering uevents, then waiting for udev to “settle” […] Solution: An system that can perform actions based on events - this is one of the major features of systemd

                                      udev is not a system that can perform actions based on events, like devd does on FreeBSD? What is it then?

                                      we have daemons with far more complex dependencies

                                      The question is… WHY?

                                      Sounds like self-inflicted unnecessary complexity. I believe that services can and should start independently.

                                      I run several webapps + postgres + nginx + dovecot + opensmtpd + rspamd + syncthing on my server… and they’re all started by runit at the same time, because none of them expect anything to be running before them. nginx doesn’t care if the webapps are up, it connects dynamically. webapps don’t care if postgres is up, they will retry connection as needed. etc. etc.

                                      Why can’t Linux desktop developers design their programs in the same way?

                                  1. 4

                                    Does anyone know why Ada is seeing a bit of a resurgence (at least, among the HN/Lobsters crowd)? I’m quite surprised by it, so I’m wondering if there are any interesting lessons that can be taken from it in terms of what causes languages to become popular.

                                    Also, what terms I should search for to find out more about Ada’s type system? It seems quite interesting - I’d love to learn more about what tradeoffs it’s making.

                                    1. 12

                                      Personally, after shunning Ada when I was younger because it felt cumbersome and ugly, I have seen enough failures where I’ve thought, “gee, those decisions Ada made more sense than I thought, for large projects”. I think some people are experiencing that; at the same time there’s this new wave of systems languages (often with stated goals like safety or being explicit about behavior) which is an opportunity for reflection on systems languages of the past; and SPARK is impressive and is part of the wave of new interest in more aggressive static verification.

                                      An earlier article posted on lobste.rs had some nice discussion of some interesting parts of Ada’s type system: http://www.electronicdesign.com/embedded-revolution/assessing-ada-language-audio-applications

                                      Also, the Ada concurrency model (tasks) is really interesting and feels, in retrospect, ahead of its time.

                                      1. 2

                                        I’m with you that it’s the new wave of systems languages that helped. The thing they were helping were people like me and pjmpl on HN that were dropping its name [among other old ones] on every one of those threads among others. There have been two, main memes demanding such a response: (a) the idea that Rust is the first, safe, systems language; (b) the idea that low-level, esp OS, software can’t be written in languages with a GC. There’s quite a few counters to the GC part but Burroughs MCP in ALGOL and Ada are main counters to first. To avoid dismissals, I was making sure to drop references such as Barnes Safe and Secure Ada book in every comment hoping people following along would try stuff or at least post other references.

                                        Many people contributing tidbits about the obscure systems languages on threads in a similar topic that had momentum. The Ada threads might be ripple effects of that.

                                        1. 3

                                          Now that I think about it, your posts are probably why I automatically associate Ada with Safe Computing these days.

                                      2. 7

                                        I think its part of the general trend of interest in formal methods and correctness (guarantees or evaluation of). We’ve also seen a lot on TLA+ recently, for example.

                                        1. 10

                                          I think lobsters, at least, is really swingy. A couple people interested in something can really overrepresent it here. For example, I either found or wrote the majority of the TLA+ articles posted here.

                                          And things propagate. I researched Eiffel and Design by Contract because of @nickpsecurity’s comments, which lead to me finding and writing a bunch of stuff on contract programming, which might end up interesting other people…

                                          One focused person can do a lot.

                                        1. 10

                                          … how in the world is Microsoft going to extinguish SSH? How can Microsoft extinguish core infrastructure in widely-used OSes they have no control over?

                                          1. 14

                                            Don’t question the meme.

                                            1. 3

                                              There’s lots of potential extensions to standards in CompSci that improve pain points that exist anywhere from individual use to enterprises. They could do an extension for one of those they buy or tweak a bit to slap a patent on. Obfuscate it a bit, too. Then, they deliver that advantageous benefit in their version. It gets widespread after a Windows release or two. Once people are locked into it, they can extend it in some new ways. Maybe some cool tricks like SSH proxies for Microsoft applications, VPN’s into their cloud, or something for Xbox Live like people used to do with LogMeIn Hamachi. They might even be cross-licensing it to third parties. Those might have already built stuff on it since it’s in Windows at no cost to them.

                                              You’re not on open-source SSH any more with those applications. Now, you’re depending on their tech that plays by their rules on their paid platforms. It’s also called SSH so anyone Googling for SSH on Windows might find the “genuine software.” ;)

                                              1. 3

                                                It gets widespread after a Windows release or two.

                                                Here’s the point which doesn’t add up. It won’t become “widespread” beyond the desktop world.

                                                A lot of things are widespread in the Windows world and not really beyond it, so Microsoft could do this to those things, but SSH is not one of them, and is not going to become one of them.

                                                I’m aware of Embrace, Extend, Extinguish. I was alive and aware in the 1990s. I’m also alive and aware in a world where Linux and Open Source in general is so widespread that it isn’t going away.

                                            2. 5

                                              Isn’t that the GNU philosophy? ;)

                                              1. 2

                                                They are pretty similar in mechanism haha.

                                            1. 4

                                              I usually can’t speak about my work, but this week I’m giving a talk to a university robotics club about motion planning. Trying to do more talks in general.

                                              1. 1

                                                I usually can’t speak about my work

                                                Man, that sucks. I hope you’re keeping secrets for the right moral reasons. Too often people do unkind things when they have secrecy.

                                                1. 1

                                                  Could I get a link to your materials? I’ve been writing a blog series on control theory, so I’m curious to see how others teach somewhat similar topics.

                                                  1. 2

                                                    Sure. I don’t have them collected yet, but I plan to provide a collection to the students so I’ll send it along.

                                                1. 1

                                                  I’m curious about how prospective employers outside the web-frontend/app development would evaluate someone with her skillset and experience. First let me add the disclaimer that I am a researcher and have never had or hired for a “traditional” programming job.

                                                  My hypothesis is that candidates with Computer Science (or equivalent) degrees from reputable universities can be expected to have some “stock” skills: basic to moderate programming skills, knowledge of data structures and algorithms, some systems programming experience, and perhaps some more specialized knowledge in webdev or graphics or compilers or machine learning etc. depending on what higher level courses they took. As a potential employer, I can be fairly confident that such a candidate could pick up whatever the current hip technology is in a couple weeks and work on a variety of projects across my system. With the proper initial training and guidance, such a candidate could work on a front-end app, or a back-end server, or maybe even on mission critical parts of my distributed system (under the guidance of a senior engineer), depending on how good they are and what prior experience they have.

                                                  By contrast, for someone like the author, unless my project is specifically involving JavaScript, Angular, ReactJS, I wouldn’t be comfortable hiring them, or I would have to put in lots of time and money into training them. The university degree experience has done a lot of training and evaluation of a potential employee than a bootcamp, or that I can easily learn from a GitHub resume.

                                                  1. 1

                                                    There’s no reason that people can’t be self-taught at systems/data structures/algorithms topics, and while the author might not have that experience, I think that there are many people without degrees that do. That being said, showing that off in a résumé screen or interview is probably more important for candidates without CS degrees.

                                                    I can be fairly confident that such a candidate could pick up whatever the current hip technology is in a couple weeks and work on a variety of projects across my system.

                                                    I think people often underestimate the depth of knowledge that’s possible in things like this. React is extremely complicated, and while I’m sure I could throw together a webapp using it in a day or so, knowing all of the tools/patterns/idioms available, and knowing how to effectively debug and diagnose problems is a skill that I think would take a lot longer to learn. I’ve definitely made statements along these lines before towards webapp development/react/etc, but the fact is, frameworks are skills just like any other, and there’s no reason to expect that you can learn all of a large/complicated framework in a couple weeks.

                                                  1. 12

                                                    It’s good to see more people talking about skipping university/college as an option! A lot of people who I went to high school with went to college because it’s the next thing that they’re “supposed” to do, rather than thinking about costs/benefits and other options. So far, I’ve been pretty happy with my choice not to go, but I definitely had a lot of doubt at first, mostly because I wasn’t seeing anyone else pursue the same sort of path.

                                                    1. 8

                                                      I replaced an iconfont that I was using on my site with inlined svg logos a week or so ago. Got rid of a request, a ton of wasted bytes for icons I wasn’t using + css for it, and improved accessibility at the same time.

                                                      SVG is great :)

                                                      1. 2

                                                        It’s good to see more research into practices/testing! I do think it’s a bit strange to say that this is how unit testing “affects” codebases though - this is only looking at correlation, not causation.

                                                        1. 1

                                                          I wonder if it would be possible to use version control history to see if there’s a difference between test-first and test-after codebases.

                                                        1. 6

                                                          What isn’t community driven about man pages? Do man page authors not accept patches?

                                                          1. 3

                                                            Having it on Github does make it easier for many people to contribute - the process to submit a patch to man-pages on linux involves emailing the patch to the mailing list using the correct patch format, which can be tricky to figure out.

                                                            It does worry me that Github is such a central place for open source projects, given that it’s closed source, but they do have a far, far easier interface for contribution than most of the kernel.org projects.

                                                            That being said, I don’t think that something like TLDR pages are a good solution to the lack of good documentation. It seems like it’ll duplicate a lot of work that’s already been done, and I’d have a hard time trusting it until it has much more of a track record. I’d also be concerned about a project like this getting out of date quickly. Personally, I think that more useful avenues of contribution would be contributing more examples to the linux/etc man-pages project, and contributing directly to the documentation of programs that are not included in the generic man-pages project.

                                                            While this is an interesting project, and I’m glad that people are working on improving docs, fragmentation and stale information is a pretty big concern with documentation, which makes me wary of projects like this.

                                                            1. 1

                                                              This is easier to approach, just fork on Github and everyone will npm install your contribution. Or something.

                                                              Better documentation is always welcome, but there’s this strange aura of laziness around this I can’t really put my finger on. Maybe because there are so many clients?

                                                            1. -1

                                                              People use cat in the weirdest ways…

                                                              1. 8

                                                                I’m aware of useless uses of cat, but in this case I wanted to use it to ensure that wc -c wasn’t relying on the filesystem’s count of the number of bytes in the file - sending it through a pipe ensures that.

                                                                1. 3
                                                                  wc -c < foo
                                                                  

                                                                  Also, POSIX specifies that wc shall read the file.

                                                                  1. 5

                                                                    If you check out the GNU coreutils wc source, if it’s only counting bytes, it will try to lseek in the fd to find the length. wc -c < foo is not the same as cat foo | wc -c in this case, because the seek will succeed in the first case and not in the second.

                                                                    1. 8

                                                                      I still prefer cat |. I actually prefer cat | in almost every case, because the syntactic flow matches the semantic flow precisely. With an infile, or even having the first command open the file, there’s this weird hiccup where the first syntactic element of the pipeline isn’t the initial source of the data, but the first transformation thereof.

                                                                      The main argument against it seems to be “but you’re wasting a process”, which, uh, with all due respect, I can’t see ever being a problem on a system you’d ever run a full multiprocessing Unix system on. If your system were constrained enough that that was an issue, a multiprocessing Unix would be too much overhead in and of itself, extra cats notwithstanding.

                                                                      1. 2

                                                                        < foo

                                                                        This does not guarantee that bytes are actually being read(); redirecting a file to stdin like that lets the process call fstat() on it if it wants. A naughty implementation of wc -c could call fstat(), check st_mode to verify that stdin is a regular file rather than a pipe or something, and then return the filesystem’s reported size from the st_size field without actually reading any bytes from stdin. Having some other process like cat or dd or something read the bytes onto a pipe does prevent wc -c from being able to see the original file & hence prevents it from being able to cheat and return st_size.

                                                                        Also, POSIX specifies that wc shall read the file.

                                                                        I guess this does. :)

                                                                        1. 0

                                                                          … and this is, indeed, how I would have done it.

                                                                        2. 1

                                                                          Interesting. Thank you for the great response.

                                                                      1. 4

                                                                        This looks like a nice intro to lockless concurrency - that being said, I think that the main thing for programmers to know about lockless concurrency is to fear it :P

                                                                        1. 3

                                                                          I think that the main thing for programmers to know about lockless concurrency is to fear it :P

                                                                          The problem with rolling your own concurrency is not the complexity of the individual components, but the exponential nature of their interactions. Unfortunately at thee systems level there is often not much one can do to avoid it. I like the approach Rust has taken here by encoding concurrency garunteees in the type system, but there isn’t much help for C programmers.

                                                                          1. 2

                                                                            In the past, the limitation made people adopt a CSP-like style for programming where the global interactions were explicit. Then, they’d check a model of that with something like SPIN. These days, there’s more effort on static analysis among those who haven’t given up on C concurrency entirely. ;) Found another one just now:

                                                                            https://lobste.rs/s/hiwfqh/locksmith_practical_static_race

                                                                        1. 7

                                                                          I would (moderately) prefer the slightly adjusted suggested description: “Array programming languages such as APL, J, and K”.

                                                                          And would definitely appreciate this tag. Lots more array/vector language stuff to come!

                                                                          1. 3

                                                                            I think maybe “Array languages” or “Array programming” would be a good descriptiveness/briefness tradeoff.

                                                                            1. 1

                                                                              Tag descriptions are even briefer than that. My guess would be “Array programming languages” or “APL, J, K, Q” if you want to list them in the tag instead of making it generic.

                                                                            1. 2

                                                                              Something that I’m asking during all of my interviews is “What is the goal of your hiring process and how do you evaluate if you’re being successful?”

                                                                              I’d say that about 25% of companies can’t answer the first question. For the second question, some companies have a formal method to tell if the people who they hire work out, but no one has ever mentioned thinking about analysing if the company made the right decision in the no-hire case.

                                                                              I think applying the engineering process to hiring is really important, but basically nobody is doing it.

                                                                              1. 3

                                                                                I guess I’m grouchy today, but is there a name for blog posts that just lazily rephrase the existing, excellent, and even interactive (!) literature (the-raft-website) to position oneself as an expert on the subject? This seems like that sort of post.

                                                                                1. 4

                                                                                  This is actually something that I thought about when I was writing this. I think that this post does actually add to this in the form of the list of problems at the end - I haven’t seen anyone talking about those before (let me know if I missed something though!). Maybe I should have just done a blog post on that, instead of including the more basic stuff.

                                                                                  Also - I think that this post covers stuff that isn’t on the raft website - they have a brief explanation about what consensus, a visualization, and the paper. I could be missing something, but I don’t see anything that is a more plain english explanation of how it works and why it works that way (although the paper is very close). I could be missing something though.

                                                                                  1. 3

                                                                                    I don’t really intend to hammer on this (or at least didn’t, I suppose I am at this point), but /u/Irene suggested I give more concrete feedback, so here goes:

                                                                                    Problem: Latency) Latency is not “briefly mentioned in the paper”, it is dealt with precisely in the same section where you pulled the inequality you cite: “The broadcast time should be an order of magnitude less than the election timeout so that leaders can reliably send the heartbeat messages required to keep followers from starting elections; … The broadcast time and MTBF are properties of the underlying system, while the election timeout is something we must choose. Raft’s RPCs typically require the recipient to persist information to stable storage, so the broadcast time may range from 0.5ms to 20ms, depending on storage technology. As a result, the election timeout is likely to be somewhere between 10ms and 500ms.” In any case, configuring the election time is an issue with running an implementation of raft, not with writing one, so you’d refer to something like the etcd time parameters page (which, incidentally, by default uses exactly one order of magnitude difference between heartbeat time and election timeout – exactly as the paper advises)

                                                                                    Problem: Sync) fsync has been reliably able to flush disk caches without modifying hdparm/sdparm since at least 2012, and while true: “in the real world, disk failures happen”, Raft is 100% ok with that. It’s not a problem.

                                                                                    It feels like you were trying to brainstorm problems where there aren’t any, especially because, as you say, you’ve never implemented Raft.

                                                                                    Otherwise, saying that a plain English explanation is missing seems disingenuous when we’re talking about an English language paper/algorithm that exists precisely to be (and is celebrated for being) easy to understand. It’s not a real problem anyone has.

                                                                                  2. 3

                                                                                    It’s probably not within most people’s understanding of spam, but it’s definitely self-promotion. I think there’s enough disagreement around where the lines are there that it’s helpful to explain why you didn’t find the post valuable, especially since the author is a lobste.rs user.