1. 8

    I wish that we had a better way to refer to this than as “nines”. I agree with all of your points; I wish that folks understood that going from X nines to X + 1 nines is always going to cost the same amount of resources.

    Here’s a further trick that service reliability engineers should know. If we compose two services which have availabilities of X nines and Y nines respectively into a third service, then the new service’s availability can be estimated within a ballpark of a nine by a semiring-like rule. If we depend on both services in tandem, then the estimate is around minimum(X, Y) nines, but should be rounded down to minimum(X, Y) - 1 for rules of thumb. If we depend on either service in parallel, then the estimate is maximum(X, Y) nines.

    As a technicality, we need to put a lower threshold on belief. I use the magic number 7/8 because 3-SAT instances are randomly satisfiable 7/8 of the time; this corresponds to about 0.90309 nines, just below 1 nine. So, if we design a service which simultaneously depends on two services with availabilities of 1 nine and 2 nines respectively, then its availability is bounded below 1 nine, resulting in a service that is flaky by design.

    1. 4

      If we depend on either service in parallel, then the estimate is maximum(X, Y) nines.

      Shouldn’t that be X + Y? If you have a service that can use either A or B, both of which are working 90% of the time, and there is no correlation between A working and B working, then at least A or B should work 99% of the time.

      It’s possible that I misunderstand you, because I don’t understand the last paragraph at all.

      1. 2

        If you have two services with a 10% failure rate (90% uptime), the odds of both failing are .1 x .1 = 1% (99% uptime).

        1. 2

          I had a hidden assumption! Well done for finding it, and thank you. I assumed that it was quite possible for the services to form a hidden diamond dependency, in which case there would be a heavy correlation between dependencies being unavailable. When we assume that they are independent, then your arithmetic will yield a better estimate than mine and waste fewer resources.

        2. 2

          Article says:

          Adding an extra “9” might be linear in duration but is exponential in cost.

          You say:

          I wish that folks understood that going from X nines to X + 1 nines is always going to cost the same amount of resources.

          I’m not sure what the article means by “linear in duration” or what you mean by “the same amount”. That said, your comment seems to conflict with my understanding: because 9s are a logarithmic scale, going from X to X + 1 9s should be expected to take an order of magnitude more resources than going from X - 1 to X 9s, and that’s an important fact about 9s. How am I misunderstanding your comment?

          1. 1

            Let me try specific examples first. Let’s start at 90% (1 nine) and go to 99% (2 nines). This has a cost, and got us a fixed amount of improvement worth 9% of our total goal. If we do that again, going from 99% to 99.9% (3 nines), then we get another fixed amount of improvement, 0.9%. My claim is that the cost of incrementing the nines is constant, which means that we get only roughly a tenth of the improvement for each additional nine. The author’s claim is that the overall cost of obtaining a fixed amount of our total goal is exponential; we get diminishing returns as we increase our overall availability. They’re two ways of looking at the same logarithmic-exponential relation.

            I don’t know what the author is thinking when they say “linear in duration”. I can understand their optimism, but time is only one of the costs that must be considered.

            1. 3

              Tbh, I’m still confused by the explanation. I’d make a simpler example claim - adding an extra nine does not have the same cost. Going from 90% to 99% is close to free. Going from 99.999% to 99.9999% is likely measured in millions of $. (with an exponential growth for every 9 in between) (we may agree here, I’m not sure :) )

              1. 1

                This hasn’t been my experience. I have seen services go from best-effort support (around 24% availability for USA work schedules) to a basic 90% or 95% SLA, and it takes about two years. A lot of basic development has to go into a service in order to make it reliable enough for people to start using it.

              2. 3

                I’m also rather confused by your claim. Are you saying that going from 99.9 -> 99.99 “costs” the same amount as going from 99->99.9, but you get 10x less benefit for it? I think that’s a rather confusing way to look at it, since from a service operator’s perspective, you’re looking for “how much effort do I need to expend to add a 9 to our reliability?” I also disagree that the cost for a 9 (the benefit aside) is at all linear.

                Going from 90->99 might be the difference between rsyncing binaries and running it under screen to building binaries in CI and running it under systemd. Going from 99.9->99.99 is very clearly understanding your fault domains and baking redundancy into multiple layers, geographic redundancy, canaries, good configuration change practices. 99.999 is where you need to start thinking about multiple providers (not just geographic redundancy), automated recovery, partial failure domains (i.e., fault-focused sharding), much longer canaries, isolation between regions.

                The effort (and cost) required to achieve greater reliability increases by at least an order of magnitude for each nine, and to your point, it’s also worth less.

                1. 1

                  I appreciate your focus on operational concerns, but code quality also matters. Consider this anecdote about a standard Web service. The anecdote says that the service restarted around 400 times per day, for an average uptime of 216 seconds. Let’s suppose that the service takes one second to restart and fully resume handling operational load; by design, the service can’t possibly exceed around 99.5% availability on a single instance.

                  In some sense, the tools which you espouse are not just standard ways to do things well, but also powerful levers which can compensate for the poor underlying code in a badly-designed service. While we might be able to eventually achieve high reliability by building compositions on top of this bad service, we should really consider improving the service’s code directly too.

                  1. 1

                    I think we’re generally on the same page here: I’m not saying you don’t need to improve your service’s code. Quite the opposite. “Baking redundancy into multiple layers” and “understanding your fault domains” fall into this category.

                    There’s also just general bugfixing and request profiling. A pretty typical way of measuring availability is by summing the requests that failed (for non-client reasons) and diving it by the total number of requests. Investigating the failed requests often leads to improvements to service behavior.

                    That being said, there will still be unknowable problems: a cache leaks data and eventually takes down the task. You need multiple tasks keep servicing requests while you solve the problem. A client query of death starts hitting your tasks: if you’re lucky, you have enough tasks to not notice, but perhaps they’re making requests fast enough that your entire fleet is downed. Perhaps they should have been consistently directed to a smaller pool of tasks to limit their blast radius.

                    You need both a well-written service and systemic reliability. The effort is greatly increased with every 9.

          1. 8

            Everybody’s written a version of this, right?

            Our house version is called errorwatch, and it now boasts the following ridiculous yet eventually necessary set of options:

            errorwatch -p “PROGRAM” -s success@somewhere -f failure@somewhere -i -r “regexp” –failregexp “FAILED” –successregexp “” -l logdir -m “message” -t “title” -T timeout -d -q -e

            -p is required; everything else has a plausible default.

            It looks at errorcodes, it can look at stdout and stderr for regexps, it can expire on a timeout, and it sends variable sizes of mail and logs. It would be nice if it could report that it hasn’t run, but that’s too meta even for Perl.

            1. 6

              Make a cron job that watches for its emails and if it doesn’t see any, reports via a different email service.. Then set errorwatch to watch this first cron job. That way both have to fail independently at the same time for notifications to vanish.

              …I wish I knew whether this was a joke or not…

              1. 7

                Not a joke, my important cron commands always post to a dead man’s snitch-style service on success.

                https://deadmanssnitch.com/

                1. 3

                  We actually have an end-to-end check to see if mail is working which does a negative feedback: it accumulates mail into a directory, and if the most recent timestamp is too far in the past, it goes off.

                2. 1

                  . It would be nice if it could report that it hasn’t run

                  And when it doesn’t finish running and loops instead of slowly progressing, thanks

                1. 35

                  This is a good opportunity to plug a technology that’s been around since well before WebSockets, is widely supported on both the client and server sides, was designed for precisely the kind of use case the article is focused on, and, bizarrely, seems virtually unknown in the developer community: Server-Sent Events.

                  1. 8

                    Server-Sent Events are one of my favorite technologies.

                    1. 4

                      SSE is a specific implementation of long-polling, no?

                      1. 9

                        Sort of. The implementations of long-polling I’ve seen are usually, “Keep the connection open until the server has an event to deliver, then deliver it as the response payload and end the request.” The client then immediately makes another long-polling request.

                        SSE is more of a streaming approach. A single connection stays open indefinitely and events are delivered over it as they become available. In that sense it’s more like WebSockets than traditional long-polling.

                        1. 2

                          It is with a different interface and more efficient bandwidth usage.

                          1. 3

                            And a built-in protocol for reconnections and catching up on messages you missed while you were out.

                        2. 1

                          I remember discovering Server-Sent Events and taking great joy in the simplicity of just going one way.

                          1. 1

                            Indeed.

                            You can also get away with implementing an “endless” request dlivering JSON lines, like the Twitter Streaming API.

                          1. 21

                            The article never mentions the, in my humble opinion, most important part of good logging practices and that is structured logging. Without it you end up with weird regexes or other hacks trying to parse your log messages.

                            1. 4

                              As a sibling post notes, if you use structured logging you’re mostly throwing away the idea that the entries must be easily parsable by a human. If that’s the case, and we’ll need a custom method of displaying the structured logs in a human friendly way, I believe we should forego plain text all together and gain the benefits of logging directly to binary.

                              1. 5

                                You can do human readable structured logging if you use key="value" formats inside text messages. Some people still prefer json, but there is a middle ground.

                                1. 2

                                  If you need just key=value, that’s not really structured in my opinion.

                                  1. 4

                                    Why not?

                                    1. 2

                                      Because the amount of information added by this format would be infinitesimal over a line based logger with manual tokenization. The reason why you’d want a structured logger is to allow proper context to a message. Unless you’re working with simple cases, the structure that would offer such context is more than one level deep.

                                      1. 3

                                        Hmm, definitely not.

                                        Structured logging is about decorating log events with just enough of a schema to make them machine parseable, so that searching, aggregating, filtering, etc. can more than a crapshoot. Deeply nested events significantly increase the complexity of that schema, and therefore the requirements of the consumer.

                                        By default, structured logs should be flat key/value pairs. It gets you the benefits of richer parseability, without giving up the ability to grep.

                              2. 2

                                Excellent point. That’s become such second nature to me by now, that I forgot to even mention it!

                                1. 2

                                  I’m surprised it wasn’t mentioned, but the larger advantage of passing a logger around to constructors is the ability to then have nested named loggers, such as

                                  Battery.ChargingStatus.FileReader: Failed to open file { file: "/tmp/battery charge", error: ... }
                                  Battery.ChargingStatus: Failed to access status logs, skipping report
                                  
                                  1. 1

                                    On top of that, structured logger if implemented properly, can often be faster and be operated at granular levels (like the other comments pointed out, sometimes you do want to on-fly turn on some logs at some locations, not all logs at all locations).

                                    1. 1

                                      I love structured logging, with one caveat: the raw messages emitted (let’s assume JSON) are harder for me to scan when tailing directly (which I usually only do locally as we have better log querying tools in the cloud), in contrast to a semi-structured simple key-value format. Do you all use a different format than JSON? Or a tool that transforms structured logs to something more friendly to humans, eg. with different log levels displayed in different appropriate colors, eg. JSON syntax characters diminished, for local tailing?

                                      1. 5

                                        At Joyent, we used the Bunyan format. Each line in the file was a separate JSON object with standard properties, some mandatory and some optional, and freeform additional properties. We shipped a tool, bunyan, that was capable of acting as a filter that would render different human readable views of the JSON. For example, you would often run something like:

                                        tail -F $(svcs -L manatee) | bunyan -o short
                                        

                                        It also had some rudimentary filtering options. It also had a relatively novel mode that would, instead of reading from a file or standard input, use DTrace probes for different log levels to allow you to dynamically listen for DEBUG and TRACE events even when those were not ordinarily present in the log files. The DTrace mode could target a particular process, or even all processes on the system that emitted Bunyan logs.

                                        1. 1

                                          Hi, what were the required fields? Was it just a unique request ID? Thanks for sharing about bunyan. Even though it’s been out for a while I was unaware of it.

                                        2. 5

                                          Do you all use a different format than JSON? Or a tool that transforms structured logs to something more friendly to humans, eg. with different log levels displayed in different appropriate colors, eg. JSON syntax characters diminished, for local tailing?

                                          We use JSON and the only tools I use are grep and jq. And although I am pretty much still a novice with these two, I found that with the power of shell piping I can do almost anything I want. Sometimes I reach for the Kibana web interface, get seriously confused and then go back to the command line to figure out how to do it there.

                                          I wrote a simple tutorial for the process, just a couple of weeks ago.

                                          1. 1

                                            If you rely on external tools to be able to make sense of your logs, why not go all the way, gain the speed and size benefits that binary logs would bring, and write your own log pager? I feel like the systemd folks had the right idea even when everyone was making fun of them.

                                            1. 3

                                              I don’t think the average employer would be happy subsidizing an employee writing a log pager instead of implementing something that would bring a tangible result to the business. The potential money savings by using binary logs probably doesn’t outweigh the new subs/increased profits of churning out more features.

                                              1. 1

                                                To me that sounds like an excuse. The world is not made up of only software that is beholden to the all mighty shareholder.

                                                1. 1

                                                  I mean, yes, if you’re developing something in your personal time, go bananas on what you implement.

                                                  But I also know my manager would look at me funny and ask why I’m not just shoving everything into CloudWatch/<cloud logging service>

                                              2. 2

                                                I’m sure most problems with systemd journals are fixable, but they’ve left a very bad taste in my mouth for two main reasons: if stuff gets deleted from under them they apparently never recover (my services continue to say something like “journal was rotated” until I restart them), and inspecting journals is incredibly slow. I’m talking magnitudes slower than log files. This is at its worst (I often have time to make a cup of tea) when piping the output into grep or, as journalctl already does by default, less, which means every byte has to be formatted by journalctl and copied only to be skipped over by its recipient. But it’s still pretty bad (I have time to complain on IRC about the wait) when giving journalctl filters that reduce the final output down to a few thousand lines, which makes me suspect that there are other less fundamental issues.

                                                I should note that I’m using spinning disks and the logs I’m talking about are tens to hundreds of GB over a few months. I feel like that situation’s not abnormal.

                                                1. 1

                                                  If you rely on external tools to be able to make sense of your logs, why not go all the way, gain the speed and size benefits that binary logs would bring, and write your own log pager?

                                                  It’s hard to imagine a case at work where I could justify writing my own log pager.
                                                  Here are some of the reasons I would avoid doing so:

                                                  • Logs are an incidental detail to the application.
                                                  • Logs are well understood; I can apply a logging library without issues.
                                                  • My application isn’t a beautiful and unique snowflake. I should use the same logging mechanisms and libraries as our other applications unless I can justify doing something different.
                                                  • JSON is boring, has a specification, substantial library support, tooling, etc.
                                                  • Specifying, documenting, and testing a custom format is a lot of work.
                                                  • Engineering time is limited; I try to focus my efforts on tasks that only I can complete.
                                                  1. 2

                                                    Logs are an incidental detail to the application.

                                                    I think this is trivially disproved by observing that if the logs stop working for your service, that is (hopefully!) a page-able event.

                                                    Logs are a cross-cutting concern, but as essential as any other piece of operational telemetry.

                                                    1. 1

                                                      Logs are a cross-cutting concern, but as essential as any other piece of operational telemetry.

                                                      I rely heavily on logging for the services I support but the applications I wrote for work have only error reporting. They are used by a small audience and problems are rare; I might get a crash report every 18 months or so.

                                                      1. 1

                                                        Ah, yeah, I presume the context here is services.

                                                2. 1

                                                  Agreed. jq is a really nice tool. It made the decision to transition to using JSON for logging very easy.

                                                3. 3

                                                  Don’t use JSON, use logfmt.

                                                  1. 1

                                                    Yes! Logfmt is the good stuff. But it’s only semi-structured. Why not use JSON and a tool to transform to logfmt (with nested data elided probably) when needing to scan as a human?

                                                    1. 1

                                                      Logfmt is fully structured, it just doesn’t support nesting, which is an important feature! Structured logs should be flat.

                                              1. 9

                                                Add unreliable translation of filesystem events for the purpose of live reload, live rebuild, etc.

                                                1. 2

                                                  I wrote watchexec originally, which is a Rust CLI utility focused on running commands when files are modified.

                                                  I suspect about 50% of the GitHub issues involve Docker somehow. Not that they didn’t reference some real concerns, but it felt like Docker was essentially Weird Linux and it would subtly break user workflows, encouraging tools to adapt to it.

                                                  I really dislike cottage industries that grow up around stuff like this.

                                                1. 5

                                                  The graphs are for free space, not space used, and this disoriented me. Maybe this will help other be less puzzled?

                                                  1. 7

                                                    Many people asked me that., and these are the reasons:

                                                    • I don’t want to reveal the size of the database (I also omit the time axis to not reveal the growth rate)

                                                    • We host on RDS so we actually monitor when we are out of storage (e.g free space approaching zero) and not the size of the overall db.

                                                    1. 3

                                                      I noticed this, thanks for calling it out. I believe the reason folks do this is because 0 is broadly meaningful as a reference to scarcity of resources, where the absolute number (eg $STORAGE_CAPACITY - 65GB) isn’t. And then, I am speculating here, the reason they don’t put in terms of percentage utilization, is for technical reasons, sometimes that’s harder to calculate than just the absolute amount of available resources. Would be curious to hear from someone who has thought carefully about metrics and their presentation.

                                                      1. 1

                                                        Thanks, added it to the readme.

                                                      1. 10

                                                        Embedding timezone data seems like a recipe for your binaries being out of date very very quickly. There are important user-visible changes to the database all the time: https://github.com/eggert/tz/commits/master

                                                        1. 8

                                                          I believe the implementation only uses the bundled tzdata when loading a time location from the system fails. So a Go program running on an up-to-date system should continue to work fine, as the bundled tzdata is just a fallback.

                                                          1. 6

                                                            But it will silently have bad behavior on a system without tzdata rather than failing in a way that will allow the operator to install tzdata through system package management - and get updates. If you’re somewhere with relatively stable timezones rules you might never notice that your users are getting bad timezones until they complain.

                                                            1. 4

                                                              You’ll still get updates through Go: just compile with the latest Go release. Most people do this already since it’s very compatible.

                                                              1. 2

                                                                But the timezone db changes daily or weekly. And your OS will pull those updates automatically while Go releases are much slower and you’d need to recompile and redeploy your binary.

                                                                1. 7

                                                                  The tzdata package doesn’t release daily or weekly? 2020a is from a few days ago; 2019c is from September (and there were only 3 releases in 2019).

                                                                  1. 6

                                                                    Remember that this is opt-in, and only a fallback. If you prefer to not risk using out of date information, don’t use the tzdata package? Though then you have to ensure that your users/machines are fully up to date.

                                                                    1. 3

                                                                      If I understand correctly, yes it’s technically opt-in – but there’s no easy way to opt-out if a library dependency opts-in nor a mechanism to discourage libraries from importing it? cf https://github.com/golang/go/issues/38679#issue-607112207

                                                                      1. 2

                                                                        Your libraries can do plenty of bad things already, though. If anything, embedding tzdata is harmless compared to some of the nasty stuff one could do in a library’s init, like modify std error variables or declare global flags.

                                                                        I think the answer here, besides good docs, is vetting what dependencies you add to your module (and the amount of extra dependencies they bring in).

                                                                        1. 1

                                                                          That’s true and a fair perspective to take. I don’t care much personally, I was just trying to understand/clarify why some people resent this direction.

                                                            2. 5

                                                              Go prefers the system timezone database but will use the embedded data if it’s not available.
                                                              #38017 explains the use cases that this change resolves.
                                                              This change will mostly affect Unixlike systems without tzdata and Windows systems that don’t have Go installed.

                                                            1. 3

                                                              Why is RSpec performance unaffected?

                                                              By default in testing Rails wraps each scenario in a transaction, and that transaction is never actually committed, instead it’s rolled back during teardown so as to leave the database pristine. The author may have opted for another non-transaction-based database cleaning strategy in Cucumber (another strategy is commonly needed when the test server and client are two different processes, e.g. when interacting with a web app through a browser), and left the default in place in RSpec.

                                                              1. 11

                                                                Would you tell the community so that other devs don’t have to go through the same problem?

                                                                • Cease work sooner next time. Not necessarily the 31st day after invoice if NET30 (especially if customer has no prior late payments, it may be appropriate to grant a bit of slack), but such that “big debt” is closer to medium debt or small debt.
                                                                • Implement small percentage discount for early payment (eg. 2% discount if paid within 15 days for NET30). This small discount apparently is enough to move your account 15 days up the Accounts Payable list, and this is a common practice that a customer will accept that would not accept eg. straight up NET15.

                                                                I have not filed a legal complaint or used a debt collection agency, it hasn’t come to that for me.

                                                                Do you believe the customer is insolvent? Are they a big established company or a startup? How is their credit? What are their assets?

                                                                Debt collection and legal fees are inevitably going to be large, and may be for naught if the company is in bad shape. I faced these circumstances in the past and settled directly with the customer for 50% of the debt wire transferred that day, and moved on, with no regrets to this day.

                                                                1. 11

                                                                  What I miss from that times — keyboard layout switching worked reliably and happened instantly. Now it causes loss of focus in current window, long delays (so after pressing “switch layout” keybinding, few typed characters are still in old layout), few keybinding choices such as only ctrl+shift and alt+shift. It’s painful now in all distros and I’m not sure if old functionality (built in into X server I think) can be still used.

                                                                  Also both latest Gnome and KDE have terrible UI. Gnome tries to treat desktop as tablet computer and KDE has Vista-era shiny plastic look.

                                                                  1. 8

                                                                    This is why I’ve stuck with xfce for so long. I am a bit concerned now that xfce is going to gtk3, I fear it will end up more like gnome 3 than xfce.

                                                                    1. 1

                                                                      I think that’s unlikely. Most things are already ported to Gtk3 and they look exactly like they did on Gtk2.

                                                                    2. 5

                                                                      I don’t understand the hate for Gnome. When you critique Gnome’s UI are you comparing it to the high water mark, best desktop UI you’ve ever experienced, or the the latest iterations of macOS and Windows? Gnome isn’t developed in a vacuum, it is competing with the mainstream commercial desktop environments, which means compromises that negatively affect highly technical users, but results in a product that in some dimensions of UI may still be better than Windows and macOS, which is impressive IMO.

                                                                      1. 6

                                                                        I only dislike its desktop elements, mostly top menu, which consists of strange menu item in left corner and clock in center. There is too much unused space around clock. This is bad UI decision originally implemented on iPad, which has standard mobile phone status bar on top (it originates from “feature phones”, not even iPhone).

                                                                        GTK3, however, is great (at least on Linux) and I like settings dialogs and Gnone apps.

                                                                        1. 3

                                                                          I appreciate that Gnome is at least trying to do something other than the “yet another Windows 95 clone” that the X11 world is fixated on. (unless it’s a tiling WM…. I wonder UX-oriented desktop oriented around tiling would be like…)

                                                                          1. 2

                                                                            I’m comparing it to Gnome 2 and XFCE, and it fails terribly in this regard.

                                                                            XFCE is well-liked because it doesn’t try to abandon its user-base in favor of chasing some mass-adoption unicorns.

                                                                            If mass-adoption of Linux on the desktop ever happens, it will not be caused by Gnome 3 displaying fewer options in their GUI.

                                                                            1. 1

                                                                              Ah I definitely felt similarly when moving from Gnome 2 to Gnome 3. Once I got used to Gnome 3 though, I forgave Gnome. The spotlight is better than macOS. The built-in tiling is good enough. The default Debian themes are classy. The animations are classy and smooth even with integrated graphics. It never freezes or crashes. The best part is all these batteries are included so Gnome requires very few user choices or customization.

                                                                              1. 2

                                                                                I use GNOME 3 on my Linux machine, but I can’t say that I am happy. How do you live without menus? Or system tray icons (to e.g. Dropbox, Keybase)?

                                                                                I know that there are some extensions that bring these things back, but they tend to reduce stability of GNOME. And with Wayland bugs tend to crash gnome-shell/mutter and log you out of the session completely.

                                                                                I wonder who they are targeting when they are removing features that have been part of the WIMP paradigm for more than three decades? No one wants big innovation on the desktop, just provide a robust, predictable desktop environment that is up to date with the latest standards (Wayland, Vulkan rendering, etc.).

                                                                                (Of course, it’s their project and they can do whatever they want to do with it, I just don’t understand the philosophy.)

                                                                                1. 1

                                                                                  I use spotlight for everything, it’s brilliant :/

                                                                          2. 5

                                                                            KDE has Vista-era shiny plastic look.

                                                                            That’s much easier to solve (through the thousands of available themes) than this “Gnome tries to treat desktop as tablet computer”. For example, my own KDE setup looks like this: https://i.imgur.com/8eAze8v.png

                                                                            1. 1

                                                                              Does it crash often? Last time used it it crashed periodically (but that was few years ago).

                                                                              1. 2

                                                                                I haven’t actually had a crash in over a year. It’s become much more stable in the past few months, no more flickering when adding/removing monitors quickly either.

                                                                          1. 4

                                                                            I currently work in fin tech and, prior to that, ad tech. It’s soulless work on the best of days. My only issue is that there doesn’t seem to be many altruistic companies hiring, or at least in SV the signal to noise ratio is so bad they get squelched by all the startups hiring.

                                                                            Maybe another weworkremotely needs to be made (think goodtechjobs.org) so that we can connect altruistic orgs with engineers who want to make a difference.

                                                                            1. 5

                                                                              Check out Binti. We provide web software to child welfare agencies that makes it easy for members of the public to become foster parents and helps agency staff approve foster families. Based in SF. https://binti.com/binti-careers/

                                                                              1. 2

                                                                                Is it open source or are you really just trying to replace these agencies with your proprietary apparatus?

                                                                                1. 2

                                                                                  Thanks for the good questions both of you.

                                                                                  We are in it for the long haul, and have plans to become a B Corporation.

                                                                                  Binti is certainly not replacing child welfare agencies - after all, agencies have entire teams of staff providing services - we’re giving them modern IT to help them do their jobs better.

                                                                                  Right now, Binti’s customers and prospective customers are widely interested in fully-managed SaaS - they are grateful for our operations, security, and compliance expertise - they aren’t seeking to operate their own systems. It still feels really good developing software that helps find kids homes, even though the software is proprietary, not open source.

                                                                                  We do allow the agencies to download their data in standard formats in a self-service manner, so we don’t believe we are introducing any undue lock-in. Also we provide source code escrow that grants our customers a wide license to our software in the event that we severely breach our contract.

                                                                                  I’d love the hear your ideas to reduce risk for our customers in the case of acquisition or going out of business, as well as ideas to harness capitalism to benefit the little guy, building an organization that gets many of the benefits of capitalism while causing the least harm.

                                                                                  1. 1

                                                                                    I am sorry that you do not see that by making proprietary tools you effectively replace the internal know-how of the public sector and make it more vulnerable to attacks from the private sector in the long run.

                                                                                    I have seen accounting departments that are no longer able to function without consulting the private supplier. And it gets worse every year as the accountants leave. I do not expect anything else here.

                                                                                    But don’t stress it much. Somebody will eventually rewrite the stuff, put it out in the open and drive you out of business. It’s cheaper.

                                                                                  2. 2

                                                                                    That’s a good point. The software startups that survive either IPO into Wall St control or get acquired by companies that didn’t get big playing nice. Most of them like lock-in and scheme on people. So, if it’s not FOSS or a non-profit, there’s a good chance that what’s helping child welfare agencies now might become something causing them problems later. That’s the kind of risk I’d never want to happen.

                                                                                    Also, @gkop, do note you can charge money for GPL’d software so long as you provide source on request or just have it in an FTP directory somewhere. You get the market share by branding, networking, and execution in general. There’s lots of also-rans in FOSS to proprietary apps whose companies just out-marketed and out-executed them.

                                                                                    1. 2

                                                                                      Thanks Nick, replied as sibling!

                                                                                1. 1

                                                                                  Sorry, I didn’t know that Lobste.rs discouraged this. Thanks for explaining rather than just downvoting!

                                                                                1. 3

                                                                                  I’ve encountered errors like the one’s mentioned here and I’ve never even rolled by own CSV code before.

                                                                                  It’s actually a pretty terrible “standard.”

                                                                                  1. 4

                                                                                    There is an actual CSV standard, namely RFC 4180. So it’s not quoted-standard. Whether producers and consumers follow the standard is a different matter.

                                                                                    1. 1

                                                                                      Writing your own CSV parser isn’t that hard though. This post pretty much tells you all you need to know, it’s extremely straightforward to write a parser that handles all of it. If you’ve ever written a basic lexer for a programming language with strings you’ve done more than a CSV parser.

                                                                                      1. 3

                                                                                        It’s straightforward if your users are cool with your parser barfing on their malformed inputs. Lexer users expect it to barf. Not CSV users.

                                                                                    1. 1

                                                                                      I have a T460s and it seems like a solid machine overall. Granted, I’ve had it for .. uh, about half a year?

                                                                                      I wouldn’t vouch for the quality of its keyboard though. The scissor mechanism is really quite flimsy and cheap, and if you’re unlucky, you might find a “sticky” key. Oh, and the trackpad sucks for small motions. Like when you land the cursor right next to a tiny link you intend to click, moving your finger slowly results in no cursor movement at all until at some point when it suddenly jumps over the link. There’s probably a software workaround waiting to be undiscovered.

                                                                                      I would like it more if it had easily replaceable & extensible batteries. Yes, it has two batteries..

                                                                                      It’s hard to give a more informative comment since I’ve no idea what sort of things you expect from a developer laptop.

                                                                                      1. 1

                                                                                        I would like it more if it had easily replaceable & extensible batteries.

                                                                                        Its predecessor the T450s had such batteries, and it’s awesome. The T460s introduced the tapered, wedge-shaped lower body (like an X1 or MacBook Air); it’s thinner and lighter and overall an excellent machine, but unfortunately without the swap-able battery.

                                                                                      1. 2

                                                                                        In general http://www.notebookcheck.net/ is a great site for checking out laptops.

                                                                                        I can recommend the Dell XPS 15, I used it as my work laptop for the past 2 years with Linux Mint 17 and was very happy with it. I now purchased a Dell XPS 13 as my personal couch/travel laptop and am also somewhat happy with it. Build quality for both is superior though.

                                                                                        Minor down point is that the most powerful (especially more RAM) versions for the XPS 15 came with a glare 4k touch display that I have absolutely no use for. Similar for XPS 13. The XPS 13 sadly is the first Linux laptop I ever had trouble with which is weird as it is the only officially Linux supported I ever bought. Wifi only worked after package upgrades (good that I had an adapter for wired connection around), plus need to deactivate some stuff in BIOS. Plus the USB-C to VGA/HDMI adapter they sell does not work with Linux… the one I bought only works for HDMI. So, be aware.

                                                                                        As I just got back from researching laptops here are a few others:

                                                                                        • Thinkpads T/X1 as mentioned by others are quite nice/good. Also considered an “Ideapad”
                                                                                        • The upcoming Asus Zenbook 3 also looks very promising, glare display though.
                                                                                        • read good things about the HP Elite book.

                                                                                        Personally I prefer laptops without a dedicated graphics card, I have a desktop for that, it saves weight and the switching between internal and dedicated GPU is still sub par in Linux (in Linux Mint switching is built in but you have to log out/in)

                                                                                        1. 3

                                                                                          Personally I prefer laptops without a dedicated graphics card

                                                                                          Ditto. There’s also the issue of dedicated graphics often being a point of faiure (see, eg, the GeForce 8600M issues of a few years ago). Intel video support is also often less troublesome when using Linux/*BSD systems.

                                                                                          1. 2

                                                                                            The last couple generations of Intel graphics are very impressive indeed. More than adequate for a development workstation (for example, they can push > 10 million pixels 3D-accelerated). And great battery life and pretty good Linux drivers.

                                                                                        1. 3

                                                                                          The universal SSL thing is egregious. You just can’t take the company seriously when they allow this so nonchalantly. Browsers should mark these endpoints the same as plaintext endpoints.

                                                                                          1. 1

                                                                                            Even if Go were perfect on a technical level, the online community is really stiff compared to Elixir, JavaScript, Ruby, and Rust. That’s reason enough to stay away.

                                                                                            1. 2

                                                                                              This is a pretty nice primer.

                                                                                              I wish Docker was more polished. In practice, I’ve found it buggy and brittle and altogether annoying to use as a development environment. I do not share the rosy attitude of the author :(