1. 3

      I like the overarching idea, but for those who need more details, the readme on this repo gives a more detailed high-level overview as well as a link to the spec, talks, tutorials and implementations: https://github.com/solid/solid

      1. 1
        • Polygon Mesh Processing — Just put this on the shelf because the math was getting too dense for me. The topics were interesting, I just couldn’t grok it.
        • Learning From Data — An overview of machine learning recommended to me by a research scientist. The math gets pretty dense also, but I might be more motivated to get through it than Polygon Mesh Processing.
        • The Deep Learning book — Currently on the shelf while I make it through Learning From Data. More of a long-form text book on the subject, but it does a very good job of teaching any math required to understand the topic.
        • A Primer on Infinitesimal Analysis — Being delivered soon. I realize that I suck at calculus and someone recommended this book as the best way to get more comfortable with calculus. Every other book in this list uses calculus pretty heavily and…yeah, I don’t do calculus.
        1. 4

          Hopefully this will give you some more motivation, but can I just say how much I love calculus? It has absolutely changed the way I think.

          If I’m driving, I’m thinking about calculus, relative rates of change, area between curves.

          If I’m thinking about game design, it usually involves calculus. For instance, what is the nature of the advantage one team in league of legends obtains when they get an early first kill? If you think about the power curve of each team, then getting a kill is essentially bumping your teams curve by adding in a constant. But power curves in the game have a positive second derivative (think of an exponential curve: as you get powerful, it is easier to get more powerful).

          With that knowledge, we can see that the area between the two curves will be at its greatest just after the kill, when the constant is having the greatest impact, but will taper off as the game progresses, getting lost in the other natural power gains. This is the most natural negative feedback loop in the game, and absolutely essential to keeping the game competitive.

          There are other negative feedback loops too: a character that has died many times in a row being worth less money, and ending an enemy kill streak rewards extra gold. A lot of people tend to see negative feedback loops as more casual and less competitive, like blue shells in Mario Kart games. With one more insight, we can see why this isn’t the case.

          You win a game of League by destroying the enemy base, which requires some given amount of power advantage over the enemy team (that is to say, there is a cost of destroying the base: the damage output of the turrets, the locality of the enemy spawn and minions, the time it takes to destroy everything). Because the power level of the game is constantly increasing, the relative cost of winning the game shrinks.

          Winning the game means gaining enough advantages in a row that the space between the power curves is larger than the power cost of destroying the enemy base (and executing it all properly, of course, but that can be thought of as an additional power cost).

          So why is diminishing impact of advantage so important? It means your team can’t just build up enough advantage over the game in one off victories, but must do so quickly. You must stack enough advantages in a time period, and if you waste the opportunities, or the enemy team interferes sufficiently, then your advantage fades and the game tends back towards neutral*.

          (* Neutral isn’t actually neutral, because different heroes have different power curves. Some heroes are strong early and taper off, others are weak but become very powerful towards the middle or end game. The game is constantly tending towards this character defined power curve, more or less, and in the absence of other factors, the pressure to win is on the team that would naturally lose in the long run.)

          A quick proof by contradiction: if there weren’t negative feedback loops, predicting the victor of a match would use a simple tally system, comebacks would be rare, whoever started out winning would probably end up winning, and the end of most matches would have very little tension. Skill would be far less of an advantage, because you would only have to get an advantage, not fight to keep and expand it.

          Without using too much of the language of calculus, hopefully you can see how important the intuition of it all is. The relative rates of change, how advantages impact that rate over time, how advantage dissipates relative to the overall power of the game, and how advantage and winning are time sensitive (which is to say, they change).

          Just thinking in arithmetic and algebra won’t get you there, and the game can’t really be modeled discretely.

          Anyway, I love calculus and it seems to come up everywhere, even outside of “hard math/science” areas. I really hope you enjoy studying it :D

          (The irony in all of this is that I don’t actually play League. I tend to get too frustrated to enjoy it.)

        1. 12

          I hope with the backlash from TDD, people will stop trying to shoehorn all their testing as unit tests. Most sprawling software projects would, overall, get far more useful test coverage from an integration test suite. Unit tests should be used for actual units: test your parser, test an especially complicated pure function, etc.

          1. 3

            My takeaway from the article (maybe I’m reading too much into it) is that we can do a better job of creating units, and that we should focus on getting better of making independent, loosely coupled components. If that happens, unit tests won’t get in the way. But, in the meantime, unit tests are definitely getting in the way so maybe we should write larger tests (component, integration, etc) and spend the saved time focusing on learning better design. It still seems very theoretical to me, but I’d love for someone to demonstrate what this looks like in practice.

          1. 6

            I send people links to this talk on a more-than-monthly basis. It’s absolutely the most powerful introduction to functional programming I’ve ever seen, and provides a way to fit the benefits of FP into most any code base in most languages.

            1. 6

              Posted a comment on the blog but since the author is answering here…

              “Abandon SQL” and do what? As long as there is a “query language” there will be concatenating strings.

              If you’re saying, use one of those JSON-as-QL things, to what degree do they avoid the problem? If it’s just a matter of making group_by, where and other clauses keys in a dictionary…well now and should be a key in a dictionary, too. As long as everything in the query is represented in a AST-like form, it’s safe; but as soon as there is an expression language it’s back to zero. When you’ve got an AST in JSON, whatever the underlying language is, doesn’t matter all that much. I don’t think we want to ask analysts to type or read things like: {"select": {"from": ["a_table", "b_table", {"select": { ...sub-select... }}], "where": {"and": [{"equals": [...expression..., ...other-expression...]}]}, ... } }

              1. 3

                Pasting my reply here because there’s a lot of good thought going on here…

                The issue is that the default way to interact with a relational database is SQL strings. The SQL is parsed on the data plane, so developers commonly concatenate strings to form queries. This is the most obvious way to do SQL, and that’s the problem.

                If the executable code was loaded out of band and we only sent data parameters this wouldn’t be an issue. This is exactly what stored procedures and prepared statements are. These tools have existed for decades, yet we’re still getting hacked with easy-to-prevent vulnerabilities.

                This is where I think, “why are we even using SQL?” I mean, we’ve proven that we can be productive with RethinkDB, DynamoDB, and many other NoSQL databases. I’d love to keep using relational databases, but is there anything inherently tying relational databases to SQL? No, not really, other than history. Maybe it’s time to make some changes.

                1. 5

                  RethinkDB is very similar to the so-called “SQL expression languages”, DSLs – like arel for Ruby and SQLAlchemy for Python – that let one express queries with method chaining and then take care of the SQL for you.

                  Is that something like what you had in mind?

                  On thing I wonder about is the inclination to build brand new databases along with these new approaches to data access. A modern relational database offers consistent and good performance, reasonable efficiency with regards to memory and CPU, and stability. In principle, ReQL could be used to drive Postgres or MySQL. Adopting ReQL in this way could really benefit the teams at any company where I’ve worked. Adopting yet another remix of how to put rows and columns on disk, on the other hand…

                  1. 4

                    Datalog and Prolog come to mind on your last question given they surpass it:

                    http://stackoverflow.com/questions/2117651/comparing-sql-and-prolog

                    Datomic is getting a lot of mileage out of one, too.

                    1. 3

                      When you use any programming language, you also write strings. You most certainly don’t write abstract syntax trees directly, although hopefully those strings are relatively easy to convert to ASTs.

                      However, in any sensible programming environment, there’s a separation between a validation phase, where your program might fail for arbitrary reasons (because humans can’t reliably write meaningful programs), and an execution phase, where ideally failure may only be triggered by external factors (bad user input, dropped network connection, etc.), not internal ones (bugs). In statically typed languages, the validation phase usually coincides with “compile time”, but all that is required is that it strictly precedes the execution phase.

                      The problem with raw query string manipulation is that it unnecessarily creates more things to validate at runtime, defeating the purpose of the phase separation. But this problem is by no means exclusive to SQL, or even caused by it. Raw string manipulation isn’t the only way to embed SQL (or any other DSL) into a general-purpose programming language..

                  1. 9

                    I use Haskell SQL libraries (Persistent + Esqueleto) that makes these problems impossible, through a well typed interface.

                    There’s also a Haskell library that sets up prepared statements for you automatically: http://hackage.haskell.org/package/hasql

                    1. 4

                      Every language has a library or tool that makes it easy to avoid SQL injection. This has been true for decades. Yet here we are, still paying the high cost of a system that’s insecure by default.

                      1. 4

                        Persistent and Esqueleto make building and running SQL queries by concatenating raw String values more inconvenient than they make running safe queries. So do nice ORMs like Python’s sqlalchemy. For example, in sqlalchemy, you are unlikely to write:

                        query = query + " WHERE foo='" + bar + "'"
                        

                        with strings because it’s more characters than writing:

                        query = query.where(Table.foo=bar)
                        

                        with Query objects, and the documentation deliberately de-emphasises the former way.

                        The point isn’t only that there’s a way to avoid SQL injection, but also that the way that avoids SQL injection is strictly more convenient than the way that permits SQL injection.

                    1. 8

                      I get where he’s coming from but it’s still bad hyperbole. Who says that bad developers won’t write insecure code using other DBs?

                      If a given website has SQL injection errors there’s a really good chance it has a whole bunch of other available attack vectors.

                      The real issue, to me, is that jobs like plumber and electrician have licenses required to be able to sell your services as one of those. Programming does not, and I would argue it’s a much more complicated job that is harder to see problems in for lay people.

                      1. 2

                        This is a great reaction to the post. Honestly, I don’t expect “programming licenses” to appear tomorrow. In my experience, that’s going to take decades. Yet in the meantime we still have a huge problem.

                        The bigger point that I wanted to make (not sure how well I did) was that sometimes a problem is simply so disastrous that it’s not worth all the value it offers. If you tell me that “all we need to do is be more careful with user input” I’m going to fucking flip shit. We’ve been doing it for decades and we still can’t do it. Elections are being compromised. Identities are being stolen. People are going broke. When does it end? When do we finally decide that SQL injection is just too disastrous a problem that it’s not even worth using SQL?

                        1. [Comment removed by author]

                          1. 3

                            If the lock truly was faulty and provably inadequate you could absolutely go after the locksmith and/or the lock manufacturer for their culpability in not doing their job sufficiently.

                            1. 3

                              I can get locks right now that nobody knows how to bypass and others that most thieves don’t know how to bypass. There should be no breakin if the lock design works as advertised. I’d absolutely blame the artisan using a defective lock. I wouldn’t blame them if a new class of attack made a previously-good lock breakable. See how that works?

                              Besides, there’s correct-by-construction techniques for doing many common things in programming. “pyon” and I mention a few in another part of this thread. Using broken methods when not necessary is negligence.

                              1. 3

                                There’s a big difference: Physical systems can always be broken, if you have enough resources and try hard enough. You can’t make your house impossible to break in. You can only make it not worth a thief’s time to try. And law enforcement officers will probably still be able to break into your house if they deem it necessary. (Not precisely because locks can magically distinguish between thieves and LEOs.)

                                On the other hand, SQL injection is the kind of attack that is outright impossible in a properly designed software system.

                                1. 1

                                  But if your village had two artisans, one with a licence and the other without, I would blame myself for picking the wrong one.

                                  1. 2

                                    When the US government hires the wrong developer to write an app for the elections agency and he uses SQL in the simplest, most obvious way, who do you blame?

                                    1. 4

                                      Both he and the government. DOD & NSA have standards for highly-assured systems. Medium, too, if cost or complexity prohibits high. There’s also lots of solutions in GOTS and COTS to these problems. The could negotiate them free or at cost for budget operations as a term for their other lucrative contracts. They just don’t care. That’s what it boils down to. Neither does Congress in their policies. So, I blame Congress and Executive branches primarily if we’re talking bad INFOSEC in government.

                                      Contractor was incompetent, too, but wouldn’t have been hired if the government was acting competently.

                                2. 2

                                  The problem is that’s just kicking the can down the road.

                                  You can say the same thing about a million things; people still don’t handle credit cards securely, even big companies. Does that mean we stop using credit cards?

                                  No, there’s no easy solution (not that abandoning SQL is an easy solution but it is a simplistic one.)

                                  I agree licensed programmers are not happening any time soon but that’s really what’s needed. We need to form a union and have a licensing process.

                                  1. 1

                                    In a very wide range of applications - perhaps not all, to be sure - SQL is overkill and more limited alternatives that are better integrated with mainstream programming languages are perfectly adequate, and reduce the rate of security vulnerabilities.

                                    1. 3

                                      So, in other words, “let’s dumb down databases until they’re just as inexpressive as most programming languages”?

                                      1. 5

                                        I definitely support simplifying error-prone systems until they are safer, in general. I wouldn’t have said it how you did, but I’m willing to agree with that statement of intent.

                                        SQL is already a DSL though, and it’s about as simple as it can be for the problem it solves. What we can do on the technical front is build type-safe wrapper APIs and heavily discourage or prohibit the use of direct string concatenation to build queries, but getting people to use them is still hard.

                                        I’m gradually coming around to the idea that licensing has the ability to imbue a sense of responsibility, but it’s hard to imagine… perhaps because we’ve never had it in this industry.

                                        1. 4

                                          You won’t find a stronger proponent than me of disallowing raw string concatenation as a means to build queries, except perhaps as a low-level implementation detail that application programmers shouldn’t concern themselves with. As my other posts on this thread make patently clear, my favorite take on how to safely build queries is Ur/Web, mostly because it’s based on typed metaprogramming, which makes error detection happen as early as possible during the build process. But code generation (Lisp macros, Template Haskell, etc.) is of course another perfectly valid approach.

                                          However, I have to vehemently oppose limiting what can be expressed in a database schema. Types and database schemata are how we make sure that our programs and our data make sense, respectively. The ability to fearlessly refactor an arbitrarily complex piece of database code comes from the certainty that any nonsense in your queries will be located and properly flagged. If programming languages “don’t grok” the structure of your database, they have to be smartened up until they do.

                                        2. 1

                                          Yes. Ideally I’d dumb down the datastore to, at least, a non-Turing-complete level. Just as I’d dumb down e.g. web templates. Having two different ways to express business logic is asking for trouble - sooner or later you’ll get the same logic expressed in both, and sooner or later it’ll get out of sync.

                                          Either the datastore should be dumb, or it should be language-integrated. If SQL were really a first-class general-purpose programming language then it would make sense to write complete systems in it, but I don’t think I’ve ever heard of that approach succeeding.

                                          1. 3

                                            I’m not a huge fan of the one feature that makes SQL Turing-complete (common table expressions), but the database is the natural place to express many business rules, which are often declarative in nature.

                                            The inability to embed SQL into a metalanguage (namely, your programming language of choice) is a failure of the metalanguage, not SQL. Java’s type system is incapable of typing the relational algebra operators. GHC Haskell and Scala probably can, but only by explicitly using type-level maps (to keep track of the names and types of a relation’s attributes), which makes this approach unusable in practice. But Ur/Web, whose type system has just the right features to type derived relations (row polymorphism, disjointness assertions), makes using SQL as an embedded DSL a breeze.

                                            1. 1

                                              In principle I’m sympathetic. But if these features can only be expressed in an obscure/immature language, are they really that critical to business functionality? How have non-database programs managed without them so far?

                                              1. 2

                                                Data integrity is nonnegotiable in the kind of applications I write. The question is not whether to enforce it or not, but rather whether to enforce it automatically in the database or manually in the application. At first sight, this is just a matter of personal preference, and, given how annoying SQL can be, it’s hard not to prefer to do it in the application side. But when you take transactions, concurrency control and error recovery into account, the advantages of doing it in the database side become clear.

                                                As for how non-database programs have managed so far, I don’t know, because my professional experience is primarily with database applications. But in my admittedly biased opinion, database applications are a common enough use case that they deserve special attention from programming language designers. Database applications tend to manipulate data in ways not previously foreseen by the database designer, so you want to optimize for flexibility (which is precisely what the relational model does!). By contrast, if you write, say, a compiler or a text editor, you can (and most likely want to) plan ahead what data structures you will use, and if you ever decide to change data structures, it will trigger a Major Refactoring Event ™.

                                        3. 2

                                          …more limited alternatives that are better integrated with mainstream programming languages…

                                          Like what?

                                          1. 1

                                            SQL is great for if you really need ad-hoc querying with partial indices created on insert, can fit your data model into square tables, and really need full ACID. IME that’s actually quite rare and what tends to happen is either people use it as a key-value store and do their aggregation in an explicit userspace batch process (in which case they would be better off with a key-value store designed as such, and map-reduce style batch processing or CQRS/lambda-architecture style near-realtime-but-nonblocking aggregation), or people need truly ad-hoc querying and put indices on every column and then have write load problems caused by the transactional model, and would be better off with an analytics-oriented store that was explicitly more lossy. Needing the ad-hoc model and the strict transactionality at the same time is quite rare (for exploratory programming you rarely need precise answers, if you need a precise answer you normally know the precise question), and you’re better off picking one or the other.

                                            1. 2

                                              But to be concrete, which datastore are you recommending?

                                              Typically businesses need to see ad-hoc query, analysis, and OLTP performed on “the same data”. Not the same datastore, but the same dataset. Having all the data in one integrated datastore does make that really easy for even moderately large sites. Replacing a SQL database with multiple datastores and pipelines between them is a hard sell purely on its merits, not even considering the current state of the industry.

                                              1. 1

                                                If you make me pick just one, Cassandra.

                                                In the current industry it’s easier than ever to run multiple datastores and have a replication pipeline or regular batch job. You can even keep the traditional SQL database interface for ad-hoc BI-type work - the transactionality issues don’t matter if it’s read-only and the security issues matter less if it’s internal-use-only.

                                                1. 1

                                                  So coming back to your original comment, with regards to Cassandra as a more limited alternative that is “better integrated with mainstream programming languages” – what are the big advantages that you see with regards to the API that it exposes?

                                                  1. 1
                                                    • Easier to make a value look like a programming-language value (in particular, collection types rather than having to map collections to tables)
                                                    • Explicit distinction between looking up by key and queries that will result in a full table scan - these operations have very different behaviour but in SQL they look the same.
                                                    • “External” map-reduce aggregation possible - so you can reuse your business logic when doing aggregations
                                                    • Key-value model covers the most important cases of the “arbitrary subset of columns” model that SQL supports, but you can write the type of it easily
                                                    1. 2

                                                      “External” map-reduce aggregation possible - so you can reuse your business logic when doing aggregations

                                                      I’m understanding this to mean, having the app map over rows; but that doesn’t seem like a feature of Cassandra.

                                                      Key-value model covers the most important cases of the “arbitrary subset of columns” model that SQL supports, but you can write the type of it easily

                                                      If I understand this right, Cassandra is focused on storage and retrieval of whole objects; the type of every result is very likely a type that already exists in your application; whereas a SQL query can return any product of the columns of these types or any subset thereof.

                                                      1. 1

                                                        I’m understanding this to mean, having the app map over rows; but that doesn’t seem like a feature of Cassandra.

                                                        It exposes a hadoop-compatible API directly. You can of course build the same thing for an SQL database but it’s harder: popular querying APIs are not streaming-oriented (I once saw an SQL server brought down because someone had visited a web page 23 days previously - it had been chugging through figuring out the result set for those 23 days, and then started trying to actually stream the results and stopped responding to any other queries), and you have to be very careful with your transaction isolation levels if you’re going to run a long-running query. (And it can be difficult to test, since there’s a “fun” failure mode where the results start off streaming fine, but get slower and slower the longer the query runs as the “snapshot” diverges from the live database).

                                                        If I understand this right, Cassandra is focused on storage and retrieval of whole objects; the type of every result is very likely a type that already exists in your application;

                                                        Yeah. You end up meeting in the middle - you split your datatypes up to be more storage-friendly, or else you have a separate DTO layer with a storage-oriented representation. Which may sound like more overhead, but I find it’s a lot more practical (in terms of testability etc.) to have an explicit transformation in plain old code before a network boundary than to have complex mapping commingled with the remote call.

                                                        1. 1

                                                          I’m understanding this to mean, having the app map over rows; but that doesn’t seem > > like a feature of Cassandra.

                                                          It exposes a hadoop-compatible API directly. You can of course build the same thing for an SQL database but it’s harder: popular querying APIs are not streaming-oriented (I once saw an SQL server brought down because someone had visited a web page 23 days previously - it had been chugging through figuring out the result set for those 23 days, and then started trying to actually stream the results and stopped responding to any other queries), and you have to be very careful with your transaction isolation levels if you’re going to run a long-running query. (And it can be difficult to test, since there’s a “fun” failure mode where the results start off streaming fine, but get slower and slower the longer the query runs as the “snapshot” diverges from the live database).

                                                          I gather this is all client side.

                                                          I hear what you are saying about streaming – relatively few query APIs are built using the async APIs exported by databases or build off of cursors.

                                                          I am not sure the transaction snapshot handling of Cassandra is a selling point, since it seems to be possible to see old rows or mixed writes in a variety of scenarios. Please correct me if I am wrong.

                                                          1. 1

                                                            I gather this is all client side.

                                                            Query prioritization/scheduling is the server’s responsibility - the 23-day query should never have been permitted to drown out all the usual business queries. Using the correct transaction isolation level is the client’s responsibility up to a point, but if getting it right is difficult and error-prone then that’s a system failure.

                                                            I am not sure the transaction snapshot handling of Cassandra is a selling point, since it seems to be possible to see old rows or mixed writes in a variety of scenarios.

                                                            Depends on the use case. IME in the case of big, aggregating, ad-hoc queries (usually BI-type work), you would rather have inaccuracy than deadlocks. And I think Cassandra’s limitations are more visible, which drives better data design, whereas SQL databases tend to paper over the problems with your data model more.

                                                            1. 1

                                                              I gather this is all client side.

                                                              Query prioritization/scheduling is the server’s responsibility …

                                                              I am sorry to belabor the point but the map/reduce – it is just a client-side façade? It’s not actually distributed or anything?

                                                              1. 1

                                                                map/reduce on Cassandra is or at least can be genuinely distributed (assuming your Cassandra instance is). Sorry if I was unclear. I don’t think distributed vs not is the important property though.

                                  1. 1

                                    I’m skeptical about how valuable this will be. The JVM has a really good garbage collector and Scala leans on GC very heavily.

                                    1. 3

                                      My guess is that it will continue to be garbage collected, but instead of compiling to java byte code, it will compile to llvm IR.

                                      1. 2

                                        Sort of by definition, right?

                                        I think @kellogh’s post is that it’s unclear what the benefit of Scala compiling natively would be. Java has gone through this already with gcj and that hasn’t been a big success. .Net is trying it now too, though. IMO, I think compiling natively is only a real benefit if the language happens to have some semantics where native compilation is an advantage, I’m not sure that is true for any JVM-based language, the semantics are just really not flattering for a native architecture, IMO.

                                        1. 1

                                          I think the difference is that Scala devs are a bit more pragmatic and are willing to make some small concessions toward making things run better, see for example Scala.js: Multi-threading isn’t really supported in the browser, but the world hasn’t ended yet.

                                          As far as I know, all that crazy every-instance-has-an-associated-monitor won’t be supported for instance on Scala Native.

                                          If Scala.js can be used as guidance, another important difference is that these projects take their host platform seriously, it’s not like some half-assed “let’s pretend to be Java like in gcj’s case”.

                                          JavaScript is a first-class citizen in Scala.js, just as Java is a first-class citizen when using Scala. I think Scala Native will be the same with first-class interop with native libraries.

                                    1. 4

                                      I’m quite pleased with the syntax they’ve chosen. I think it will do an effective job of lowering the cognitive overhead of FP principals for developers of C-like languages. For instance, I love that, in the tuple syntax, they seem to be forcing you to write out both the type as well as the name:

                                      var ll = new (double lat, double lon) { lat = 0, lon = 0 };
                                      

                                      When Java developers see my Scala code, they tend to point at usages of tuples and say things like “hey! it’s not readable!”. I imagine the bulk of C# developers feel the same way, so this is probably a necessary decision to make C# developers feel comfortable with the new syntax. I know for certain that this will rub many Haskell developers the wrong way (and they have good reasons), but this I think they’ve made the correct decision for C#.

                                      On the same note, I also think using switch/case/break is actually the correct decision for the same reasons (familiarity). Also, the way they implement custom destructuring/pattern matching is different, but overall seems like a very C#-way to do this.

                                      I’m very excited to see how the language is developing. More and more, it seems like the perfect platform to express the benefits of FP while (mostly) avoiding the elitist side.

                                      1. 2

                                        Eh, but I think the entire Iceland thing is a sideshow here. It’s almost as if they ignoring the Big Flashing Marquee and concentrating on the the tiniest furthest least significant corner of it.

                                        1. 1

                                          It is possible that the company just doesn’t deal with citizens of USA.

                                          1. 2

                                            No, US citizens were definitely involved. I saw somewhere (Reddit) that they’re planning on releasing that info but they haven’t yet due to an unspecified reason. Some were speculating that certain finance laws made it more difficult to release information for US citizens (can’t find the source, sorry).

                                              1. 1

                                                OK, ok. I stand very well corrected.

                                          1. 26

                                            My first job, almost 8 years ago, was sub-contracting to test software on the F-35. They subcontracted the unit testing to us, a much smaller company (“they” were actually a subcontracter of another company). We didn’t have requirements, but we did have pressure to stay on schedule, so we wrote tests anyway. We had excellent coverage – every single byte of machine code was covered as well as every possible branch (if we couldn’t cover an instruction or branch, we left an official note explaining why). But since we didn’t know what the software was supposed to do, these tests just gained coverage. Our customer was happy to pay us and report 100% test coverage to their superiors.

                                            1. 4

                                              I wonder if non-programmers understand the difference between lines-of-code test coverage and input domain test coverage.

                                              1. 5

                                                Not at all. Code is the closest thing to a physical artifact, so saying you’ve tested 100% of the code sounds great. Internal state and input domains are much more abstract concepts.

                                                1. 5

                                                  In engineering I’ve met quite a few people who do, though maybe not phrased exactly like that. It helps that input domains can be roughly mapped to the non-software idea of testing equipment by establishing a set of operating conditions that it is supposed to function within, and then getting good test coverage of the conditions within those parameters (including any unusual/extreme/unlikely conditions included in the range). The key bit anyway is that you’re supposed to get good coverage of the conditions, not only of the equipment, i.e. full test coverage of every last bit of a part isn’t good enough if you test it only under one temperature or load. And somewhat analogously with software.

                                              1. 3

                                                US, PST

                                                1. 2

                                                  Out of curiosity, will there be Windows support in the future?

                                                  1. 9

                                                    How do any of these points prove they have a quality problem?

                                                    The last point seems particularly stretched. Is there any organization anywhere that doesn’t experience fewer introduced defects when nobody is working? In other news, the Coca Cola bottling plant broke fewer bottles when everybody was on vacation too.

                                                    1. 8

                                                      The Coca Cola bottling plant had zero activity when no one was working. Facebook’s activity remains the same on weekends and holidays.

                                                      1. 3

                                                        It’s just a bad analogy - presumably the same number of people were drinking Coca Cola on the weekends.

                                                        1. 2

                                                          Nor do people stop drinking Coke when the plant that makes it is closed.

                                                          1. 1

                                                            Bugs and breakages and defects have a power law distribution in software and gross number is not a good measure of total cost. Most bugs are annoying and minor; occasionally, there’s a major bug that is a serious threat to the business (or worse).

                                                            Defect/crisis count is going to go up when people are working, but a lot of that is issue detection.

                                                            So yeah, I’m underwhelmed by this finding. It doesn’t prove that Facebook has a worse code quality problem than any other organization or even (although I don’t assert this to be true) refute a potential claim it it has better code quality.

                                                            1. 8

                                                              Defect/crisis count is going to go up when people are working

                                                              I love that we, as a community, stress the importance of distributed systems and fault tolerance, but statements like this indicate that human error is still by far the leading cause of outages. It’s not just your statement, this article also says the same thing and so does my experience.

                                                              If you have more bugs when people are working, what do you think the cause is? Probably the humans. How do you fix the humans? Well you make it harder for them to make mistakes. Code quality could be one thing to fix, also deployment procedures, design reviews or possibly company culture. From my experience, fixing code quality goes a long way toward reliable software. It’s not certain, but I’d also bet that Facebook is running into a code quality issue.

                                                        1. -2

                                                          Seems like pbcopy and pbpaste for OS X but strays further from the Unix philosophy (do only one thing and do it well). There’s a lot of cruft there for a utility that only needs to set & read the clipboard. It doesn’t even do a good job of interacting with the clipboard. What about HTML vs TXT? What about UTF-16? Big endian or little?

                                                          1. 8

                                                            Obviously you didn’t read the README very carefully.

                                                            Also, your concerns regarding HTML and encoding are irrelevant.

                                                            1. 2

                                                              My bad, I didn’t read carefully enough. This is a very cool addition to pbcopy

                                                          1. 5

                                                            It’s probably appropriate to link to Eric Lippert’s Top 10 Worst C# Features on lobste.rs earlier this week. The TL;DR is that most of his regrets stemmed from making a language that felt familiar but worked differently just enough to be highly confusing. It seems like Crystal might be following exactly in these shadows.

                                                            1. 8

                                                              I think a thing missing from Stack Overflow’s worldview is that, as you move along in you career, your questions either:

                                                              1. Start being super super super specific to your situation (“On WhateverBSD 4.2, using Erlang HPE compiled with Clang 3.0, I’m getting E_OBSCUREST_OF_ERRORS…what can I mount to /proc to work around this?”) and so you may well receive no answer whatsoever before you grind through it.

                                                              2. Are high-level opinion questions, which are totally valid (“Which of these two approaches is currently the industry standard? Have you ever been bitten by this?”) but appear to be silly subjective slapfights to mods.

                                                              There probably could stand to be a place (C2 wiki, maybe?) for at least questions of the second type.

                                                              1. 2

                                                                I think Programmers Stack Exchange, which claims to focus on “conceptual questions about software development”, was created specifically to handle questions of the second type.

                                                                1. 2

                                                                  Slant was created to be stack overflow for opinions. I’ve followed them for a couple years, but I don’t find myself regularly using them. I think they accomplish their purpose reasonably well, but I don’t often find myself seeking out opinions on the Internet.

                                                                1. 11

                                                                  I think it’s due to the absurdly low (and increasingly lower) barrier to entry to be a “developer” (i.e. You can be a developer in two weeks, here’s how!)

                                                                  People who spend a dozen years in medical school and residential programs do not send emails telling people they jerked off to their conference talks.

                                                                  1. 36

                                                                    Well, hm, these terrible folks get in because we support them by looking away, by ignoring our women colleagues concerns, or outright disbelieving them, and by ignoring our systemic gender biases in hiring.

                                                                    There is nothing in a long professional filter handles those concerns and removes men that are bad to women. Many of those filters, in fact, are “boys' clubs” that have terrible amounts of attrition w.r.t. women. Google “sexual harassment doctor” or “lawyers” or whichever professional field and you’ll see plenty of evidence that filters are not the fix.

                                                                    While this doesn’t solely exist in our industry, it is ours to fix! And it’s going to be fixed by listening to and amplifying women’s voices, addressing our systemic biases in hiring, and taking on the emotional labor we tend to put on women.

                                                                    1. 1

                                                                      And a lot of those men probably didn’t have positive interactions with women in their formative years, and probably were pandered to by media and games who took advantage of that fact, and luckily for them they found a career field that promised “meritocracy” and that it would turn a blind-eye to their social issues as long as they got shit done.

                                                                      How many of them were called creeps or left alone when just a little bit of compassion could’ve changed things? How many of them were hit with harassment charges or teasing when a patient “Now, man, that’s not a polite thing to say, we don’t talk about women like that”?

                                                                      Any discussion about these things needs to acknowledge the entire pipeline.

                                                                      EDIT:

                                                                      Thanks for the troll flag. I’m suggesting a bit of compassion here, same as you. It’s a lot easier to wring our hands about the evil men dominating the work force and literally oppressing women by their mere existence than it is to realize that hey, men are people to, and that if we don’t want to just write off an entire generation of people whose views and actions we deplore than we need to try and engage with them and show them the right way to behave. And part of that is showing the empathy that we ourselves claim to require so much.

                                                                    2. 6

                                                                      Why would the barrier to entry to being a developer change anything? Let’s say we start doing credentials and certificates and licensing like other fields. So? People can still send garbage over the internet.

                                                                      1. 4

                                                                        I’m surprised that no one points out the obvious difference between software engineering and medicine. Doctors have to face each other in person, whereas programmers often do not. It’s a lot easier on the internet to forget that these are real people, and it’s a lot harder to see the damage you’ve inflicted when you can’t see in their face the sorrow & pain caused by your words. Unfortunately, I think this is going to get worse with our latest push for remote working.

                                                                        1. 3

                                                                          Have you ever seen the backstabbing and infighting bigwig or self-important doctors get into?

                                                                          Many of them are pleasant in person, but that’s just because they’re quite practiced at being two-faced.

                                                                          1. 3

                                                                            For what it’s worth, my remote jobs have had healthier cultures than than my on-site jobs. Arguably the flexibility of remote work makes it especially appealing to programmers with family responsibilities, and it certainly makes it easier for those of us who can’t or don’t want to live in the Silicon Valley echo chamber, both of which I think help to diversify the industry beyond the SF-unattached-twentysomething-male monoculture which has been so toxic.

                                                                            More broadly, I don’t think there is much similarity between the dynamics of a remote or distributed team of coworkers and the dynamics of a message board of anonymous strangers, and I don’t think it is valid to draw conclusions about one based on the other.

                                                                          2. 2

                                                                            If only because those programs taught them the importance of at least not expressing misogyny in ways they could face public backlash for.

                                                                          1. 2

                                                                            I think there’s a similar intuition with programming, but it’s possibly even harder to explain. Some people “get” it, others don’t. It doesn’t have anything to do with intelligence, but it’s hard to teach.