Threads for dege

  1. 10

    I’ve (mercifully) never needed to deal with CVEs, but my understanding is that maintainers often dislike them because the process isn’t run by vendors/developers/maintainers but by “anyone who plugs details in the MITRE form”. After looking at the process a bit, it looks like it would be easy for me to submit a CVE for any product I wanted, give a link to a self-referential security page (“Foo has security issue bar, see CVE-XXX”), and have the same thing happen.

    Strange system.

    1. 8

      Part of the problem here is that the idea of CVEs is not aligned with what many people think CVEs are.

      Ultimately the core idea of the CVE system is just to have a common identifier when multiple people talk about the same vulnerability. In that sense having bogus CVE entries doesn’t do much harm, as it’s “just another number” and we won’t run out of numbers.

      But then some security people started treating CVEs as an achievement, aka “I did this research and I found 5 CVEs!” etc. - and suddenly you have the expectation of the CVE system being a gatekeeper of what vulnerability is “real” and what not. (And having seen a lot of these discussions, they are imho a waste of energy, you’ll have cornercases of the form: might be a vuln, might help if you have another vuln to chain together, whether to call it a vuln people will never agree.)

      1. 2

        But then some security people started treating CVEs as an achievement, aka “I did this research and I found 5 CVEs!” etc. - and suddenly you have the expectation of the CVE system being a gatekeeper of what vulnerability is “real” and what not.

        That is a maddening behavior that I’ve also observed. It’s also a hard thing to fix, considering that the CVE database was conceived in the face of vendors who refused to admit that they shipped security issues. We needed a common way to reference them even if the vendor disagreed.

        I’m not sure how I think we should fix it, yet.

      2. 3

        The fact that you can is a strong checks-and-balance method of making sure that a maintainer cannot stonewall an actual vulnerability and pretend it doesn’t exist. The process is not without flaws, but it’s the devil we know and being an honors system it works surprisingly well. Speaking as a member of a security team in a well known project, the CVE part of the security process is among the least complicated aspects.

        1. 2

          It’s even weirder. I needed a CVE once for a library I maintain, and I couldn’t get one! Apparently ranges of numbers are allocated to certain organizations (big corporations and Linux distros), and you’re supposed to ask “your” organization to give you a number. I was independent, and nobody wanted to talk to me.

        1. 2

          It should be noted that this is a 7 year old article, and while primarily about a feature that hasn’t changed all that much, it is referring to a PostgreSQL version which is long out of support. https://www.postgresql.org/docs/current/sql-select.html discuss the feature in the current version.

          1. 9

            I converted from MySQL (before whole MariaDB and fork), and I’ve been happier with every new version. My biggest moment of joy was JSONB and it keeps getting better. Can we please make the connections lighter so that I don’t have to use stuff like pg-bouncer in the middle? I would love to see that in future versions.

            1. 6

              Connections are lighter in Postgres 14!

              1. 5

                Do you have more information about it ? I am interested too :)

                1. 4

                  This writeup (not by me) might be of interest: https://pganalyze.com/blog/postgres-14-performance-monitoring

                2. 4

                  I am all ears!

                  1. 2

                    One link is in a reply to other comment in a tree, and some other details can be found from links from this blog post https://www.depesz.com/2020/08/25/waiting-for-postgresql-14-improvements-for-handling-large-number-of-connections/

              1. 2

                This experience really reinforced my pre-existing experience that Postgres is enormously complex and full of poorly-labelled operational landmines.

                I believe this—along with synchronous replication—is the main reason hyperscalers continue to use MySQL over Postgres.

                1. 2

                  Since postgres can do both async and sync replication, can you elaborate on what you mean?

                  1. 1

                    Postgres sync replication was immature and basically unused until a couple years ago. Async log shipping has been the go-to, or using something slow and fraught with complexity like slony. MySQL sync replication just works, and has worked for many years.

                    See this Uber post about switching off Postgres for examples of problems with early versions of Postgres streaming replication.

                  2. 1

                    Tell me more about this. Tbh I’ve only used redshift for a clustered rdbms.

                    1. 2

                      Articles like this about Postgres are pretty common. There is no one “silver bullet” Postgres config that will handle any workload predictably, you need to tune it based on what you’re doing. As this post mentions, VACUUM comes up a lot. MySQL’s storage engine is pretty simple in comparison, so even though it can’t do as many useful things as Postgres’, it usually Just Works™️. Hyperscalers don’t want to tune per application, they want to deploy a zillion homogeneous database instances and have them all work for a variety of applications without babysitting. Having a Postgres expert per product team doesn’t scale.

                      Often this means they don’t use MySQL or Postgres at all. But when they do, they almost always use MySQL.

                  1. 1

                    Relevant follow-up from the PostgreSQL Security Team: https://www.postgresql.org/about/news/1935/

                    1. 2

                      This is very old and somewhat outdated, I recommend a recent presentation as a complement to it: https://www.postgresql.eu/events/pgconfeu2018/sessions/session/2058/slides/96/hackingpg-present.pdf

                      1. 1

                        Greatly appreciated! I got the initial one from the PostgreSQL FAQ - must say the amount of documentation and presentations regarding developer on-boarding in the project is amazing.

                      1. 3

                        Tagged this with “Law” as well since the interesting angle here is how this can be seen as a response from AWS to the MongoDB license change.

                        1. 2

                          Good reasoning, but I think it’d be a little better to reserve that for submissions that are primarily about law.

                        1. 8

                          yet in many respects, it is the most modern database management system there is

                          It’s not though. No disrespect to PostgreSQL, but it just isn’t. In the world of free and open source databases it’s quite advanced, but commercial databases blow it out of the water.

                          PostgreSQL shines by providing high quality implementations of relatively modest features, not highly advanced state of the art database tech. And it really does have loads of useful features, the author has only touched on a small fraction of them. Almost all those features exist in some other system. But not necessarily one single neatly integrated system.

                          PostgreSQL isn’t great because it’s the most advanced database, it’s great because if you don’t need anything state of the art or extremely specialized, you can just use PostgreSQL for everything and it’ll do a solid job.

                          1. 13

                            but commercial databases blow it out of the water

                            Can you provide some specific examples?

                            1. 16

                              Oracle has RAC, which is a basic install step for any Oracle DBA. Most Postgres users can’t implement something similar, and those that can appreciate it’s a significant undertaking that will lock you into a specific workflow so get it right.

                              Oracle and MS-SQL also have clustered indexes. Not what Postgres has, but where updates are clustered as well. Getting Pg to perform sensibly in this situation is so painful, it’s worth spending a few grand to simply not worry about it.

                              Ever run Postgres on a machine with over 100 cores? It’s not much faster than 2 cores without a lot of planning and partitioning, and even then, it’s got nothing on Oracle and MS-SQL: Open checkbook and it’s faster might sound like a lose, but programmers and sysadmins cost money too! Having them research how to get your “free” database to perform like a proper database isn’t cost effective for a lot of people.

                              How about big tables. Try to update just one column, and Postgres still copies the whole row. Madness. This turns something that’s got to be a 100GB of IO into 10s of TBs of IO. Restructuring this into separate partitions would’ve been the smart thing to do if you’d remembered to do it a few months ago, but this is a surprise coming from commercial databases which haven’t had this problem for twenty years. Seriously! And don’t even try to VACUUM anything.

                              MS-SQL also has some really great tools. Visual Studio actually understands the database, and its role in development and release. You can point it at two tables and it can build ALTER statements for you and help script up migrations that you can package up. Your autocomplete can recognise what version you’re pointing at. And so on.

                              …and so on, and so on…

                              1. 3

                                Thanks for the detailed response. Not everyone has money to throw at a “real” enterprise DB solution, but (having never worked with Oracle and having only administered small MSSQL setups) I did wonder what some of the specific benefits that make a DBA’s life easier were.

                                Of course, lots of the open source tools used for web development and such these days seem to prefer Postgres (and sometimes MySQL), and developers like Postgres’ APIs. With postgres-compatible databases like EnterpriseDB and redshift out there, my guess is we’ll see a Postgres-compatible Oracle offering at some point.

                                1. 7

                                  Not everyone has money to throw at a “real” enterprise DB solution

                                  I work for a commercial database company, so I expect I see a lot more company-databases than you and most other crustaceans: Most companies have a strong preference to rely on an expert who will give them a fixed cost (even if it’s “money”) to implement their database, instead of trying to hire and build a team to do it open-source. Because it’s cheaper. Usually a lot cheaper.

                                  Part of the reason why: An expert can give them an SLA and has PI insurance, and the solution generally includes all costs. Building a engineering+sysadmin team is a big unknown for every company, and they usually need some kind of business analyst too (often a contractor anyway; more £££) to get the right schemas figured out.

                                  Professional opinion: Business logic may actually be some of the least logical stuff in the world.

                                  lots of the open source tools used for web development and such these days seem to prefer Postgres

                                  This is true, and if you’re building an application, I’d say Postgres wins big. Optimising queries for dbmail’s postgres queries was hands down much easier than any other database (including commercial ones!).

                                  But databases are used for a lot more than just applications, and companies who use databases don’t always (or even often) build all (or even much) of the software that interacts with the database. This should not be surprising.

                                  With postgres-compatible databases like EnterpriseDB and redshift out there, my guess is we’ll see a Postgres-compatible Oracle offering at some point.

                                  I’m not sure I disagree, but I don’t think this is a good thing. EnterpriseDB isn’t Postgres. Neither is redshift. Queries that work fine in a local Pg installation run like shit in redshift, and queries that are built for EnterpriseDB won’t work at all if you ever try and leave. These kinds of “hybrid open source” offerings are an anathema, often sold below a sustainable price (and much less than what a proper expert would charge), leaving uncertainty in the SLA, and with none of the benefits of owning your own stack that doing it on plain postgres would give you. I just don’t see the point.

                                  1. 3

                                    Professional opinion: Business logic may actually be some of the least logical stuff in the world.

                                    No kidding. Nice summary also.

                                    1. 0

                                      Queries that work fine in a local Pg installation run like shit in redshift

                                      Not necessarily true, when building your redshift schema you optimize for certain queries (like your old pg queries).

                                  2. 4

                                    And yet the cost of putting your data into a proprietary database format is enough to make people find other solutions when limitations are reached.

                                    Don’t forget great database conversion stories like WI Circuit Courts system or Yandex where the conversion to Postgres from proprietary databases saved millions of dollars and improved performance…

                                    1. 2

                                      Links to those stories?

                                      1. 1

                                        That Yandex can implement clickhouse doesn’t mean everyone else can (or should). How many $100k developers do they employ to save a few $10k database cores?

                                        1. 2

                                          ClickHouse has nothing to do with Postgres, it’s a custom column oriented database for analytics. Yandex Mail actually migrated to Postgres. Just Postgres.

                                      2. 2

                                        You’re right about RAC but over last couple of major releases Postgres has gotten alot better about using multiple cores and modifying big tables. Maybe not at the Oracle level yet bit its catching up quickly in my opinion.

                                        1. 3

                                          Not Oracle-related, but a friend of mine tried to replace a disk-based kdb+ with Postgres, and it was something like 1000x slower. This isn’t even a RAC situation, this is one kdb+ core, versus a 32-core server with Postgresql on it (no failover even!).

                                          Postgres is getting better. It may even be closing the gap. But gosh, what a gap…

                                          1. 1

                                            Not to be that guy, but when tossing around claims of 1000x, please back that up with actual data/blogpost or something..

                                            1. 6

                                              You remember Mark’s benchmarks.

                                              kdb doing 0.051sec what postgres was taking 152sec to complete.

                                              1000x is nothing.

                                              Nobody should be surprised by that. It just means you’re asking the computer to do the wrong thing.

                                              Btw, starting a sentence with “not to be that guy” means you’re that guy. There’s a completely normal way to express curiosity in what my friend was doing (he’s also on lobsters), or to start a conversation about why it was so much easier to get right in kdb+. Both could be interesting, but I don’t owe you anything, and you owe me an apology.

                                              1. 2

                                                Thanks for sharing the source, that helps in understanding.

                                                That’s a benchmark comparing a server grade setup vs essentially laptop grade hardware (quad-core i5), running the default configuration right out of the sample file from the Git repo, with a query that reads a single small column out of a very wide dataset without using an index. I don’t doubt these numbers, but they aren’t terribly exciting/relevant to compare.

                                                Also, there was no disrespect intended, not being a native english speaker I may have come off clumsy though.

                                                1. 1

                                                  kdb doing 0.051sec what postgres was taking 152sec to complete.

                                                  That benchmarks summary points to https://tech.marksblogg.com/billion-nyc-taxi-rides-postgresql.html which was testing first a pre-9.6 master and then a PG 9.5 with cstore_fdw. Seems to me that neither was fair and I’d like to do it myself, but I don’t have the resources.

                                                  1. 1

                                                    If you think a substantially different disk layout of Pg, and/or substantially different queries would be more appropriate, I think I’d find that interesting.

                                                    I wouldn’t like to see a tuning exercise including a post-query exercise looking for the best indexes to install for these queries though: The real world rarely has an opportunity to do that outside of applications (i.e. Enterprise).

                                              2. 1

                                                Isn’t kdb+ really good at stuff that postgres (and other RDBMS) is bad at? So not that surprising.

                                                1. 1

                                                  Sort of? Kdb+ isn’t a big program, and most of what it does is the sort of thing you’d do in C anyway (if you liked writing databases in C): Got some tall skinny table? Try mmaping as much as possible. That’s basically what kdb does.

                                                  What was surprising was just how difficult it was to get that in Pg. I think we expected, with more cores and more disks it’d be fast enough? But this was pretty demoralising! I think the fantasy was that by switching the application to Postgres it’d be possible to get access to the Pg tooling (which is much bigger than kdb!), and we massively underestimated how expensive Pg is/can be.

                                                  1. 3

                                                    Kdb+ isn’t a big program, and most of what it does is the sort of thing you’d do in C anyway (if you liked writing databases in C)

                                                    Well, kdb+ is columnar, which is pretty different than how most people approach naive database implementation. That makes it very good for some things, but really rough for others. Notably, columnar storage is doesn’t deal with update statements very well at all (to the degree that some columnar DBs simply don’t allow them).

                                                    Even on reads, though, I’ve definitely seen postgres beat it on a queries that work better on a row-based system.

                                                    But, yes, if your primary use cases favor a columnar approach, kdb+ will outperform vanilla postgres (as will monetdb, clickhouse, and wrappers around parquet files).

                                                    You can get the best of both worlds You can get decent chunks of both worlds by using either the cstore_fdw or imcs extensions to postgres.

                                                    1. 1

                                                      which is pretty different than how most people approach naive database implementation.

                                                      I blame foolish CS professors emphasising linked lists and binary trees.

                                                      If you simply count cycles, it’s exactly how you should approach database implementation.

                                                      Notably, columnar storage is doesn’t deal with update statements very well at all (to the degree that some columnar DBs simply don’t allow them).

                                                      So I haven’t done that kind of UPDATE in any production work, but I also don’t need it: Every customer always wants an audit trail which means my database builds are INSERT+some materialised view, so that’s exactly what kdb+ does. If you can build the view fast enough, you don’t need UPDATE.

                                                      Even on reads, though, I’ve definitely seen postgres beat it on a queries that work better on a row-based system.

                                                      If I have data that I need horizontal grabs from, I arrange it that way in memory. I don’t make my life harder by putting it on the disk in the wrong shape, and if I do run into an application like that, I don’t think gosh using postgres would really speed this part up.

                                          2. 3

                                            Spanner provides globally consistent transactions even across multiple data centers.

                                            Disclosure: I work for Google. I am speaking only for myself in this matter and my views do not represent the views of Google. I have tried my best to make this description factually accurate. It’s a short description because doing that is hard. The disclosure is long because disclaimers are easier to write than useful information is. ;)

                                            1. 2

                                              @geocar covered most of what I wanted to say. I also have worked for a commercial database company, and same as @geocar I expect I have seen a lot more database use cases deployed at various companies.

                                              The opinions stated here are my own, not those of my former or current company.

                                              To put it bluntly, if you’re building a Rails app, PostgreSQL is a solid choice. But if you’ve just bought a petabyte of PCIe SSDs for your 2000 core rack of servers, you might want to buy a commercial database that’s a bit more heavy duty.

                                              I worked at MemSQL, and nearly every deployment I worked with would have murdered PostgreSQL on performance requirements alone. Compared to PostgreSQL, MemSQL has more advanced query planning, query execution, replication, data storage, and so on and so forth. It has state of the art features like Pipelines. It has crucial-at-scale features like Workload Profiling. MemSQL’s competitors obviously have their own distinguishing features and qualities that make them worth money. @geocar mentioned some.

                                              PostgreSQL works great at smaller scale. It has loads useful features for small scale application development. The original post talks about how Arcentry uses NOTIFY to great effect, facilitating their realtime collaboration functionality. This already tells us something about their scale: PostgreSQL uses a fairly heavyweight process-per-connection model, meaning they can’t have a huge number of concurrent connections participating in this notification layer. We can conclude Arcentry deployments using this strategy probably don’t have a massive number of concurrent users. Thus they probably don’t need a state of the art commercial database.

                                              There are great counterexamples where specific applications need to scale in a very particular way, and some clever engineers made a free database work for them. One of my favorites is Expensify running 4 million queries per second on SQLite. SQLite can only perform nested loop joins using 1 index per table, making it a non-starter for applications that require any kind of sophisticated queries. But if you think about Expensify, its workload is mostly point look ups and simple joins on single indexes. Perfect for SQLite!

                                              1. 1

                                                But MemSQL is a distributed in-memory database? Aren’t you comparing apples and oranges?

                                                I also highly recommend reading the post about Expensify usage of SQLite: it’s a great example of thinking out of the box.

                                                1. 1

                                                  No. The author’s claims “Postgres might just be the most advanced database yet.” MemSQL is a database. If you think they’re apples and oranges different, might that be because MemSQL is substantially more advanced? And I used MemSQL as one example of a commercial database. For a more apples-to-apples comparison, I also think MSSQL more advanced than PostgreSQL, which geocar covered.

                                                  And MemSQL’s in-memory rowstore serves the same purpose as PostgreSQL’s native storage format. It stores rows. It’s persistent. It’s transactional. It’s indexed. It does all the same things PostgreSQL does.

                                                  And MemSQL isn’t only in-memory, it also has an advanced on-disk column store.

                                          1. 7

                                            CTEs are great, but it’s important to understand the implementation characteristics as they differ between databases. Some RDBMSs, like PostgreSQL, treat a CTE like an optimization fence while others (Greenplum for example) plan them as subqueries.

                                            1. 2

                                              The article mentions offhand they use SQL Server, which AFAIK does a pretty good job of using them in plans. I believe (not 100% sure) its optimiser can see right through CTEs.

                                              1. 2

                                                … and then you have RDBMSs like Oracle whose support for CTE is a complete and utter disgrace.

                                                I praying for the day Oracle’s DB falls out of use, because I imagine that will happen sooner than them managing to properly implement SQL standards from 20 years ago.

                                                1. 2

                                                  At university we had to use Oracle and via the iSQL web-interface for all the SQL-related parts in our database-courses. It was the slowest most painful experience, executing a simple select could take several minutes and navigating the interface/paginating results would take at least a minute per operation.

                                                  I would always change it to show all results on one page (no pagination) but the environment would do a full reset every few hours requiring me to spend probably 15-30minutes changing the settings back to my slightly saner defaults. Every lab would take at least twice as long because of the pain in using this system. I loved the course and the lecturer, it was probably one of the best courses I took during my time at university, but I did not want to use Oracle again after that point.

                                                  I’ve heard that they nowadays have moved the course to use PostgreSQL instead which seems like a much more sane approach, what I would have given to be able to run the code locally on my computer at that time.

                                                2. 1

                                                  I didn’t know this, so using a CTE in Postgres current would be at a disadvantage compared to subqueries?

                                                  Haven’t really used CTEs in Postgres much yet but I’ve looked at them and considered them. Is there any plans on enabling optimization through CTE’s in pg? Or is there a deeper more fundamental undelaying problem?

                                                  1. 5

                                                    would be at a disadvantage compared to subqueries

                                                    it depends. I have successfully used CTEs to circumvent shortcomings in the planner which was mi-estimating row counts no matter what I set the stats target to (this was also before create statistics).

                                                    Is there any plans on enabling optimization through CTE’s in pg

                                                    it’s on the table for version 12

                                                    1. 2

                                                      It’s not necessarily less efficient due to the optimization fence, it all depends on your workload. The underlying reason is a conscious design decision, not a technical issue. There have been lots of discussions around changing it, or at least to provide the option per CTE on how to plan/optimize it. There are patches on the -hackers mailing list but so far nothing has made it in.

                                                    2. 1

                                                      Does anyone know if CTEs are an optimization fence in DB2 as well?

                                                    1. 8

                                                      Notable from the release announcement for those not reading the article:

                                                      • CVE-2018-10915: Certain host connection parameters defeat client-side security defenses

                                                      libpq, the client connection API for PostgreSQL that is also used by other connection libraries, had an internal issue where it did not reset all of its connection state variables when attempting to reconnect. In particular, the state variable that determined whether or not a password is needed for a connection would not be reset, which could allow users of features requiring libpq, such as the “dblink” or “postgres_fdw” extensions, to login to servers they should not be able to access.

                                                      You can check if your database has either extension installed by running the following from your PostgreSQL shell:

                                                      \dx dblink|postgres_fdw

                                                      Users are advised to upgrade their libpq installations as soon as possible.

                                                      The PostgreSQL Global Development Group thanks Andrew Krasichkov for reporting this problem.

                                                      • CVE-2018-10925: Memory disclosure and missing authorization in INSERT ... ON CONFLICT DO UPDATE

                                                      An attacker able to issue CREATE TABLE can read arbitrary bytes of server memory using an upsert (INSERT ... ON CONFLICT DO UPDATE) query. By default, any user can exploit that. A user that has specific INSERT privileges and an UPDATE privilege on at least one column in a given table can also update other columns using a view and an upsert query.

                                                      1. 3

                                                        to this day I’m surprised that Postgres cannot be upgraded without downtime. I guess there’s maintenance windows, but it feels like so many DBs out there have uptime requirements

                                                        EDIT: don’t want to be too whiny about this, Postgres is cool and has a lot of stuff. I guess it’s mostly the webdev in me thinking “well yeah of course I need 100% uptime” that made me expect DBs to handle this case. But I ugess the project predates these sorts of expectation

                                                        1. 1

                                                          I don’t disagree… but just to be clear:

                                                          minor versions(i.e. bug fixes) do not need any downtime really, you just replace the binaries and restart. (i.e. from 9.4.6 -> 9.4.7)

                                                          Major versions (9.4 -> 9.5 ) do need a dump/restore of the database, which is annoying. You can avoid this almost completely now with logical replication, which is now included with PG 10 (before this version it’s available as a module back to PG9.4 I think).

                                                          1. 2

                                                            Ah thanks for the information, super helpful! Previously, when reading up on upgrading PG I got the impression I couldn’t do this on major versions.

                                                            1. 1

                                                              see: https://www.2ndquadrant.com/en/resources/pglogical/ it’s one of the use-cases.

                                                            2. 2

                                                              Major versions (9.4 -> 9.5 ) do need a dump/restore of the database

                                                              pg_upgrade has been available and part of the official codebase since 9.0 (7ish years). It’s still not perfect, but it’s been irreplaceable for me when migrating large (45+TB) databases.

                                                              1. 1

                                                                True, I had forgotten. I’ve been using PG since the 8.x days. pg_upgrade didn’t work for me from 9.0 -> 9.1 (or thereabouts, def. at the beginning of pg_upgrade existence) and haven’t ever tried it again. I should probably try it again, see if it works better for us!

                                                              2. 2

                                                                There have also been numerous logical replication tools (Slony for example) that allowed upgrades without downtime since at least around 8.0, but probably earlier.

                                                              1. 4

                                                                I find debate around this weird. This code is protected by a license, just like the Linux kernel is protected by a license (and the Linux kernel has 20 year old bits still protected by that license today). Anyone who would be angry with a company for violating the GPL should applaud sending the disc back. Agreeing with a license is different from honouring it, but we should honour others licenses just as we expect the open source code we love and use to be honoured.

                                                                1. 6

                                                                  It’s actually not hard to understand. If you support the GPL because it is a pragmatic way to, within our legal system, push for more software to be open source, then it is not inconsistent at all to also want people to violate proprietary licenses in order to make more software open source. You have a goal (people should be able to read the source code of software they use) and you use whatever means you have available to make that happen.

                                                                  The idea that this is hypocritical (not saying you are necessarily saying this, but it’s a common argument) is based on a particular liberal political philosophy (liberal as in the individualist, equal application of rules, not its use to mean further to the left on the political spectrum). Certain people take that political philosophy as somehow a ground truth, and then claim people are hypocritical if they don’t fit their beliefs into it.

                                                                1. -3

                                                                  When can I expect food neutrality next so I can get steak and lobster and the same price as the guy who got a salad?

                                                                  1. 22

                                                                    Maybe when eating a salad prevents you from using the infrastructure your taxes paid for.

                                                                    1. 1

                                                                      Or when your internet connection and mobile plan cost twice as much because you have to tick the Gmail, Facebook, Netflix and Youtube boxes that were once provided for free.

                                                                    2. 18

                                                                      While it’s easy for me to scoff at anything you post because of your username, I’m gonna guess that you didn’t mean any harm with this joke but over 17 million american households were food insecure in 2012, and 18 million americans live in a food desert, where access to perishables is either overwhelmingly expensive or simply absent.

                                                                      Maybe food neutrality wouldn’t be such a bad idea :)

                                                                      1. 3

                                                                        That’s an incorrect comparison; food neutrality in your hypothetical restaurant already exists since the restaurant doesn’t charge differently for using their cutlery and crockery depending on what you eat.

                                                                        1. 4

                                                                          You aren’t a libertarian, you’re just a capitalist.

                                                                        1. 2

                                                                          Comes with among other things:

                                                                          • CVE-2017-12172: Start scripts permit database administrator to modify root-owned files
                                                                          • CVE-2017-15098: Memory disclosure in JSON functions
                                                                          • CVE-2017-15099: INSERT … ON CONFLICT DO UPDATE fails to enforce SELECT privileges
                                                                          1. 1

                                                                            This was 16 years ago. Did they complete the stuff under Future Work or are there still gains to be had there?

                                                                            1. 3

                                                                              Yep, I believe that most of that work is now complete, although it took a long time. For example, VFS giant lock compatibility was still in the 9 tree, ten years later.

                                                                              For those that missed them, the FreeBSD 5.x releases (the first with SMPng) were painful

                                                                              1. 2

                                                                                Not sure about all the details, but one big thing that happened soon after was Matthew Dillon’s Dragonfly fork: http://www.dragonflybsd.org/history/

                                                                                1. 2

                                                                                  Dragonfly forked off FreeBSD 4 rather than the 5 branch in which the SMPng work took place, so it’s not entirely relevant for this.

                                                                                  1. 4

                                                                                    The fork took place two years after this paper. For many reasons, but a large part of it was the difficulty realizing the course plotted here. So very relevant, IMO. DFly went back and forked from 4 because 5 was floundering.

                                                                                    1. 2

                                                                                      Correct, reading my comment again I realize I was very unclear. With “not entirely relevant” I was referring to the code in Dragonfly not being based on the ideas/realizations in the paper but on a prevision version, so anyone wanting to read/trace code shouldn’t expect to find SMPng in Dragonfly.

                                                                              1. 9

                                                                                This seems like more or less a copy-paste of Jacques Mattheij’s blogpost, https://jacquesmattheij.com/sorting-two-metric-tons-of-lego, adding nothing but advertising and webtrackers. The blogpost is awesome, read that instead of click-revenue republishing.

                                                                                1. 3

                                                                                  the only bit that surprised me was the intro

                                                                                  The idea of Ikea plus internet security together at last seems like a pretty terrible one, but having taken a look it’s surprisingly competent.

                                                                                  why would ikea not be expected to do it right? they have a reputation for being extremely competent when it comes to getting all the small details correct.

                                                                                  1. 5

                                                                                    For furniture, sure, but do they have previous IOT experience? I wouldn’t expect Ikea to produce super high quality IOT lighting software, or any software, just because it’s not their thing.

                                                                                    1. 18

                                                                                      You’d be surprised:

                                                                                      They do seem like a company that cares about their software.

                                                                                      1. 14

                                                                                        75% of their catalog is CGI

                                                                                        I’m obviously getting old - I initially thought “75% of their catalogue uses CGI scripts - that’s not terribly modern!”.

                                                                                      2. 17

                                                                                        Albeit many years ago, I’ve worked with IKEA as a consultant on their catalog printing and at least back then they had very competent software engineers in the company to support that effort (making that catalog is no simple task). Writing software might not be where their revenue stream comes from, but don’t underestimate the inhouse capabilities in such a monster company where everything from catalog production to website to logistics and inventory management runs on.. software.

                                                                                        1. 12

                                                                                          do they have previous IOT experience?

                                                                                          So many companies with long and storied histories suck ass at security. “Experience” might just mean “Oh, hey, we know we can leave gaping security holes and the market will let us get away with it”.

                                                                                          The thing to always remember is that a culture of competent and thorough engineering is nearly always going to trump buzzword “experience”.

                                                                                          1. 6

                                                                                            This is 100% correct. It was even true when INFOSEC was invented as the founders started as teams of smart people just carefully thinking about security effects on each aspect of the lifecycle. A small, almost-fringe number of people doing that caused emergence of high-assurance security. They also reused anything proven to help subgoals as engineers do.

                                                                                            I’m obviously not expecting Ikea to repeat that. However, just right culture and time/effort invested could lead engineers to Google their ass off, read INFOSEC books/articles, and talk to people in field. They’d then apply what they could within their constraints. Many IT people do this, esp in smaller firms. Hell, they have to do that for everything since they can’t afford specialists. Works pretty well, too, for most I meet.

                                                                                          2. 8

                                                                                            i am a bit sceptical about the idea of IOT needing to be “in a company’s dna” for them to do a good job of it. getting things right is very much in ikea’s dna; i’d trust them to hire the right people to make that happen.

                                                                                          3. 2

                                                                                            What sort of reputation does IKEA have with respect to software?

                                                                                            1. 7

                                                                                              that’s the thing - the actual software implementation is something you can hire for. what you need from ikea’s side is a willingness to identify the people who know what’s important to get right, hire them, and then listen to what they tell you. from what else i’ve seen of ikea i’d definitely trust them not to override the people who say “look, we need to get security right before shipping anything”

                                                                                            2. 1

                                                                                              Wait, what? Aren’t they known for incredibly-hard-to-assemble, fiddly-as-hell furniture kits with such laughably tiny tolerances and complex instructions that the notion of most customers sitting in a pile of screws and bolts crying into their hands is almost a cliché?

                                                                                              1. 19

                                                                                                that’s the popular joke, yes, but in reality i’ve found their furniture startlingly well-made for flat-pack stuff, and relatively easy to assemble as long as i get another person to help me (it’s a pain with just one person).

                                                                                                1. 1

                                                                                                  Fair enough. Me, not so much. I don’t think it’s easy to assemble and often found bad stuff like threads just not machined properly, tolerances so tight that tears and breaks are inevitable, etc, to the point where I stopped buying and using it. Maybe it’s better these days. I just know that I saved myself aggro by not buying it any more.

                                                                                                  1. 1

                                                                                                    Also, where would a “popular joke” come from if it had no basis at all in truth? Do people joke that, I don’t know, Apple products are poorly designed and don’t work properly?

                                                                                                    1. 6

                                                                                                      Also, where would a “popular joke” come from if it had no basis at all in truth?

                                                                                                      Assume the following:

                                                                                                      a) Flat pack furniture is, in general, very difficult to assemble.

                                                                                                      b) IKEA is the most well-known manufacturer of flat pack furniture, serving as a readily identifiable eponym for the genre.

                                                                                                      Now consider proposition c: IKEA furniture is incredibly easy to assemble, and !c: IKEA furniture is very difficult to assemble.

                                                                                                      (A ^ B) ^ !C allows the joke to hold quite nicely. (A^B)^C does not; but that means you need to find another readily identifiable company to serve as a specimen for “flat pack furniture” - I’ve actually considered this for many seconds and can’t, can you?

                                                                                                      Therefore the joke is made irrespective of C! \qed

                                                                                                      1. 3

                                                                                                        no, they joke that you cannot rename an MP3 in iTunes without a personal phone call to apple HQ for permission. again, a slight exaggeration.

                                                                                                        1. 2

                                                                                                          Are we using the same OSX?

                                                                                                    2. 7

                                                                                                      IKEA wouldn’t be the global giant it is if their products were that poor or hard to assemble (hint: they’re not, and the instructions are very well done).

                                                                                                      1. 1

                                                                                                        Well, in my experience that’s very much not the case. It’s time-consuming and fiddly, and I’ve bought other pieces from other places which have been so much better designed, prepared and tooled, with so much better quality materials, that it really stunned me how much better they were than IKEA and how much hassle those things are and how much time they take. Personally I think they’re the global giant they are because they’re cheap and they use aspirational styling. Which is obviously totally fine and if you like their stuff that’s fair and great for you. But, “hint”: that doesn’t make your opinion fact and mine “incorrect”.

                                                                                                        1. 1

                                                                                                          Could you tell me what those pieces are and where you got them from, because I’ve found IKEA uniformly excellent and I’d be delighted if I could find something even better.

                                                                                                          1. 1

                                                                                                            Main one was a WaterRower. You’d think a rowing machine of all things would be complicated. I was just blown away by how easy and solid it was to put together, everything just slipped into place and the thing is built like a tank.

                                                                                                    1. 42

                                                                                                      In case anyone wants to cross-check, out of the 23 curl CVEs in 2016, at least 10 (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) are due to C’s manual memory management or weak typing and would be impossible in a memory-safe, strongly-typed language. (Note that, while I like Rust and it seems to have been the motivator for this post, many modern languages meet this bar.) While “slightly more than half” as non-C-related vulnerabilities may technically be “most”, I’m not sure it’s fitting the spirit of the term.

                                                                                                      There are some very compelling advantages to C, certainly, which the author enumerates; in particular, its portability to nearly every platform in existence is a major weakness of Rust (and, to the best of my knowledge, any other competitor) at the moment. But it’s very important to note that nontrivial C code practically always contains serious vulnerabilities, and nothing we’ve tried (especially “code better”, the standard advice for avoiding C vulnerabilities) works to prevent them. We should be conscious that, by writing C, we are trading away security in favor of whatever benefits C provides at that moment.

                                                                                                      edit: It’s worth noticing and noting, as I failed to, that 2016 was an unusual year for curl vulns. /u/amaurea on Reddit helpfully counted and cataloged all the vulns on that page, and 2016 is an obvious outlier for raw count, strongly suggesting an audit or new static analysis tool or something. However, the proportion of C to not-C bugs is not wildly varied over the entire list, so the point stands.

                                                                                                      1. 9

                                                                                                        […] 2016 is an obvious outlier for raw count, strongly suggesting an audit or new static analysis tool or something.

                                                                                                        It was an audit.

                                                                                                        1. 5

                                                                                                          especially “code better”, the standard advice for avoiding C vulnerabilities

                                                                                                          If the curl codebase is as bad as its API then this is honestly a completely fair response.

                                                                                                          We had this code recently:

                                                                                                          int status;
                                                                                                          void * some_pointer;
                                                                                                          curl_easy_getinfo( curl, CURLINFO_RESPONSE_CODE, &status );
                                                                                                          

                                                                                                          which trashes some_pointer on 64bit Linux because curl_easy_getinfo( CURLINFO_RESPONSE_CODE ) takes a pointer to a long and not an int. The compiler would normally warn about that, but curl_easy_getinfo is a varargs function, which brings no benefits and means the compiler can’t check the types of its arguments. WTF seriously? Why would you do that??

                                                                                                          I also recall reading somewhere that curl is over 100k LOC, which is insane. If the HTTP spec actually requires the implementation to be that large (and it wouldn’t surprise me if it does), then you are free to, and absolutely should, just not implement all of it. If the spec is so unwieldy that nobody could possibly get it right, then why try? Implement a sensible subset and call it a day.

                                                                                                          If you know you’re not going to be using many HTTP features, it’s not hard to implement it yourself and treat anything that isn’t part of the tiny subset you chose as an error. For example, it’s only a few hundred lines to implement synchronous GET requests with non-multipart responses and timeouts, and that’s often good enough.

                                                                                                          1. 5

                                                                                                            I also recall reading somewhere that curl is over 100k LOC, which is insane. If the HTTP spec actually requires the implementation to be that large (and it wouldn’t surprise me if it does), then you are free to, and absolutely should, just not implement all of it.

                                                                                                            curl supports a lot more protocols than just http though.

                                                                                                            1. 3

                                                                                                              Indeed. From the man page.

                                                                                                              curl is a tool to transfer data from or to a server, using one of the supported protocols (DICT, FILE, FTP, FTPS, GOPHER, HTTP, HTTPS, IMAP, IMAPS, LDAP, LDAPS, POP3, POP3S, RTMP, RTSP, SCP, SFTP, SMB, SMBS, SMTP, SMTPS, TELNET and TFTP).

                                                                                                              1. 1

                                                                                                                damn, that’s a juicy attack surface

                                                                                                            2. 3

                                                                                                              CURL is highly compatible with a lot of the strange behaviors that browsers do support and are usually outside of (or even prohibited by) the spec/standard. Just implementing the spec doesn’t quite make it useful to the world, when the world isn’t even spec compliant. Even if you write down the standard, the real standard is what all the other browsers do, not what a piece of paper says.

                                                                                                              Example: https://github.com/curl/curl/issues/791

                                                                                                              1. 1

                                                                                                                But it is useful even if you only implement a tiny subset of HTTP, because most use cases involve sending trivial requests to sensible servers.

                                                                                                                1. 3

                                                                                                                  The point is that cURL isn’t a project that supplies that subset, regardless of it being useful or not. cURL supplies a complete and comprehensive package that runs pretty much anywhere and supports pretty much any protocol you might need at some point (and some you might not need).

                                                                                                                  Nothing wrong in making a slimmed down works-most-of-the-time-and-will-be-enough-for-most-people project, it might be very useful indeed, but thats not the goal of the cURL project. There’s space for both.

                                                                                                                  1. 1

                                                                                                                    This is the way. Start small. I would assume that 90% of the use cases for curl is just some simple HTTP(S) queries and that can be implemented in any language quite quickly.

                                                                                                                    For example, D currently has curl in its standard library, which will probably be deprecated and removed. For simple HTTP(S) queries, there is requests, which is pure D except for the ssl and crypto stuff.

                                                                                                              2. 8

                                                                                                                nothing we’ve tried works to prevent them

                                                                                                                Formal verification actually works. seL4 exists.

                                                                                                                1. 10

                                                                                                                  Verifying seL4 took a few years and it was roughly 10000 LoC. Curl has an order of magnitude more. 113316 as counted by sloccount on the Github repo right now. Verification is getting easier, but only very slowly.

                                                                                                                  There is no immediate commercial advantage since curl works fine. This leaves it to academia to get the ball rolling.

                                                                                                                  1. 4

                                                                                                                    Verifying seL4 took a few years and it was roughly 10000 LoC.

                                                                                                                    Formally verifying 15,000ish lines of Haskell-generated C in seL4 took ~200,000 lines of proof, actually, per this. Formally verifying all of curl would easily run into the millions of lines of proof – and you’d basically be rewritting it into C-writing Haskell to boot.

                                                                                                                  2. 3

                                                                                                                    seL4 has two versions, a Haskell version that’s used to verify model safety and a C version that’s just a translation of the Haskell version. It may actually be a bit of a counter-example to your claim (that formal verification on C works in practice).

                                                                                                                    1. 1

                                                                                                                      This is incorrect. seL4 project actually proved C version is equivalent to (technically, refines) Haskell version. And then they (semi-automatically) proved generated assembly is equivalent to (refines) C so that they don’t need to rely on C compiler correctness.

                                                                                                                  3. 2

                                                                                                                    Yes but a lot of these are only published and fixed because curl is so widely used—and scrutinized. For example number 2 on your list:

                                                                                                                    If a username is set directly via CURLOPTUSERNAME (or curl’s -u, –user option), this vulnerability can be triggered. The name has to be at least 512MB big in a 32bit system. Systems with 64 bit versions of the sizet type are not affected by this issue.

                                                                                                                    Literally this doesn’t matter.

                                                                                                                    Also, how would Rust prevent this? I’m pretty sure multiplication overflow happens in Rust too.

                                                                                                                    1. 14

                                                                                                                      Rust specifies that:

                                                                                                                      1. If overflow happens, it is a “program error,” but is well-defined as two’s compliment wrapping.
                                                                                                                      2. In debug builds, overflow must be checked for and panic.

                                                                                                                      In the future, if overflow checking is cheap enough, this gives us the ability to require it. Who knows when that’ll ever be :)

                                                                                                                      Also note that this means it might lead to a logic error, but not a memory safety error. Just by making it defined helps a lot.

                                                                                                                      1. 3

                                                                                                                        Is there a formal or semi-formal Rust specification anywhere?

                                                                                                                        1. 9

                                                                                                                          Not quite yet; or at least, it’s not all in one place. While all those universities are working on formalisms, we’re not working hard to get one in place, since it’d have to take that work into account, which would mean throwing stuff out and re-writing it that way, I’d imagine.

                                                                                                                          There is some work going on to make the reference (linking to nightly docs since some work has recently landed to split it up into manageable chunks) closer to a spec; there’s also been an RFC accepted that says before stabilization, we must have the reference up-to-date with the changes, but we have to backfill all the older ones. So currently, it’s always accurate but not complete.

                                                                                                                          This area is well-specified though, in RFC 560 https://github.com/rust-lang/rfcs/blob/master/text/0560-integer-overflow.md (one RFC I refer to so often I remember its number by heart)

                                                                                                                          1. 1

                                                                                                                            Thank ye

                                                                                                                        2. 2

                                                                                                                          That’s neat! Still, I find it hard to believe anything would have coverage of all multiplication errors in allocations, even if it were written in Rust. If anyone can show me a single Rust project that deliberately trips the debug panic for multiplication errors during allocation in its unit tests, I’ll be impressed. But I’ll bet the only way to really be robust against this class of error is to use something like OpenBSD’s reallocarray. That’s equally possible in C and Rust.

                                                                                                                          1. 3

                                                                                                                            I do have an few overflow tests in one of my projects, but not for that specifically: https://github.com/steveklabnik/semver-parser/blob/master/src/range.rs#L682

                                                                                                                            We have pretty decent fuzzer support, seems like that might be something it would be likely to find.

                                                                                                                            1. 2

                                                                                                                              I guess that depends on how often you run your fuzzer on 32 but systems long enough for it to accumulate gigabytes of input.

                                                                                                                              The example here triggers after half a gig, but many of this class of bug would need more.