1. 5

    What does this do to all the branches forked off of master? Do they now track main?

    What does this do to everyone else’s repository? Does their local master branch automatically start tracking the remote main branch?

    1. 13

      Branches don’t reference the branch they are forked from, they “just” have a common history which can be used to merge them.

      1.  

        They do reference it as well as sharing a common history. Many git commands rely on that reference. You can change the referenced branch with git branch --set-upstream-to and see it with git branch -vv

        1. 5

          I think you’re talking about two different things - remote relations vs what commit a branch was created from.

      2. 7

        All good questions!

        Branches based on master do not “switch” to main, but since no tracked data has changed those branches can be automatically rebased.

        Others who have master checked out and write permission could still push to master and re-create it. There’s no real solution to this unless the “upstream” can be set to reject branches named “master”.

        1. 5

          No, it breaks clones, and every fork has to rename as well, or change the tracking.

        1. 22

          I agree lots of people don’t, because they never even bother to learn anything past GRANT ALL…

          So, who out there has used these features?

          We use PG’s row-level security exclusively. It is 100% worth the introduction pain. Every user of our software has their own PG DB login, and that’s how they login to the application.

          How did that impact your application/environment?

          The application does only 1 thing with access control, what shows up on the menus(well and logging in). 100% of the rest of the app is done via PG RLS, and the app code is a bunch of select * from employees; kind of thing.

          What have you used them for?

          Everything, always! :) lol. (also see next answer)

          Do they provide an expected benefit or are they more trouble than they’re worth?

          When we got a request to do a bunch of reporting stuff, we connected Excel to PG, had them login with their user/password to the PG DB, and they were off and running. If the user knows SQL, we just hand them the host name of the PG server, and let them go to town, they can’t see anything more than the application gives them anyway.

          When we added Metabase, for even more reporting, we had to work hard, and added a reporting schema, then created some views, and metabase handles the authorization, it sucks. Metabase overall is great, but It’s really sad there isn’t anything in reporting land that will take advantage of RLS.

          How did you decide to use them?

          When we were designing the application, PG was just getting RLS, we tried it out, and was like, holy cow.. why try to create our own, when PG did all the work for us!

          Trying to get access control right in an application is miserable.

          Put permissions with the data, you won’t be sorry.

          1. 6

            Doesn’t this require a lot of open connections? IME, postgres starts to struggle past a couple of hundred open connections. Have you run into that at all?

            1. 5

              If you run everything inside of transactions you can do some cleverness to set variables that the RLS checks can refer to, emulating lots of users but without requiring more connections.

              1. 2

                See my other comment, but you don’t have to work quite that hard, PG has a way now to become another user.

                see: https://www.postgresql.org/docs/12/sql-set-session-authorization.html

                1. 2

                  How many users does this support? I might be being overly cautious, but we have been told to look at user counts in the millions.

                  1. 2

                    We are an internal staff application, we max around 100 live open DB connections, so several hundred users. This runs in a stable VM with 32GB ram and < 1TB of data. We will never be a Google or a Facebook.

                    One can get really far by throwing hardware at the problem, and PG can run on pretty big hardware, but even then, there is a max point. Generally I recommend not optimizing much at all for being Google size, until you start running into Google sized problems.

                    Getting millions of active users out of a single PG node would be hard to do, regardless of anything else.

              2. 2

                In our experience, the struggle is around memory, PG connections take up some memory, and you have to account for that. I don’t remember the amount per connection, but this is what I remember.

                It’s not entirely trivial, but you can re-use connections. You authenticate as a superuser(or equivalent) send AUTH or something like that after you connect, to lazy to go look up the details.

                We don’t currently go over about 100 or so open active connections and have no issues, but we do use pgbouncer for the web version of our application, where most users live.

                EDIT: it’s not AUTH but almost as easy, see: https://www.postgresql.org/docs/12/sql-set-session-authorization.html

              3. 3

                How do RLS policies impact performance? The Postgres manual describes policies as queries that are evaluated on every returned row. In practice, does that impact performance noticeably? Were there gotchas that you discovered and had to work around?

                1. 3

                  Heavily.

                  It is important to keep your policies as simple as possible. E.g. if you mark your is_admin() as VOLATILE instead of STABLE, PG is going to happily call it for every single row, completely destroying performance. EXPLAIN is your best friend.

                  But even then, some queries are performed needlessly. Imagine you use transitive ownership. For example Users own Boxes, Boxes contain Widgets. When you want to determine what Widgets can User manipulate, you usually cache the User-Boxes set at the application server level and query “downwards”. With RLS, you need to establish a link between the Widget and User, joining over Boxes “upwards” as there is no cache.

                  The real problem here is that with sufficiently large schema, the tools are lacking. It’s really inconvenient to develop within pgAdmin4, away from git, basically within a “live” system with its object dependencies and so on.

                  1. 2

                    It can, as I mentioned in my other comment in this thread, we have only run into a few instances where performance was an issue we had to do something about.

                    As for tools, We use liquibase[0], and our schema’s are in git, just like everything else.

                    0: https://www.liquibase.org/

                    1. 1

                      I’ll check it out.

                      1. 1

                        How does the Liquibase experience compare to something like Alembic or Django migrations?

                        The main difference I see is whether your migrations tool is more tightly coupled to your app layer or persistence layer.

                        With Alembic you write migration modules as imperative Python code using the SQL Alchemy API. It can suggest migrations by inspecting your app’s SQL Alchemy metadata and comparing it to the database state, but these suggestions generally need refinement. Liquibase appears to use imperative changesets that do basically the same thing, but in a variety of file formats.

                        1. 2

                          I’m not very familiar with alembic or django migrations. Liquibase(LB) has been around a long time, it was pretty much the only thing doing schema in VCS back when we started using it.

                          Your overview tracks with my understanding of those. I agree LB doesn’t really care about the file format, you can pick whatever suits you best.

                          The LB workflow is pretty much:

                          • Figure out the structure of the change you want in your brain, or via messing around with a development DB.

                          • Open your favourite editor, type in that structure change into your preferred file format.

                          • Run LB against a test DB to ensure it’s all good, and you didn’t mess anything up.

                          • Run LB against your prod DB.

                          • Go back to doing whatever you were doing.

                    2. 1

                      We actually use an OSS extension veil[0], and while performance can be an issue, like @mordae mentions, if you are careful about your use, it’s not to bad. We have only had a few performance issues here and there, but with explain and some thinking we have always managed to work around it without much hassle. You absolutely want indexes on the things you are using for checking permissions with.

                      Veil makes the performance a lot less painful, in our experience.

                      0: https://github.com/marcmunro/veil Though note, veil2 is the successor and more relevant for new implementations, that we don’t currently use(and have no experience with): https://github.com/marcmunro/veil2

                      Veil2 talks about performance here in 23.1: https://marcmunro.github.io/veil2/html/ar01s23.html

                    3. 3

                      Same here, I used row level security everywhere on a project and it was really great!

                      1. 2

                        One mistake I’ve made is copy-pasting the same permission checks on several pages of an app. Later I tried to define the permissions all in one place, but still in the application code (using “django-rules”). But you still had to remember to check those permissions when appropriate. Also, when rendering the page, you want to hide or gray-out buttons if you don’t have permission on that action (not for security, just niceness: I’d rather see a disabled button than click and a get 403).

                        With row-level permissions in the DB, is there a way to ask the DB “Would I have permission to do this update?”

                        1. 2

                          Spitballing but maybe you could try run the query in a transaction and then roll it back?

                          Would be a bit costly because you’d have to run the query twice, once to check permissions and then again to execute, but it might be the simplest solution.

                          1. 1

                            Maybe what you need is to define a view that selects the things you can update and use that view to define the RLS. Then you can check whether the thing you want to update is visible through the view.

                            1. 1

                              With row-level permissions in the DB, is there a way to ask the DB “Would I have permission to do this update?”

                              Yes, the permissions are just entries in the DB, so you can query/update whatever access you want(provided you have the access to view/edit those tables).

                              I’m writing this from memory, so I might be wrong in the details… but what we do is have a canAccess() function that takes a row ID and returns the permissions that user has for that record. So on the view/edit screens/pages/etc, we get the permissions as well returned to us. So it’s no big deal to handle.

                          2. 1

                            Follow question: How did you handle customers (accidentally even) writing expensive sql queries?

                            1. 2

                              We would admonish them appropriately :) Mostly the issue is making sure they know about the WHERE clause. It hasn’t been much of an issue so far. We have _ui table views, that probably do 90% of what they are wanting anyways, and they know to just use those most of the time. The _ui views flatten the schema out, to make the UI code easier, and use proper where clauses and FK joins, to minimize resource usage.

                              If our SQL user count grew enough that we couldn’t handle it off-hand like this, we would probably just spin up a RO slave mirror, and let them battle each other over resources and otherwise ignore the problem until we got complaints enough to upgrade resources again.

                          1. 1

                            Kyber decryptions have a failure probability of about 1 in 2^160.

                            This is substantially less than the chance that a cosmic ray interferes in the computation of the algorithm, in which case you would get nonsense out with any algorithm.

                            In fact without looking into these algorithms further, I’m willing to bet that the 200KB keys have a worse real-world chance of screwing up because there’s more memory in which bits can be swapped.

                            1. 3

                              What does “proactive” IO mean? Is it the same as other async-io systems in rust like tokio/async-std/smol or different?

                              1. 2

                                It uses the proactor pattern (https://en.wikipedia.org/wiki/Proactor_pattern) instead of the reactor pattern, which the others use. This pattern fits the way io-uring and IOCP work better, as both are completion-, and not poll-based.

                              1. 2

                                Use a reverse shell to get shell access to any system (trying to debug a jenkins pipeline? Want to look at a file downloaded into a Kubernetes container somewhere?)

                                Useful cheat sheet: http://pentestmonkey.net/cheat-sheet/shells/reverse-shell-cheat-sheet

                                Obviously only use it to look at things your allowed to look at, but it’s happened quite a few times that I owned and was allowed to do whatever I wanted with a system, yet getting shell access to it was difficult.

                                1. 9

                                  This is a great idea for a post that I’ve wanted to write myself. Leaving aside trees, hash tables, and stacks, I probably used less than one “interesting” data structure/algorithm PER YEAR of programming. Some notable counterexamples are two kinds of reservoir sampling, regular languages/automata, random number generation, some crypto, and some info retrieval algorithms.

                                  One thing that sparked it is a obscure but long-running debate over whether dynamic programming interview questions are valid.

                                  I don’t think they’re valid. It’s mostly a proxy for “I got a CS degree at a certain school” (which I did, I learned dynamic programming in my algorithms class and never used it again in ~20 years.)

                                  I challenge anyone to name ANY piece of open source software that uses dynamic programming. Or to name an example in your own work – open source or not.

                                  I’ve tried this in the past and nobody has been able to point to a concrete instance. I think the best I’ve heard is someone heard about a professor who heard about some proprietary software once that used it.


                                  Related: algorithms used in real software (although this is certainly not representative, since compiler engineering is a subfield with its own body of knowledge):

                                  https://old.reddit.com/r/ProgrammingLanguages/comments/b22tw6/papers_and_algorithms_in_llvms_source_code/

                                  https://github.com/oilshell/blog-code/blob/master/grep-for-papers/llvm.txt

                                  Linux kernel algorithms:

                                  https://github.com/oilshell/blog-code/blob/master/grep-for-papers/linux.txt

                                  1. 10

                                    I challenge anyone to name ANY piece of open source software that uses dynamic programming.

                                    Git, or most reasonable implementations of “diff”, will contain an implementation of the Myers Algorithm for longest-common-subsequence, which is very dynamic-programmy.

                                    No concrete example for this one, but I know that bioinformatics code is full of dynamic programming algorithms for the task of sequence alignment, which is similar to diff — identifying a way to align two or more base sequences so that they coincide with the minimal number of changes/additions/deletions required to make them identical.

                                    1. 1

                                      Hm I’m familiar with that algorithm but I never thought of it as dynamic programming.

                                      Wikipedia does say it’s an instance of dynamic programming. Although when I click on the paper, it appears to contrast itself with “the basic O(MN) dynamic programming algorithm” (section 4).

                                    2. 8

                                      Since you mentioned dynamic programming, it’s worth pointing out that the name “dynamic programming” was chosen for political reasons, as pointed out in the history section of the Wikipedia article on dynamic programming. So I think it’s a really bad name.

                                      1. 1

                                        That’s funny, I remembered xkcd’s “dynamic entropy” comic, and it quotes the same passage:

                                        https://xkcd.com/2318/

                                        It also has a very interesting property as an adjective, and that is it’s impossible to use the word dynamic in a pejorative sense. Try thinking of some combination that will possibly give it a pejorative meaning. It’s impossible.

                                        LOL

                                        Agree it’s a terrible name… I would say it was chosen for “marketing” reasons

                                      2. 7

                                        I have thought about whether dynamic programming questions are fair to ask, and I ended up where you are: they are not.

                                        Dynamic programming was the one I struggled most in understanding and implementing correctly. And while there are semi-practical examples (like the backpack problem), I have not found any practical, day-to-day use cases on this.

                                        I had an argument with my colleague who asked this kind of problem, saying it’s basic knowledge. Turns out he did competitive programming and there, it is table stakes. But in practice, it just filtered for anyone who has learned and practiced this approach.

                                        I stay away from asking this, problems that need dynamic programming to solve.

                                        1. 4

                                          I’m familiar with dynamic programming mostly from high-school competitive programming as well. Otherwise I can’t say I’ve encountered real-life problems where it occurred to me to use the technique.

                                        2. 8

                                          I challenge anyone to name ANY piece of open source software that uses dynamic programming. Or to name an example in your own work – open source or not.

                                          I’m literally about to implement something that could be classified as dynamic programming at work, which can be sumarrized as “computing a few simple statistics such as number of bytes changed for each directory in a large filesystem”. Dynamic programming is such a general term that it applies regularly if you want to use it.

                                          1. 4

                                            I’d like to see it. I don’t think dynamic programming is a general term.

                                            In fact one time I participated in this argument (which was years ago so my memory is fuzzy), a former CS professor I worked with explained the difference between memoization and dynamic programming. A bunch of 10-20+ year programmers like myself went “ah OK”. Unfortunately I don’t even remember the difference, but the point is that most programmers don’t, because dynamic programming is very uncommon.

                                            What you’re describing sounds like an obvious algorithm that anyone could implement, which is not the same as dynamic programming interview questions, or dynamic programming in competitive programing.

                                            As other posters mentioned, competitive programming is the main place you see it outside of a CS class.

                                            1. 2

                                              It’s absolutely an obvious algorithm, so is most dynamic programming. That was sort of my point.

                                              Can’t share the code unfortunately, but it’s just iterate over sorted list of file changes in reverse order and collect statistics as we go. Dynamic part comes from the fact that we can just look at the subdirectories of a dir (that we already have numbers for) instead of recursing into it.

                                              1. 2

                                                What you’re talking about could be called memoization, or it probably doesn’t even deserve that name. It’s just what a “normal” programmer would come up with.

                                                That’s not the type of thing that’s asked in interview questions or competitive programming. The wikipedia page gives some examples.

                                                Dynamic programming usually changes the computational complexity of an algorithm in some non-obvious way. There’s very little you can do recursing over a directory tree that doesn’t have a clear O(n) way to code it (e.g. computing statistics).

                                                1. 7

                                                  I like Erik Demaine’s explanation, that problems where dynamic programming can be applied are ones where their subproblems and their dependencies can be modeled as a directed acyclic graph [1]. Up to you if you’d like to tackle that with a top down approach where you look at a node and calculate its solution based on the solutions of its ancestors, or a bottom up approach starting from the nodes in the DAG with no dependencies and propagate the solutions in topological order.

                                                  My colleague and I used it for a generalization of matrix chain multiplication (for tensors) [2].

                                                  [1] https://youtu.be/OQ5jsbhAv_M?t=2736

                                                  [2] https://github.com/TensorCon/ratcon/blob/release/opt/gencon/pathopt.py#L198

                                                  edit: by the definition above, even naive memoization can count as DP, if you’re caching solutions to subproblems of that structure. Doesn’t have to be at the difficulty level of competition to count as DP in that case.

                                                  1. 1

                                                    Hm interesting, searching through my personal wiki, I found a related definition which is different. I haven’t really thought about it enough to form an opinion.

                                                    Either way, it doesn’t change my overall point: that there are certain kinds of algorithms problems that appear on coding interviews, and in competititve programming, that do not show up in 99% of programming jobs. They are easy to pose and have cute solutions, but aren’t testing very much.

                                                    I think the “DP tables” part is key but again I haven’t thought about it enough …

                                                    https://blog.racket-lang.org/2012/08/dynamic-programming-versus-memoization.html

                                                    Memoization is fundamentally a top-down computation and DP is fundamentally bottom-up. In memoization, we observe that a computational tree can actually be represented as a computational DAG

                                                    In DP, we make the same observation, but construct the DAG from the bottom-up. That means we have to rewrite the computation to express the delta from each computational tree/DAG node to its parents. We also need a means for addressing/naming those parents (which we did not need in the top-down case, since this was implicit in the recursive call stack). This leads to inventions like DP tables, but people often fail to understand why they exist: it’s primarily as a naming mechanism (and while we’re at it, why not make it efficient to find a named element, ergo arrays and matrices).

                                                    This bottom-up / top-down distinction might have been the same as what the aforementioned professor said 5+ years ago, but I don’t remember exactly.

                                                    1. 1

                                                      So, is memorization of factorial top-down, or bottom-up?

                                                      1. 1

                                                        I would probably say neither… Either factorial or Fibonacci is so trivial that it doesn’t help to be thinking about it that way.

                                                        Though I think the quote hints at a clear test for whether it’s top-down or bottom-up: if you need extra state outside the call stack. I’m getting curious enough to try this out, but right now all I can do is quote what other people say.

                                                        In any case it’s clear to me that there’s some controversy over what dynamic programming really is. I think the issue is that a lot of algorithms could be framed that way but were not discovered that way, and not taught and learned that way.

                                                        1. 1

                                                          I would probably say neither… Either factorial or Fibonacci is so trivial that it doesn’t help to be thinking about it that way.

                                                          I think that the triviality is actually helpful here. If it’s actually true that memoization and dynamic programming are different (and there’s clearly debate on this), can 2 implementations of a trivial function, that everyone can understand highlight the differences?

                                                  2. 1

                                                    On the contrary the stupidly naive way (recurse on every directory) is O(n^2).

                                                    Dynamic programming is just solving a series of problems while using the answers of shared subproblems multiple times. Memoization is a common way to implement this.

                                                    Yes there are some very clever algorithms that use dynamic programming, this doesn’t make obvious algorithms that use dynamic programming not also fit under the definition.

                                                    1. 3

                                                      Why would recursing into every directory be O(n^2)? You’re still only visiting every directory/file once? It seems like something missing?

                                                      1. 1

                                                        Say you have a directory structure with a single file in it, /a/b/c/d/e

                                                        To get the number of bytes changed in e you need to visit e, then to get the number of bytes changed in d you need to visit d and then e, then for c you need to visit c, d, and e, and so on.

                                                        Like I said, it takes a really naive solution, but if you don’t remember the values you calculate anywhere for some reason it’s sum over inodes (depth of inode)… which is O(n^2) (for bad directory structures).

                                                        Note that I need these values for every directory, not just for one particular directory.

                                                        1. 2

                                                          That’s still linear complexity space. Unless you’re hardlinking directories (which you then have to deal with potential recursion), it’s still O(n). If you add a file at /a/b/c/file you only visit 1 more file and no more dirs, not an exponential. O(n + n + n) or O(n + 3) still simplifies to O(n).

                                                          1. 1

                                                            If you add /a/b/c/file you add 4 more visits, not 1. Going from n= 3 /a/b/file to n=4 /a/b/c/file adds 4 more visits. In other words this worst case example takes time O(sum from 1 to n of i) = O(n(n-1)) = O(n^2).

                                                            N is the number of inodes in a arbitrary tree not a number of files in a fixed tree.

                                                            1. 1

                                                              That’s still adding a linear number of operations for each file, the depth could technically be considered a different variable, say m. So for each file (n+1) you add, you also add the number of directory traversals (m) resulting in O(m+n), which simplifies again to O(n), but in reality folders are files too, so are part of n in the first place, so again O(n). Ultimately your n space is the total number of inodes, which both files and folders have.

                                                              Abstractly, you’re just traversing a tree structure (or a directed graph if using links), which is well understood to be O(n) (maybe O(n^2) worst case if all folders are links, resulting in a fully connected graph), because you only visit each node once. If it were O(n^2), you would visit each node n times.

                                                              Remember, Big O notation is about scaling, not the actual concrete number of operations, which is why you drop any constants or variables other than n.

                                                              1. 1

                                                                It’s O(mn) not O(m+n) (in the insanely naive algorithm that recalculate things every time).

                                                                It’s not a single tree traversal but #internal nodes tree traversals.

                                                                1. 1

                                                                  Even if it was O(mn) (it’s not), that still simplifies to O(n). An ‘internal nodes’ tree traversal is still O(n), n is just smaller, but again, your problem is not an internal nodes traversal, it’s a full traversal because you have to look at the blocks attached to the file (leaf) inodes, which means you need to read all inodes of all files and of all folders one time each. n = # of files + # of folders = O(n)

                                                                  1. 1

                                                                    I supposed an extremely naive solution could be to fully traverse each sub tree for every folder visited, which would be… O(log n)? But even that isn’t O(n^2), as the total repeated space shrinks the deeper you get.

                                                                    1. 1

                                                                      You’re assuming a balanced tree, which is not guaranteed. Depth of tree is O(n) in pathological cases (and average case O(sqrt(n)) is typical for randomly generated trees)

                                                                      1. 1

                                                                        Ah yeah, I think it would be O(n log n) not O(log n), because you traverse the tree once for each node, and a subset of of the tree for almost every n (except leafs), at least in the worst case. Still not O(n^2), and the solution for a O(n) is almost easier to conceptualize than the completely naive solution :)

                                                                        1. 1

                                                                          and the solution for a O(n) is almost easier to conceptualize than the completely naive solution :)

                                                                          No argument here…

                                                                          I think it would be O(n log n)

                                                                          We agree it’s O(n) * O(time tree search) now right? And you’re trying to call the tree search time log(n)? Because trees are height log(n)? Then see the post you replied to, that’s true in a balanced tree, it’s not true in a random tree (where it is sqrt(n)) and it’s definitely not tree in a pathological worst case (where a tree is just a n length linked list).

                                                                          1. 2

                                                                            Yeah, the part I was hung up on before was that you’re naive solution traverses the entire subtree below a node for each node visit, I was stuck in the simple optimal solution. For the pathological case, basically just a bunch of folders in folders with a single file at the bottom, the depth of the tree is n, and the file inode at the bottom would be accessed n times, so O(n^2). For the common case it would be about O(n log n) where you can skip traversing larger and larger parts of the tree the deeper you get on each ‘path.’ Thanks for the discussion, I enjoyed it :)

                                                  3. 1

                                                    I think comparing memoization to dynamic programming is a category mistake: they are different kinds of things.

                                                    ‘Dynamic programming’ is a description of a style of algorithm. It’s divide-and-conquer, usually with overlapping subproblems, making it possible to reuse intermediate results.

                                                    Memoization is a programming technique to remember intermediate results, by remembering the results of function calls. You can e.g. also store the intermediate results somewhere explicitly, usually in a matrix, in which case you don’t memoize the result ‘transparently inside the function’, but use a lookup table ‘external to the function that computed the result’.

                                                    1. 1

                                                      I dunno I find that in addition to the Racket language resource I gave elsewhere in the thread, lots of people compare them:

                                                      https://medium.com/@afshanf/cracking-the-coding-interview-questions-part-7-a7c8f834d63d

                                                      A note on terminology: Some people call top-down dynamic programming “memoization” and only use “dynamic programming” to refer to bottom-up work. We do not make such a distinction here. We call both dynamic programming.

                                                      There does seem to be disagreement on what dynamic programming is. And many algorithms that were not derived with dynamic programming techniques could be described as dynamic programming.

                                                      But it seems that most people agree it’s related to memoization.

                                                2. 4

                                                  GCC uses dynamic programming to split IA-64 instructions into bundles.

                                                  1. 2

                                                    Thanks, nice example and link! Still I would say it’s a niche skill, especially to come up with from scratch in an interview.

                                                  2. 4

                                                    I challenge anyone to name ANY piece of open source software that uses dynamic programming. Or to name an example in your own work – open source or not.

                                                    Ever do video encoding or transcoding with anything built on FFmpeg or x264? Encode images with MozJPEG? Encode an AV1 video or AVIF image with libaom? Trellis quantization in advanced lossy compression encoders is a dynamic programming algorithm.

                                                    1. 3

                                                      Hm very interesting! I was not aware of that algorithm. Paper I found:

                                                      https://www.mp3-tech.org/programmer/docs/e00_9.pdf

                                                      I would still say it’s a bad interview topic, but it’s cool to see real world usages of it.

                                                      1. 2

                                                        Oh, no disagreement there! Even after coding it up myself, I’d hate to have someone ask me to whiteboard a working implementation of trellis quantization in 40 minutes or whatever (though I’m pretty sure I could sketch out an explanation now).

                                                        In general I’m not a fan of whiteboard coding exercises at all. Whenever I’ve interviewed candidates I’ve always preferred the old-fashioned method of just reading their resume well ahead of time, looking up what ever piques my interest on it, and then having a friendly conversation about that. Usually that provides plenty of esoteric material for me to quiz them on and it also lets them show me their strengths and enthusiasm.

                                                        1. 1

                                                          My current company doesn’t do a whiteboard exercise, but my previous one did… but the thing is, the usual task was to implement a basic “grep”. That is, read a file and print all of the lines that contain a user-specified string, in a language of your choice, with whatever libraries make you happy (it’s not a trick, you’re not supposed to implement Boyer-Moore on your own). Assuming you succeeded at that, we would ask you to implement a few more features, like a -v flag (only print lines that don’t match), and -A and -B flags (print context lines before and after the matching line), until you got stuck or the time for that segment was up. It wasn’t graded on minutiae like correct semicolon placement, it was just an evaluation of whether a candidate could do a trivial task, how they handled additional requirements, whether they asked sensible questions and got clarification when needed, etc. I found it pretty reasonable.

                                                    2. 4

                                                      I challenge anyone to name ANY piece of open source software that uses dynamic programming. Or to name an example in your own work – open source or not.

                                                      I used Warshall’s algorithm (which is dynamic programming) to compute the transitive closure of a graph for a typechecker. This is, in my experience, a very common algorithm.

                                                      In high school, I wrote a program for my professor that places students into groups of 4 such that their meyers-briggs personalities are as different as possible. This used dynamic programming.

                                                      A professor of mine (who taught the subject to me) used dynamic programming for some kind of RNA sequencing problem in a paper he published. One of our homework assignments had us arrive at a watered down version of his (and his co-authors’) algorithm.

                                                      I’m fairly certain that at least some fuzzy string matching algorithms use string distance, which is also solved using dynamic programming.

                                                      These are all diverse applications of DP. In my personal, subjective experience, the idea that DP is in any way obscure or dated is absurd.

                                                      Edit:

                                                      To be more concrete, the “transitive closure of a graph” is for the graph of dependencies, computing the set of all functions that a particular function depends on. This is as described in the Haskell Report.

                                                      For fuzzy string matching, I have in mind something like fzf, though I cannot say with certainty that it uses string distance (I’m unfamiliar with its implementation).

                                                      Here’s the paper that I think I’m referencing: Statistical Mechanics of Helix Bundles using a Dynamic Programming Approach

                                                      1. 2

                                                        Thanks for the examples. The claim is definitely not that it’s outdated or obscure; the claim is that it’s not a good interview question because it doesn’t show up much at work. Although there were lots of people here who pointed out interesting uses of dynamic programming, that’s not incompatible with the idea that you could have a 10 or 20 year programming career and never use it.

                                                        Side note: I’m familiar with the Floyd Warshall algorithm but I never thought of it as dynamic programming. I think part of the issue is that I may have a more narrow definition of it than others. (I think people even say the linear time fibonacci is an example of dynamic programming, which I find silly. I just call that the obvious algorithm. I guess it can be used to illustrate a principle.)

                                                        Even so, I definitely think it’s more popular in universities, and certain domains like bioinformatics. In contrast to what people on this site typically do “for work”.

                                                      2. 3

                                                        I challenge anyone to name ANY piece of open source software that uses dynamic programming. Or to name an example in your own work – open source or not.

                                                        I do a lot of work with text processing – computing the edit distance between two strings is something I do very often. That’s a classic dynamic programming algorithm. There are probably hundreds of open source packages that do this or some variation thereof.

                                                        1. 3

                                                          Just to add to the list of responses clearly demonstrating Cunningham’s Law:

                                                          I believe the Knuth-Plass line-breaking algorithm used in LaTeX to lay out text “optimally” uses dynamic programming. This was done for efficiency, as opposed to using some kind of general global optimization routine. It’s also the reason why LaTeX doesn’t support “backtracking”.

                                                          1. 2

                                                            It’s also the reason why LaTeX doesn’t support “backtracking”.

                                                            Sile uses a variant of the same dynamic programming algorithm to lay out paragraphs on a page. The original paper describing the algorithm says that TeX wanted to use it like that, but it would require more than one entire megabyte of state for a large document, which was infeasible.

                                                            1. 1

                                                              Definitely an instance of Cunningham’s law at work :) I should make another go for my pet problems:

                                                              • it’s impossible to make a zsh-like interactive interface on top of GNU readline
                                                              • you can’t make a constant-space linear-time model of computation that’s more powerful than regular languages, and that can parse shell/JS/Python/C++
                                                              • you can’t make an extended glob to regex translator in less than a week (https://github.com/oilshell/oil/issues/192)

                                                              Thanks for the example. If there were more specific links I would make a blog post out of this :)

                                                              And although my comment was a tangent, it did motivate me to get out the “Algorithm Design Manual” and flip to the part on dynamic programming. Though I remember the applications in that book being interesting but seemingly far removed from what programmers do day-to-day. It seemed to be by a professor who consulted on algorithms for various companies, which is an interesting job!

                                                            2. 1

                                                              The Grasshopper routing library uses contraction hierarchies, which are implemented using Dijkstra’s shortest path algorithm and A* search, which are special cases of dynamic programming.

                                                              I have to agree it’s not something most people will use every day, but it never hurts to have a general idea how it works.

                                                              1. 1

                                                                Here is a concrete example of Dynamic Programming that you use every day: Word Wrap. Knuth has an algorithm that is often used for maximizing the number of words per line.

                                                                Also the field of Bioinformatics often uses the Levenshtein distance when matching two dna strands.

                                                                Also I would like to mention the single most important thing I learned from Dynamic Progrmming: Start at the end case, and figure out what constraints can work from there. For example, think about the last recursion call, and what constraints it needs, and go backwards from there.

                                                              1. 9

                                                                The first comment ( https://lobste.rs/s/gigoo8/end_redis_adventure#c_say5sq ) you give as an example of “pretending systematic racism, sexism, and bias aren’t a thing” is absolutely not an example of that. The point made in the linked comment is valid.

                                                                Statistics vary, but open source contributors skew heavily male, for instance one survey by GitHub and some academics came up with a value of 95% being male. If we assume open source maintainers have the same gender ratio (I expect it’s similar) than 4 randomly selected projects have a 81% chance of all being maintained by men. Trying to shame some random poster posting on a non-gender related post for not deciding to include an “affirmative action” item in his list of projects is ridiculous. Including an affirmative action item at all has questionable value.

                                                                Nothing about the above is related to systematic bias of any form, there is nothing systematic about a single list generated by a random poster. At best you could be trying to argue not that this is an example of pretending systematic bias isn’t a thing, but an example of pretending a particular poster isn’t biased. The evidence doesn’t even support the idea that they were pretending that a particular poster wasn’t biased, since the evidence points to that poster actually not being biased.

                                                                1. 5

                                                                  With all the enthusiasm for zettelkasten/second-brain like systems (roam, org-roam, now this), I’m surprised that nobody has been working on I haven’t heard of an external format/tool that various UI’s can interface. VSCode, at least that’s my impression, is the kind of editor that gets displaced from it’s throne every few years by the next new thing, as has happened to Sublime and Atom before, so I certainly wouldn’t be too confident in making my “second brain” depend on it, except maybe if it’s used as a brainstorming tool for projects, but then it would have to be distributable too – but from skimming the article that doesn’t seem to be the case.

                                                                  Edit: Fixed the first sentence, sorry for my ignorance. Also I missed that this is markdown based, so I guess the rest of the comment isn’t quite right either, but I guess/hope my general point is still legitimate.

                                                                  1. 6

                                                                    I’m surprised that nobody has been working on an external format/tool that various UI’s can interface

                                                                    Checkout neuron which is editor-independent, has native editor extensions, but can also interface (in future) with editors through LSP.

                                                                    Some examples of neuron published sites:

                                                                    Easiest way to get started (if you don’t want to install yet): https://github.com/srid/neuron-template

                                                                    1. 3

                                                                      That sounds cool, but I don’t really get why LSP would help? I (personally) would much prefer a native client, in my case for Emacs, than something that forces itself into a protocol for program analysis.

                                                                      1. 2

                                                                        Well, neuron does have native extensions for emacs and vim (see neuron-mode and neuron.vim) - but LSP support just makes multiple editor support easier by shifting common responsibility to a server on neuron.

                                                                        EDIT: I’ve modified the parent comment to clarify this.

                                                                      2. 1

                                                                        Is there any easier way to install (i.e. without nix?) I’m on a laptop and installing new toolchains is prohibitive for the low storage I have.

                                                                        1. 1

                                                                          Nix is the only way to install neuron (takes ~2GB space including nix and deps), until someone contributes support for building static binaries.

                                                                          But I’d encourage you give Nix a try anyway, as it is beneficial even outside of neuron (you can use Nix to install other software, as well as manage your development environments).

                                                                          1. 2

                                                                            I got a working binary with nix-bundle, that might be a simpler option. It’s a bit slow though, especially on first run when it extracts the archive. nix-bundle also seems to break relative paths on the command line.

                                                                            1. 1

                                                                              Interesting. Last time I tried nix-bundle, it had all sorts of problem. I’ll play with it again (opened an issue). Thanks!

                                                                      3. 3

                                                                        Isn’t the markdown that this thing runs on exactly that external format, and one that has been getting adoption across a wide range of platforms and usecases at that?

                                                                        1. 3

                                                                          There is tiddlywiki and the tiddler format.

                                                                          1. 2

                                                                            I wish the extension used the org format instead of markdown (so if something happens to vscode, I can use it with emacs), but otherwise I totally agree with your comment!

                                                                            1. 2

                                                                              You can use markdown files with org-roam in emacs by using md-roam. I prefer writing in Markdown most of the time, so most of my org-roam files are markdown files.

                                                                          1. 13

                                                                            Huh? For 10 users you need a separate database and for 100 users you need 2 web servers??? I’m not sure if those numbers are meant to be taken literaly, but they’re off by orders of magnitude.

                                                                            We’re going to take our new photo sharing website, Graminsta, from 1 to 100k users.

                                                                            Yeah this seems totally wrong. I think back in the day imgur did this all on one box, and that’s how they were able to economically provide free photo hosting for so long (it seems to have gone south now, but in the early days it was very good).

                                                                            1. 8

                                                                              At $big_streaming_company we got to ~1M users with a few servers before needing to do anything clever.

                                                                              1. 4

                                                                                Totally agreed – though it does depend a bit on what he means by “N users”. Is it “N users who are using your site flat-out at the same time” or “N users in your database, but only 1 or 2 use the site once a month or so”. Those are very different, but even for the first option I suspect he’s still out by an order of magnitude.

                                                                                At a previous company we didn’t really have users, but we handled millions of page views on a single, relatively small database and a monolithic application with about 10 server nodes. We could handle about 300 requests per second if I recall correctly.

                                                                                1. 3

                                                                                  I think it really depends on what you’re doing.

                                                                                  At home, I just made https://droid.cafe/starlink. It’s running off a free f1-micro instance from google. Each page load takes roughly 40 microseconds of cpu time and is just serving up a few static files. Eyeballing it I think I could probably get rid of cloudflare (which I added out of curiosity not necessity) and still handle more than 10m daily users (computers are really fast these days). By upgrading to a faster instance, and minimizing the html/css/js, I bet I could push that to well over a billion a day. Of course at those scales I’d be paying a lot for geocoding services, but it would be 1 server managing to serve a webpage to most of the world every day.

                                                                                  At work I don’t know that anyone has counted, but I bet we have roughly 1 production server per user (user defined as someone who directly interacts with our apps), that’s because each of our users gives our servers a lot of work to do, and our servers provide that users company a lot of value.

                                                                                  Anyways, the real point is to not focus on the numbers.

                                                                                  1. 2

                                                                                    Yeah, we’ve got a particularly slow rails app serving 6 million users/day. It runs on 6 AWS instances, but we could just buy a server 4x or 8x bigger and skip the load balancing.

                                                                                    1. 1

                                                                                      At the bottom of the article it mentions:

                                                                                      This post was inspired by one of my favorite posts on High Scalability.

                                                                                      If you read the High Scalability post, you can see that a lot of the numbers given were ranges, not hard limits. In “Scaling to 100k Users”, the headings are:

                                                                                      • 1 User: 1 Machine
                                                                                      • 10 Users: Split out the Database Layer
                                                                                      • 100 Users: Split Out the Clients
                                                                                      • 1,000 Users: Add a Load Balancer.
                                                                                      • 10,000 Users: CDN
                                                                                      • 100,000 Users: Scaling the Data Layer - caching, read replicas

                                                                                      High Scalability’s article has:

                                                                                      • 1 User - 1 machine
                                                                                      • Users > 10 - Separate DB and app server
                                                                                      • Users > 100 - Store data on RDS
                                                                                      • Users > 1000 - Elastic Load Balancer with 2 instances , multi AZ for app servers, standby database in another AZ
                                                                                      • Users > 10,000s - 100,000s - Read replica, CDN, maybe caching, autoscaling
                                                                                      • Users > 500,000+ - Autoscaling groups, caching, monitoring, automation, decouple infrastructure, maybe Service Oriented Architecture (SOA)
                                                                                      • Users > 1,000,000+ - Requires Multi-AZ, Load balancing between tiers, SOA, S3+CloudFront for static assets, caching in front of DB.
                                                                                      • Users > 10,000,000+ - Data partitioning/sharding, moving some data to specialised DBs
                                                                                      • Users > 11 Million - More SOA, Multi-region, deep analysis of entire stack.

                                                                                      If you read Scaling to 100k Users with a > instead of an = then it is slightly more understandable (though splitting out the clients at 100 users doesn’t make a lot of sense to me).

                                                                                      1. 6

                                                                                        Right now - today - you can comfortably serve > 11 million users off one server in a good colo.

                                                                                        There are smaller systems with decade+ uptime using this approach.

                                                                                        IMO most of these practices are about altering your risk profile rather than your performance profile (which is also important!). The risk profile of a single server in a single DC is unacceptable for many, many businesses.

                                                                                    1. 4

                                                                                      Why was the framework written in the first place?

                                                                                      Why is URL canonicalization not outsourced to an open source library that is called from bespoke framework?

                                                                                      1. 5

                                                                                        Why is URL canonicalization not outsourced to an open source library that is called from bespoke framework?

                                                                                        While a good question, I think the relevant question here is actually “why was the authentication code working on the unparsed URL”? Using a good canonicalization library doesn’t help if you don’t use it in the authentication path.

                                                                                      1. 1

                                                                                        1.14 — Rustup 1.0

                                                                                        I’ve never really liked it. The fact that the official website still tells me to use curl ... | sh to install yet another tool still makes rust seem unprofessional, at least in this regard. And of course I could install it from my package manager’s repositories, building relatively simple tools like a linter or a completion engine fail because they depend on some nightly features.

                                                                                        I guess it’s my problem that I expect these things, but either way those have been the things that have been keeping me away from the language.

                                                                                        1. 14

                                                                                          We’ve had many dicussions on the topic of whether you should let people pipe your malware(tm) directly into their shell or not. Every time with varying arguments of

                                                                                          • but I want signatures
                                                                                          • nobody checks signatures
                                                                                          • you have to trust those signatures at one point also
                                                                                          • which is like the first rustup installation, you don’t pipe into shell afterwards anyway
                                                                                          • but I want my malware(tm) downloaded to HDD before piping it into shellexecuting the installer blindly
                                                                                          • but I always inspect the malware(tm) I download, before executing it anyway
                                                                                          • you can always build rustc by yourself, enough about that in the readme
                                                                                          • but you can’t verify the correctness of every line in the compiler anway
                                                                                          • you can’t verify the integrity of the binary, as you can’t verify the code and thus not create a clean build to compare with anyway
                                                                                          • you can just use your package manager

                                                                                          Pick your poison..

                                                                                          1. 3

                                                                                            My issue isn’t even security, I just want to centrally manage all software on my system.

                                                                                            1. 7

                                                                                              Then… install rustup or rust through your package manager not from the official website? Just like every other piece of software you want managed centrally?

                                                                                              1. 0

                                                                                                I use Debain, and as far as I see, no version packages rustup. And the versions of Rust that are packaged appear to lack something that nightly versions have. And that’s what annoys me, that if you want tooling you need to use nightly (whatever the technical or political reason for this may be). Imagine having to use the dev-build for Java to get autocompletion or track the master branch for Go if you want static analysis.

                                                                                                1. 1

                                                                                                  I dunno, I’ve never really used nightly but the tooling works just fine in my experience. What features do you want that are nightly-only? rust-analyzer builds just fine on stable.

                                                                                                  As an outside observer, I’m given to understand that Rust is a tricky thing to package for the Debian crew, since they want to be able to package Rust libraries but the Rust ABI is only stable within a particular compiler version. So, not packaging Rustup is kinda the best of bad options there: either you use Debian’s rustc (1.34 on current Stable which seems to be a sorta informal “if you want to use a stable version this is a good one”, 1.42 on Testing which is pretty close to the current most-recent) and can use the libs that Debian distributes, or you use rustup and it’s up to you to provide the libs you want that are built with the compiler you want.

                                                                                                  1. 1

                                                                                                    rust-analyzer needs the latest stable though, I think this might be where the confusion about nightly features comes from (ra uses “nightly” features from the perspective of 1.34). If folks want to build rust-analyzer from source with old compilers, they have to stick to old versions of rust-analyzer as well.

                                                                                                    1. 1

                                                                                                      Is there a way to tell cargo what version to use? If yes, that should be better documented.

                                                                                                      1. 1

                                                                                                        Yes, I think cargo install can accept tags/commits when building from a git repo. When building from source, it’s possible to check out the required commit.

                                                                                                    2. 1

                                                                                                      I tried to install racer a while ago, and it failed because of missing nightly features. Because of your comment, I tried again, but it still seems to fail, citing a lot of “unstable” or “experimental features. This all just aggregates to a very bad introductory experience.

                                                                                                      1. 4

                                                                                                        The README for racer tells you that it requires nightly Rust. Its dependencies tell the story: it depends on a bunch of rustc specific crates.

                                                                                                        racer bootstrapped the IDE experience for Rust using what I’m guessing was the quickest possible method: using the compiler directly. The future of IDE for Rust is being concentrated in rust-analyzer, which I guess does compile with Rust stable. But probably not the version of Rust in Debian, because Debian packages an older version of Rust, as I’m sure you’re aware. The simplest way to build an older version of rust-analyzer is to clone its repository and checkout an older version. But at that point, you’re now faced with the research task of figuring out which version is compatible with your Rust compiler, if such a version even exists.

                                                                                                        In general, if you’re writing Rust code, the Rust ecosystem generally assumes that you’re not only using rustup, but are using the latest version of the compiler. Archlinux, for example, has no problems packaging rustup. I don’t know why, off hand, Debian doesn’t. There isn’t a lot of flexibility built in for folks using a compiler older than, say, two releases. That work flow is not well supported, but is possible. It is certainly not a good way to introduce yourself to the language, and I don’t really see that changing any time soon. There has been talk about LTS releases, but it’s not particularly popular. There’s also been some effort toward making the minimum supported Rust version (“MSRV”) an officially supported concept by Cargo, which I think would, at minimum, give you better error messages.

                                                                                                        It’s a sad hill to die on in my opinion, but I guess you have your requirements. Personally, if I were stuck using Debian, I’d be thankful for the easy curl ... | sh command I could run to liberate myself and use the latest version of the Rust compiler. Not even necessarily because I have to, if only to get the latest compilation time improvements, which are happening pretty frequently.

                                                                                                        1. 1

                                                                                                          There’s also been some effort toward making the minimum supported Rust version (“MSRV”) an officially supported concept by Cargo, which I think would, at minimum, give you better error messages.

                                                                                                          That sounds great, thanks for the insight!

                                                                                                          It’s a sad hill to die on in my opinion …

                                                                                                          I don’t have to use Rust, I just want to explore and play with it. That’s why I’d want things like completion and static analysis (preferably without LSP, but that’s getting harder and harder). If I had to use it, then all these things wouldn’t be requirements or I would use rustup.

                                                                                                          1. 2

                                                                                                            Gotya. Yeah, LSP appears to be the future. In a couple years time, I’d be surprised if there were any actively maintained non-LSP solution for Rust.

                                                                                                            I’ve made my peace with LSP. I used to use ctags for goto-definition, but it was just plain wrong a noticeable number of times. The only way to fix that AFAIK, for Rust at least, is to use one of the smarter tools which are all LSP based.

                                                                                                            LSP stuff outside of VS Code still appears to be early days. The user experience has overall just not been great from my perspective, and that’s coming from someone using vim (well, now neovim) which actually enjoys a wide variety of LSP client implementations to choose from. I shutter to think what it’s like for folks using less supported editors. Hopefully it will get better though.

                                                                                                      2. 1

                                                                                                        For example, just yesterday

                                                                                                        • I used miri (which requires nightly) to help decide if I had any faith in some unsafe code being correct.
                                                                                                        • I used -Z macro-backtrace to help debug some macros I was using (only available on nightly as a unstable compiler flag). Using the flag was suggested by the compiler.
                                                                                                      3. 1

                                                                                                        I’m not 100% clear on what the debian packing story is in general these days, but google suggests this?

                                                                                                    3. 1

                                                                                                      If you need nightly and your distribution doesn’t package it, it sounds like that the instructions telling you to use your distribution’s package manager wouldn’t help anyway. Certainly you could package rust-nightly yourself. Or, install rustup and manage toolchains that way.

                                                                                                      1. 1

                                                                                                        If your distro doesn’t package rustup (which is reasonably common to package) or nightly (which would be really weird to directly package), then ya, you’re out of luck and need to install it by hand. So… use a distro that packages a wide variety of things if you care about a wide availability of packages.

                                                                                                        Here is a arch package in the official repos, here is a void linux package in the official repos. I’m not 100% clear on how it fits in the packaging story, but ubuntu seems to be moving towards “snaps” and here is a rustup snap for ubuntu and a rustup snap for debian.

                                                                                                  2. 11

                                                                                                    At least it’s not asking you to pipe it as root, and I really appreciate the one solution for most *nix-y systems, at least after dealing with Python’s installation and package ecosystems. I’m pretty confident I won’t break my rust install and be stuck wondering which environmental variable should point where.

                                                                                                  1. 2

                                                                                                    So unsurprisingly unless your language uses the c-way, you’ll have trouble trying to integrate with anything relevant on *nix ? At least that’s the message I get from this. (The article is really interesting, I’m just disappointed about this cruft and that this is the solution in the all-new wayland.)

                                                                                                    1. 10

                                                                                                      It varies. Some libraries map very well, and are pleasure to use. When they fit, Rust can add borrow checking, thread-safety and automate memory management on top of existing C code.

                                                                                                      When a C library has straightforward memory management rules, like “every widget_init must be followed by widget_free”, it works great. The more the it goes towards “if florp flag is 3, then XOR the pointer with the name of your pet and dealloc &ptr[-17]”, the harder it gets to explain that to the borrow checker.

                                                                                                      It’s worth noting that it’s never a showstopper. As the article says, you can always shrug and wrap the whole code in unsafe or keep it in check with runtime assertions. The trouble is self-inflicted when Rust programmers bend backwards to have everything zero-cost and guaranteed safe.

                                                                                                      1. 6

                                                                                                        Wayland is in my experience actually fairly uniquely hard to integrate with from rust. People have been integrating with all sorts of other “relevant” things without trouble. For example kernel modules, x11, various foreign languages like writing python modules in rust, or even eBPF in the kernel, integrating with user space libraries like gtk, and so on and so forth.

                                                                                                        1. 5

                                                                                                          wayland-rs contains both a from-scratch Rust implementation and a libwayland binding. If you don’t need to use EGL or Vulkan, you can get away with the pure Rust implementation… but yeah, libEGL/libvulkan themselves use libwayland (and xlib/xcb, and equivalent native things on other platforms) so you can’t exactly give them your objects from your custom implementation of the protocol.

                                                                                                          1. 3

                                                                                                            Yes, the protocol itself fits Rust fairly nicely, but a window manager without accelerated graphics for at least the clients is pretty much worthless. Last I checked it wasn’t (reasonably) possible to do that without libwayland integration.

                                                                                                            Curiously that readme says

                                                                                                            There are also two auxilliary crates:

                                                                                                            • wayland-egl, which is necessary client-side for OpenGL integration

                                                                                                            I might have to look into what that means/if they managed to avoid the libwayland requirement somehow.

                                                                                                            1. 3

                                                                                                              No, of course that crate is meant to be used with the libwayland option. If you wanted to avoid the requirement, you’d have to reimplement the actual libwayland-egl.

                                                                                                      1. 2

                                                                                                        I’m currently running arch linux and void linux. Future installs will probably be void. Basically via process of elimination

                                                                                                        Long back I was bit hard by trying to upgrade non-rolling release distros from one version to the next. I also like being on recent versions of software, so that disqualifies Ubuntu/Fedora/most versions of debian.

                                                                                                        Debian sid was never advertised as particularly good for end users, and issues like this will continue to keep me far away.

                                                                                                        Gentoo sounds like too much work.

                                                                                                        Arch linux has systemd, which isn’t the end of the world but has caused me a few headaches.

                                                                                                        So far the only bad things I have to say about Void is that it doesn’t package openssl, only libressl, and the documentation is a big step down from arch. On the plus side the fact that it has its package repository on github makes it incredibly easy to see what is happening and contribute compared to most distros.

                                                                                                        1. 10

                                                                                                          I am currently using Void Linux. I hate systemd, it’s hassle to maintain. Void uses runit also it’s package manager is very good.

                                                                                                          1. 1

                                                                                                            How would you compare package manager to PacMan

                                                                                                            1. 3

                                                                                                              It is the same, except split in multiple binaries. Even the system upgrade process uses -Syu !

                                                                                                              However, the package building system is much more complex IMO (though really powerful !). Building a package from scratch will spin up a basic void chroot with your dependencies to make sure you didn’t forget anything, usually resulting in high quality packages. This raise the entry level though, and I say it as the maintainer of a lot of packages on crux (and arch, back in the days).

                                                                                                              I do have void installed on a notebook though, because it runs well without systemd, and just works. If I need a package from the repo, it is easy to install it. If it is not in the repos, I compile it manually and use my own package manager to install to /usr/local or $HOME/.local.

                                                                                                              1. 2

                                                                                                                The UI is mildly superior, at least once you install xtools, the functionality is basically identical at least from an end user perspective. The lack of an AUR equivalent means installing things from other peoples forks is slightly more painful, but it’s not hard to build packages from source.

                                                                                                            1. 2

                                                                                                              I feel like the “dot-map” one does the best job. You get a quick and accurate view of what regions are crammed with infected people and ones not, but also where within the region. I’m surprised it’s not more used (I’ve never seen one before this).

                                                                                                              1. 23

                                                                                                                but also where within the region

                                                                                                                This is wrong: points are spread randomly within the province, they don’t correlate to actual people or cases. This makes the dot-density map really misleading, and this comment is evidence.

                                                                                                                1. 6

                                                                                                                  I don’t like the dot one. It doesn’t correct for population. It implies that cases are spread uniformly over the area, but it’s clear that that assumption isn’t even close to correct. Cases are undoubtedly heavily weighted towards cities just because that’s where the people are, different parts of the province are almost certainly doing different, etc.

                                                                                                                  I particularly don’t think it’s responsible to make it look like the entire area is doing the same when we are talking about something like a virus. People use the press to decide how they should respond to the virus, and if they are in the region, local variations matter. Note that this is different from the other maps that make it clear that we don’t have fine grained enough data to divide up the region, but don’t suggest the absence of local variation.

                                                                                                                  For this data in particular, glhaynes also has a very good point that it looks like the color has been inverted or something. I was similarly initially confused, removing the label might help.

                                                                                                                  1. 1

                                                                                                                    Replying to:

                                                                                                                    Don’t you think it’s more useful to report an estimation of where an infection happened, rather than saying “we don’t know, somewhere in this massive area”?

                                                                                                                    Cases are undoubtedly heavily weighted towards cities just because that’s where the people are, different parts of the province are almost certainly doing different, etc.

                                                                                                                    And this is what the dot density map implies anyway? Or are we talking about two different maps? X) I’m seeing very very sparse dots in non cities and dense as hell in cities…?

                                                                                                                    1. 2

                                                                                                                      I feel like we must be interpreting the map differently, or looking at different things.

                                                                                                                      What I’m seeing, in this image, is a map divided into a bunch of provinces. Provinces contain both rural and urban areas. Within each province dots are spread uniformly at random, i.e. any point within the province is equally likely to have a dot no matter the local population density, and the local propensity to have the virus relative to the rest of the province.

                                                                                                                      1. 2

                                                                                                                        Within each province dots are spread uniformly at random

                                                                                                                        Oh, they are placed randomly in the province? :( Ok, that does suck.

                                                                                                                  2. 3

                                                                                                                    I had a little trouble with the dot density map: because the dots were so dense in Hubei Province as to almost entirely fill it, I initially read Hubei as being rendered in inverse color for emphasis, leading me to interpret the small remaining non-dotted negative spaces there as being its “dots”.

                                                                                                                    Probably not a problem in a higher-resolution setting — or for smarter viewers!

                                                                                                                    Even then, though, it could easily be interpreted as showing a harder boundary at the province level than actually exists — as though the province was basically filled with cases that abruptly end right at the edge. That seems particularly likely to mislead a viewer in this case into thinking this is showing the effect of a hard quarantine right on the province boundary.

                                                                                                                    1. 1

                                                                                                                      FWIW I experienced the exact same thing

                                                                                                                  1. 46

                                                                                                                    The link displayed in the bottom of the web browser window, when you hover over the link, can not be trusted.

                                                                                                                    Google has been doing it for ages. You can reproduce it on Firefox: Just search for something on Google, then hover over the link. The link will say it goes directly to the website. Then right click the link and dismiss the menu. Now hover over it again. The link has changed to a Google tracker URL. It’s shady and it’s not okay, and in my eyes, it is a UI bug / exploit that needs to be fixed in the browsers. Links shouldn’t change right under your nose.

                                                                                                                    1. 10

                                                                                                                      This is a feature that all browsers had for ages: “Disable Javascript”.
                                                                                                                      The whole concept of running arbitrary code on client side is a feature, and it means that you can run any arbitrary code on client side. The browser has no way of knowing which snippet is good or bad.
                                                                                                                      You can argue that this specific case is obviously malicious, and should be treated explicitly, but you would have two major issues:

                                                                                                                      1. You cannot treat this without breaking something else (changing href attribute is a valid usage)
                                                                                                                      2. You will soon have to treat billions of other “malicious cases” (changing content based on user-agent, hiding content, …)

                                                                                                                      And anyway, people will find a way to break through, for example by making calls to an analysis website on load, or event replace the current page with the target link passed through an analysis website.

                                                                                                                      The only safe way to counter this is to disable javascript, or at least, ask the user for the right to run it (though this will only get in the way…).

                                                                                                                      1. 2

                                                                                                                        But the browser could kill the whole pattern by changing the order of execution. Currently execution flow looks like

                                                                                                                        1. Detect click from operating system
                                                                                                                        2. Run javascript
                                                                                                                        3. If click is on a href, go to link

                                                                                                                        Just change the order of 2 and 3. The browser controls the javascript vm. There is no reason it needs to let it run between detecting the click and checking for href’s.

                                                                                                                        Legitimate patterns that need the onclick to run before going to a link can still work, they just need to follow the link in js instead of doing it through the href element, which will avoid the browser displaying the contents of the false original link at the bottom of the page.

                                                                                                                        1. 3

                                                                                                                          Running the code after following the link means giving the priority to the element themselves rather than the code. That would indeed fix the issue for “malicious” links, but it would kill a whole lot of other (legitimate) stuff.

                                                                                                                          Take as an example someone filling a form, and submitting it from a link. The onClick event can be used to check user input before submitting anything, and cancel the submission if, eg. one field is not filled, or incorrect.
                                                                                                                          By running the JS after following the link, you simply prevent the process. And I can barely imagine a (valid!) use-case with onClick that would work as intended if the code is run after the link is followed.

                                                                                                                          1. 1

                                                                                                                            For forms, the url isn’t displayed by the browser so this change isn’t needed.

                                                                                                                            But, let’s pretend that it was so we can use your example. The “fix” for making on-click continue to be useful would be as simple as removing the url from the html attribute, and instead passing it to the javascript, then the javascript could load the url (or send the post request) after validating the input. The only difference here is the url isn’t part of the element, so the browser doesn’t display it.

                                                                                                                      2. 4

                                                                                                                        You find this out the hard way when you want to simply copy a link from google, only to paste a load of tracking horseshit

                                                                                                                        1. 3

                                                                                                                          First, I agree with what you said, that this is not OK.

                                                                                                                          But how would you fix this in a way that doesn’t break legitimate uses of onClick?

                                                                                                                          1. 3

                                                                                                                            Interestingly, there exists a ping attribute on the a element, which would facilitate click tracking without sneakily changing the link with JavaScript. Google uses it on their search results, but only if you’re using Chrome. On Firefox, it appears to use <a onmousedown> instead, which would be responsible for the swap. The ping attribute is supported on Firefox, but it’s not enabled by default. It is on Chrome, though. Hmm.

                                                                                                                        1. 1

                                                                                                                          I’m having some difficulty identifying what the source of the data the author uses is — my guess is Github, which would make this interesting but not very practical or representative of “real life”. Anyone have any ideas?

                                                                                                                          1. 1

                                                                                                                            I checked out the source data (he links a tarball of code and data), it looks like it’s an analysis of just two repositories. Probably evolution and apache by googling a file name.

                                                                                                                            1. 1

                                                                                                                              Interesting! I need to dig into that a little bit later - seems like too small of a sample size to really draw conclusions.

                                                                                                                          1. 2

                                                                                                                            I see it uses SMT solvers to check the annotations. I know that Z3 is quite impressive, but am interested in how scalable this would be. Does the language design ensure these checks are kept ‘local’ (i.e. adding N definitions/calls adds O(N) time to the build), or can the constraints interact over a longer distance (potentially causing quadratic or exponential time increase)? I’d also like to know if there are common failure-modes, e.g. where some common code pattern can’t be solved without extra annotations.

                                                                                                                            For example, the classic “hello world” for dependent types is to define Vector n t (lists containing n elements of type t), and the function append : Vector n t -> Vector m t -> Vector (plus n m) t whose output length is the sum of the input lengths. Woe betide anyone who writes plus m n instead, as they’d have to prove that plus commutes!

                                                                                                                            1. 3

                                                                                                                              With SMT based verification, each property is proved separately and assumed true when proving others. So there’s no problem with locality. The problems are very different vs. dependent types. SMT is undecidable, so unless the tool is very careful about using decidable logic (like Liquid Haskell is), you’ll very often see that the solver just times out with no solution, which is pretty much undebuggable. It’s difficult to prove anything through loops (you have to write loop invariants), etc.

                                                                                                                              1. 2

                                                                                                                                In this case, SMT isn’t undecidable: a friend pointed out the write!(solver,"(set-logic QF_UFBV)\n").unwrap(); line to me, which means “quantifier free with uninterpreted functions and bitvectors”. It’s decidable, just super hard.

                                                                                                                                1. 2

                                                                                                                                  Yeah, if you can get away with quantifier-free, everything is much better. But you can’t do much with C code without quantifiers. zz probably doesn’t support verifying loops.

                                                                                                                                  1. 2

                                                                                                                                    I agree, unless ZZ has its own induction system for loops, using SMT only to prove the inductive hypothesis. It’s a lot of work though.

                                                                                                                              2. 2

                                                                                                                                Does the language design ensure these checks are kept ‘local’ (i.e. adding N definitions/calls adds O(N) time to the build), or can the constraints interact over a longer distance (potentially causing quadratic or exponential time increase)?

                                                                                                                                I’m also interested in this in terms of API stability. If interactions happen at a longer distance it can be difficult to know when you are making a breaking change.

                                                                                                                                1. 2

                                                                                                                                  Z3 “knows” addition commutes, so that’s no problem. Usual trouble with dependent type is that you define addition yourself, so it is hard for Z3 to see your addition is in fact addition.

                                                                                                                                  1. 1

                                                                                                                                    Yes, I gave that example for dependent types (the type requiring a different sort of branching than the computation) since I’m more familiar with them.

                                                                                                                                    The general question was about the existence of pathological or frustrating cases in this SMT approach (either requiring much more time, or extra annotations) arising from very common assertion patterns, or where the/a “obvious” approach turns out to be an anti-pattern.

                                                                                                                                1. 9

                                                                                                                                  There is a documented public use of the term “open source” from 1996…

                                                                                                                                  https://hyperlogos.org/article/Who-Invented-Term-Open-Source

                                                                                                                                  1. 8

                                                                                                                                    And 1993. There are probably more references if you care to search long enough.

                                                                                                                                    I have no reason to doubt that Christine’s is dishonest in her retelling of events, by the way. It’s just a logical term that got independently got coined on separate occasions, since “open” was (and still is) a commonly used adjective (Open Software Foundation, OpenBSD, etc.)

                                                                                                                                    1. 4

                                                                                                                                      Another scattered reference, but if OSI wasn’t to credit for the phrase, we should be able to find a lot of uses of the phrase before 1998. The fact that we have to look real hard to find two very rare usages before 1998 does not undermine OSI’s contribution to the explosive popularity and meaning of the term.

                                                                                                                                    2. 6

                                                                                                                                      Everyone always brings this up as some kind of proof that “open source” didn’t enter public discourse before Tim O’Relly bankrolled OSI’s publicity campaign in 1998:

                                                                                                                                      https://books.google.com/ngrams/graph?content=open+source%2Cfree+software&year_start=1980&year_end=2008&corpus=15&smoothing=3&share=&direct_url=t1%3B%2Copen%20source%3B%2Cc0%3B.t1%3B%2Cfree%20software%3B%2Cc0

                                                                                                                                      Yes, there were a few scattered and seldom repeated uses of the phrase, but I find Caldera’s usage unconvincing here. The phrase “open source” appears just once in the title and in the body it’s used as “open source code”, which to me reads more like “open sourcecode” than like “opensource code”.

                                                                                                                                      We can also find the phrase “free software” in a lot of places that have little to do with what the FSF means by it. This doesn’t mean that “free software” doesn’t mean what the FSF and OpenBSD mean by it.

                                                                                                                                      1. 5

                                                                                                                                        In the body it is used (hyphens not added)

                                                                                                                                        • 2x: “open-source” not followed by the word code.
                                                                                                                                        • 3x: “open-source code”.
                                                                                                                                        • 1x: “open source code”.

                                                                                                                                        Plus in the 6 word tile as “open source” not followed by the word code. And open source is the main object.

                                                                                                                                        Moreover it’s not just a throwaway phrase but a label they are using for their model

                                                                                                                                        Caldera believes an open source code model benefits the industry in many ways.

                                                                                                                                        They continue to be extremely interested in DOS and support our open-source technology direction

                                                                                                                                        Caldera’s OEM and Channel Partners can utilize the open-source code models for DOS and Linux to create

                                                                                                                                        I’m honestly not sure how it could be any clearer that if this is the original use they are coining the phrase.

                                                                                                                                        1. 3

                                                                                                                                          Oh, right, they do hyphenate it.

                                                                                                                                          Anyway, it didn’t pick up. It wasn’t because of Caldera that we all started saying “open source”. It was because of Tim O’Reilly bankrolling OSI. They deserve the credit for coining and popularising the phrase.

                                                                                                                                        2. 3

                                                                                                                                          It seems to me that ngram chart rather undermines your own point by showing that there was already an upward trend of “open source” before 1998, which continued on the same trajectory for a few years after 1998.

                                                                                                                                          I find your argument regarding “open source code” unconvincing; “source” is just short for “source code”, and clearly the adjective “open” was combined with the noun “source” (or “source code”) to mean “the source is available for people to download”.

                                                                                                                                      1. 5

                                                                                                                                        Anyone know if it is possible to make multiple dynamically linked rust libs that are usable from C without including the rust runtime multiple times?

                                                                                                                                        1. 7
                                                                                                                                          • Rust doesn’t guarantee a stable ABI. You can link Rust’s libstd dynamically, but the library will be very version-specific. You can get away with this if only you distribute it and libraries that depend on it all together.

                                                                                                                                          • You can make one dynamic C ABI library that consists of multiple static Rust libraries. LTO will dedupe them.

                                                                                                                                          1. 1

                                                                                                                                            You can make one dynamic C ABI library that consists of multiple static Rust libraries. LTO will dedupe them.

                                                                                                                                            Right, but if two of your downstream dependencies both write a .so file in rust, won’t it all break down as you have two copies of the std allocator, one in each .so file?

                                                                                                                                            1. 5

                                                                                                                                              No. It’s no different than having Rust’s allocator (which may not be your system’s allocator) and C’s allocator. When you cross ffi boundaries, you generally try not to assume any particular allocator. It’s up to each library to allocate memory internally and expose their own free routines. The same goes for Rust when exposing a C interface.

                                                                                                                                          2. 3

                                                                                                                                            Rust doesn’t have a runtime, in the sense that Python or Go have a runtime, but it does have a standard library that gets statically linked into the resulting binary.

                                                                                                                                            1. 2

                                                                                                                                              That is a runtime, I’m confused how this can work without causing multiple redefinitions of the stdlib symbols.

                                                                                                                                              1. 2

                                                                                                                                                As long as these aren’t publicly exposed (so are defined as static in C terms) it is fine to go. Also you can always use no_core and do not use any of the Rust stdlib in the library.

                                                                                                                                                1. 1

                                                                                                                                                  If you are going to do this, you might want to define you exposed API surface area with a common naming convention (my_api_* or something), then use a linker version script to ensure that only the symbols you expect to export are actually exported.

                                                                                                                                                  This looks possible in rust: https://users.rust-lang.org/t/linker-version-script/26691 For information about the linker scripts: https://ftp.gnu.org/old-gnu/Manuals/ld-2.9.1/html_node/ld_25.html

                                                                                                                                                  I do this with C++ libraries. I’ll statically link libstdc++ into libwhatever.so, then hide all of the stdc++ symbols, libgcc symbols, etc. My libraries usually only have dynamic dependencies on libc. I have no idea if you can replicate this pattern in rust, I haven’t tried yet, but doing this has been tremendously helpful for deploying libraries in the environment I am deploying them too.

                                                                                                                                              2. 1

                                                                                                                                                The obvious option is to dynamically link the rust standard library as well, rustc test.rs --crate-type=dylib -C prefer-dynamic (or similarly passing those options through cargo). Or to use no_std.

                                                                                                                                                I don’t know if there is a more elegant way though.

                                                                                                                                                1. 1

                                                                                                                                                  Does that work in the presence of generics?

                                                                                                                                                  1. 1

                                                                                                                                                    It should, though it’s probably duplicating the asm versions of the monomorphized methods in both libs.

                                                                                                                                                    1. 1

                                                                                                                                                      No it does not, because you cannot represent generics in C’s ABI. You can probably compile the standard lib as a Rust dylib, but then you don’t get a stable ABI and it’s of limited use.