1. 3

    I once played a platformer game that looked like this (circa 2010?). The whole window looked like white noise, but the walls moved/changed at a different rate to the background & the character. It was quite interesting and not uncomfortable to play.

    In particular I remember that after every time I played it: my eyes would see the rest of the world as super-smooth for the next few minutes. It was a weird and interesting effect.

    I’ve been trying to find this game again for years. Was it win32? Was it flash? :(

    1. 6

      Hmm, was it Lost in the Static, as inspired by that guy doing the 300 mechanics website?

      1. 1

        I remember playing the FKR games by NAL on Yoyogames back when that site still was a thing. FKR5 especially is interesting since it it never clears the frame buffer but instead continues to partially draw on top of it. It demonstrates how incredible our capability for visual understanding really is.

      1. 2

        Who doesn’t want more autonomy? Most of what he said could be applied to any kind of individual contributor. I think a lot of people like to think that software engineers are special in some way, but we’re really just humans like everyone else.

        1. 15

          I think it’s because we don’t have better tooling. Wireshark makes observing binary protocols much easier and richer than just text, and if we had richer structural editing (think Interlisp-like editing s-exprs in memory) and structures (again, closer to TLV over C structures), we could just make binary encodings for everything reasonable.

          Alas, Unix.

          1. 25

            But text as a common denominator is the reason that there are good tools. Bespoke binary formats introduce a fundamental O(M * N) problem. You have M formats and N operations, and writing M*N tools is infeasible, even for the entire population of programmers in the world.

            For example, I can grep a Python, JavaScript, Markdown, HTML, a log file, etc. How do I grep PDF, Word, or an RTF file? You need a tool for each format.

            I can git pull and git diff any of those files too. How do I diff two Word files? I have to wait for somebody to build it into Word.

            How do I version control a bunch of video clips cut from the same source? You need a custom tool if you want to do anything but copy the whole thing. (Again video editing tools have this knowledge built in, e.g. with an app-specific notion of a “project”)

            How do I diff two Smalltalk images?

            I’m not saying those are necessarily bad designs, but there’s an obvious tradeoff, and it’s why Unix is popular. And it’s especially popular in open source where you can’t just throw highly paid programmers at the problem.

            It is not an absolute; in reality people do try to try to fill out every cell in the O(M * N) grid, and they get partway there, and there are some advantages to that for sure.

            Editing with a human-friendly UI is one of the most expensive “rows” in the grid to fill. Many structural editors exist, but they don’t catch on. They have a fundamental limitation. The problem is: which structure? The lowest common denominator of structural formats doesn’t exist. In other words, the lowest common denominator is an unstructured byte stream.

            IntelliJ is partway there – they have a generic text editor, but structure is laid on top. It’s nontrivial to get this layering right, and you still have a bunch of work to do for each language. Try doing Zig in IntelliJ, etc. I think you will mostly have a text editor (and that’s good).

            Don’t forget that parsing is at least two rows in the grid: parsing for validity/interpretation/compilation, and parsing incomplete code for a rich UI. IntelliJ has its own parsing framework to make these operations easier in a polyglot manner. But it’s expensive to produce and Eclipse wasn’t able to duplicate it in the open source world.

            WireShark supports pretty printing network protocols. What about sed, grep, wc (counting records), diff, an awk of all network protocols? I’m sure it has support for many of those things by now, although it’s not easy to find the abstraction that lets you do it. There’s an inherent tradeoff.

            Some projects addressing the tradeoff:

            (I should probably write a blog post about this)


            Other resources: Always Bet On Text by Graydon Hoare

            Text is the most efficient communication technology

            Text is the most socially useful communication technology

            It can be compared, diffed, clustered, corrected, summarized and filtered algorithmically. It permits multiparty editing. It permits branching conversations, lurking, annotation, quoting, reviewing, summarizing, structured responses, exegesis, even fan fic. The breadth, scale and depth of ways people use text is unmatched by anything. There is no equivalent in any other communication technology for the social, communicative, cognitive and reflective complexity of a library full of books or an internet full of postings. Nothing else comes close.


            A related rant about protobufs from a couple days ago: https://news.ycombinator.com/item?id=25582962

            Protobufs give you structured data – now what’s the problem? Isn’t that what we want?

            No, it’s because simple operations like copy are now not generic. It doesn’t work like Unix cp or memcpy(). You need machine code to implement this one operation for every single type. Tons of complexity results from this (and this is explained by multiple primary authors of protobufs):

            The protobuf core runtime is split into two parts, “lite” and “full”. If you don’t need reflection for your protos, it’s better to use “lite” by using “option optimize_for = LITE_RUNTIME” in your .proto file (https://developers.google.com/protocol-buffers/docs/proto#op…). That will cut out a huge amount of code size from your binary. On the downside, you won’t get functionality that requires reflection, such as text format, JSON, and DebugString().

            Even the lite runtime can get “lighter” if you compile your binary to statically link the runtime and strip unused symbols with -ffunction-sections/-fdata-sections/–gc-sections flags. Some parts of the lite runtime are only needed in unusual situations, like ExtensionSet which is only used if your protos use proto2 extensions (https://developers.google.com/protocol-buffers/docs/proto#ex…). If you avoid these cases, the lite runtime is quite light.

            However, there is also the issue of the generated code size ..

            1. 5

              Another concept to look into is the “thin waist” of networking, of operating systems, and compilers. They are all least common denominators that solve O(M*N) problems:

              • TCP/IP is a thin waist. One on side you have low level networks like Ethernet, Wireless, etc. On the other side you have applications formats like HTTP, e-mail, IRC, etc.
                • It doesn’t make sense to have e-mail work over wired, then have to port it to wireless, etc. In the early days, it DID work like that. You lacked the abstraction of the thin waist. This was the whole point of the Internet – to bridge incompatible networks.
              • Unix is a thin waist. On one side you have heterogeneous hardware (supercomputers, PCs, embedded evices). On the other side you have applications.
              • LLVM is a thin waist. On the one side you have languages, and on the other you have ISAs.
              1. 2

                We should be careful to note that the cost is not O(1), as we might hope, but O(|M| + |N|). We have M parsers, N backends, and a single format that they all use for interchange. This cost shows up in tools like Pandoc and in the design of protocols like NDN which aim to explicitly have a thin-waist model.

              2. 4

                or an RTF file

                bad example, RTF IS a text-based format. https://en.wikipedia.org/wiki/Rich_Text_Format

                And the text parts of pdf and Word documents frequently are actually stored as text too. The problem is grep doesn’t actually work on “text”… try like strings foo.doc | grep ..., decent chance it will actually work. Of course “text” is in scare quotes because you might have to convert from UTF-16 or something too.

                And grep only works if there’s reasonable line breaks which is frequently untrue, grep HTML just returns one gigantic line a lot, ditto on diff. You might have to put some format-specific formatter in the middle of the pipeline to make it actually usable by those other things too.

                The general principle of just sticking another pipe in the middle to convert applies to lots of formats. In my other comment I used disassemblers as an example and they work pretty well.

                Text and binary isn’t really that different.

                1. 1

                  Adding to your comment; PDF is also a text-based format. New word documents too (OOXML).

                  1. 1

                    That’s all true, but it doesn’t invalidate the main point. Text is a compromise that works with a lot of different tools. You get more reuse.

                    strings foo.doc | grep is a good example.

                    There are tools that take an HTML or JSON tree structure and make line-based so you grep it, e.g.

                    https://github.com/tomnomnom/gron

                    These are features, not bugs.

                  2. 2

                    I think a good argument against your O(M * N) problem is that the onus should be the binary format provider’s onus to supply some kind of binary -> text tooling. FlatBuffers, for instance, provides tooling to go to and from json.

                    Now, an argument against that is that it’s herding cats to get everyone to actually provide said tooling. Which I guess is why we can’t have nice things.

                    1. 1

                      You can also write your own text -> binary conversion, and lots of people have done that for specific use cases.

                      That is an argument FOR text, not against it (but also not against binary). It’s not either/or or absolute like I said in the original comment. Binary formats and text formats are both valid tools, but unstructured text is a useful “thin waist” that reduces engineering effort.

                      Related: https://lobste.rs/s/vl9o4z/case_against_text_protocols#c_jjkocy

                      I recommend reading “The Art of Unix Programming”, there is a section about textual formats for images.

                      Related: Use echo/printf to write images in 5 LoC with zero libraries or headers

                      How to create minimal music with code in any programming language

                      (i.e. use the Unix style. If you’re writing code in a niche language, you don’t need to wait for somebody to write a library.)

                    2. 2

                      I have two recurring problems with text diffs:

                      One is when a colleague has changed something in a UI definition and it has been serialised into XML or JSON, and values in a dictionary have Switches places, or a numeric value has been re-rounded, leading to a bigger diff than necessary.

                      The other is when I’ve refactored an if clause into a guard, re-indenting much of a function. The structure has changed, but the diff covers an entire function instead of the structure itself, which is just a few lines.

                      Text is just an approximation of structure.

                      1. 1

                        Well, yes… Most programming languages and text formats are liberal when it comes to whitespace and newlines. This is a design choice with trade offs like they one you mention. But it is not always the case. For example, http headers do not let you squeeze in ad hoc whitespace or new lines.

                        Notice that the problem arises from using a specialized UI tool to output text from structure… Then it doesn’t play well with simple text based tools. Using a simple text editor gives you full control of every single char and will not have the problem you describe.

                        This is exactly the same problem with binary formats, because you pretty much have to use a specialized tool.

                        1. 1

                          Yes, that’s valid and true. But it doesn’t contradict anything I said – byte streams are a compromise, a thin waist, just like TCP and LLVM are. TCP is very inefficient for certain applications; likewise LLVM is an awkward fit for many problems. But it’s uneconomical to rewrite the entire ecosystem for a specific application.

                          Though note you can also write your own GIT_EXTERNAL_DIFF for XML and JSON, and I’m sure someone has done so already. Unix is flexible and customizable.

                        2. 2

                          I realize this isn’t the main thrust of your argument, but Word can definitely diff two Word documents, and act as a merge tool for them. It’s not as straightforward as regular diff, but it’s definitely capable there.

                          1. 1

                            If you have used uchex, how different is that approach from simply using a parser with automatic error correction (something like what ANTLR does)? (I find it strange that they do not mention any standard error recovery except for a passing mention in 2.1).

                            1. 2

                              I haven’t used it, but I think I should have called it “micro-grammars” rather than “uchex” (they use that name in the paper). It’s an approach for partial parsing and semantic analysis that works on multiple languages.

                              Looks like some people have implemented this:

                              https://github.com/atomist/microgrammar

                              I think error recovery is related, but philosophically different? There is no error recovery in microgrammars. They purposely don’t specify some information. They don’t specify ALL of the language and then “recover” from errors. And as I understand they also do semantic analysis. I’d be interested in details from anyone who has more experience.

                              1. 1

                                Thanks for the link! much appreciated.

                            2. 1

                              Thanks a bunch for this comment, I’ve personally learned a lot!

                              An interesting thin waist solution for binary formats would be having canonical text representation. WASM does this right I think.

                              1. 1

                                Right, WASM needs to be compact on the wire and quickly parsed. It is mostly generated by compilers, but occasionally you want to write it by hand, and read it. The economical solution is to project it onto text.

                                Sure, you can write a structured editor for WASM as the OP was suggesting, but does anyone really think that is a good solution ??? Do you also want to write a WASM version control system that knows how to diff and merge WASM functions perfectly?

                                It’s definitely possible, but I want to see someone pursue those projects with a straight face.

                                When you convert to text, you can a bunch of tools for “free”. You can use sed, grep, git diff, pull, merge, etc. on it. You are moving onto the thin waist; WASM binary is off of the thin waist.


                                Glad you got something out of this comment!

                                I often find myself in these “systems vs. code” arguments. Most people are trying to optimize the code (for local convenience) whereas I’m trying to optimize the system (for global properties). The whole point of Oil is to optimize systems, not code.

                                Ken Thompson unrdestands this, he wrote about the lack of records in his “sermonette” on Unix design in the very first paper on Unix shell: https://lobste.rs/s/asr9ud/unix_command_language_ken_thompson_1976#c_1phbzz

                                (UTF-8 is also a Thompson design that lacks metadata unlike UTF-16, and that’s a feature.)

                                Systems vs. code is a pretty big mindset difference, and it often leads to opposite conclusions. A lot of people here claim they don’t understand Rich Hickey

                                https://news.ycombinator.com/item?id=23905198 (my reply as chubot)

                                which linked:

                                https://lobste.rs/s/zdvg9y/maybe_not_rich_hickey

                                The point here is that dynamic typing has advantages in systems (similar to unstructured text). (And I just got a raised eyebrow from a CS professor when I made this claim, so I know it’s not obvious.)

                                Field presence is checked at runtime, not compile time, because it enables graceful upgrade. That is a systems thing and not a code thing.

                                Static typing is also valid, but should be layered on top. Just like binary is valid, but should gracefully co-exist with tools for text. Haskell, OCaml and Rust are great, but they run on top of untyped Unix, which is working as intended. I consider them all domain-specific type systems. (This is a real argument: Unikernels are optimizing locally, not globally, which is a fundamentally limited design.)

                                I have some blog posts queued up on this, referencing Ken Thompson vs. Kubernetes, and system design, although it’s not clear when they will appear …

                          1. 4

                            For an update, see On the Information Bottleneck Theory of Deep Learning.

                            The article itself asks:

                            It remains to be seen whether the information bottleneck governs all deep-learning regimes, or whether there are other routes to generalization besides compression.

                            The paper I linked answers:

                            Moreover, we find that there is no evident causal connection between compression and generalization: networks that do not compress are still capable of generalization, and vice versa.

                            I think it was a great idea. But evidences suggest that it’s just not true.

                            1. 2

                              My intuition has been that discovering sparse representations are usually necessary for any kind of generalization – a model learning speech-to-text, for instance, will necessarily have somewhere inside it an understanding of the individual vowel/consonant sounds and utterances which are then building blocks for generating text.

                              “Compression” ~= “sparse representation”, right? So the paper refutes that idea?

                              1. 1

                                thank you kindly for the link ! having cursorily looked at it and the arguments raised by tishby et al, it seems that information bottleneck might still be relevant…

                                1. 1

                                  Why do you think information bottleneck might still be relevant? I am curious. (I consider the theory mostly failed at this point.)

                                  1. 2

                                    In that link @sanxiyn posts, there seems to be a very vigorous back and forth between Tishby et al. (the IB theory crew) and the article criticizing IB (EDIT: with neither side conceding defeat). The program committee accepting the paper to the conference may only mean they thought it worthy of a much broader discussion in the community than their review process.

                                    Since that was 2 years ago, perhaps other papers or discussion have taken place in the understanding of IB or its critique. I think the link itself and publication is non-conclusive, even of community opinion, never mind the fact of the matter.

                                    One kind of “obvious” point about “compression” and “generalization” is that the are almost semantically the same. To find a well generalizing representation means to have some representation that has been “properly de-noised”. Representing noise takes up space (probably a lot, usually, but that is problem specific). This applies to all fitting, from multi-linear regression on up, and maybe to all “induction”. (The transduction case, more analogous to “interpolation” is different.)

                                    That is just one piece of the puzzle, of course, and there may be controversy over how to define/assess “compression” (e.g. a neural net with a bunch of near zero weights may take up computer memory, but be the same as one without those weights at all), and also controversy over specific quantitative relationships between compression, however assessed and out of sample error rates.

                                    TL;DR - I think @sanxiyn has more work to do in order to establish “mostly failed” or “debunked” status.

                                    1. 2

                                      @cblake, i don’t think i could have said it better than you did. thank you !

                                      1. 2

                                        You’re welcome. The Wikipedia entry on the Information Bottleneck Method covers some of this controversy in the “Information theory of deep learning” section (today’s version..Future folk may someday have to go back to that in wiki history). They also have more references.

                              1. 6

                                My five favorite packages:

                                1. Ivy. The list-and-narrow model has radically changed how I use Emacs and I think this model should be emulated more widely. It’s a great way to deal with hundreds even thousands of options in an easy manner.

                                (I tried selectrum and I like it and am philosophically more aligned with it, but Ivy is what I know and it requires fewer configs to get it to work the way I want it. Ivy also allows matching from external programs, for example ripgrep, which I use constantly.)

                                1. Magit. No suprise there, everyone who uses Emacs pretty much loves Magit, it’s just an awesome git porcelain.

                                2. dumb-jump. For programming modes where I don’t want to bother with configuring LSP, dumb-jump offers a very workable jump-to-def approach. It uses regular expressions to find lines that look like they may define the word under the cursor. Not always completely accurate, but it requires 0 configuration and works well enough for most cases.

                                3. move-dup and shift-text. I’m cheating here by including both, but they do similar jobs and allow me to manipulate lines of text similarly to how I got used to in Eclipse (Ctrl+Alt+Down/Up to copy the current line down/up, Alt+Down/Up to move a line down/up, etc.)

                                4. deadgrep. When using counsel-rg gives to many results, deadgrep offers a very clean interface to view and navigate the results of a search. I especially love that by default it finds the root of the project (a Git project for example) so it DWIMs pretty well.

                                1. 2

                                  Great choices!

                                  I’ve been using ag for ages, but I should try rg one of those days. I’m always using such search packages via Projectile, though, as they seem to be most useful within a project context.

                                  Btw, how fast is the dumb-jump for you on bigger projects?

                                  1. 2

                                    I tried dumb-jump on GCC with ripgrep and it was unbearably slow. I reverted to ggtags as it is considerably faster.

                                  2. 2

                                    I keep trying magit but I can never get good enough at it to the point where it makes sense to actually use it. The documentation is sort of laughably impenetrable too (I say this as a long time emacs user, but not a power user). The git blame docs for instance are just hilariously unhelpful.

                                    Anywhoo, would be interested in giving it yet another go and would appreciate any resources you’re aware of that might be helpful.

                                    1. 1

                                      Anywhoo, would be interested in giving it yet another go and would appreciate any resources you’re aware of that might be helpful.

                                      I found this introductory presentation from Howard Abrams helpful. Mostly though, once I bound C-x g to magit-status and got into the habit of using it, the combination of my existing (basic) git knowledge and the ? key-bind allowed me to learn by doing. Hope that helps :-)

                                      1. 1

                                        Thanks, I’ll give it a watch! I pretty much do the same thing but it never seems to work out (for example, I still can’t find blame anywhere).

                                        1. 1

                                          I’ve used magit for many years yet I rely on the ? shortcut for using it.

                                    2. 1

                                      I tried selectrum and I like it and am philosophically more aligned with it, but Ivy is what I know and it requires fewer configs to get it to work the way I want it.

                                      Selectrum is new to me! I’ve been using Helm, but am considering switching, as Helm’s display doesn’t quite seem to be configurable enough – sometimes when switching buffers, it truncates the buffer name! Ridiculous!

                                      Is this the philosophy you’re talking about?

                                      The design philosophy of Selectrum is to be as simple as possible, because selecting an item from a list really doesn’t have to be that complicated, and you don’t have time to learn all the hottest tricks and keybindings for this. What this means is that Selectrum always prioritizes consistency, simplicity, and understandability over making optimal choices for workflow streamlining. The idea is that when things go wrong, you’ll find it easy to understand what happened and how to fix it.

                                      How does that differ from Ivy, which says it “aims to be more efficient, smaller, simpler, and smoother to use yet highly customizable”? I haven’t tried Ivy either, so high-level info is great.

                                      1. 2

                                        The section on Ivy in the Selectrum README is where they go into most of the detail. Essentially, Ivy was originally designed for Swiper, and got abstracted out of that some point along the way, but not very cleanly, resulting in a lot of hardcoded special cases in the code for different functions. Selectrum on the other hand only wants to plug in to completing-read and let a user choose from that list in a more convenient way, which allows the codebase to be something like ten times smaller.

                                      2. 1

                                        deadgrep was really nice, thanks for sharing!

                                      1. 35

                                        Describing Wren as “classy” because it has classes; and its VM implementation as “under 4,000 semicolons”; is cute, and earns it a +1.

                                        1. 9

                                          Hehe - gotta watch that semicolon budget 😎

                                          1. 1

                                            Maybe switch to Python if you run out…

                                            1. 1

                                              I use Clojure at work, and Emacs Lisp at home, so I’ve got semicolons to burn in my prose ;-)

                                          2. 2

                                            Careful, you just used two semicolons, or .05% of Wren’s total budget.

                                          1. 7

                                            I think I would summarize as “Security with obscurity is a good idea.”

                                            1. 1

                                              I agree. One should though make sure it’s cleanly separated (like in the article) and doesn’t increase complexity much and doesn’t thereby accidentally create more attack surface. Also it’s important to keep a clear mind about this over time and not accidentally end up slowly turning the “with” into a “by”.

                                              These things can and do happen as projects grow, owners switch, focuses change, etc.

                                            1. 11

                                              A CRM for personal relationships

                                              1. 2

                                                I have read about https://www.monicahq.com/ as an example. Never tried it. Have you tried it?

                                                Personally I find the concept a bit … autistic/creepy, but still have considered it as possibly useful tool.

                                                1. 9

                                                  Personally I find the concept a bit … autisti

                                                  Well, yeah, that’s me. Thanks for the link.

                                                  1. 5

                                                    If you think it might be useful, but found a dedicated CRM app a bit much, have you tried using the notes field in your phone’s address book? I use it to jot down names of kids & spouses and things like “vegan”, “teetotal”, “pronounced […]” etc. They’re synced everywhere automatically and they’re searchable in a hurry from your phone.

                                                    1. 4

                                                      I think it may seem creepy because of associations with corporations and marketing.

                                                      However, when I actually think about it… Would my life be richer and better if I was more consistent about staying in touch with people? Almost certainly!

                                                      1. 1

                                                        I tried this but had difficulty getting the self hosted version to work. As far as creepy, I think of it as just a memory extension. It isn’t anything someone with a good memory couldn’t do, just helps mortals to remember birthdays, peoples’ interests, etc.

                                                      2. 1

                                                        …the more I think about this the more I want it

                                                        1. 1

                                                          I found this one a while ago: https://www.monicahq.com/ (not affiliated)

                                                          It needs a lot more automation to become useful IMO.

                                                          1. 1

                                                            Thanks

                                                          2. 1

                                                            Why do you need this, if I my ask?

                                                            1. 1

                                                              Help me follow up with my friends and family

                                                          1. 1

                                                            As an aside, the random complexity classes have always been interesting to me. It feels somewhat wrong to be able to make use of randomness, kind of like how thermodynamics doesn’t let you extract useful work from pure entropy. In that vein, P=BPP wouldn’t be so surprising.

                                                            But at the same time, the random algorithms seem so much more elegant and straightforward than their deterministic counterparts. So then P=BPP would actually be a bit unexpected: the additional expressiveness buys exactly nothing.

                                                            1. 5

                                                              Regular Expressions are a great example of where Lisp/Scheme’s Code = Data approach shines. Take the Scheme Regular Expression SRFI, or Elips’ rx. The same could be done in other languages, but writing something like

                                                              RE urlMatcher = new SequenceRe(            // start matching an url
                                                                new RE.Optional("http://"),
                                                                 new RE.OneOrMore(
                                                                    new RE.Group("domain", 
                                                                       new RE.Sequence(new RE.OneOrMore(RE.Word), ".")
                                                                    )
                                                                 ),
                                                                 // etc. ...
                                                              )
                                                              

                                                              is far more cumbersome, even if it would allow you to insert comment, structure the expression, statically ensure that the syntax is valid.

                                                              1. 4

                                                                There are libraries that do this: https://github.com/VerbalExpressions

                                                                tester = (verbal_expression.
                                                                            start_of_line().
                                                                            find('http').
                                                                            maybe('s').
                                                                            find('://').
                                                                            maybe('www.').
                                                                            anything_but(' ').
                                                                            end_of_line()
                                                                )
                                                                
                                                                1. 3

                                                                  I still find

                                                                  (rx bol "http" (? "s") "://" (? "www.") (* (not (any " "))) eol)
                                                                  

                                                                  nicer, plus it evaluates at compile-time. But good to know that other languages are thinking about these ideas too.

                                                                  1. 6

                                                                    So the regexp this represents is this one I think?

                                                                    ^https?://www\.[^ ]+$
                                                                    

                                                                    I don’t know … I find the “bare” regular expression easier. Especially the Scheme/Lisp variant essentially uses the same characters, but with more syntax (e.g. (? "s") instead of s?). Maybe the advantages are clearer with larger examples, although I find the commenting solution as mentioned in this post much better as it allows you to clearly describe what it’s matching and/or why.

                                                                    1. 3

                                                                      This example is rather simple, but in Elisp, I’d still use it, because it’s easier to maintain. I can break the line where ever I want, insert real comments. Usually it’s more verbose, but in one case, I even managed to write a (slightly) shorter expression using rx, than a string literal because of escape-symbols:

                                                                      (rx (* ?\\ ?\\) (or ?\\ (group "%")))
                                                                      

                                                                      vs

                                                                      "\\(?:\\\\\\\\\\)*\\(?:\\\\\\|\\(%\\)\\)"
                                                                      

                                                                      but with more syntax (e.g. (? “s”) instead of s?).

                                                                      If that’s the issue, these macros usually allow the flexibility to choose more verbose keywords. The example from above would then become (combined with the previous points)

                                                                      (rx line-start
                                                                          "http" (zero-or-one "s") "://"	;http or https
                                                                          (zero-or-one "www.")		;don't require "www."
                                                                          (zero-or-more (not (any " ")))	;just no spaces
                                                                          line-end)
                                                                      

                                                                      Edit: With Emacs 27 you can even extend the macro yourself, to add your own operators and variables.

                                                                      1. 2

                                                                        Right; that example shows the advantages much clearer. It still looks kinda unnatural to me, but that’s probably just lack of familiarity (both with this method and Scheme in general; it’s been years since I did any Scheme programming, and never did much in the first place: my entire experience is going through The Little Schemer and writing two small programs). But I’m kinda warming to the idea of it.

                                                                        One way you can do this in other languages is by adding a sort of sexpr_regex.compile() which transforms it to a normal regexp object for the language:

                                                                        regex = sexpr_regex.compile('''
                                                                        	(rx line-start
                                                                        		"http" (zero-or-one "s") "://"	;http or https
                                                                        		(zero-or-one "www.")		    ;don't require "www."
                                                                        		(zero-or-more (not (any " ")))	;just no spaces
                                                                        		line-end)
                                                                        ''')
                                                                        

                                                                        And then regex.match(), regex.find(), what-have-you.

                                                                        Dunno if that’s worth it …

                                                                        1. 1

                                                                          It would be possible, but you’d lose the fact that rx and similar macros can be expanded and checked at compile-time.

                                                                          1. 2

                                                                            You already don’t have that in those languages anyway, so you’re not really losing much. And you can declare it as a package-level global (which isn’t too bad if you’re consistent about it) and throw an exception if there’s an error, so you’ll get an error on startup, which is the next best thing after a compile-time check. You can also integrate it in your test suite.

                                                                            1. 2

                                                                              Well Elisp does (Byte compilation), and some Schemes do too (but SRFI 27 couldn’t make use of it in most cases anyway).

                                                                    2. 2

                                                                      I agree – the problem with the chained-builder approach is that it’s shoe-horning this abstract idea of a regex through the lens of the syntax of a programming language. Designed from first-principles, a regex syntax is much more likely to look like the s-expression syntax.

                                                                      1. 2

                                                                        That is really pretty. Is this elisp?

                                                                        1. 1

                                                                          Yes.

                                                                        2. 1

                                                                          An aside about compile-time: There exists a C++ compile-time regex parser, surely one of the most terrifying examples of template metaprogramming ever. It’s also quite feasible to compile regexes at compile time in Rust or Nim thanks to their macro facilities.

                                                                          That LISP syntax is nice, though.

                                                                          1. 1

                                                                            Do you have a link to some site that describes the C++ parser?

                                                                            1. 1

                                                                              No, sorry, or I’d have given it. I just remember a video of a conference presentation, by a woman with a Russian name…

                                                                              Edit: a quick search turned up this, which looks familiar: https://youtu.be/3WGsN_Hp9QY

                                                                            2. 1

                                                                              That sounds very appealing to me!

                                                                        3. 1

                                                                          I program in Lua, and there, I use LPEG. There’s a submodule of LPEG that allows one to use a BNF-like syntax. Here’s the one I constructed from RFC-3986.

                                                                        1. 9

                                                                          This is a great idea for a post that I’ve wanted to write myself. Leaving aside trees, hash tables, and stacks, I probably used less than one “interesting” data structure/algorithm PER YEAR of programming. Some notable counterexamples are two kinds of reservoir sampling, regular languages/automata, random number generation, some crypto, and some info retrieval algorithms.

                                                                          One thing that sparked it is a obscure but long-running debate over whether dynamic programming interview questions are valid.

                                                                          I don’t think they’re valid. It’s mostly a proxy for “I got a CS degree at a certain school” (which I did, I learned dynamic programming in my algorithms class and never used it again in ~20 years.)

                                                                          I challenge anyone to name ANY piece of open source software that uses dynamic programming. Or to name an example in your own work – open source or not.

                                                                          I’ve tried this in the past and nobody has been able to point to a concrete instance. I think the best I’ve heard is someone heard about a professor who heard about some proprietary software once that used it.


                                                                          Related: algorithms used in real software (although this is certainly not representative, since compiler engineering is a subfield with its own body of knowledge):

                                                                          https://old.reddit.com/r/ProgrammingLanguages/comments/b22tw6/papers_and_algorithms_in_llvms_source_code/

                                                                          https://github.com/oilshell/blog-code/blob/master/grep-for-papers/llvm.txt

                                                                          Linux kernel algorithms:

                                                                          https://github.com/oilshell/blog-code/blob/master/grep-for-papers/linux.txt

                                                                          1. 10

                                                                            I challenge anyone to name ANY piece of open source software that uses dynamic programming.

                                                                            Git, or most reasonable implementations of “diff”, will contain an implementation of the Myers Algorithm for longest-common-subsequence, which is very dynamic-programmy.

                                                                            No concrete example for this one, but I know that bioinformatics code is full of dynamic programming algorithms for the task of sequence alignment, which is similar to diff — identifying a way to align two or more base sequences so that they coincide with the minimal number of changes/additions/deletions required to make them identical.

                                                                            1. 1

                                                                              Hm I’m familiar with that algorithm but I never thought of it as dynamic programming.

                                                                              Wikipedia does say it’s an instance of dynamic programming. Although when I click on the paper, it appears to contrast itself with “the basic O(MN) dynamic programming algorithm” (section 4).

                                                                            2. 8

                                                                              Since you mentioned dynamic programming, it’s worth pointing out that the name “dynamic programming” was chosen for political reasons, as pointed out in the history section of the Wikipedia article on dynamic programming. So I think it’s a really bad name.

                                                                              1. 1

                                                                                That’s funny, I remembered xkcd’s “dynamic entropy” comic, and it quotes the same passage:

                                                                                https://xkcd.com/2318/

                                                                                It also has a very interesting property as an adjective, and that is it’s impossible to use the word dynamic in a pejorative sense. Try thinking of some combination that will possibly give it a pejorative meaning. It’s impossible.

                                                                                LOL

                                                                                Agree it’s a terrible name… I would say it was chosen for “marketing” reasons

                                                                              2. 7

                                                                                I have thought about whether dynamic programming questions are fair to ask, and I ended up where you are: they are not.

                                                                                Dynamic programming was the one I struggled most in understanding and implementing correctly. And while there are semi-practical examples (like the backpack problem), I have not found any practical, day-to-day use cases on this.

                                                                                I had an argument with my colleague who asked this kind of problem, saying it’s basic knowledge. Turns out he did competitive programming and there, it is table stakes. But in practice, it just filtered for anyone who has learned and practiced this approach.

                                                                                I stay away from asking this, problems that need dynamic programming to solve.

                                                                                1. 4

                                                                                  I’m familiar with dynamic programming mostly from high-school competitive programming as well. Otherwise I can’t say I’ve encountered real-life problems where it occurred to me to use the technique.

                                                                                2. 8

                                                                                  I challenge anyone to name ANY piece of open source software that uses dynamic programming. Or to name an example in your own work – open source or not.

                                                                                  I’m literally about to implement something that could be classified as dynamic programming at work, which can be sumarrized as “computing a few simple statistics such as number of bytes changed for each directory in a large filesystem”. Dynamic programming is such a general term that it applies regularly if you want to use it.

                                                                                  1. 4

                                                                                    I’d like to see it. I don’t think dynamic programming is a general term.

                                                                                    In fact one time I participated in this argument (which was years ago so my memory is fuzzy), a former CS professor I worked with explained the difference between memoization and dynamic programming. A bunch of 10-20+ year programmers like myself went “ah OK”. Unfortunately I don’t even remember the difference, but the point is that most programmers don’t, because dynamic programming is very uncommon.

                                                                                    What you’re describing sounds like an obvious algorithm that anyone could implement, which is not the same as dynamic programming interview questions, or dynamic programming in competitive programing.

                                                                                    As other posters mentioned, competitive programming is the main place you see it outside of a CS class.

                                                                                    1. 2

                                                                                      It’s absolutely an obvious algorithm, so is most dynamic programming. That was sort of my point.

                                                                                      Can’t share the code unfortunately, but it’s just iterate over sorted list of file changes in reverse order and collect statistics as we go. Dynamic part comes from the fact that we can just look at the subdirectories of a dir (that we already have numbers for) instead of recursing into it.

                                                                                      1. 2

                                                                                        What you’re talking about could be called memoization, or it probably doesn’t even deserve that name. It’s just what a “normal” programmer would come up with.

                                                                                        That’s not the type of thing that’s asked in interview questions or competitive programming. The wikipedia page gives some examples.

                                                                                        Dynamic programming usually changes the computational complexity of an algorithm in some non-obvious way. There’s very little you can do recursing over a directory tree that doesn’t have a clear O(n) way to code it (e.g. computing statistics).

                                                                                        1. 7

                                                                                          I like Erik Demaine’s explanation, that problems where dynamic programming can be applied are ones where their subproblems and their dependencies can be modeled as a directed acyclic graph [1]. Up to you if you’d like to tackle that with a top down approach where you look at a node and calculate its solution based on the solutions of its ancestors, or a bottom up approach starting from the nodes in the DAG with no dependencies and propagate the solutions in topological order.

                                                                                          My colleague and I used it for a generalization of matrix chain multiplication (for tensors) [2].

                                                                                          [1] https://youtu.be/OQ5jsbhAv_M?t=2736

                                                                                          [2] https://github.com/TensorCon/ratcon/blob/release/opt/gencon/pathopt.py#L198

                                                                                          edit: by the definition above, even naive memoization can count as DP, if you’re caching solutions to subproblems of that structure. Doesn’t have to be at the difficulty level of competition to count as DP in that case.

                                                                                          1. 1

                                                                                            Hm interesting, searching through my personal wiki, I found a related definition which is different. I haven’t really thought about it enough to form an opinion.

                                                                                            Either way, it doesn’t change my overall point: that there are certain kinds of algorithms problems that appear on coding interviews, and in competititve programming, that do not show up in 99% of programming jobs. They are easy to pose and have cute solutions, but aren’t testing very much.

                                                                                            I think the “DP tables” part is key but again I haven’t thought about it enough …

                                                                                            https://blog.racket-lang.org/2012/08/dynamic-programming-versus-memoization.html

                                                                                            Memoization is fundamentally a top-down computation and DP is fundamentally bottom-up. In memoization, we observe that a computational tree can actually be represented as a computational DAG

                                                                                            In DP, we make the same observation, but construct the DAG from the bottom-up. That means we have to rewrite the computation to express the delta from each computational tree/DAG node to its parents. We also need a means for addressing/naming those parents (which we did not need in the top-down case, since this was implicit in the recursive call stack). This leads to inventions like DP tables, but people often fail to understand why they exist: it’s primarily as a naming mechanism (and while we’re at it, why not make it efficient to find a named element, ergo arrays and matrices).

                                                                                            This bottom-up / top-down distinction might have been the same as what the aforementioned professor said 5+ years ago, but I don’t remember exactly.

                                                                                            1. 1

                                                                                              So, is memorization of factorial top-down, or bottom-up?

                                                                                              1. 1

                                                                                                I would probably say neither… Either factorial or Fibonacci is so trivial that it doesn’t help to be thinking about it that way.

                                                                                                Though I think the quote hints at a clear test for whether it’s top-down or bottom-up: if you need extra state outside the call stack. I’m getting curious enough to try this out, but right now all I can do is quote what other people say.

                                                                                                In any case it’s clear to me that there’s some controversy over what dynamic programming really is. I think the issue is that a lot of algorithms could be framed that way but were not discovered that way, and not taught and learned that way.

                                                                                                1. 1

                                                                                                  I would probably say neither… Either factorial or Fibonacci is so trivial that it doesn’t help to be thinking about it that way.

                                                                                                  I think that the triviality is actually helpful here. If it’s actually true that memoization and dynamic programming are different (and there’s clearly debate on this), can 2 implementations of a trivial function, that everyone can understand highlight the differences?

                                                                                          2. 1

                                                                                            On the contrary the stupidly naive way (recurse on every directory) is O(n^2).

                                                                                            Dynamic programming is just solving a series of problems while using the answers of shared subproblems multiple times. Memoization is a common way to implement this.

                                                                                            Yes there are some very clever algorithms that use dynamic programming, this doesn’t make obvious algorithms that use dynamic programming not also fit under the definition.

                                                                                            1. 3

                                                                                              Why would recursing into every directory be O(n^2)? You’re still only visiting every directory/file once? It seems like something missing?

                                                                                              1. 1

                                                                                                Say you have a directory structure with a single file in it, /a/b/c/d/e

                                                                                                To get the number of bytes changed in e you need to visit e, then to get the number of bytes changed in d you need to visit d and then e, then for c you need to visit c, d, and e, and so on.

                                                                                                Like I said, it takes a really naive solution, but if you don’t remember the values you calculate anywhere for some reason it’s sum over inodes (depth of inode)… which is O(n^2) (for bad directory structures).

                                                                                                Note that I need these values for every directory, not just for one particular directory.

                                                                                                1. 2

                                                                                                  That’s still linear complexity space. Unless you’re hardlinking directories (which you then have to deal with potential recursion), it’s still O(n). If you add a file at /a/b/c/file you only visit 1 more file and no more dirs, not an exponential. O(n + n + n) or O(n + 3) still simplifies to O(n).

                                                                                                  1. 1

                                                                                                    If you add /a/b/c/file you add 4 more visits, not 1. Going from n= 3 /a/b/file to n=4 /a/b/c/file adds 4 more visits. In other words this worst case example takes time O(sum from 1 to n of i) = O(n(n-1)) = O(n^2).

                                                                                                    N is the number of inodes in a arbitrary tree not a number of files in a fixed tree.

                                                                                                    1. 1

                                                                                                      That’s still adding a linear number of operations for each file, the depth could technically be considered a different variable, say m. So for each file (n+1) you add, you also add the number of directory traversals (m) resulting in O(m+n), which simplifies again to O(n), but in reality folders are files too, so are part of n in the first place, so again O(n). Ultimately your n space is the total number of inodes, which both files and folders have.

                                                                                                      Abstractly, you’re just traversing a tree structure (or a directed graph if using links), which is well understood to be O(n) (maybe O(n^2) worst case if all folders are links, resulting in a fully connected graph), because you only visit each node once. If it were O(n^2), you would visit each node n times.

                                                                                                      Remember, Big O notation is about scaling, not the actual concrete number of operations, which is why you drop any constants or variables other than n.

                                                                                                      1. 1

                                                                                                        It’s O(mn) not O(m+n) (in the insanely naive algorithm that recalculate things every time).

                                                                                                        It’s not a single tree traversal but #internal nodes tree traversals.

                                                                                                        1. 1

                                                                                                          Even if it was O(mn) (it’s not), that still simplifies to O(n). An ‘internal nodes’ tree traversal is still O(n), n is just smaller, but again, your problem is not an internal nodes traversal, it’s a full traversal because you have to look at the blocks attached to the file (leaf) inodes, which means you need to read all inodes of all files and of all folders one time each. n = # of files + # of folders = O(n)

                                                                                                          1. 1

                                                                                                            I supposed an extremely naive solution could be to fully traverse each sub tree for every folder visited, which would be… O(log n)? But even that isn’t O(n^2), as the total repeated space shrinks the deeper you get.

                                                                                                            1. 1

                                                                                                              You’re assuming a balanced tree, which is not guaranteed. Depth of tree is O(n) in pathological cases (and average case O(sqrt(n)) is typical for randomly generated trees)

                                                                                                              1. 1

                                                                                                                Ah yeah, I think it would be O(n log n) not O(log n), because you traverse the tree once for each node, and a subset of of the tree for almost every n (except leafs), at least in the worst case. Still not O(n^2), and the solution for a O(n) is almost easier to conceptualize than the completely naive solution :)

                                                                                                                1. 1

                                                                                                                  and the solution for a O(n) is almost easier to conceptualize than the completely naive solution :)

                                                                                                                  No argument here…

                                                                                                                  I think it would be O(n log n)

                                                                                                                  We agree it’s O(n) * O(time tree search) now right? And you’re trying to call the tree search time log(n)? Because trees are height log(n)? Then see the post you replied to, that’s true in a balanced tree, it’s not true in a random tree (where it is sqrt(n)) and it’s definitely not tree in a pathological worst case (where a tree is just a n length linked list).

                                                                                                                  1. 2

                                                                                                                    Yeah, the part I was hung up on before was that you’re naive solution traverses the entire subtree below a node for each node visit, I was stuck in the simple optimal solution. For the pathological case, basically just a bunch of folders in folders with a single file at the bottom, the depth of the tree is n, and the file inode at the bottom would be accessed n times, so O(n^2). For the common case it would be about O(n log n) where you can skip traversing larger and larger parts of the tree the deeper you get on each ‘path.’ Thanks for the discussion, I enjoyed it :)

                                                                                          3. 1

                                                                                            I think comparing memoization to dynamic programming is a category mistake: they are different kinds of things.

                                                                                            ‘Dynamic programming’ is a description of a style of algorithm. It’s divide-and-conquer, usually with overlapping subproblems, making it possible to reuse intermediate results.

                                                                                            Memoization is a programming technique to remember intermediate results, by remembering the results of function calls. You can e.g. also store the intermediate results somewhere explicitly, usually in a matrix, in which case you don’t memoize the result ‘transparently inside the function’, but use a lookup table ‘external to the function that computed the result’.

                                                                                            1. 1

                                                                                              I dunno I find that in addition to the Racket language resource I gave elsewhere in the thread, lots of people compare them:

                                                                                              https://medium.com/@afshanf/cracking-the-coding-interview-questions-part-7-a7c8f834d63d

                                                                                              A note on terminology: Some people call top-down dynamic programming “memoization” and only use “dynamic programming” to refer to bottom-up work. We do not make such a distinction here. We call both dynamic programming.

                                                                                              There does seem to be disagreement on what dynamic programming is. And many algorithms that were not derived with dynamic programming techniques could be described as dynamic programming.

                                                                                              But it seems that most people agree it’s related to memoization.

                                                                                        2. 4

                                                                                          GCC uses dynamic programming to split IA-64 instructions into bundles.

                                                                                          1. 2

                                                                                            Thanks, nice example and link! Still I would say it’s a niche skill, especially to come up with from scratch in an interview.

                                                                                          2. 4

                                                                                            I challenge anyone to name ANY piece of open source software that uses dynamic programming. Or to name an example in your own work – open source or not.

                                                                                            Ever do video encoding or transcoding with anything built on FFmpeg or x264? Encode images with MozJPEG? Encode an AV1 video or AVIF image with libaom? Trellis quantization in advanced lossy compression encoders is a dynamic programming algorithm.

                                                                                            1. 3

                                                                                              Hm very interesting! I was not aware of that algorithm. Paper I found:

                                                                                              https://www.mp3-tech.org/programmer/docs/e00_9.pdf

                                                                                              I would still say it’s a bad interview topic, but it’s cool to see real world usages of it.

                                                                                              1. 2

                                                                                                Oh, no disagreement there! Even after coding it up myself, I’d hate to have someone ask me to whiteboard a working implementation of trellis quantization in 40 minutes or whatever (though I’m pretty sure I could sketch out an explanation now).

                                                                                                In general I’m not a fan of whiteboard coding exercises at all. Whenever I’ve interviewed candidates I’ve always preferred the old-fashioned method of just reading their resume well ahead of time, looking up what ever piques my interest on it, and then having a friendly conversation about that. Usually that provides plenty of esoteric material for me to quiz them on and it also lets them show me their strengths and enthusiasm.

                                                                                                1. 1

                                                                                                  My current company doesn’t do a whiteboard exercise, but my previous one did… but the thing is, the usual task was to implement a basic “grep”. That is, read a file and print all of the lines that contain a user-specified string, in a language of your choice, with whatever libraries make you happy (it’s not a trick, you’re not supposed to implement Boyer-Moore on your own). Assuming you succeeded at that, we would ask you to implement a few more features, like a -v flag (only print lines that don’t match), and -A and -B flags (print context lines before and after the matching line), until you got stuck or the time for that segment was up. It wasn’t graded on minutiae like correct semicolon placement, it was just an evaluation of whether a candidate could do a trivial task, how they handled additional requirements, whether they asked sensible questions and got clarification when needed, etc. I found it pretty reasonable.

                                                                                            2. 4

                                                                                              I challenge anyone to name ANY piece of open source software that uses dynamic programming. Or to name an example in your own work – open source or not.

                                                                                              I used Warshall’s algorithm (which is dynamic programming) to compute the transitive closure of a graph for a typechecker. This is, in my experience, a very common algorithm.

                                                                                              In high school, I wrote a program for my professor that places students into groups of 4 such that their meyers-briggs personalities are as different as possible. This used dynamic programming.

                                                                                              A professor of mine (who taught the subject to me) used dynamic programming for some kind of RNA sequencing problem in a paper he published. One of our homework assignments had us arrive at a watered down version of his (and his co-authors’) algorithm.

                                                                                              I’m fairly certain that at least some fuzzy string matching algorithms use string distance, which is also solved using dynamic programming.

                                                                                              These are all diverse applications of DP. In my personal, subjective experience, the idea that DP is in any way obscure or dated is absurd.

                                                                                              Edit:

                                                                                              To be more concrete, the “transitive closure of a graph” is for the graph of dependencies, computing the set of all functions that a particular function depends on. This is as described in the Haskell Report.

                                                                                              For fuzzy string matching, I have in mind something like fzf, though I cannot say with certainty that it uses string distance (I’m unfamiliar with its implementation).

                                                                                              Here’s the paper that I think I’m referencing: Statistical Mechanics of Helix Bundles using a Dynamic Programming Approach

                                                                                              1. 2

                                                                                                Thanks for the examples. The claim is definitely not that it’s outdated or obscure; the claim is that it’s not a good interview question because it doesn’t show up much at work. Although there were lots of people here who pointed out interesting uses of dynamic programming, that’s not incompatible with the idea that you could have a 10 or 20 year programming career and never use it.

                                                                                                Side note: I’m familiar with the Floyd Warshall algorithm but I never thought of it as dynamic programming. I think part of the issue is that I may have a more narrow definition of it than others. (I think people even say the linear time fibonacci is an example of dynamic programming, which I find silly. I just call that the obvious algorithm. I guess it can be used to illustrate a principle.)

                                                                                                Even so, I definitely think it’s more popular in universities, and certain domains like bioinformatics. In contrast to what people on this site typically do “for work”.

                                                                                              2. 3

                                                                                                I challenge anyone to name ANY piece of open source software that uses dynamic programming. Or to name an example in your own work – open source or not.

                                                                                                I do a lot of work with text processing – computing the edit distance between two strings is something I do very often. That’s a classic dynamic programming algorithm. There are probably hundreds of open source packages that do this or some variation thereof.

                                                                                                1. 3

                                                                                                  Just to add to the list of responses clearly demonstrating Cunningham’s Law:

                                                                                                  I believe the Knuth-Plass line-breaking algorithm used in LaTeX to lay out text “optimally” uses dynamic programming. This was done for efficiency, as opposed to using some kind of general global optimization routine. It’s also the reason why LaTeX doesn’t support “backtracking”.

                                                                                                  1. 2

                                                                                                    It’s also the reason why LaTeX doesn’t support “backtracking”.

                                                                                                    Sile uses a variant of the same dynamic programming algorithm to lay out paragraphs on a page. The original paper describing the algorithm says that TeX wanted to use it like that, but it would require more than one entire megabyte of state for a large document, which was infeasible.

                                                                                                    1. 1

                                                                                                      Definitely an instance of Cunningham’s law at work :) I should make another go for my pet problems:

                                                                                                      • it’s impossible to make a zsh-like interactive interface on top of GNU readline
                                                                                                      • you can’t make a constant-space linear-time model of computation that’s more powerful than regular languages, and that can parse shell/JS/Python/C++
                                                                                                      • you can’t make an extended glob to regex translator in less than a week (https://github.com/oilshell/oil/issues/192)

                                                                                                      Thanks for the example. If there were more specific links I would make a blog post out of this :)

                                                                                                      And although my comment was a tangent, it did motivate me to get out the “Algorithm Design Manual” and flip to the part on dynamic programming. Though I remember the applications in that book being interesting but seemingly far removed from what programmers do day-to-day. It seemed to be by a professor who consulted on algorithms for various companies, which is an interesting job!

                                                                                                    2. 1

                                                                                                      The Grasshopper routing library uses contraction hierarchies, which are implemented using Dijkstra’s shortest path algorithm and A* search, which are special cases of dynamic programming.

                                                                                                      I have to agree it’s not something most people will use every day, but it never hurts to have a general idea how it works.

                                                                                                      1. 1

                                                                                                        Here is a concrete example of Dynamic Programming that you use every day: Word Wrap. Knuth has an algorithm that is often used for maximizing the number of words per line.

                                                                                                        Also the field of Bioinformatics often uses the Levenshtein distance when matching two dna strands.

                                                                                                        Also I would like to mention the single most important thing I learned from Dynamic Progrmming: Start at the end case, and figure out what constraints can work from there. For example, think about the last recursion call, and what constraints it needs, and go backwards from there.

                                                                                                      1. 9

                                                                                                        I think I broke it, doesn’t seem to be able to handle this question:

                                                                                                        http://istheanswertothisquestionno.ynaas.com/

                                                                                                        1. 5

                                                                                                          Haven’t watched yet, but I don’t think that we are. Going to watch the vid shorly! =]

                                                                                                          Some of us are artists, some another flavor of assembly line worker. I think the first do it for a reason of passion for software while the second are only interested in making a living.

                                                                                                          Think about how the day feels to you. Do you really feel like you’re doing any real engineering when you’re writing software? I’d be surprised if so.

                                                                                                          1. 8

                                                                                                            Having worked alongside and very closely with mechanical, electrical, and chemical engineers, I would say that yes, writing software is an engineering discipline as well. The similarities are pretty striking in terms of problem decomposition, modularity, planning, creativity, and so on. Engineering is about building something from nothing, which all these fields have in common.

                                                                                                            1. 3

                                                                                                              I think that based on your analysis, most artists are engineers too.

                                                                                                              In my opinion, creating something from nothing is art, not necessarily engineering. Also, most software “engineers” maintain things that already exist. They aren’t creating something from nothing.

                                                                                                              1. 3

                                                                                                                I think that based on your analysis, most artists are engineers too.

                                                                                                                Not sure I would jump all the way there, but they do have some things in common. Engineering also involves all the things that I listed in the previous sentence.

                                                                                                                Also, most software “engineers” maintain things that already exist. They aren’t creating something from nothing.

                                                                                                                Have you worked with any mechanical, electrical, civil, or any other of the traditional engineers? It’s at least 10x worse for them. How many bridges are designed from scratch vs an iteration of a previous design? How many EEs do something more than mix and match pre-existing circuits? And so on.

                                                                                                                1. 3

                                                                                                                  How many bridges are designed from scratch vs an iteration of a previous design?

                                                                                                                  Writing a program is like building a bridge not like designing a bridge. Designing a bridge is like coming up with a new algorithm or improving an old one.

                                                                                                                  1. 1

                                                                                                                    Writing a program is like building a bridge not like designing a bridge.

                                                                                                                    I think this analogy is poor. If pressed to tell you what software process was like “building a bridge”, I would say installing software on a server - and we’ve gotten very good at making that fast and easy.

                                                                                                                    1. 2

                                                                                                                      Both analogies partly work and partly don’t. That’s why I made such an effort to talk to people with firsthand experience in both sides of the discussion.

                                                                                                            2. 2

                                                                                                              I like this, I’ve always thought of myself of an artist.

                                                                                                            1. 17

                                                                                                              “This is Rust, running at 60fps.” Immediately below that: “FPS: 18.5”

                                                                                                              Snark aside, is it really necessary to render every frame when there have been no I/O events? Maybe redraw only on mouse click / drag, or when the control under the mouse changes?

                                                                                                              1. 10

                                                                                                                No, it’s not necessary, but it does make it far, far easier to write the GUI and also easier to write the code using it as long as it doesn’t get too elaborate. The opposite model to an immediate mode GUI is “retained mode GUI”, where you build a data structure representing your GUI, and generally only redraw parts of it that change state; this model is used by Real Gui Libs(tm) like Qt, Cocoa and WinAPI. Imgui’s such as Dear Imgui or Nuklear are particularly popular in gamedev as debug interfaces, since the performance is generally Good Enough and it’s easy to throw them together and modify them.

                                                                                                                1. 17

                                                                                                                  Another reason why immediate GUIs are popular with games and other 3D visualization tools is that you’re typically dirtying your whole window on every frame and making frames as fast as you can anyway; they’re already pegging your CPU drawing the application, so pushing into that peg a little more for a nice UI library is fine :)

                                                                                                                  1. 6

                                                                                                                    You can have a mix though, where you write all your GUI as if it was immediate mode but then you just pause the render loop when nothing at all is happening.

                                                                                                                    1. 4

                                                                                                                      I find the modern chiasmus from retained mode UI (Win32 et al) and immediate mode graphics (OpenGL 1.x) to immediate mode GUIs (React) and retained mode graphics (modern GL) amusing.

                                                                                                                      1. 2

                                                                                                                        There’s nothing about immediate mode GUIs that requires them to render at 60fps in the absence of input events. Games do this because they naturally have a realtime render loop, but desktop applications don’t and there’s no reason to impose one.

                                                                                                                        1. 1

                                                                                                                          I think you can do immediate mode but then render that through HTML / CSS. You can then style from outside the code with CSS, and submit HTML immediate mode from your code. Keeps the game-code-friendly ergonomics. I was trying this out and it’s actually been kinda fun: https://twitter.com/snikhilesh/status/1275945505168584704?s=21

                                                                                                                          You can even reuse existing “web components” and whatnot that way. I think I’ll end up building the CSS up from scratch though cuz it’s fun. :D

                                                                                                                        2. 1

                                                                                                                          It runs at 75fps for me :)

                                                                                                                        1. 1

                                                                                                                          Older versions of Turbo Pascal had the exact same problem – something in its standard library would busy-loop on startup to calibrate some delay function. So any binaries it built would be unusable on faster hardware.

                                                                                                                          1. 2

                                                                                                                            I ran across ART a few weeks ago when I was idly wondering about data structures for text indexing, so I’m happy to see someone else thought it looked cool too. Are there any other common use cases for it?

                                                                                                                            As an aside, I also found MergedTries – seems like they could be pretty cool when used together, getting both prefix and suffix compression.

                                                                                                                            1. 1

                                                                                                                              In the new glossary, WAL is missing it’s id attribute, which means none of the references to it work.

                                                                                                                              1. 4

                                                                                                                                It’s there, just in another transaction. Wait for the commit. ;)

                                                                                                                                1. 3

                                                                                                                                  That joke was so bad my eyes did a rollback into my head

                                                                                                                                  1. 2

                                                                                                                                    LOL

                                                                                                                              1. 3

                                                                                                                                I don’t think it falls in the category of supercomputer, but it’s pretty good, and it’s not mine, but I have access to the server of the Federal University of Minas Gerais (UFMG):

                                                                                                                                RAM: 1 terabyte
                                                                                                                                LSCPU output:
                                                                                                                                CPU(s):                64
                                                                                                                                On-line CPU(s) list:   0-63
                                                                                                                                Thread(s) per core     2
                                                                                                                                Cores(s) per socket:   8
                                                                                                                                Socket(s):             4
                                                                                                                                NUMA nodes:            8
                                                                                                                                CPU family:            21
                                                                                                                                Model:                 2
                                                                                                                                Model name:            AMD Opteron(tm) Processor 6376
                                                                                                                                CPU MHz:               1400.000
                                                                                                                                CPU max MHz:           2300,0000
                                                                                                                                CPU min MHz:           1400,0000
                                                                                                                                

                                                                                                                                I use it for performing genome assembly and genome annotation, mostly.

                                                                                                                                Genome assembly: genome, when sequenced is fragmented, then we need to join the pieces, close the gaps checking for overlaps in the many fragments.

                                                                                                                                Genome annotation: annotate the assembled genome, finding where genes begin, where they end, finding intron and exon regions, finding binding regions.

                                                                                                                                1. 1

                                                                                                                                  What do the cool kids use for assembly and alignment these days? I worked in biotech ~10 years ago and I can only imagine things have changed considerably.

                                                                                                                                  1. 2

                                                                                                                                    For assembly: SPAdes, Flye, Canu, Velvet. With a 16 G of RAM, you might be able to perform assembly with some smaller datasets.

                                                                                                                                    For alignment, the same old tools are still in use: BLAST, Clustal, MUSCLE, DNAstar.

                                                                                                                                1. 1

                                                                                                                                  It looks like the bug is an out-of-bounds write to an array that was subsequently fixed.

                                                                                                                                  But:

                                                                                                                                  a lot of people liked the custom maps that utilized this mechanism to create some fun games, called “EUD Maps”, so much that a tool called EUDEnable was created to patch the game in memory and re-introduce the bug.

                                                                                                                                  …what crazy custom maps were people making that depended on writing to arbitrary addresses in memory?

                                                                                                                                  1. 2

                                                                                                                                    https://youtu.be/JhPdfz-IglQ Starts with some good examples.

                                                                                                                                    1. 1

                                                                                                                                      I’m mesmerised😲

                                                                                                                                      1. 1

                                                                                                                                        …that’s about as glorious and horrifying as I could imagine

                                                                                                                                    1. 1

                                                                                                                                      I have a hard time understanding why things like Django’s ORM are a win. They make simple queries possible and complex queries tedious at worst, as compared to just writing the SQL. My workflow for writing non-trivial queries is something like:

                                                                                                                                      1. Think about the query I want to write
                                                                                                                                      2. Figure out how to express said query in the ORM
                                                                                                                                      3. Look at the generated query to figure out why it’s slow and what I need to nudge to get the ORM to generate something reasonable

                                                                                                                                      I’d be curious to hear about success stories from people using these tools in large, non-trivial codebases.

                                                                                                                                      1. 1

                                                                                                                                        Two benefits that come to mind are

                                                                                                                                        1. Portability: using ORM reduces dependency in database engine. For products / projects that need to operate on multiple db engines using an ORM makes things much easier.

                                                                                                                                        2. Composability: using an ORM, and Django QuerySet as an example, you can easily compose queries. Think about a website with a list of filters. Some filters are on fields on the same tables, other on related tables that require a join, some filters are on expression fields etc. How would you implement that w/o an ORM? String interpolation? I think you’ll end up creating your own tiny ORM to add fields and joins lazily..

                                                                                                                                        As someone who is not bound by the first constraint, I often resort to RAW SQL in places the ORM falls short (example).