Threads for madhadron

  1.  

    This seems very much in the vein of what FFTW does for Fourier Transforms, though that’s with genetic algorithms.

    FFTW tunes on a per-CPU basis, though, while this is targeting LLVM, which is interesting. I am curious if the agent could do better by targeting the underlying CPU?

    1. 10

      So.. there’s a lot to unpack here.

      But the thing that really irks me is Douglas in the video said very vague assertions, which is fine for click-bait, but the author of this article is make a ton of assumptions of what those “bad foundations” are.

      I am not a Douglas Crockford stan.. but a good place to start if you want to see the world from his perspective is his programming language “E”.

      Another thing to pick apart, in the video he said “it may be the best language right now to do what it does, but we should do better.” That would imply we’re including all the modern languages in there.

      Now for my take:

      When we’re talking about application development, raw performance is the last characteristic that is interesting to me.

      And the things that are holding us back from writing massive understandable applications isn’t just stronger core data structures and destructors, these are baby steps. We need to go beyond programming based on procedures and data structures.

      1.  

        As far as I’m aware, Pony and Monte both follow in the tradition of E in one way or another.

        1.  

          raw performance is the last characteristic that is interesting to me.

          Using a lang like Rust means I never have to encounter a case where the language does not meet requirements due to performance (unless it’s need to drop down to optimize assembly or something and I just don’t think I’ll ever really need that). Even though I don’t “need” the performance most of the time, it is nice knowing that it’s there if a problem comes up. With Ruby if you hit an issue it’s “write a C extension” but then I write Ruby because I don’t want to write C.

          The other thing I think about is expressivity. If a language does not have semantics that allow me to express my low level performance requirements to the machine (so it can be optimized), what other kinds of things are hard to express?

          We spent decades trying to decouple logic “compute” from memory only to come full circle as it turns out the two are deeply intertwined.

          1.  

            I don’t organize my tech choices by the 1% bottleneck in my system, I guess the difference between us is I don’t mind writing a C/Zig/Rust extension if 99% of my code can stay in ruby. I think we could find new ways to solve the 2-language problem, but I don’t believe its as simple to solve people think. You cannot build rails in rust, and rails is still more boilerplate & restrictive than I’d prefer, the system I want doesn’t exist yet but I know it wouldn’t be able to be written in rust.

            1.  

              I guess what I was trying to express is that the two language problem isn’t about performance but rather expression (for me).

              I love that I have the capability to express to my computer “I want exclusive access to this input” (via a mutable reference) or that I can express “this function does not mutate its input” or “this function should not allocate”.

              I am a ruby core contributor and maintain syntax suggest. After about a year of writing rust code I ran into an internal mutation bug in syntax suggest (written in Ruby) that cost me a few hours of my life. In Rust it would be impossible (and the default) to prevent that kind of logic bug because the code wouldn’t have even compiled. Yes, that is the same limitation that allows for not needing a GC, but it also has benefits beyond performance.

              Im not advocating everyone using it for everything, but I’m saying you cannot avoid thinking about memory (in GC languages). It’s just a question of how much do you have to think about it and when.

              1.  

                That’s a good point, thanks for clarifying.

              2.  

                Are we sure that we can’t have both, though? For example, Common Lisp allows incremental typing which acts as hints to the compiler, and the mature compilers will use them to generate very efficient machine code if they can. The v8 VM and Hotspot JIT for the JVM both come from work on heavily optimizing Smalltalk and Self (see Strongtalk).

                1.  

                  I do think we can have both, I like the approach I see from Red with Red and Red/System.

                  I’ve been imagining in my own language a high level DSL to generate C code to interface into, but with analysis done to generate safe C and make it highly interoperable.. maybe a pipe dream, but I do think there’s a lot of unexplored space here.

              3.  

                Using a lang like Rust means I never have to encounter a case where the language does not meet requirements due to performance (unless it’s need to drop down to optimize assembly or something and I just don’t think I’ll ever really need that)

                C is faster than rust though, just so everyone knows

                1.  

                  Asm is faster than C though, just so everyone knows

                  1.  

                    I don’t think that’s necessarily the case. Why do you think that C is faster?

                    It’s a very broad topic with a lot of nuances but let me share a few short points.

                    On one hand Rust design enables fearless concurrency. Clear ownership makes it easier to write concurrent code and avoid costly synchronization primitives. Stricter aliasing rules give more optimization opportunities (modulo compiler backend bugs).

                    On the other hand, there are programs which are easy to express in C but writing them in Rust is painful (see Learn Rust With Entirely Too Many Linked Lists). The cost of array bounds checking is probably non-zero as well (although I haven’t seen a good analysis on this topic).

                    1.  

                      The cost of array bounds checking is probably non-zero as well (although I haven’t seen a good analysis on this topic).

                      Will this do? https://lobste.rs/s/yibs3k/how_avoid_bounds_checks_rust_without

                      1.  

                        Marketing language like “fearless concurrency” tells us nothing about what Rust design enables or is good for. I’ve never been scared of concurrency, just annoyed by it. What practices or features does Rust afford the programmer that improves their experience writing concurrent code? This is something I haven’t yet heard.

                        This Rust book explains it in meaningful terms: https://doc.rust-lang.org/book/ch16-00-concurrency.html

                  2.  

                    It’s important to note that Crockford, Mark Miller, and other E veterans deliberately steered ECMAScript to be more like E. From that angle, Crockford is asking for something better than ECMAScript, which is completely in line with the fact that “E” should be more like Joe-E, then E-on-Java, then Caja, then Secure ECMASCript…

                    1.  

                      When we’re talking about application development, raw performance is the last characteristic that is interesting to me.

                      There was a recent story about how Mojo takes Python and adds a sub-language for optimization. It seems like a similar approach would be great for JavaScript. WASM is great and all, but it suffers from the two language problem, but worse because of how sandboxed WASM is.

                      1.  

                        Mojo doesn’t appear aimed at application development, their headline is “a new programming language for all AI developers.”

                        1.  

                          Though “for AI” is everyone’s tagline at the moment. I’m not sure I’d read too much in it. I just saw AWS advertising on my IAM login screen that I should store vectors in RDS Postgres with pgvector because AI.

                        2.  

                          That is a problem with WASM, but the funny thing is that it started exactly like that – asm.js was a subset of JavaScript that the browser JITs could optimize to native performance. And that became WASM, which reduced the parsing burden with a binary code format.

                          The reason the subset approach works for Mojo is because it’s a language for machine learning. It’s meant for pure, parallel computation on big arrays of floats – not making DOM calls back to the browser, dealing with garbage-collected strings, etc.

                          The Mojo examples copy arrays back and forth between the embedded CPython interpreter and the Mojo runtime, and that works fine.

                          WASM also works fine for that use case – it’s linear memory with ints and floats. It would be slower because there’s no GPU access, and I think no SIMD. But you wouldn’t really have the two language problem – the simple interface works fine.

                          Machine learning is already distributed, and JavaScript <-> WASM (+ workers) is basically a distributed system.

                          1.  

                            asm.js was a subset of JavaScript that the browser JITs could optimize to native performance. And that became WASM, which reduced the parsing burden with a binary code format.

                            Wasn’t AssemblyScript the answer to the loss of asm.js? That was my understanding, but maybe I’m wrong.

                            1.  

                              I don’t think AssemblyScript could fill that niche since it is both not a superset of JavaScript and compiles entirely to WASM.

                              1.  

                                I don’t understand. asm.js was a strict subset of JavaScript designed to be a compilation target. AssemblyScript is, supposedly, “designed with WASM in mind,” so presumably spiritually successive, given asm.js inspired WASM?

                                1.  

                                  AssemblyScript is not any sort of successor to asm.js. It’s a programming language (not a compilation target) that uses TypeScript syntax and compiles to WebAssembly.

                        3.  

                          Urks?

                          1.  

                            Language confuses me, I spell phonetically - I believe I was looking for “irk”, updated

                        1. 7

                          Based on the recommendation of a coworker at the time who I respect greatly, I spent a week in good faith trying to use ChatGPT 4 as a programming assistant. I wasn’t trying to give it gotchas to confuse it or asking it to do algorithmic heaving lifting. I was writing a system that was mostly plumbing (prototype of using a Temporal.io workflow as a CI system, posting to GitHub Checks as the output — that deserves a writeup of its own, it turned out to be really nice).

                          When I was writing the GitHub Checks API I asked it to generate some code for me. It produced a hunk of code. I asked it to rewrite it using a library in Go that exists for the purpose. It did that. I asked for a few more variations, and by chance the necessary security checks showed up in one of the code outputs. In the end I cut and paste probably twenty lines from different versions, but basically it was a way of generating code examples of varying quality. So it probably saved me some time in that it gave me the names of things I needed to search for and some basic examples, but it was basically a bad documentation generator.

                          When I tried to get help with writing the Temporal.io workflow, it choked entirely. It couldn’t produce even fragments of code that were useful.

                          I kept trying to feed it various subtasks of the workflow, such as cloning the git repository. It never produced output that I could use more than a few fragments of, it did continue to operate as a bad documentation and example generator. I am an experienced enough programmer and knew enough about the systems I was working with that I could look at that output and say, “Obviously that can’t be right,” and research what would be correct. But if you have someone who doesn’t have a couple decades of experience to course correct, the tool may be worse than useless.

                          Over the course of a week it probably saved me a couple hours on tasks. It felt like it imposed high cognitive burden at the same time, so I’m not sure if it cost me a similar amount of time in context switches spent. And if I were working with systems with excellent documentation, it would have been useless. I have felt no inclination to reach for it again since that week.

                          1. 1

                            IMO programmers who think that AI cannot help them aren’t being creative enough in how they use them. I don’t use ChatGPT to write whole programs for me, I use it for things like getting implementation details of third party libraries.

                            1. 4

                              Yes, but vice versa, I think for most programmers it’s not even a 10% improvement in productivity. It’s an occasional two hour task cut down to 10 minutes of back and forth with the bot.

                              1. 6

                                …followed by 90 minutes of going out to confirm what the bot said.

                                1. 5

                                  What makes it good for CSS is that you can instantly see that it’s completely full of crap and not working at all. For tasks without clear testing conditions, it’s very dangerous, e.g. the insecure POSTing on Github’s Copilot demo page.

                                2. 5

                                  I’ve found it really variable and I can easily see people considering it a complete game changer or a total waste of time, depending on where their day-to-day work falls on the spectrum of things I’ve tried.

                                  For knocking together some JavaScript to do something that’s well understood (and probably possible to distill from a hundred mostly correct StackOverflow answers), it’s been great. And, as someone who rarely writes JavaScript, a great way to find how some APIs have changed since I last looked. Using a LLM here let me do things in 10 minutes that would probably have taken a couple of hours without. If you are working in a space where a lot of other people live but you typically don’t, especially if you jump between such spaces a lot and so don’t have the time to build up depth of expertise, it’s a great tool for turning breadth of experience into depth on demand.

                                  I tried it for some things in pgfplots, a fairly popular LaTeX package (and therefore part of a niche ecosystem). It consistently gave me wrong answers. Some were close to right and I could figure out how to do what I wanted from them, a few were right, and a lot were very plausible-looking nonsense). For fairness, I used DuckDuckGo to try to find the answer while it was generating the response. In almost all cases, I was about the same speed with or without the LLM if I was able to solve the problem. For some things I was unable to solve it at all (for example, I had a table column in bytes and I wanted to present those numbers with the base-2 SI prefix - Ki, Mi, and so on - and I completely failed). I probably wasted more time with plausible-but-wrong answers here than I gained overall because I spent ages try to make them work where I’d probably have just given up without the LLM. If you’re doing something where there’s a small amount of data in the training sets then you might be lucky or you might not. I can imagine a 10% or so improvement if the LLM is fast.

                                  I’ve also tried using it to help with systems programming tasks and found that it routinely introduces appalling security holes of the kind I’d expect in example code (which routinely omits error handling and, particularly, the kind of error handling that’s only necessary in the presence of an active attacker). Here, I spent far more time auditing the generated code than I’d have spent writing it from scratch. This is the most dangerous case because, often, the code it generated was correct when given valid input and so non-adversarial testing would have passed. Writing adversarial tests and then seeing that they failed and tracking down the source of the bugs was a huge pain. In this scenario, it’s like working with an intern or a student, something that you would never do to be more productive, but to make them more productive in the longer term. As such, the LLM was a significant productivity drain.

                                  1. 4

                                    I find that llms really shine when you give them all the context needed to do their task, and rely on some „grammatical“ understanding they learned. Relying on their training corpus somehow being qualitatively good enough to generate good code is a crapshoot and indeed a proper timeline. But, asking it to rewrite the one written out unit test to test 8 more edge cases I specify? Spot on. Ask it to transform the terraform to use an iterative and a variable instead of the hardcoded subnets? Right there. I like writing the first version, or designing the dsl that can then be transformed by the llm. You don’t see many of these approaches around, but that’s where all the stochastical underpinnings really work. Think of it as human language driven dsl refactoring. Because it’s output will be quite self consistent, it will often be „better“ than what I would do because my stamina is only so large.

                                    I do use llms to generate snippets of code and have a pretty good flair for „ok this probably doesn’t exist“, but even then, I do get proper test scaffolding and maybe a hint of where to look in the manual, or even better, what api I actually should implement. It’s a weird thing to explain without showing it. I was very skeptical of using llms to learn something (in this case, eMacs and eMacs lisp) where I don’t know much and I knew the training corpus would be haphazard, but it turned out to be the most fun k had in a long time.

                                3. 2

                                  I think honestly it would sell me if it instead of trying to give me the answer, it would provide me links to various sources that should help me out.

                                  “Maybe you should check out pages 10-15 of this paper.” or “This article seems to achieve part of your goal [x], and this one shows how to bring them together [y]”

                                  The problem is it assumes it can give me an answer better than the original source, and while sometimes that’s true, its often not.

                                  I’m sure I could learn to prompt it in a way that would give me these types of answers though..

                                1. 1

                                  This kind of thinking is why I think classes were an early mistake in object oriented programming. Of course is-a and has-a are insufficient. It’s like trying to write Prolog with only two predicates. And, yes, there are failed schools of object oriented design that tried to do that, and they’re still taught in some academic settings.

                                  We would serve students better if we started with Self or JavaScript (without classes): conjure up the objects you want. Don’t try to classify them. Learn to work with them directly. Later start talking about static analysis of such programs. Even in Java I tend to define interfaces and reify up anonymous implementations of them as needed instead of defining classes for most things.

                                  1. 2

                                    In my experience people don’t know where their data sources are. You can’t ask them and walk the tree to get total knowledge.

                                    Sure they know the big ones. But those are not the ones that’ll get you paged at 2 am when a mutation broke the service you didn’t know existed.

                                    1. 4

                                      Yes, and I think that this is the best way. Outage happens, you resolve it, write a postmortem, add the newly-discovered data dependency to the list. Now you have some potential energy to improve it. Several postmortems later you’re in a better place.

                                      1. 3

                                        This ties into a “you’re probably gonna need it” that I feel I should write for infrastructure. One of the things is ownership: a single catalog of all resources with a team attached. It’s not only for outages. It’s for reorgs (what have we just inherited?), data compliance (we have to delete this everywhere…where is everywhere?), cost control (who is spending half a million a day on this ElasticSearch cluster and why?), and a bunch of other stuff.

                                        1. 1

                                          I agree, and also this sort of record-keeping is a core prong of GDPR, so anyone who does business in Europe or with EU citizens really ought to be thinking about this stuff.

                                          I’ve been kind of upset that there isn’t a ton of public discussion of the value that data governance records can have operationally, to engineers. It seems like everyone is treating it as just something for the lawyers to care about.

                                      1. 1

                                        Another option: run it on a local machine on your internet connection, and use Cloudflare as a proxy to it. Then you can run the much nicer development environment of processes on a Linux machine as opposed to wiring stuff together across the internet.

                                        1. 21

                                          I’ve been using Migadu for a few years now; they’re great. The best thing about them is that I always get very quick replies from their support teams when needed.

                                          1. 6

                                            The pricing looks great for individual use, but I’m a little concerned about the limits for incoming and outgoing mail.

                                            Are you on the Micro plan? Have you ever exceeded the limit?

                                            1. 10

                                              I’m on the Micro plan and never come close to the limits. YMMV.

                                              1. 5

                                                Me too, and I am subscribed to a few mailing lists

                                              2. 2

                                                I host the email for a few dozen accounts on their largest account, and it works smoothly. The webmail is okay. No calendar integration or the like, which was a pain point for a few of the users when I migrated from an old GMail service when Google decided to start charging for it.

                                                Their support really is excellent.

                                                1. 1

                                                  This might not be what you’re looking for but they do have basic CalDAV support. No web interface for this though.

                                                  1. 1

                                                    I wonder what caldav server they use.

                                                    1. 3

                                                      I believe it’s sabre/dav.

                                                2. 1

                                                  Best thing to do if you’re considering it is to look at how much mail you’ve sent and received in previous months. I think there’s a half-decent Thunderbird add-on that’ll summarise that information for you if nothing else.

                                                  Also, if you’re keen on moving but are pushing the limit on sends then remember that there’s no reason you need to always use their SMTP service! I often use my ISP’s (sadly now undocumented) relay and never had any bother.

                                                  1. 1

                                                    I had to bump up to the mini plan to accommodate a family member’s small business running under my account. It was painless.

                                                    1. 1

                                                      Yes, I’m on the Micro plan, and I haven’t come close to the limits. If there were one day when I exceeded the limits I’m sure they wouldn’t mind; if I had higher email flux in general though I’d be happy to pay more.

                                                    2. 1

                                                      I just started using them for some things, and the amount of configurability they give you is crazy (in a great way, that is). I’m going to move all of my mail hosting to them some day.

                                                      1. 1

                                                        only bad thing is their web portal - it doesn’t remember logins and the search is slow / disfunctional

                                                        otherwise i love migadu and will always sing its praises

                                                        1. 1

                                                          As in the web mail interface? I assumed that was more of a toy/demo since I use IMAP.

                                                      1. 10

                                                        Does no one else remember the period when everyone was going to be using tablets and similar devices and we had to redesign the desktop environments for that? My memory is that we got Windows 8’s UI and GNOME 3 from that push.

                                                        1. 9

                                                          We remember. We’re trying to forget.

                                                          1. 3

                                                            You say that as if it were misguided. Almost everyone does use oversized smartphones and/or tablets and many, many people own no other devices of any kind.

                                                          1. 6

                                                            I spent a couple of weeks trying to use GPT4 heavily, in a context where lots of people were sharing tips for how to get value out of it. And, in the end, it just wasn’t very useful to me. Can it produce code? Yes. But I’ve spent a lot of time learning how to remove unnecessary pieces, and its usefulness in producing code seems to increase dramatically as you ask it for code that is in a piece I just wouldn’t write in the first place.

                                                            I may just be a weird outlier (it wouldn’t be the first time), but I cancelled my subscription. I can find many more useful things to do with $20/month, and that’s before we get into the issues around intellectual property, data governance, etc.

                                                            1. 20

                                                              “Is it possible for a peer-reviewed paper to have errors?”

                                                              My sweet, sweet summer child. You are vastly overestimating the competence of this species.

                                                              1. 12

                                                                The standard pre-publication peer review process is only intended to reject work that is not notable or is fairly obviously bullshit. It’s not intended to catch all errors.

                                                                Post-publication review (in the form of other scholars writing review papers, discussing the work, trying to replicate it, challenging it, etc) is where the academic process slowly sifts truth from fiction.

                                                                1. 1

                                                                  Expanding: reviewers certainly can and do catch minor errors — but it’s not their primary job, and they generally don’t have the time to be very thorough about it.

                                                                2. 6

                                                                  I am definitely a summer child. Academically I wasn’t fortunate enough. I have worked through a few papers, but this was the first time I encountered a publication error.

                                                                  When I shared this story with a few, they were surprised that I didn’t know paper could have errors. I simply didn’t know because no one had told me till now!

                                                                  1. 4

                                                                    We’ve all been sweet summer children at different times in different walks of life.

                                                                    Before I’ve had any insight into the details through friends and colleagues, I, too, had illusions of academic publishing being this extremely rigorous process, triple- and quadruple-checked and reproduced before publishing. Discovering the fallibility of authors, peer reviewers and publishers was both mildly heartbreaking and extremely relieving. :)

                                                                    It really isn’t something they teach you in school — or at least they haven’t in any of the ones I’ve attended.

                                                                    1. 4

                                                                      Pre-publication peer review is a process where papers are filtered via a set of biases. Papers are as likely to be rejected for not being the right style for a venue as for technical content. At this stage, o one tries to reproduce results and often will not check maths (my favourite example here is the Marching Cubes algorithm, which was both patented and published at the top graphics conference, and is fairly obviously wrong if you spend half an hour working through it. The fix is simple but the reviewers didn’t notice it).

                                                                      After publication, 90% of papers are then ignored. The remaining 10% will have people read them and use them as inspiration for something. An even smaller fraction will have people try to reproduce the result ps to try to build on them. Often, they discover that there was a bug. When we tried to use the lowFAT Pointers work, for example, we discovered that the compression scheme was described in prose, maths, and a circuit in the original paper and these did not quite agree (I think two of them were correct). For a tiny subset of papers, a lot of things will be built on the result and you can have confidence in them.

                                                                      The key is to think of academic papers as pretentious blogs. They might have some good ideas but until multiple other people have reproduced them that’s the most that you can guarantee.

                                                                      It was sobering for me to attend ISCA back in 2015, when there was a panel on the role of simulation in computer architecture papers. I expected the panel to say ‘don’t rely on simulators, build a prototype’. The panel actually went the other way and said some abstract model was probably fine. This was supported by the industry representative (I think he was from NVIDIA), who said that they look at papers for ideas, they skip the results section entirely because they don’t trust them: the error margins are often 50% for a 20% speed up, so unless they’ve internally reproduced the results on their own simulation infrastructure they assume the results are probably nonsense.

                                                                      1. 2

                                                                        the error margins are often 50% for a 20% speed up, so unless they’ve internally reproduced the results on their own simulation infrastructure they assume the results are probably nonsense.

                                                                        To be fair to simulators, I suspect industry has to ignore the results sections even on papers which do have implementations. So experimenting on a simulator is reasonable because it’s cheaper.

                                                                        Say I have an idea for making arithmetic faster. I implement an ALU with my idea (the experiment) and an ALU without it (the control) and compare performance. If my control has a mistake in it which tanks its performance, the experiment will look great by comparison.

                                                                        I have seen people ranting about precisely this problem with literature on data structures and algorithms here on lobsters. That’s largely solved by making sure you benchmark against an industrially relevant open source competitor, but those largely aren’t available in the hardware space?

                                                                    2. 2

                                                                      Indeed. This was just a typo, too. Wait until the author gets deep into the literature and starts finding stuff that is just outright wrong. By the end of grad school, I assumed that any biology paper in Nature, Science, or Cell was probably flawed to the point of being unusable until proven otherwise.

                                                                      1. 5

                                                                        At the end of high school, you believe you know everything.

                                                                        At the end of college, you believe you know nothing.

                                                                        At the end of a PhD, you believe nobody knows anything.

                                                                        (No clue, where I got it from)

                                                                        1. 2

                                                                          After five years in industry, you start assuming active malice…

                                                                          (edit: Software developer, for context. I’m not talking about papers so much as the garbage that makes up the modern ecosystem.)

                                                                          1. 3

                                                                            We each choose whether to be that kind of person.

                                                                            1. 2

                                                                              Or, just maybe, the system of peer review that was instituted when there were less researchers has been overwhelmed by the massive increase in researchers who also have a direct economic interest in publishing.

                                                                              Anyway, peer review is just a basic step to tell you a paper isn’t entirely worthless. It states conclusions and presents evidence that’s not totally unbelievable. The real science starts when (like in the linked post) someone tries to reproduce the findings.

                                                                          2. 2

                                                                            Or if not wrong then incomplete, vague, or low-key deceptive. Frankly, the format rather sucks and we should do better. The failures are often a lot more interesting and useful, but the successes are what get reported. So anyone who wants to actually reproduce the work has to re-tread all the failures from scratch, again.

                                                                            1. 6

                                                                              The researchers that I respect often have fascinating blogs. I am aware, for example, that @ltratt publishes some papers to keep the REF and his funding bodies happy. Occasionally I might even read one. I’ll read everything that he writes on his blog though. If the REF actually measured impact, they’d weigh the blog far more heavily than a lot of publications.

                                                                        1. 10

                                                                          While this is true to some extent, it’s important to note that when you type on a keyboard, you are also essentially predicting and adding the next word.

                                                                          I’m genuinely offended by this. I try to synthesise something closest to my desired result using words in order. The only “predicting” is that the next word which comes to mind is most useful for that synthesis.

                                                                          1. 19

                                                                            It’s fascinating that so many of the folks who get into these models seem to think that this is how language works, as if language were just a statistical pattern isolated from behavior and environment and our other mental faculties.

                                                                            1. 13

                                                                              It’s the result of taking the positivist, behaviourist approach of “we’re all biological deterministic computing machines” (which is a fair-enough aspect for a model of human behaviour) with the classic tech-bro attitude of not having understood the problem domain, but posturing just enough to convince people you know what you’re talking about

                                                                              1. 13

                                                                                It’s the result of taking the positivist, behaviourist approach of “we’re all biological deterministic computing machines” (which is a fair-enough aspect for a model of human behaviour) with the classic tech-bro attitude of not having understood the problem domain, but posturing just enough to convince people you know what you’re talking about

                                                                                Actually, no. I wrote it the way I did for a specific reason; it was designed as a jab at people who don’t realize that the lowlevel behavior of predicting the next token doesn’t really mean that there are not higher level abstractions, or processes behind that. Which is quite common, especially among the people who have some knowledge of markov chains and other similar outdated techniques of generating text (“it just completes the text it saw somewhere” response I’ve got many times while discussing LLM’s with people). My hope was that it will catch their attention and make them think about the next part of the blog, instead of just dismissing it.

                                                                                1. 6

                                                                                  Only a few folks appreciate that predicting the next word accurately requires a healthy measure of ‘thinking ahead’.

                                                                                  Sure it comes out one word at a time, that’s probably why it can accurately model real writing, where the cursor is at the end, and doesn’t jump around everywhere (at least on a first draft, and a first draft by an excellent draftsman is the only draft.)

                                                                                  Don’t let them get to you.

                                                                                  1. 4

                                                                                    markov chains

                                                                                    Is an LLM a markov chain or not?

                                                                                    edit: I believe that it is a markov chain. technically.

                                                                                    1. 3

                                                                                      Depends on the architecture. GPTs might not count, but RWKV does. You’re right to check for the Markov property directly.

                                                                                      1. 2

                                                                                        just thinking about a basic decoder only autoregressive LLM. I’m certain that it is markov but googling around people seem to strongly believe that it is not based on personal desires. I do not understand why one would be emotionally attached to LLMs not being markov

                                                                                      2. 1

                                                                                        Well, on the token level, I think you could argue that it has some similarities in how the temperature works, but apart from that, I wouldn’t say so.

                                                                                        1. 1

                                                                                          Absolutely not, because they have a context window and because the user inserts steps. What they generate depends on your prompts and what they previously generated in that session.

                                                                                          1. 3

                                                                                            You’re telling me that an LLM is not a Markov Chain (definition 1.2)? Why is it not?

                                                                                            1. 4

                                                                                              Because the indices of its ‘transition matrix’ don’t deterministically correspond to states in its (reduced) state space. Even at a temperature of 0 it won’t always generate the same completion twice.

                                                                                      3. 6

                                                                                        not having understood the problem domain, but posturing just enough to convince people you know what you’re talking about

                                                                                        There’s plenty of that in all levels of academia and industry, let’s put away the tired old meme of the tech bro.

                                                                                        1. 2

                                                                                          I would posit the differentiating factor is tech bros move fast to make money and “hustle”, for better or for worse. Off-topic though.

                                                                                      4. 3

                                                                                        From the human perspective, what else is there? From the memetic perspective, of course, memes are evolving in a way akin to the RNA world (a selfish meme theory), but I’m not sure if there’s more to it than statistical mechanics and biological evolution.

                                                                                        1. 2

                                                                                          How does language actually work?

                                                                                          1. 2

                                                                                            Is this a rhetorical question?

                                                                                            Edit to clarify, I am genuinely curious.

                                                                                          2. 1

                                                                                            I’m not an expert in LLMs, but having read what I’ve read about them, the conclusion I’ve drawn is that “statistically predicting the next word” is a good description of their training process, but it’s not actually a good description of how they work. We don’t really understand how they work on a very deep level. From probing and whatnot, it seems like they’re building world models in there, but who knows. It’s like saying human brains are just “trying to reproduce their genes.” That is how the brain evolved, but it’s not what brains do.

                                                                                            1. 3

                                                                                              it’s not actually a good description of how they work

                                                                                              It is factual? if not, is there is something factually incorrect about this description?

                                                                                              1. 1

                                                                                                How they work seems to involve some kind of world model having been encoded into the system by the training: https://thegradient.pub/othello/.

                                                                                                1. 2

                                                                                                  I’m not an expert either, but it doesn’t seem to me your link contradicts the “they just predict the next word” assertion. It might be that the best way to predict the next word involves learning a superficial world model during training, but the way the text is constructed is still by picking the most likely word.

                                                                                          3. 5

                                                                                            This pretty much reflects my personal experience. I offload most of the process of typing to my autonomous nervous system, which (for most words) handles the spelling and predicts the next word. If my attention moves to something else, I don’t immediately stop typing but I do stop making sense after a few words. I can’t imagine how slow and painful it would be if I had to use the bit of my brain that I use for conscious thought for this.

                                                                                          1. 5

                                                                                            Another good resource:

                                                                                            Formal Algorithms for Transformers https://arxiv.org/abs/2207.09238

                                                                                            It contains psuedocode for the most used algorithms, with clear input and output annotations.

                                                                                            1. 1

                                                                                              they really woke up and chose to go with 1 indexed arrays 💀

                                                                                              thanks though, this is a fantastic document. I’ll add it to my list of materials.

                                                                                              1. 3

                                                                                                :shrug: 1-indexing is the standard in mathematical notation. It has some advantages in terms of terseness. The main advantage of zero indexing, IMO, is machine empathy and performance. Neither of those are relevant for psuedocode.

                                                                                                1. 2

                                                                                                  I found the opposite: when I was doing large calculations involving indexes, 0-indexing produces much terser mathematics. It’s just habit that most mathematicians use 1-indexing.

                                                                                            1. 40

                                                                                              If all the author needed was a blog, maybe the problem is that his tech stack is way too big for his need? A bunch of generated HTML file behind a Nginx server would not have required this amount of maintenance work.

                                                                                              Is the caching of image at the edge really necessary? So what if it take a little while to load them. Just by not having to load a front end framework and making 10 API call before anything is displayed, the site will already load faster than many popular site.

                                                                                              If the whole point is to have fun and learn stuff, the busy work is the very point of course. Yet all this seems to be the very definition of non value added work.

                                                                                              1. 13

                                                                                                At the end he says

                                                                                                I know that I could put this burden down. I have a mentor making excellent and sober use of Squarespace for his professional domain - and the results look great. I read his blog posts myself and think that they look good! It doesn’t have to be like this. […]

                                                                                                And that’s exactly why I do it. It’s one of the best projects I’ve ever created.

                                                                                                So I think the whole point is to have fun and learn stuff.

                                                                                                1. 7

                                                                                                  Inventing your own static site generator is also a lot of fun. And because all the hard work is done outside the serving path, there’s much less production maintenance needs.

                                                                                                  1. 14

                                                                                                    Different people find different things fun

                                                                                                    1. 1

                                                                                                      IMO if you do it right, inventing your own static site generator is only fun for about half a day tops. Because it only takes a couple hours. :)

                                                                                                      1. 2

                                                                                                        Not if you decide to write your own CommonMark compliant Markdown parser :]

                                                                                                        1. 2

                                                                                                          Pandoc is right there.

                                                                                                          1. 1

                                                                                                            I’ve been seriously considering dropping Markdown and just transforming HTML into HTML by defining custom tags. Or finally learning XSLT and using that, and exposing stuff like transforming LaTeX math into MathML via custom functions.

                                                                                                    2. 9
                                                                                                      • Node.js or package.json or Vue.js or Nuxt.js issues or Ubuntu C library issues
                                                                                                      • CVEs that force my to bump some obscure dependency past the last version that works in my current setup
                                                                                                      • Debugging and customizing pre-built CSS frameworks

                                                                                                      All of these can be done away with.

                                                                                                      I understand that the point may be to explore new tech with a purposefully over-engineered solution, but if the point is learning, surely the “lesson learned” should be that this kind of tech has real downsides, for the reasons the author points out and more. Dependencies, especially in the web ecosystem, are often expensive, much more so than you would think. Don’t use them unless you have to.

                                                                                                      Static html and simple CSS are not just the preference of grumpy devs set in their ways. They really are easier to maintain.

                                                                                                      1. 5

                                                                                                        There’s several schools of thought with regards to website optimization. One of them is that if images load quickly, you have a much lower bounce-rate (or people that run away screaming), meaning that you get more readers. Based on the stack the article describes, it does seem a little much, but he’s able to justify it. A lot of personal sites are really passion projects that won’t really work when scaled to normal production workloads, but that’s fine.

                                                                                                        I kinda treat my website and its supporting infrastructure the same way, a lot of it is really there to help me explore the problem spaces involved. I chose to use Rust for my website, and that seems to have a lot less ecosystem churn/toil than the frontend ecosystem does. I only really have to fix things when bumping packages about once per quarter, and that’s usually about when I’m going to be improving the site anyways.

                                                                                                        There is a happy medium to be found, but if they wanna do some dumb shit to see how things work in practice, more power to them.

                                                                                                        1. 4

                                                                                                          A bunch of generated HTML file behind a Nginx server would not have required this amount of maintenance work.

                                                                                                          Sometimes we need a tiny bit more flexibility than that. To this day I don’t know how to enable content negotiation with Nginx like I used to do with Apache. Say I have two files, my_article.fr.html, and my_article.en.html. I want to serve them under https://example.com/my_article, English by default, French if the user’s browser prefers it over English. How do I do that? Right now short of falling back to Apache I’m genuinely considering writing my own web server (though I don’t really want to, because of TLS).

                                                                                                          This is the only complication I would like to address, it seems pretty basic (surely there are lots of multilingual web site out there), and I would have guessed the original dev, not being American, would have thought of linguistic issues. Haven’t they, or did I missed something?

                                                                                                          1. 4

                                                                                                            Automatic content negotiation sucks though? It’s fine as a default first run behavior, but as someone who lived in Japan and often used the school computers, you really, really need there to also be a button on the site to explicitly pick your language instead of just assuming that the browser already knows your preference. At that point, you can probably just put some JS on a static page and have it store the language preference in localStorage or something.

                                                                                                            1. 1

                                                                                                              There’s a way to bypass it: in addition to

                                                                                                              https://example.com/my_article
                                                                                                              

                                                                                                              Also serve

                                                                                                              https://example.com/my_article.en
                                                                                                              https://example.com/my_article.fr
                                                                                                              

                                                                                                              And generate a bit of HTML boilerplate to let the user access the one they want. And perhaps remember their last choice in a cookie. (I would like to avoid JavaScript as much as possible.)

                                                                                                              1. 1

                                                                                                                If JS isn’t a deal breaker, you can make my_article a blank page that JS redirects to a language specific page. You can use <noscript> to have it reveal links to those pages for people with JS turned off.

                                                                                                              2. 1

                                                                                                                Browsers have had multiple user profiles with different settings available, for more than a decade now (in the case of Firefox I distinctly remember there being a profile chooser box on startup in 2001–2).

                                                                                                                1. 2

                                                                                                                  Which is fine if you can actually make a profile to suit your needs. If you cannot make a profile, you are stuck with whatever settings the browser has, and you get gibberish in response as you might not understand the local language.

                                                                                                                  1. 1

                                                                                                                    Look, the browser is a user agent. It’s supposed to work for the user and be adaptable to their needs. If there are that many restrictions on it, then you don’t have a viable user agent in the first place and there’s nothing that web standards can do about that.

                                                                                                                  2. 1

                                                                                                                    The initial release of Firefox was 2004. Did you typo 2011 or mean one of its predecessor browsers?

                                                                                                                    1. 2

                                                                                                                      Yeah I’m probably thinking of Phoenix.

                                                                                                                2. 3

                                                                                                                  There’s no easy way, AFAIK - you either run a Perl server to get redirects or add an extra module (although if you were doing that, I’d add the Lua module which gives you much more freedom to do these kinds of shenanigans.)

                                                                                                                  1. 1

                                                                                                                    Caddy allows you to match HTTP headers, and you can probably achieve what you want with a bunch of horrible rewrite rules.

                                                                                                                    You can always roll your own HTTP server and put it behind Caddy or whatever TLS-capable HTTP server.

                                                                                                                    1. 1

                                                                                                                      You could put Apache behind Nginx; I’ve done that before, and I might do it again.

                                                                                                                      • I prefer nginx for high load; it’s great with static files.
                                                                                                                      • apache config for some things - redirects, htaccess, I think? - feels easier.

                                                                                                                      It’s been quite a while since I delved in on these.

                                                                                                                  1. 2

                                                                                                                    Vue 2 somehow exactly repeated the error of Python’s 2 to 3 transition. :-( It seems to have snapped back faster than Python did, with only a two or three year loss of productivity, but it still sucks. The lessons from those transitions are really, really clear:

                                                                                                                    • An automated tool to move some stuff from version N to N+1 isn’t going to be good enough.
                                                                                                                    • You can break backwards compatibility, fine.
                                                                                                                    • Do not under any circumstances break forward compatibility.

                                                                                                                    Once forward compatibility is broken, you just slammed the breaks on the whole ecosystem. Instead of them upgrading to N+1, finding all the spots using the deprecated API, and then fixing them, you have to somehow do the upgrade and fixes simultaneously. It doesn’t work.

                                                                                                                    It also means all of your dependency roots need to upgrade first (and break all their dependents!) before you can upgrade.

                                                                                                                    I can see why it was a tempting move for Vue, since breaking forwards compatibility had real advantages in file size, but it was ultimately a big mistake.

                                                                                                                    1. 1

                                                                                                                      Did it have the same reasons that made Python 2 to 3’s transition necessary, though?

                                                                                                                      1. 1

                                                                                                                        Not really. Evan You rebuilt the core of Vue and added a React hooks style composition functions, but there were also some breaking changes that were just broken to be a nicer API or more logical. One example off the top of my head: in Vue 2, if you have a ref named x on an HTML element in a loop, $refs.x is an array, but in Vue 3 you just aren’t allowed to do that. Instead there’s some more flexible system you can tap into and do other stuff with, but by default it won’t work.

                                                                                                                        Again though it’s fine to break backwards compatibility as long as you have forward compatibility, so the solution should have been:

                                                                                                                        • Figure out the new core and APIs
                                                                                                                        • Once those are stabilized, back port them to work with Vue 2.
                                                                                                                        • Encourage everyone to write code that is compatible with both Vue 2.X and Vue 3 during the transition period.
                                                                                                                    1. 1

                                                                                                                      Science training doesn’t teach us where hypotheses come from.

                                                                                                                      Undergraduate science classes don’t. I remember graduate training to be a scientist (which is a different thing) certainly doing so. It was an oral tradition, mind you, taught in the setting of discussions and lab meetings. The character of that tradition has been studied (see things like Lakatos on research programs). More concretely, though, a number of foundations of quantum mechanics labs have started doing automated hypothesis generation for experiments. I can’t find the citations right now, though.

                                                                                                                      1. 11

                                                                                                                        I think the article would have benefited from being called “Writing code isn’t the hard part” since that really seemed to be the main point. Once you have decomposed the problem and modeled solutions, writing the code is just a matter of translating that into whatever language you want (or need) to use. About two years ago someone here recommended How to Design Programs and I started using it as the textbook for my Fundamental Programming Concepts course (the only FP course our school has so far) and hands-down the biggest benefit has been the Function Design Recipe. I’ve started adapting it to my other classes, even the AP-level ones with good results. Obviously it is tailored to an FP model, but there are similar things you can do in OOP too. Anyway, it’s helped my students see that most of the work happens before you ever touch the keyboard. The way this article is titled and starts out makes it seem like the author is offended by “easy” languages or environments, and I don’t think that’s really the case. He just wants people to understand that the heavy lifting happens earlier in the process.

                                                                                                                        1. 2

                                                                                                                          Yes, the function design recipe is really useful. I was teaching a student a year ago, and even though we were in Python, I went through the same structured process with her.

                                                                                                                        1. 5

                                                                                                                          After reading this, my conclusion is: it’s a desktop environment, just like it’s always been. It’s a very customizable desktop environment, like it’s always been. Mixing and matching pieces was something you could do in KDE 1.x (which I used when it came out).

                                                                                                                          1. 1

                                                                                                                            That depends on whether we continue to consider aggregation for AI training to be fair use. If we start insisting that people be paid for the training data they create, then probably not much.

                                                                                                                            1. 2

                                                                                                                              Some people have even been anecdotally testing if these tools can do peer review.

                                                                                                                              Considering that ChatGPT has deep misunderstandings about things like molecular biology embedded in it, this is a complete nonstarter. I spent half an hour trying to get it to adjust those misunderstandings, and it flat out kept telling me that I was wrong.

                                                                                                                              1. 2

                                                                                                                                Are these likely to be common misunderstandings that it is amplifying?

                                                                                                                                1. 2

                                                                                                                                  Oh, very much so. Kind of the relationship between pop science and the understanding of an actual researcher.