Threads for imadij

    1. 4

      The animations don’t seem to work for me (I tried enabling JS, still no luck)

      1. 8

        Firefox doesn’t support css view transitions.

        1. 4

          Forgive me for going off on a tangent but how do the people behind Bun make money or remain sustainable?

          1. 4

            In late 2022, they raised fund from VC with support from Guillermo Rauch (CEO of Vercel) so they’ve some runway. At the time, the founder said they will offer hosting and cloud computing services after the stable release but it seems things have shifted a bit.

            Bun’s long term goal is stated:

            to be a cohesive, infrastructural toolkit for building apps with JavaScript/TypeScript, including a package manager, transpiler, bundler, script runner, test runner, and more.

            So the goal is to capture the JS infra ecosystem then take it from there. I imagine if the performance claims hold, the cost reduction alone will be enough for cloud companies like Vercel to keep investing and promote it further. Is it not pretty much the same with how other open source projects like nodejs are funded?

            1. 3

              The OpenJS Foundation, which supports nodejs development, is not VC-funded.

              1. 3

                I was talking about the second part where cloud companies like Vercel will be sponsoring its development and support the project. Not VC.

                The VC funding might be what got them here but it’s old and likely reaching its end.

          2. 6

            Honestly, the title is a bit clickbaity and as per the submission guidelines:

            We also tone down clickbait titles:

            • remove hyperbole and moralizing (“The reckless bug that caused the apocalypse” → “Debugging a null reference”)

            I would prefer something like “Typed Stack Traces in F#” so it would be just a touch more informative

            1. 5

              I wasn’t aware that’s now encouraged but you’re certainly right. This seems to be a recent change to submission guidelines (Sep 2024) and I missed the updates.

              I’m no longer able to edit the title for this story but will keep it in mind.

              1. 2

                If you click the “suggest” link under the title, and suggest your new title that way, the system will apply it if enough others make the same suggestion. I don’t know what “enough” is right now, but last time I looked it was a small number.

              2. 5

                I think one of the reasons people underestimate how good ad targeting can be without needing to listen to your mic is because other similar systems we consume aren’t on the same level. Based on our experience with other products, we build this image of what’s technically possible or expected based on things like search or recommendations being terrible at times, so it’s easy to see how Ads targeting feels paranormal by comparison and make people raise their eyebrows.

                1. 16

                  An editor that can actually adjust to your way of typing and self adjust to increase typing speed. I have been working lately with researchers on this and it is still a lot to sort out.

                  I have no idea what this means, but everything else is covered by emacs, and has been for 40-45 years.

                  The modern JavaScript-based editors are not bad too, on a fast machine, but emacs is still fine on a sub-GHz (or even sub 100 MHz) Arm or RISC-V SBC or a soft core in an FPGA. “Eight Megabytes And Constantly Swapping” they used to say – yeah, ok, but the $3 Milk-V Duo has 64 MB RAM and a 1 GHz CPU.

                  1. 12

                    I wish the author good luck, although I personally think that it’s the language server area of code editors should be improved, not the text editing part, or the fact that some new editor has some kind of a novel approach to keyboard shortcuts.

                    Remote development is also a good candidate for research and new novel implementations – VSCode resolves it pretty well (from the UX side, because at the protocol level it seems like a mess), when it works, but the problem is that it still doesn’t work for lots of use cases.

                    I treat the text editing part as a completely solved problem, but LSP/remote are still a mess. Yet people for some reason focus on the first.

                    1. 3

                      although I personally think that it’s the language server area of code editors should be improved, not the text editing part

                      +1 on this thought. I don’t ever feel slowness of editing code because of anything having to do with the render loop. The slow part is typing, expecting an auto-complete pop-up, but then nearly getting to the end of a line and THEN having the editor fill in because the LSP finally responded.

                      1. 1

                        This is one of my favorite thing about JetBrains IDEs, autocomplete / intelligence is faster than any editor I’ve used, and almost always does a significantly better job than whatever LSP implementation might be available. Closed source is a bummer, and it would definitely be exciting to see better open source alternatives, though!

                    2. 2

                      everything else is covered by emacs

                      I wouldn’t say so. There’s a lot of room for how to interpret those ideas and then a heck more room left for the execution which makes it impossible to draw any conclusions at this stage.

                    3. 20

                      I am very confused as to what the article is actually trying to say. It seems to be a wishlist without any actual details.

                      1. 4

                        The author already has his own take on Terminal emulators: Rio Terminal. So I’m reading this post as their next project announcement to implement his wish-list. And a call to action if any of the points resonate with others.

                        1. 1

                          I am very confused as well, but mostly because of said wishlist, since to me all those points are already present in quite few popular (or not) editors.

                          1. 1

                            all those points are already present

                            There’s one item that I can’t quite picture. Can you share an example of an editor that

                            adjust to your way of typing and self adjust to increase typing speed

                            I’d like to see what the author is aiming for, especially since they mention working with researchers on this.

                            1. 2

                              Seems like my brain faulted and I somehow missed that point, mea culpa. I don’t recall any editor that “adjusts to the way of typing” (which I assume is learning your coding/typing pattern to offer suggestions and fly-checks once you’re done typing or in-between longer periods of inactivity), but the 2nd part of that point is either too vague or it just means that editor needs to be faster at “typing”, which, depending on the editor, is already a solved problem (though it depends on more factors that just editor as the machine running whole stack will contribute to performance).

                        2. 8

                          I’ve tried to read through the various posts but I really don’t understand the controversy. Guilt by association is unfortunate but it doesn’t make it any less of a fact of life especially when nation states are involved. I have yet to see any new or interesting takes on that subject which makes this feel more like run of the mill lkml drama with a fresh coat of geopolitics.

                          1. 59

                            This is the right decision and it has nothing to do with “US law” as some of the lwn people seem to be talking about. Russia is a dictatorship with sophisticated state-powered cyberwarfare capabilities. Regardless of whether a Russian-based maintainer has malicious intent towards the Linux kernel, it’s beyond delusional to think that the Russian government isn’t aware of their status as kernel developers or would hesitate to force them to abuse their position if it was of strategic value to the Russian leadership. Frankly it’s a kindness to remove them from that sort of position and remove that risk to their personal safety.

                            1. 39

                              It may or may not have been the right decision, but it was definitely the wrong way to go about it. At the very least there should have been an announcement and a reason provided. And thanks for their service so far. Not this cloak and dagger crap.

                              1. 19

                                Indeed this was quite the inhumane way to let maintainers with hundreds of contributions go, this reply on the ML phrases it pretty well:

                                There is the form and there is the content – about the content one
                                cannot do much, when the state he or his organization resides in gives
                                an order.
                                
                                But about the form one can indeed do much. No "Thank you!", no "I hope
                                we can work together again once the world has become sane(r)"... srsly,
                                what the hell.
                                

                                Edit: There is another reply now with more details on which maintainers were removed, i.e. people whose employer is subject to an OFAC sanctions program - with a link to a list of specific companies.

                                1. 19

                                  I hope we can work together again once the world has become sane(r)

                                  This would be a completely inappropriate response because it mischaracterizes the situation at hand: if the maintainers want to continue working on Linux, they only have to quit their jobs at companies producing weapons and parts used to kill Ukrainian children. It has nothing to do with the world being (in)sane, and everything to do with sanctions levied against companies complicit in mass murder.

                                  1. 2

                                    it has everything to do with sanity or lack thereof, when such a standard is applied so unevenly

                                2. 13

                                  Yes, the decision is reasonable whether or not it is right, but the communication and framing is terrible. “Sorry, but we’re forced to remove you due to US law and/or executive orders. Thanks for your past contributions” would have been the better approach.

                                3. 61

                                  This is true of quite a few governments, including those you think are friendly, and it is a huge blind spot to believe otherwise. Dictatorship doesn’t have anything to do with it, it isn’t as though these decisions are made right at the top.

                                  1. 5

                                    Dictator, you say? I chuckled. Linus is literally a “BDFL”.

                                    Maybe we’ll eventually see an official BRICS fork of the Linux kernel? Pretty sure China has been working on it.

                                  2. 21

                                    Do you have the same reaction to contributions from US-based companies that have military contracts? While the US isn’t a dictatorship, the security and foreign policy apparatuses are very distant from democratic feedback.

                                    1. 0

                                      much more distant than russia’s in fact

                                    2. 16

                                      Regardless of whether a Russian-based maintainer has malicious intent towards the Linux kernel, it’s beyond delusional to think that the Russian government isn’t aware of their status as kernel developers or would hesitate to force them to abuse their position if it was of strategic value to the Russian leadership.

                                      It’s hard to single out Russia for this in a post-Snowden world. Not to mention that if maintainers can be forced to do something nefarious, then they can do the same thing of their own will or for their own benefit.

                                      Frankly it’s a kindness to remove them from that sort of position and remove that risk to their personal safety.

                                      Did you hear this from the affected parties?

                                      1. 5

                                        The Wikimedia Foundation has taken similar action by removing Wikipedia administrators from e.g. Iran as a protective measure (sorry, don’t have links offhand), but even if that’s the reason, the Linux actions seem to have a major lack of compassion for the people affected.

                                        1. 35

                                          It wasn’t xenophobia. The maintainers who were removed all worked for companies on a list of companies that US organizations and/or EU organizations are prohibited from “trading” with.

                                          The message could have (and should have) been wrapped in a kinder envelope, but the rationale for the action was beyond the control of Linus & co.

                                          1. 3

                                            Thank you for the explanation, makes sense as is common and compatible with sanctions to other countries. I was replying to the comment above mostly.

                                            1. 2

                                              This was what Hangton Chen has to say about this:

                                              Hi James,

                                              Here’s what Linus has said, and it’s more than just “sanction.”

                                              Moreover, we have to remove any maintainers who come from the following countries or regions, as they are listed in Countries of Particular Concern and are subject to impending sanctions:

                                              Burma, People’s Republic of China, Cuba, Eritrea, Iran, the Democratic People’s Republic of Korea, Nicaragua, Pakistan, Russia, Saudi Arabia, Tajikistan, and Turkmenistan. Algeria, Azerbaijan, the Central African Republic, Comoros, and Vietnam. For People’s Republic of China, there are about 500 entities that are on the U.S. OFAC SDN / non-SDN lists, especially HUAWEI, which is one of the most active employers from versions 5.16 through 6.1, according to statistics. This is unacceptable, and we must take immediate action to address it, with the same reason

                                              1. 6

                                                did you just deliberately ignore the fact that huawei is covered by special exemption in the sanctions?

                                          2. 0

                                            The same could be said of US contributors to Linux, even moreso considering the existence of National security letters. The US is also a far more powerful dictatorship than the Russian Federation, and is currently aiding at least two genocides.

                                            The Linux Foundation should consider moving its seat to a country with more Free Software friendly legislation, like Iceland.

                                            1. 15

                                              The Linux Foundation should consider moving its seat to a country with more Free Software friendly legislation, like Iceland.

                                              I’m Icelandic and regret I only have two eyebrows to raise at that.

                                              1. 3

                                                it’s an incredibly low bar that Iceland has to clear, as this story demonstrates

                                                1. 5

                                                  Please expand on how Iceland would act to be seen as a more FLOSS friendly place, as opposed to for example the United States.

                                                  1. 2
                                                    1. 8

                                                      In other words, refusing to comply with international sanctions. This is in fact an incredibly high bar to clear for Iceland. It would require the country to dissociate itself from the Nordic Council, the EEA, and NATO.

                                                      1. 1

                                                        a kernel dev quoted in the Phoronix article wrote:

                                                        Again, we’re really sorry it’s come to this, but all of the Linux infrastructure and a lot of its maintainers are in the US and we can’t ignore the requirements of US law. We are hoping that this action alone will be sufficient to satisfy the US Treasury department in charge of sanctions and we won’t also have to remove any existing patches.

                                                        that made me think it was due to US (not international) sanctions and that the demand was made by a US body without international jurisdiction. what am I missing?

                                                        1. 4

                                                          Without a citation of which sanction they’re referencing it’s really hard to say. I assumed this sanction regime was one shared by the US and the EU, and that Iceland would follow as a member of NATO and the EEA. If it is specific to the US, like their continued boneheaded sanctions against Cuba, than basing the Linux foundation in another country would prevent this specific instance (a number of email addresses removed from a largely ceremonial text file in an open source project) from happening again.

                                                          Note however that Icelandic law might impose other restrictions on the foundation’s work. The status of taxation as a non-profit is probably different.

                                                          1. 2

                                                            even if it has to do with international sanctions, their interpretation and enforcement seems to have been particular to the US. it reeks of “national security” with all the jackbootery that comes with it.

                                          3. 3

                                            I had eagerly waiting for this series to finish! I will read the posts again. Thank you!

                                            could you please consider adding a RSS/Atom feed to your blog?

                                            1. 2

                                              That’s a good idea. Could you suggest a blog that does a good job of this? I’ll copy whatever they’re doing :)

                                              1. 1

                                                Ok, I’ve added https://jacko.io/rss.xml and linked to it from my homepage. Please let me know if it works with your reader.

                                                1. 2

                                                  it does, thank you!

                                                  I will be looking forward to the future posts

                                              2. 8

                                                Related blog post that add more context:

                                                Introducing runes - Rethinking ‘rethinking reactivity’ (Sep 2023)

                                                1. 3

                                                  This might put the nail in that coffin of if you should put the whole post’s content in the feed or not (I always thought a site can offer better reading experience with better styling, syntax highlighting, etc. so a summary/abstract/intro was a better feed body). If the bots can just read all the content from the feed they will.

                                                  1. 44

                                                    As an RSS user, it makes me sad when I have to click through to the site to read a full post. The whole point of syndication is so that I can read all my news feeds in one place, and having the full text in my reader means I can read offline too.

                                                    1. 8

                                                      This feels more like a limitation of the reader. The entry contains a link to the content. A reader that wants to support offline reading can grab the linked page and cache it. If you want to be able to display images and so on offline you already need to fetch and cache linked things. Keeping a short summary in RSS with a link will make things much faster for readers that don’t support offline reading.

                                                      1. 5

                                                        There is also the added benefit of less downloading of old news. Most feed readers will respect ETag and related headers, but when something changes, the entire feed must be downloaded, even though only a single article has been added/updated.

                                                    2. 11

                                                      If a feed contains entire posts, some people will prefer to read the post with its full styling on the original site.

                                                      If a feed contains a summary, some people will prefer to read the post within their reader, which they will configure to fetch and process posts from the original site.

                                                      No matter which nails or coffins exist, people will invent and use new ones. Cloudflare is part of an arms race, alongside bots, human nature, convenience, and everything else.

                                                      1. 5

                                                        If a feed contains entire posts, some people will prefer to read the post with its full styling on the original site.

                                                        They can still visit the website as usual, their workflow shouldn’t be affected, no? Idk, it seems both crowds would be satisfied by having the full content in the feed.

                                                        I only heard this argument from website owners who do something special with styling and want to make their audience live the full experience.

                                                        1. 1

                                                          I only heard this argument from website owners who do something special with styling and want to make their audience live the full experience.

                                                          Ugh yuck. I use Dreamwidth for my feed reader, and one of its unfortunate features is that it fails to completely strip inline styles from feeds, so the occasional article looks really ugly and out of place. It’s like email with too much HTML, jarringly inappropriate like someone typeset the Economist in the style of 1990s WiReD.

                                                        2. 2

                                                          No matter which nails or coffins exist, people will invent and use new ones.

                                                          That’s good, real good!

                                                      2. 5

                                                        As someone who built a few personal sites in my youth and then ran a few personal blogs I think about setting up a site again sometimes, but:

                                                        • privacy is more of a concern these days
                                                        • time is more of a concern these days
                                                        • the exercise of getting attention on the web just feels like such a slog
                                                        1. 2

                                                          privacy, time, attention on the web

                                                          These are all valid reasons, but aren’t you effectively doing just that here? If not on other platforms as well

                                                          1. 1

                                                            It’s a matter of effort vs expected reward. Comments on lobsters are low effort, with a moderate chance at getting attention/reward. Personal sites are moderate to high effort, with a low chance at getting attention. This is just my personal calculus, of course. (The sting of putting things out there and getting no response factors in as well!)

                                                        2. 3

                                                          Not the low-level things you listed, but maybe also a bit niche. Among other usual software stuff the core repeating topic (accompanying me for more than 10 years) in my blog is using natural language processing to help me learn foreign languages (please don’t expect too much now, I’m just trying this and that, it’s all very high-level).

                                                          I think the first article was a very, very simple scheme to tokenize Japanese. I don’t know anymore what I wanted to use it for.

                                                          There’s really so much you could do with software nowadays:

                                                          The articles are not very well structured, you’ll probably find them somewhere in Natural Language Processing category or nlp tag.

                                                          1. 1

                                                            Thank you for sharing. I love how you covered a lot of different topics over the years.

                                                          2. 5
                                                            1. 1

                                                              Those are some great suggestions. Though you forgot one: yours ;)

                                                              Do you have more by any chance? It doesn’t have to be about the topics I mentioned, could be some of your personal favorites.

                                                            2. 4

                                                              Here’s a very different site for you. https://destevez.net/

                                                              Daniel Estevez writes a wonderfully detailed blog about wireless signals and the signal processing involved in working with them. From satellite communications to radio astronomy I find it very interesting even when the math gets deeper than I can easily follow. He shares the raw data and Python code for most posts

                                                              1. 1

                                                                Great recommendation! I had to look up things right off the bat

                                                                Even though this isn’t a field I’m exactly interested in, it is the type of variation I wanted to explore. Something where I’m (a little) out of depth and be introduced to new things constantly.

                                                              2. 7

                                                                So this paper is about testing the performance of LLMs on:

                                                                1. Resolving already solved issues
                                                                2. From 12 popular repositories
                                                                3. Only written in Python

                                                                This a very limited scope which is critical for interpreting the results correctly but it isn’t highlighted enough outside of the paper:

                                                                • It’s never mentioned explicitly in Github’s README
                                                                • On the website it’s only mentioned at the very end of the page in the about section

                                                                People are likely misinterpreting the results because of this

                                                                Also, considering the issues were already solved, were the models trained on datasets that include the solution? How does it fare with new green issues?

                                                                1. 1

                                                                  Also, considering the issues were already solved, were the models trained on datasets that include the solution? How does it fare with new green issues?

                                                                  In the paper (page 7), they split issues to “before 2023” and “after 2023”, and found no difference.

                                                                2. 44

                                                                  I watched video attached in the email and even for me it was really painful to watch. It seems that part of the audience judged the presented solutions before they even had a chance to hear what the presenter had to say and wanted to undermine the presenter’s competence.

                                                                  1. 6

                                                                    Rust is great but promoting it in such way to this crowd is not a smart strategy. Rust is not really new anymore, it’s almost 10 years old (since 1.0). Linux maintainers have been aware of it for a while and have their own opinions and concerns. I think the engineer’s reaction is a buildup from past interactions.

                                                                    Regardless what the discussion is about, it’s very natural to oppose change. If you beat the drum too hard, people will refuse whatever you’re offering. A better approach to make them adopt your ideas would be to manifest them in an independent project and show them how nice it’s.

                                                                    1. 55

                                                                      A better approach to make them adopt your ideas would be to manifest them in an independent project and show them how nice it’s.

                                                                      The presenter is asking of the existing maintainers “I want to know what the semantics of this API are, so that we can properly model them, we will maintain the bindings” and the response is “you can’t make me care about Rust, if I break the API semantics, you need to adapt”, which was a side-comment that was already preempted in the question itself.

                                                                      1. 34

                                                                        I wonder whether a root problem here is that the exact semantics of the decades-old mass of C are not known to the C maintainers either, so they can’t tell the Rust people what the semantics are nor know whether a change in the C code will change them. I wonder whether the C maintainers understand this tacitly and only tacitly, such that what seems to the Rust proponents like a reasonable request (tell us the semantics) seems to the C maintainers like an absurd imposition but neither side quite realizes why the request so agitates the C side.

                                                                        1. 22

                                                                          Crass thought, but maybe the C maintainers know that they have no idea how the code actually works, but don’t want to admit it? They’d rather just continue to play whack-a-mole when bugs crop up.

                                                                          1. 11

                                                                            That’s needlessly inflamatory. We who use and advocate for Rust need to do all we can to de-escalate, including assuming the best in others.

                                                                            1. 26

                                                                              I am physically incapable of assuming the best in someone who behaves like in that video. Though I should say that I am not assuming the thing I posted above, it’s just a terrible possibility that popped into my head. I can think of other explanations for their behavior, though none of them are good. For example, maybe Ted Ts’o really really really hates Rust, and is being obstructionist and caustic to intentionally stall the Rust for Linux effort enough to kill it.

                                                                              1. 9

                                                                                How do you assume the best in people who are demonstrating their worst? That’s going to get you nowhere but burning everyone out who tries to improve things somewhat (insert “improve society somewhat” comic meme here for a moment of levity)

                                                                              2. 4

                                                                                I do think that (or rather a less hyperbolic version of it¹) also plausible, though I chose not to say it in my previous comment.

                                                                                ¹ the maintainers literally having no idea how the code works seems unlikely

                                                                      2. 10

                                                                        Absolutely. This is also a great attitude to have in general. Understand how objects around you work, taking things apart, see how it can implemented in different ways and what are the tradeoffs. The knowledge and experience you gain will yield compounded benefits and likely come handy down the line many times over, often unexpectedly.

                                                                        I’m always in disbelief when I see people discouraging experimentation, which seems to be a prevalent trait in the software industry for some reason. Now that I think about it, it might be because side projects here are more visible and publicly advertised as a side effect of FOSS culture? while other industries you only hear about them if you’re a close friend, especially at early stages, so the environment tends to be more supportive in comparison.

                                                                        1. 14

                                                                          I’m always in disbelief when I see people discouraging experimentation, which seems to be a prevalent trait in the software industry for some reason.

                                                                          It kills me when I see people respond to a language announcement by complaining that there are too many languages. Most languages aren’t going to become common so it’s really not a big deal if people make something new. Just let people have some fun!

                                                                          1. 20

                                                                            This also leads to programming language creators feeling like they have to justify creating a new language. I’ve seen it on introductions to languages where the creator explicitly answers the question, “Why Another Programming Language?”.

                                                                            It’s like saying, “There is already plenty of music, why write a new song?”.

                                                                            1. 4

                                                                              It’s a classic misconception of “not reinventing the wheel.”

                                                                            2. 5

                                                                              I’m working on my own lisp interpreter. Along the way I came up with a seemingly novel way to embed arbitrary data into ELF files and have the kernel load it automatically. Wrote a blog post about it to explain the implementation details, and showed off my language a little bit by demonstrating how I can use it to embed arbitrary code into the interpreter and have it become a self-contained application. It was shared here, and the top comment just expressed surprise at the fact someone had made yet another lisp. Someone replied “but it’s so fun”.

                                                                              I really am having lots of fun. I feel like this is the project of a lifetime for me, it’s the kind of thing I’ve always wanted to make all along. I basically want to remake all of Linux user space in my image. I probably won’t make it, I’m just one guy. There is no doubt that it’s fun though. I definitely encourage everyone to try it.

                                                                              1. 2

                                                                                It depends a lot on the point of the new language. There are two good reasons for creating a new language:

                                                                                • It’s fun, and you want to either enjoy yourself, learn something, or both.
                                                                                • You have a novel idea for how to do programming better and your language is there to showcase these ideas.

                                                                                Some people will object to the first one, but (I hope) not that many. The entire esoteric languages community is building things in this space.

                                                                                The problem with a lot of new-language announcements is that they’re presented as if they’re the second, but they’re not actually either. It’s a slightly new syntax on old ideas, which is fine as a thing that you do for fun, but the announcement is trying to get people to use the language.

                                                                                1. 2

                                                                                  I don’t have a problem with people pushing back on claims that are too strong for what is presented. That’s just healthy discourse. I have an issue with people whose knee jerk reaction is to complain about a new language or really demand a justification for it’s existence.

                                                                                  1. 1

                                                                                    There is nothing wrong with trying to get people to use the language you made. If people like it, if the ideas make sense, they should use it.

                                                                                    No one needs permission to create a new language or to show it to people.

                                                                                    1. 1

                                                                                      Back when I was in university, I had a lot of ideas for that second one. I implemented some of them, and they usually turned out to be less good in practice than in my head (I spent a lot of time on a weird concatenative/actor hybrid). But doing those implementations taught me a lot about language implementation and design.

                                                                                      I don’t do language development at the moment - my job doesn’t call for it, and I haven’t had the motivation to code recreationally for the last year and a half. But the last language project I did was solidly the first kind: I made a weird little Lisp just for my own enjoyment, and all the exploration I did of unusual implementation details were entirely because they seemed like they’d be fun to tinker with, not because I thought they’d ever amount to The Next Big Thing.

                                                                                  2. 1

                                                                                    Yes, and part of the process of creating a new language is also the enjoyment! PL is also a very deep field; there are so many subfields to learn and experiment in. Sometimes the best way to do so is to create many small programming languages and throw them away.

                                                                                    Aside: I believe that learning how to develop programming languages can be valuable in industry. We encounter internal DSLs more often than we realize, and they can be decent ways of modeling and solving domain-specific problems in industry.

                                                                                    1. 2

                                                                                      You also learn how metaprogram doing this, which can be quite valuable, IME

                                                                                    2. 1

                                                                                      Made me think of this classic:

                                                                                      I’m doing a (free) operating system (just a hobby, won’t be big and professional like gnu) for 386(486) AT clones.

                                                                                    3. 8

                                                                                      As discussed before, the problem with “opt-out” here is that the bad actors will just create novel “opt-out” requirements to stay one step ahead. Enforced refusal to “opt-in” is the only real strategy.

                                                                                      1. 2

                                                                                        I shared this repo specifically for the IP ranges file. I was thinking of putting one together if none exist. The one in the repo is a little bare-bones at the moment, mainly focused on OpenAI, but I promoted here hopefully to garner more contributions. Something like this is better and more effective when crowdsourced.

                                                                                        Blocking by IP address seems like the only practical approach to fight back. Can you expand more about what would an enforced refusal to “opt-in” look like?

                                                                                        1. 1

                                                                                          Nothing fancy: “opt-in” is another word for positive consent, meaning they may not use my data unless they ask and I say yes. “Enforced” just means legal entities actually enforcing laws & protections, which may be a hopeless dream.

                                                                                          edit: In other words, I think this is a social problem at least as much as a technical one, and no scripts or tags will stop these bad actors from stealing or attempting to steal.

                                                                                        2. 2

                                                                                          Enforced refusal to “opt-in” is the only real strategy.

                                                                                          What kind of enforcement do you imagine would suffice?

                                                                                          As much as I’d like something to suffice, I’m not sure even end-to-end Digital Rights Management on all content on the Internet would be enough to enforce this — I don’t think the cost of cameras to copy content through the analog hole and an extra step of optical character recognition would be much compared to what these AI companies are already spending.

                                                                                          1. 1

                                                                                            I think it would be quite interesting to serve up quasi randomized content so as to poison the models. I’m not sure how exactly one would target it in such a way that the AI companies wouldn’t get wise to it and it wouldn’t negatively impact a user experience, but one can dream.

                                                                                            1. 1

                                                                                              Maybe quasi-randomized factually incorrect content to poison the models with. That would be a great way to fight back. They’d probably start blacklisting domains doing that, but you could say that’s “mission fucking accomplished”.

                                                                                              I’m not sure how to do that in a way that doesn’t impact actual human visitors (and search bots?) though.

                                                                                        3. 5

                                                                                          I saw the domain on the link and thought at first this was about GitHub specifically. How do I opt-out all my repos on GitHub? Is that even possible?

                                                                                          1. 3

                                                                                            I believe the only option is private repositories.

                                                                                            Github says the following: What data has GitHub Copilot been trained on?

                                                                                            GitHub Copilot is powered by generative AI models developed by GitHub, OpenAI, and Microsoft. It has been trained on natural language text and source code from publicly available sources, including code in public repositories on GitHub.

                                                                                            Unfortunately, even if GitHub had some opt-out mechanism, anything public will be scraped and accessible to other parties. The general consensus now is to grab the data, sort out any issues later.

                                                                                            In some alternative universe, GitHub would’ve taken matters into its own hands to defend open source against AI scraping bots on the ground and in courts. This might have been possible with old GitHub and would probably been a win for them publicity-wise and good for business. But we live in a different universe, where they’re now advocating that anything public is a fair game.

                                                                                            1. 4

                                                                                              Much as I am critical of AI-generated slop, I believe that training on FLOSS licensed code is probably the least objectionable from a legal/copyright point of view.

                                                                                              1. 1

                                                                                                At least if they provided the source code + data for their models.

                                                                                              2. 3

                                                                                                Which also means that you can’t opt-out your self-hosted code from being taken by MS / OAI / Github.