Threads for tedu

    1. 12

      Let’s look at a few of these. 3 out of the top 6 would have been mitigated by standard rust features:


      1. CVE-2023-34362: Progress MOVEit Transfer

      https://www.horizon3.ai/moveit-transfer-cve-2023-34362-deep-dive-and-indicators-of-compromise/ The function that extracts the X-siLock-Transaction header to compare its value to folder_add_by_path has a bug. It will incorrectly extract headers that end in X-siLock-Transaction, so an attacker can trick the function to passing the request onto the machine2.aspx by providing a header such as xX-siLock-Transaction=folder_add_by_path and additionally providing the correctly formatted header with our own arbitrary transaction to be executed by the machine2.aspx endpoint.

      Rust’s type system, specifically http::header::HeaderName would have caught this error as the headers passed to a web application are not be treated as an arbitrary string, but compared to known good headers. “xX-siLock-Transaction” would have been ignored. Of course you could write brain dead code that converts the headername to a string and then does the conversion, but the idiomatic use of this would lead you away from that path.


      1. CVE-2023-33246: Apache RocketMQ

      https://attackerkb.com/topics/YBI7e7fY0a/cve-2023-33246 The command that gets executed by Runtime.getRuntime().exec is created by the following method FilterServerManager.buildStartCommand(). We see from the last else block if the system we’re exploit is not Windows, the command run will be sh %s … where %s get substituted for getRocketmqHome() which is an user controlled parameter when the user sends a request to update the broker configuration.

             } else {
                 return String.format("sh %s/bin/startfsrv.sh %s",
                    this.brokerController.getBrokerConfig().getRocketmqHome(),
                     config);
             }
      

      Rust’s std::process::Command does accepts arguments individually rather than as a full command string like this, so the code would have looked like::

      Command::new("sh")
          .arg(format!("{rocketMqHome}/bin/startfsv.sh")
          .arg(config)
          .spawn();
      

      There never would have been the chance for the command to treat the first argument as multiple arguments or as a second command. Can you work around that - probably, but you’d have to try hard to write code that breaks in the same way.


      1. CVE-2023-22515: Atlassian Confluence

      https://attackerkb.com/topics/Q5f0ItSzw5/cve-2023-22515/rapid7-analysis While not obvious from reading the diff above, we must note that the class com.atlassian.confluence.core.actions.ServerInfoAction extends the class com.atlassian.confluence.core.ConfluenceActionSupport. This will be important during exploitation.

      Rust’s lack of inheritance would suggest that there’s less likelihood of accidentally including unintended behavior of a parent class - emulating inheritance via DeRef polymorphism may still lead to this, but that would be less common and a known antipattern.

      We know we can leverage the XWorks2 feature of supplying HTTP parameters to call setter methods on objects. We need to identify an unauthenticated endpoint whose Action object also exposes a suitable get method that will allow us to access the application configuration. … We can see this class has a getter method getBootstrapStatusProvider which returns the BootstrapStatusProviderImpl instance we are looking for. BootstrapStatusProviderImpl, in turn, has a getter method getApplicationConfig to return the application’s configuration. … Finally, we can see the class com.atlassian.config.ApplicationConfig implements the setter method setSetupComplete.

      Rust’s general mutability approach would have helped here. For the most part if you want a mutable setter you need to make that explicit. All the calls down the tree would have generally have been obviously tagged with mutability indicators (get_bootstrap_status_provider_mut(), get_application_config_mut(), and the set_setup_complete method would only compile with a mutable reference to the application config). Here there would be no reason for the server info action to mutably borrow the application config.


      In summary, the article doesn’t go deep enough into the rust idioms and standard that would reduce some of these problems. I didn’t look at all the mentioned vulnerabilities - just 3 out of the top 6, so I’m sure there’s probably more than a few of them and perhaps even most of them that rust doesn’t help with.

      1. 7

        Every web framework in the world has a function to extract an exact match header.

        1. 10

          You could say that the fundamental problem in that particular CVE is that the Moveit app writes their own http handling and I’d definitely agree with you, The rust specific point I’m making here is that there is a blessed library (http) that’s used by basically every rust web framework for header parsing. This makes doing the wrong thing difficult to justify in any code.

          The important part isn’t that the ability to perform an equality comparison is there, but that the idiomatic way to get the header requires creating a HeaderName and using that to get the header. Many web frameworks treat the header names as strings which leads to the potential for making this sort of error more generally.

        2. 4

          Yes. The rust community seems to have an unusually high standard for security and code quality. That means that the defaults you’ll reach for are less likely to bite you than in other languages (e.g. Django is great, but the default way to handle JSON is more error-prone than serde).

          There’s bad code written in rust too, of course, but the core packages and language seem higher quality and more security conscious than the equivalents in other languages I’ve used.

          1. 2

            The CVEs from both #1 and #2 stem from unidiomatic implementations, and I don’t think #2 is actually related to how arguments are passed.

            In #1, someone decided to roll their own header extract and match function in #1 figuring it’s “just string matching”. Maybe Cargo being so easy to use would’ve made that less likely, as it would’ve made it less likely for someone to roll their own thing instead of using something that actually works, but I’m not convinced. MOVEit Transfer isn’t exactly some deep embedded thing with a weird build system cross-compiling for some esoteric platform no one heard of. They had zero technical reason not to use one of the myriad options already in existence. Whatever “social” screw-up prevented the original developer from using Beast or whatever may have prevented them from using http, too.

            FWIW #2 is already unidiomatic, Java has had a std::process::Command-like interface for at least 15 years.

            But the CVE stems from rocketMqHome being passed to the shell despite being under the attacker control. So I think this snippet:

            Command::new("sh")
                .arg(format!("{rocketMqHome}/bin/startfsv.sh")
                .arg(config)
                .spawn();
            

            is equally vulnerable, it’s just cleaner :-).

            #3 is more contrived OOP than I can stomach so early in the morning so I can’t comment on it. Maybe no inheritance would’ve helped, as setupComplete is effectively exposed by accident.

            But this hinges so much on architectural choices that IMHO any analysis of how one would implement that in Rust isn’t very relevant. I’d like to think I wouldn’t make this mistake in Rust or in Java but commenting about in on lobste.rs is one thing, doing it while everything is burning around me is another. Here, there’s no reason to mutably borrow the application config (although one could argue there’s “no reason” for about half the things done in that CVE, too). IRL I’ve seen people doing worse under management pressure.

            Edit: FWIW, I do think a strong typing system would’ve helped in all three cases, but a) not without significant changes to the underlying architecture that make this sort of analysis completely hypothetical and b) not without additional architectural choices (e.g. capabilities).

            #2 and a milion problems like it stem from the fact that it’s hard to track what’s under user control and what isn’t. Tagging user-controlled variables and rejecting them in e.g. the process API helps. I did that as a prototype in Common Lisp a long time ago; it’s probably doable in Rust, but it’s obviously not in the standard library.

            #3’s bigger deal is that you can reset application installation state without presenting a capability to do so. It’s obviously a bug that this is also exposed externally, but there’s probably more where that came from, including from unexposed endpoints, and they’re there because all it takes to reset state is a call to setSetupComplete, rather than setSetupComplete, a setup session token, and a token that says you can reset it.

            #1 is… probably the only one that’s already fixable out of the box, in just about any tech stack newer than 2005 or so, I don’t think there’s much to add there :-D.

            1. 1

              2 doesn’t seem to be exploitable in the way you’re saying it does. (see parallel comment at) https://lobste.rs/s/hws1cu/rust_won_t_save_us_analysis_2023_s_known#c_qv0m7y

              The issue at play is that the entire string is treated as the command to Runtime.exec, rather than each argument being passed individually to e.g. Java’s ProcessBuilder. I wonder does Rust have a footgun equivalent to Runtime.exec that would be exploitable? Perhaps manually constructing the string and then splitting it into args would have hit the same problem.

              1. 3

                It’s not exactly the same mechanism, but escaping variables under user control in a command line-building environment is a pretty common pitfall. The CVE was easier to exploit because it could get you a pipe straight to a shell, but even if they’d used ProcesBuilder like that, it would still have been vulnerable. For example:

                fn main() {
                    let rocketMqHome = String::from("/www/upload/");
                    let config = String::from("/etc/config");
                
                    Command::new("sh")
                    .arg(format!("{rocketMqHome}/bin/startfsv.sh"))
                    .arg(config)
                    .spawn();
                }
                

                will gladly get you:

                [tabby@leena rust-exec-test]$ cargo run
                   Compiling rust-exec-test v0.1.0 (/home/tabby/workshop/rust-exec-test)
                warning: variable `rocketMqHome` should have a snake case name
                ...
                warning: `rust-exec-test` (bin "rust-exec-test") generated 1 warning
                    Finished dev [unoptimized + debuginfo] target(s) in 0.17s
                     Running `target/debug/rust-exec-test`
                [tabby@leena rust-exec-test]$ Oh noes!
                

                with nothing but public upload rights, or write access to a network-mounted path, or an auto-mounted firmware update USB stick and so on, because you’re giving the user control over the initial part of the path.

                (Also edit: this is fairly common knowledge, it may not be the kind of code you’d write in general and maybe you just took a shortcut when explaining the other exploit – I’m not assuming you don’t know how command injection works here, what I am assuming, based on practical experience, is that this is the kind of code lots of people do write, because they don’t want to jump through the hoops of constructing a proper execution environment and will absolutely escape initial path sections).

                Edit:

                I wonder does Rust have a footgun equivalent to Runtime.exec that would be exploitable?

                Depending on shell and arg mechanics, there are other tricks you can play sometimes. I don’t recall the exact mechanics, that was forever ago, but there were some PHP APIs that “gracefully” treated an array of arguments as separateargs calls, so with a little creativity you could get it to run ["-c", "/bin/sh", ... ].

                Command is pretty solid in my experience, I don’t think I ever saw it doing unexpected stuff, but once you start doing the escaping for it, there’s only so much it can do :-).

                Although FWIW you did hit the nail on the head here:

                Perhaps manually constructing the string and then splitting it into args would have hit the same problem.

                Easily half the vulnerable code I patched back when people started to figure out doing system() is bad was old system()-using code which did exactly that.

            2. 1

              This code is also vulnerable in a very similar way. Specifically, a -c argument can get in:

              Command::new("sh")
                  .arg(format!("{rocketMqHome}/bin/startfsv.sh")
                  .arg(config)
                  .spawn();
              

              You already wrote code that breaks the same way.

              1. 4

                I’m probably missing something in my understanding of where the vulnerability you’re suggesting is (or possibly you are, I’m not sure):

                dbg!(Command::new("sh").arg("-c echo foo").output()?);
                

                passes the arg which is “vulnerable” as a single arg, which results in sh not seeing that as a -c:

                    status: ExitStatus(
                        unix_wait_status(
                            512,
                        ),
                    ),
                    stdout: "",
                    stderr: "sh: - : invalid option\nUsage:\tsh [GNU long option] [option] ...\n\tsh [GNU long option] [option] script-file ...\nGNU long options:\n\t--debug\n\t--debugger\n\t--dump-po-strings\n\t--dump-strings\n\t--help\n\t--init-file\n\t--login\n\t--noediting\n\t--noprofile\n\t--norc\n\t--posix\n\t--protected\n\t--rcfile\n\t--restricted\n\t--verbose\n\t--version\n\t--wordexp\nShell options:\n\t-irsD or -c command or -O shopt_option\t\t(invocation only)\n\t-abefhkmnptuvxBCHP or -o option\n",
                }
                

                for comparison:

                dbg!(Command::new("sh").arg("-c").arg("echo foo").output()?);
                

                works as you’d expect:

                    status: ExitStatus(
                        unix_wait_status(
                            0,
                        ),
                    ),
                    stdout: "foo\n",
                    stderr: "",
                }
                
            3. 38

              I came for an explanation of why Rust won’t save us. I didn’t get that. I mean, they say it, but they don’t bother to explain it. It feels like the title is clickbait.

              1. 31

                Rust won’t save us because it’s primary feature (memory safety) only addresses 20% of the vulnerabilities discovered in the past year.

                1. 39

                  That number changes a lot if you count only ones that lead to arbitrary code execution. The thing that makes memory safety bugs so bad is that they step outside of the language’s abstract machine and so you can no longer reason about the attacker’s abilities in terms of things that are possible in the source language.

                  1. 30

                    Being able to reason about the attacker’s abilities didn’t seem to help 23andme.

                    There’s a big disconnect between the unsafe code whining and the security incidents that really upset people to the extent they raise their hand and say, this affected me. But a stupid bug in glibc which affected nobody and look out, here come the crab people.

                    1. 13

                      So what about all the zero days, many of which are actually exploited? They are almost always some low-level C code (e.g. bluetooth drivers), or the other case is JIT compilers - both employ “unsafe” code, as per rust’s lingo (but one can actually be replaced by safer alternatives for decades now).

                      Your claim simply doesn’t stand up to scrutiny.

                      1. 8

                        How many zero days were you exploited by last year? How did you respond?

                        1. 23

                          Being able to reason about the attacker’s abilities didn’t seem to help 23andme.

                          I don’t know what point you’re trying to make here. 23andme suffered a credential stuffing attack.

                          There’s a big disconnect between the unsafe code whining and the security incidents that really upset people

                          I have personally, in my professional life, responded to a breach involving 0day exploits. I have triaged memory safety vulns plenty of times. What’s your point? That if random lobsters users haven’t experienced these issues they don’t matter?

                          The reason why “crab people” talk about memory safety is because it is the highest impact vulnerability possible. It also has relatively high complexity to exploit on modern systems thanks to massive amounts of work across multiple decades to employ least privileged architectures and exploit mitigations.

                          Even just a decade ago, or so, memory safety exploitation was one of the most common ways you would get attacked through your browser. It was extremely common.

                          This is like saying “How many people do you know who got covid? Those vaccine pushers sure to get up and arms!” - the fact that you don’t see it is:

                          a) Irrelevant

                          b) A sign of how seriously memory safety vulns get taken, how fast patches get rolled out, and how decades of effort has gone into addressing a problem that rust just does not have

                          1. 6

                            a problem that rust just does not have

                            I like Rust, and it’s been the language I’ve programmed in the most for the past few years. I don’t think this is a fair characterization of the language.

                            Rust has a widely-used unsafe keyword which lets Rust code be memory-unsafe. I like the design of making memory-unsafe operations stand out in code because they have to go in an unsafe block, but that is not the same thing as “Rust does not have this problem.”

                            Rust absolutely has the problem, it just tends to affect a lower percentage of lines of code in a Rust code base compared to a C or C++ code base. If Rust didn’t have the unsafe keyword, I think it would be fair to say Rust didn’t have the problem, but in that world Rust wouldn’t be nearly as useful or popular as it is today.

                            1. 4

                              Every language has an unsafe escape hatch, and we hand wave it away because the bug density for memory safety issues in those languages is so low.

                              1. 3

                                I don’t think it’s an unfair characterization, in the same way I don’t think it’s unfair to characterize Haskell as purely functional despite FFI, or C/C++/Rust as providing type safety despite the ability to cast values to types, or asynchronous code asynchronous even if calls that block an event loop are made.

                                I think it’s more about the intentionality and the guardrails provided to you. If you “know better” and want to escape the guardrails, you are free to do so, with the expectation you know better than the static analysis. But the guardrails are significant and invite a programming pattern that eliminates a class of issue.

                                1. 3

                                  Rust has a widely-used unsafe keyword

                                  citation needed. Last time I saw any statistics, less than 1% of Rust code on crates.io was unsafe.

                              2. 16

                                I was exploited every time a journalist exposing government corruption got arrested or shot because their phone was hacked with a bought zero day.

                                1. 1

                                  How often does that happen compared to simply demanding that popular web services and ISPs hand over data?

                                  1. 2

                                    I’m aware of at least one NGO running infosec training specifically for journalists. It’s not as though folk working in that industry are unfamiliar with what happened to e.g. the one who broke the panama papers story.

                                    1. 1
                          2. 28

                            20% is an insanely huge percentage, though. I can’t understand how someone could be underwhelmed by that.

                            Hell, if you told me that the code I write would have 20% fewer bugs of any kind by switching to a different programming language (and the same performance characteristics), I’d be very intrigued.

                              1. 10

                                If people just take the lesson that “rust is secure” then the risk is that someone will be tempted to take a mature codebase with a well designed and debugged security model in a memory unsafe language, and replace it with an immature rushed rust reimplementation.

                                1. 15

                                  There are people rolling their own crypto algorithms also. Let them shake hands, and go on with your life. I don’t think caring about this sorta made up situation is meaningful.

                                2. 7

                                  Absolutely, like I think rust is great. And 20% better by removing a class of flaws is great progress in the field.

                                  But I get overwhelmingly frustrated by having to constantly fend off the narrative around “rust is safe”, the implication being that it’s 100% safe and secure. When those mean very different things, and it’s definitely not 100% (nor is it 0%). To say nothing of the other costs (especially and specifically, the (re-)introduction of flaws by rewriting an existing solution) not ever seeming to be a consideration.

                                  1. 14

                                    I don’t know who you have to fend off - but anyone who claims that a language solves basically “every problem” is so junior that their opinion simply shouldn’t be taken seriously.

                                    Rust is a very novel and great development in that it actually provides a safe alternative to the relatively small niche of “no managed language allowed”. There were safe alternatives in the form of managed languages for decades, so it won’t meaningfully change the security of, say, web applications, but it really is a welcome change in this field, so some “hype” is absolutely warranted, and many safety critical systems should start to port their C parts to Rust (e.g. mobile OS drivers), as there is no good reason for having them in a memory unsafe language.

                                    1. 5

                                      There were safe alternatives in the form of managed languages for decades, so it won’t meaningfully change the security of, say, web applications, but it really is a welcome change in this field, so some “hype” is absolutely warranted, and many safety critical systems should start to port their C parts to Rust (e.g. mobile OS drivers), as there is no good reason for having them in a memory unsafe language.

                                      Actually, most languages used for web apps are not very good at enforcing invariants in the type system, compared to Rust, so I would bet that Rust webapps are safer in ways unrelated to UB than webapps in managed languages.

                                      1. 6

                                        Certain layer of the web is fundamentally “stringly typed”, so I would take a battle-hardened framework, over a month-old rust library any time.

                                        Also, Rust’s type system is not particularly unique besides affine types, Haskell, Scala, OCaml all have just as expressive type systems, if not better.

                                        1. 2

                                          Also, Rust’s type system is not particularly unique besides affine types, Haskell, Scala, OCaml all have just as expressive type systems, if not better.

                                          I feel like besides is doing a lot of work in that sentence. Move semantics and borrowing are genuinely useful for expressing application logic. I’m learning OCaml right now and I can’t wait for the day Jane Street’s Rust-inspired additions are upstreamed.

                                          Certain layer of the web is fundamentally “stringly typed”, so I would take a battle-hardened framework, over a month-old rust library any time.

                                          That feels like a disingenuous comparison. Obviously you shouldn’t rely on some random brand new project, but does that make Rust a fundamentally worse choice for webapps than Python, Java, or Go? Also, I don’t think I can think of anything that’s really fundamentally stringly typed, so could you give me an example?

                                          1. 1

                                            Move semantics and borrowing are genuinely useful for expressing application logic

                                            I would say it is a good safety guard at certain kinds of applications. But in the general case it might be too restrictive, disallowing many kinds of valid programs. Nonetheless, I also look forward to what OCaml brings out of it - their work on limiting the harm that any given data race might cause is also really interesting.

                                            Rust [..] fundamentally worse

                                            No, but nor is it fundamentally better - which was my point.

                                            As for stringly typed, I meant mostly things like parsing http request bodies, headers, urls, or failing to escape the response. All quite frequent security bugs from the time before big frameworks that hide these details better.

                                            1. 2

                                              I would say it is a good safety guard at certain kinds of applications. But in the general case it might be too restrictive, disallowing many kinds of valid programs.

                                              Right, that’s what’s great about bringing such features to a language like OCaml. Rust needs to have those restrictions all the time to be safe, but in OCaml, you could restrict yourself only when it’s useful to express your application logic.

                                              Nonetheless, I also look forward to what OCaml brings out of it - their work on limiting the harm that any given data race might cause is also really interesting.

                                              Aye. I haven’t tried multicore OCaml yet, but from what I read, while data races are possible, they cannot cause type safety issues. The ability prevent data races is one of the goals of Jane Street’s additions to the compiler.

                                              No, but nor is it fundamentally better - which was my point.

                                              Mm. I don’t think Rust is fundamentally better, but practically, I feel like it’s easier to write best practices code in Rust, than, say, Python or Java. Hell, this might be controversial, but even though I only write Rust code once or twice a year, I find it easier prototype in than almost any language I’ve used. That’s not something I think I could say of other languages with fancy type systems. (Though I’ve heard very good things about OCaml, so I guess I’ll see how I feel once I get more experience with it)

                                              As for stringly typed, I meant mostly things like parsing http request bodies, headers, urls, or failing to escape the response. All quite frequent security bugs from the time before big frameworks that hide these details better.

                                              Ah, yeah. I think of those things as things that should be parsed into easy to deal with data structures. Throwing strings around is a dark and dangerous temptation.

                                    2. 8

                                      The key is to remind people that PHP is a memory safe language.

                                      1. 4

                                        Yes, this point is critical. Memory safety is not new. Memory safety without gc/ref counting(plus it’s possible cyclic memory leaks)/sacrificing low level control is new.

                                        1. 3

                                          Also, PHP was a huge win for web security. I used to write web apps in C before I started using Perl & PHP. But once we had memory safety we found a whole lot more issues to address.

                                          1. 2

                                            oh fair point, and interesting take. I kind thought you were dunking on php a little bit and by way of doing so also indicating that really memory safety itself is a mostly solved problem (in domains where we can support GC et al).

                                            But yeah, that makes sense. At the time I started to learn webdev it seemed like websites were written in one of ASP/Java in the enterprise, Perl in the small, with PHP was up-and-coming replacing Perl. This was probably circa 2003-2007ish.

                                            None of this is to say that Rust isn’t a great language and a fine choice. Just that, we’ve known how to do memory safe programming since 1959 (https://en.wikipedia.org/wiki/Garbage_collection_%28computer_science%29#cite_note-McCarthy_1960-2).

                                            (of course, saying this will prompt a whole new round of arguments, which I totally remember experiencing in that era. Yes, GC requires more memory. Yes, this was impractical when memory is very highly constrained. The typical explanation for why GC languages became prominent when they did was that Java had a massive push right when memory generally became large enough to accommodate. That of course is a bit puzzling when considering that a huge push for java was mobile phone programming, which were very highly memory constrained; but popular narratives often don’t exactly mesh with facts. It may totally be that Java was a bad choice really for phones, and perhaps that whole thing held back phone tech for years and years. I don’t really know.)

                                    3. 1

                                      The problem is what you have to pay for those 20%. There is a cost, even if just in the uncertainty in switching. And I daresay the cost-benefit analysis is very different when the bug reduction rate goes from 60% to 20%. Not just in whether you’ll switch, but when: when a bug class exceeds 50% it’s natural to address that class first, but when it’s just 20%, it might be worth looking into other classes, and fix them first.

                                    4. 19

                                      On the other hand, Rust has a lot of desireable features beyond memory safety, so it’s also improper to reduce it to just that.

                                      1. 1

                                        I know, right? A great build tool, Option instead of null, sane monadic error handling, fast standard library data structures, DDOS-safe-by-default hash maps, great error messages and docs, not to forget rust-analyzer. The list goes on.

                                    5. 23

                                      In their dataset there’s a lot of web apps. Attacking systems by opening URLs they didn’t mean to expose is a low hanging fruit, so it’s not surprising it’s a popular vulnerability.

                                      And technically they’re correct that Rust won’t protect against that — most web apps are already written in higher-level languages, and Rust won’t magically configure server auth for you.

                                      But also clickbaity, because Rust wasn’t even meant to address these types of problems.

                                      1. 10

                                        There’s a more interesting result here: a lot of the things being exploited here are written in typesafe languages. When your data set is primarily things written in unsafe languages, the number is closer to 70%. To me, this shows that moving to type-safe languages is a huge win in terms of reducing the number of vulnerabilities.

                                        I’ve written before about why I think focusing on memory safety as a goal, rather than a building block, is a mistake. A lot of the other issues would have significantly reduced impact if we had better tools for building compartmentalised applications. That’s what CHERI and Verona are both trying to enable.

                                        1. 10

                                          I think in terms of common vulnerabilities I see: ruby/python/java have all had more than their fair share of “evaling untrusted input over the network seemed like a good idea at the time” bugs. Charitably they can also be described as “the most popular serialization frameworks are basically code execution frameworks and not dumb data containers and a surprising number of developers don’t know or care”.

                                          See:

                                          • java.io.Serializable
                                          • yaml
                                          • pickle

                                          You have to work pretty hard to have the same class of bug in Rust so you could say with a straight face that Rust would have prevented the Equifax hack: https://avatao.com/blog-deep-dive-into-the-equifax-breach-and-the-apache-struts-vulnerability/

                                          [edit]

                                          I’m playing a little bit of devils advocate. I don’t think that auditing a codebase for pickle/bad yaml is nearly as hard as, for example, avoiding memory errors in c that lead to remote execution.

                                          1. 10

                                            Or more recently, gitlab account takeover. But somehow problems like these never get mentioned in the language wars. The usual suspects never swarm the thread to explain, you know, this could have been prevented if only.

                                            1. 2

                                              what makes yaml a code execution platform?

                                              the other two let you call any constructor/property on any type. unless there’s a deserializer that requires a list of allowed types, that’s going to be a problem

                                              with yaml, the code it can execute should be limited to the constructors/properties of the type you’re deserializing to (and the types of its properties), shouldn’t it?

                                              1. 1

                                                https://www.google.com/search?q=rails+yaml+vulnerability

                                                I promise it’s not a sarcastic let me Google that for you response. It’s just an easy way to get a list.

                                                I was actually surprised to see the top result was from 2024 and it read the same as every other version: someone sends a webapp framework some yaml and by default certain yaml directives specify the type of an object to construct. See: https://yaml.org/YAML_for_ruby.html#objects

                                                Python has yaml libraries with the same problem: https://book.hacktricks.xyz/pentesting-web/deserialization/python-yaml-deserialization

                                                Now a serialization format can’t force parsers to run eval or instantiate custom objects so “yaml” can’t technically be called a remote code execution framework.

                                                But. If you look at where yaml is being used commonly it turns out that a bunch of frameworks use yaml parsers configured to do this. Sometimes they depend on that behavior. Sometimes they are just unaware. It doesn’t help that more than once I’ve seen people throw yaml parsers at JSON because it’s mostly a subset of YAML (technically my understanding is that there are some subtle differences, but they are somewhat unintentional. YAML was made to be a superset of JSON and people so rely on that).

                                                https://john-millikin.com/json-is-not-a-yaml-subset

                                                1. 1

                                                  i believe you

                                                  it’s not really obvious to me why yaml libraries would be like this, though. (when compared to json, xml, etc.)

                                                  i have seen json libraries with a special "$type" key. there’s nothing stopping people from adding that vulnerability to libraries for other formats. but from what i hear, most of the json libraries don’t do that

                                                  is there something about yaml that makes it especially tempting to bolt on serialized type names?

                                                  1. 1

                                                    The essential language feature that enables these attacks is having enough reflection to look up types or functions by name at runtime, right? I suppose that means that Rust prevents these attacks, unlike most common memory-safe languages, in a way unrelated to memory-safety.

                                                    C as a language also has no reflection (AFAIK), but I have an impression that looking up functions by name at runtime (e.g., dlsym) is fairly common in C, whereas it’s (unsafely possible but) rare in Rust.

                                              2. 6

                                                If you read it in the key of “what’s affected the most” rather than “what technology is affected the most”, it’s surprisingly un-clickbaity: about 80% of the vulnerabilities the authors consider are in things Rust isn’t meant to address and likely can’t do much about (default secrets, insecure exposed functions, weak encryption). It’s not a jab at Rust.

                                                Most web apps are indeed not written in Rust – and they’re the primary, and in many cases the only means of interaction for a lot of users out there. If data can be exfiltrated without even getting out of the JavaScript sandbox, it’s going to be exfiltrated just as well whether the web server handling the requests is written in Rust or C.

                                                It’s not a perfect analysis, you can’t have one that’s simultaneously so wide and detailed enough. E.g. if you look at Talos’ list of most exploited vulns in 2023 (to mention just one other trendy analysis) you’ll see that many of them are core vulnerabilities in either nearly universally-depoyed Windows components (e.g. print spooler) or widely-used applications. That “insecure exposed function” section of the pie chart is huge but all of the applications in that blue half, taken together, probably have like 1/10th of the deployment count of one good memory corruption bug in Windows.

                                                But I also don’t think it’s clickbaity. There’s a lot that Rust won’t help because it won’t cover. No one’s expecting it to cover it but it’s also fair to point out that it doesn’t.

                                                1. 10

                                                  I would argue that it’s not super clickbaity in that there is a certain “Rust will solve all your problems!!!11!” that is repeated enough that some people will say “I can guarantee my web app is secure, it’s written in Rust!” and be believed (and maybe even believe it themselves).

                                                  1. 19

                                                    some people will say “I can guarantee my web app is secure, it’s written in Rust!” and be believed (and maybe even believe it themselves).

                                                    This feels like a bit of a straw man; I’ve never seen this kind of reasoning in the wild, and it runs contrary to how everyone I’ve worked with professionally tends to think. Is this something you’ve actually encountered, and if so in what context?

                                                    1. 10

                                                      It’s definitely a straw man, but I do have the issue that most Rust web frameworks value speed over everything while lacking common OWASP mitigations that are standard in any mature web stack. Things like the rack-security set of libraries in Ruby.

                                                      Rust does help at a lot of places outside of memory safety though, e.g. I would say that structured JSON parsing through e.g. serde is a huge win. There’s been so many attacks on GitHub that worked through sneaking fields into JSON the API didn’t expect, but pick up anyways.

                                                      1. 2

                                                        In general I am not fond of Rust as a language for web applications. Unless you have additional constraints (performance or resources consumption) it is almost never a good choice to use Rust there. Instead I prefer to use managed languages and use FFI to Rust in tight loops/performance sensitive code. Other than that the only situation where I would use Rust for HTTP handling would be if there is super small webUI needed for application that is 90% Rust anyway. For rest Rust as a web language just do not fit my mind.

                                                        1. 1

                                                          I agree to an extent, though I don’t think the issue lies in the language per se. It has very useful features for such applications and servers.

                                                          But I am clearly of the opinion that the current stacks do not tackle the problems that you need for writing a web application. Not fundamentally unsolvable.

                                                          I’ve also formed the habit of calling what’s available in Rust “HTTP acceptors”, none cross the threshold that I would consider a framework.

                                                      2. 4

                                                        I haven’t yet encountered it with Rust specifically but as someone who did professional security audits for code for many years, having people say “I wrote this in $MEMORY_SAFE_LANGUAGE therefore it’s secure” was not a rare thing. Not super common, but not unheard of.

                                                        People who are new to programming, or not security-minded, or whatever, just read in a book “C caused all sorts of problems, Java makes it impossible to make those kinda of mistakes” and go on to think “I use Java, I’m protected.”

                                                        (I knew someone who was paid very well who wrote code that said “I set the encoding on the database column to BINARY, that means it’s encrypted and I can store passwords in it.” Not everyone is security-minded.)

                                                    2. 2

                                                      In their dataset there’s a lot of web apps. Attacking systems by opening URLs they didn’t mean to expose is a low hanging fruit, so it’s not surprising it’s a popular vulnerability.

                                                      Systems have to be networked in order to be remotely exploitable. Most network-accessible APIs are web APIs in this day and age. It only makes sense that those represent the vast majority of exploits in the wild.

                                                      1. 3

                                                        This is the assumption that gets cars hacked through front lights and information is extracted through LED flickering. “remote” is very stretchable.

                                                        1. 2

                                                          Obviously. And just as obviously that’s the exception that proves the rule. C’mon man.

                                                      2. 1

                                                        And yet, I have talked a programmer who were learning Rust and Wasm for the safety benefits they thought it had.

                                                    3. 7

                                                      rand.N working for time durations will be really handy.

                                                      1. 11

                                                        I know very little about networking. Why do people care about this?

                                                        1. 23

                                                          There’s now enough IPv6 deployment that you can do most things with a pure IPv6 stack, but a bunch of big services are still missing. If you run a dual-stack implementation then you end up with a big attack surface. Being able to remove IPv4 support eliminates that and also significantly reduces the administrative overhead for managing a network (IPv6 is easier than v4, but v4 + v6 is a lot more complex). The sooner it’s plausible to run a v6-only network, the happier a lot of people will be.

                                                          More concretely: For CI systems, a bunch of hosting providers give a discount for machines with only v6 connectivity. For example, the cheapest Vultr node you can buy is $2.50/month with IPv6-only support, $3.50/month for v4 as well. If you want to run your own CI (or mirror of a git repo, or canonical git repo that you mirror to GitHub, or any other thing that integrates with GitHub) then you can save a chunk of money if you can reach GitHub via IPv6.

                                                          1. 3

                                                            Is the attack surface of v6 only really smaller than v4 only? A quick review of some vulns I could find has quite a few more affecting v6 than v4, but I didn’t spend much time searching.

                                                            1. 3

                                                              The packet parsing code is simpler. I’ve recently been poking at the FreeRTOS network stack for CHERIoT RTOS and so had to read up on all of the various packet formats. If I wanted to write a network stack from scratch and be confident in its correctness and security, and had to pick v4 or v6, I would defined pick v6. For IoT devices, we may be able to ditch v4 sooner than most things, which would be great.

                                                              1. 2

                                                                theres a whole slew of DHCP and NAT code that is not going to be needed anymore that should be factored in to your search.

                                                                That: and a lot of code on switches since the routing tables of big iron internet routers is so fragmented now that it needs additional handholding.

                                                                1. 1

                                                                  In my experience that part of the network stack is pretty well tested. So I’m not sure if we’re really winning much. DHCPv6 is also a thing, not exactly for IP assignment, but stuff like (s)NTP servers.

                                                                  Dual stack is definitely much harder to maintain.

                                                            2. 10

                                                              Because people want to move from IPv4 to IPv6

                                                              1. 3

                                                                I’ve been complaining about this for eight years and I’m in disbelief it could actually be happening.

                                                              2. 5

                                                                IP4 address access is a bit like the water supply – you ignore it until it’s gone. Last year I attempted IPv6 transition and discovered that Github lack of ipv6 broke my deployments. Thus I had to revert to IPv4 NAT access.

                                                                There are workarounds that carry the same complexity and cost burden as a proxy or vpn.

                                                                Alas with IPv6 the weak link breaks the chain. You’ll need all of your dependencies to support ipv6 and a majority of internet apps depend on Github.

                                                                1. 2

                                                                  Because there are 8 billion people but only 4 billion IPv4 addresses. Here’s a quote from a github user in Brazil:

                                                                  New ISPs in my country are IPv6-only because there is no new IPv4 space to be provided to them. They do have a over-shared IPv4 address by CGNAT but due to the oversharing, it is unstable and not rare to be offline. For these companies, the internet access is stable only in IPv6.

                                                                  Thinking about the server-side, some cloud providers are making extra charges for IPv4 addresses (e.g.: Vultr.com) so most of the servers in my company are IPv6-only. Cloning github repositories is very cumbersome due to the lack of IPv6 support and this issue affects me and my team mates on a daily basis.

                                                                2. 4

                                                                  I think this might be a silly question, but could the application also have defaults itself and not write anything to the config file by default?

                                                                  1. 7

                                                                    That’s what the author is advocating I think. I agree, and I would go further to say that the application should print its config by default when it starts up. (After applying the defaults, and with any secrets removed)

                                                                    1. 1

                                                                      I tried that for a while and found the noise at startup annoying. What I landed on instead was an interface to retrieve the current config on demand (with defaults applied) and a tool that collected information from that interface and a couple of others to collect data for interacting with support.

                                                                      1. 3

                                                                        I went for the same approach in soupault. There’s an option --show-effective-config that prints the complete config as the application sees it, with all built-in defaults and user-supplied options together.

                                                                        A truly great system could show such configs with comments for the option purpose and tell if the option was default or modified, but just the option to display the complete effective config is still a lot better than nothing.

                                                                      2. 1

                                                                        I actually like the approach rspamd takes - you have a folder full with config files that contain the defaults, and if you want to change anything, you create an override in a subdirectory.

                                                                        1. 4

                                                                          UCL was developed by the author of rspamd. I wish things would adopt it instead of TOML.

                                                                          1. 2

                                                                            oh, I really need to look into UCL.

                                                                    2. 1

                                                                      It’s much simpler to use the hash as a query parameter. /style.css?h=abracadabra

                                                                      1. 2

                                                                        True, that would remove needing to create a separate asset file with the hash in the name, yes, but generating the hash and updating the HTML (whether on a static web page or regular app) would still need to happen.

                                                                        To me, the benefit of the separate files is that I could keep the last N around for a little while if stale frontends were still requesting them, or if I needed to roll back to prior changes.

                                                                        Thanks for responding!

                                                                      2. 2

                                                                        How does this happen? The go runtime is normally pretty thorough about setting O_CLOEXEC on new files.

                                                                        1. 1

                                                                          For something like this, close on exec feels fragile. I’d expect to set up a small set of known file descriptors to pass and then call closefrom to ensure everything else is gone, and to do this when creating the child, not waiting until exec. That said, last time I looked, Linux was still missing closefrom.

                                                                          1. 4

                                                                            Linux 5.9 and later has close_range() at least, which for example libbsd uses to implement closefrom().

                                                                        2. 7

                                                                          I propose measuring database load in Henrys.

                                                                          1. 8

                                                                            Suid binaries are evil.

                                                                            They can be run in very different initial conditions, and thus make paths in code that are not designed for this available for execution.

                                                                            Instead of such binaries, one should make services that can be run from root, under well-known initial conditions.

                                                                            In https://stal-ix.github.io / we don’t have any suid binaries in system, even sudo works as ssh client + local ssh daemon.

                                                                            1. 6

                                                                              Or make the suid binary not runnable AT ALL by non wheel group members. There is a NixOS option that does this, which makes sudo exit with an error, handled by the kernel. This solves all suid related vulnerabilities.

                                                                              I’m not sure how it is done on other distros, but it’s security.sudo.execWheelOnly on NixOS: https://github.com/RGBCube/NixOSConfiguration/blob/master/modules%2Fsudo.nix#L17

                                                                              Suid binaries are definitely not evil.

                                                                              1. 3

                                                                                This sounds like it just sets the permissions to rwsr-xr--.

                                                                                1. 4

                                                                                  The implementation of execWheelOnly:

                                                                                      security.wrappers = let
                                                                                        owner = "root";
                                                                                        group = if cfg.execWheelOnly then "wheel" else "root";
                                                                                        setuid = true;
                                                                                        permissions = if cfg.execWheelOnly then "u+rx,g+x" else "u+rx,g+x,o+x";
                                                                                      in {
                                                                                  

                                                                                  So, you seem to be correct.

                                                                              2. 2

                                                                                Is a suid binary a requisite for this flaw to be exploited?

                                                                                1. 11

                                                                                  It would be quite unusual for a daemon to be started with an overflowing argv0 by other means.

                                                                                  1. 3

                                                                                    Not that weird though. For example some services will create per-slice/task/customer processes that change the argv0 to include the identifier.

                                                                                    1. 2

                                                                                      Thanks for expanding! I found the linked article about light on these details.

                                                                                2. 4

                                                                                  Honestly we need to get together and standardize a new terminal protocol. Start with xterm-256color or something, cut out a bit of the crap nobody uses, agree on a few of the things with different variants like this, and give it a $TERM string that’s the same everywhere. Maybe even make a feature detection capability so future changes don’t need random non-xterm programs to call themselves xterm just to make things work.

                                                                                  Maybe then I’ll be able to use Helix from inside tmux from inside mosh, without all the colors being fucked over. You never know. Dream big!

                                                                                  1. 4

                                                                                    You can use DECRQSS or XTGETTCAP to query the terminal, but it gets really awkward if you don’t control the input event loop, and a lot of programs are not plumbed to make it easy to read the response.

                                                                                    1. 3

                                                                                      Indeed. Terminal queries + standardized escape sequences have been the trend for several years now. We’re not quite at the point where we can obsolete terminfo, but we’re making progress.

                                                                                    2. 1

                                                                                      Don’t most programs support xterm-kitty now? doing SetEnv TERM=xterm-kitty in my SSH config works for me without issue, and I get all the colors.

                                                                                    3. 12

                                                                                      Since when does unix support filenames with spaces?

                                                                                      1. 7

                                                                                        I could be wrong, but I believe that’s always been a property of UNIX filesystems because the kernel didn’t parse the path components at all and just treated them as a bag of bytes. I’m not sure if the original shell supported escaping them though.

                                                                                        1. 17

                                                                                          Given @tedu is an OpenBSD core dev with whom I’ve personally discussed Unix path parsing insanity, I’m pretty sure this is a joke about how the headline reads weird to anyone who’s done serious Windows dev work, because it shows up all the time

                                                                                        2. 4

                                                                                          At least according to this 0, “UNIX-like systems” accept any nonzero byte for filenames - so they don’t even need to be valid UTF-8. Slash has special meaning, of course, and I assume the nonzero restriction is because of functions using pointers to bytes with NUL as a terminator.

                                                                                          1. 1

                                                                                            Sometime between v0 and v6 it seems. v0 is so minimal and different that it’s kinda hard to test but trying to write a file named “foo bar” from ed results in “foobar”. By v6 things work as expected.

                                                                                          2. 1

                                                                                            Ironically, the first image, of the forest, is completely blown out in HDR on my laptop. When I resaved and stripped the gain map, many more trees suddenly became visible.

                                                                                            1. 1

                                                                                              Hm, that’s strange, that’s not the case for me. I see the same photo, just brighter in the HDR case.

                                                                                              What laptop and browser and OS do you use?

                                                                                              I added a photo in the wiki page, where I try to show the difference between a browser which supports it (Chrome) vs one which just shows the SDR image (Safari).

                                                                                              1. 1

                                                                                                Samsung Chromebook OLED. So here’s a photo of my screen showing your laptop.

                                                                                                I tried playing with brightness, but it seems it simply clips all bright content to the “SDR” range. And then all the foliage above the road disappears.

                                                                                                On my phone, the comparison looks good. The HDR is brighter, but not blown out.

                                                                                                https://honk.tedunangst.com/d/7pF214RjBG4M43w61L.jpg

                                                                                                1. 1

                                                                                                  Do the photos look correct here? https://gregbenzphotography.com/hdr/ (This website mostly uses AVIF though, not Ultra HDR.)

                                                                                                  Do YouTube HDR videos look correct?

                                                                                                  1. 1

                                                                                                    Mostly correct, just with clipped highlights. And the HDR gradient maxes out pretty low. It’s just a screen (or maybe something else) that doesn’t like displaying brighter than normal. I still like it because the color saturation is amazing.

                                                                                                    Anyway, I was checking it out because I’ve been passively following developments in this space, and was curious.

                                                                                              1. 21

                                                                                                To me this comment is ambiguous and therefore unhelpful. I normally ignore such comments but you have an official looking flair and I’m trying to learn golang so I’m writing on the off chance that you might expand on this statement.

                                                                                                1. 5

                                                                                                  I agree. The article as it’s written makes a good case for doing it this way IMO.

                                                                                                  1. 5

                                                                                                    The problem is that the standard library doesn’t follow this rule, and most (~all) third party libraries don’t follow the rule. So you’ve got a little convention that just exists in your own code. But your own code is where the distinction is least helpful because you’re more likely to remember that you wrote First to be generic and Second to use an interface or whatever versus when you come to a new package someone else wrote and need to figure it out. I think if the standard library hadn’t switched to using any, it would make sense to try to follow it, but it’s too late now, and it’s mostly NBD either way.

                                                                                                    1. 1

                                                                                                      Thank you. This actually changed my view on it. I mean, the post still makes a good point. But if this isn’t a standard followed by literally everything else then yeah, there is little point.

                                                                                                  2. 5

                                                                                                    You can do what you want in your code, but trying to create semantic meaning where the compiler has expressly declared there is no difference is a fools errand imo.

                                                                                                  3. 4

                                                                                                    Yeah, that’s my feeling on balance. I thought I would do this when 1.18 was in betas, but by the time it was released I think I had just given up and embraced any everywhere.

                                                                                                  4. 11

                                                                                                    If I’m writing my project in C, I probably don’t want a rust dependency.

                                                                                                    1. 7

                                                                                                      It’s only written in Rust, it can manage any build system. So it’s just a static binary to have in your $PATH.

                                                                                                      1. 6

                                                                                                        I’m just saying, if people are writing C, they probably want to own the stack and the tools and utilities, and they want that stack to be C.

                                                                                                        1. 19

                                                                                                          That seems like an arbitrary constraint that brings very little to no value. CMake is written in C++, so if you write C you don’t want CMake either?

                                                                                                          Meson, which is used a lot as well, is in Python, that is an even bigger dependency because you need Python to run it, while here Rust is only needed to build shipp, not run it.

                                                                                                          1. 11
                                                                                                            1. 3

                                                                                                              The “it’s just a static binary” argument falls apart as soon as a C programmer needs to fix or debug an issue in your Rust program. While IME most C programmers these days can understand and even write some C++, Rust is a very different language. The same goes for Python. So I believe a credible attempt at solving the C/C++ build system or package management problem will have to be written in C++ (you would be insane to do that in C).

                                                                                                              BTW, the same fallacy applies to build system-agnostic package managers. The pitch goes like this: it doesn’t matter which build system your project uses, we support all/any build systems. But it does matter: if my dependency graph ends up using a dozen different build systems (CMake, Meson, Boost Build, Autotools, etc), sooner or later I will hit a build issue that I will have to debug/fix in a build system I don’t care about written in a language I don’t know. There are other issues here, BTW, like building components of your dependencies that you don’t need (such as building the entire Boost while you only use a handful or libraries) and loss of parallelism because your build graph is aggregated at build system boundaries. You can find more on this here (including some concrete build speed numbers): https://build2.org/faq.xhtml#why-package-managers

                                                                                                              1. 9

                                                                                                                Honestly, I tried to write it in C++. But do you realize the amount of boilerplate code you need to reliably run a command, while redirecting its stdout/stderr to a file, and setting its environment variables in a cross-platform way? In Rust, I have std::process::Command right there.

                                                                                                                C++ is great, most of my code is modern C++20 / C++23 when the compiler supports it, but then you have a use case that is trivial in every other language, and somehow in C++ you have to write C code like it’s 1980 again. When you have more than 500 LOC to just run a command, you start questioning yourself if the language you are using is the right tool for the job.

                                                                                                                1. 1

                                                                                                                  But do you realize the amount of boilerplate code you need to reliably run a command, while redirecting its stdout/stderr to a file, and setting its environment variables in a cross-platform way?

                                                                                                                  Oh, I sure do: https://github.com/build2/libbutl/blob/master/libbutl/process.hxx I even know how to reliable move/remove a file, even on Windows: https://github.com/build2/libbutl/blob/master/libbutl/filesystem.cxx#L1784 ;-)

                                                                                                                  you have a use case that is trivial in every other language, and somehow in C++ you have to write C code like it’s 1980 again

                                                                                                                  True. Though part of the reason why it’s trivial in every other language is because they have a sane build system/package management story. Which means if process creation/management is not in the standard library, then by now there is likely a well-established package that most people use. So the way I see it the choices are basically:

                                                                                                                  1. Use a different language to try and solve the C/C++ build/packaging problem, don’t get any adoption (for reasons mentioned above), definitely fail to solve the problem, and still not have a well-established process creation/management package for C++.

                                                                                                                  2. Bite the bullet and do it in C++, which entails implementing most of the cross-platform abstractions from scratch. Maybe get adoption and maybe succeed which will then hopefully lead to a a well-established process creation/management package for C++.

                                                                                                                  Also, the “trivial in every other language” part is wishful thinking, at least if you are trying to create a cross-platform build system that is usable in the real world. Take the “removing a file on Windows” problem I mentioned above: to solve it we retry the operation for some time. This is acceptable for a build system (with some sensible limit, like a few seconds) but there is no way this is acceptable in a general-purpose API that is found, say, in the Rust standard library (but, reportedly, this is what Cosmopolitan libc does by default, incredibly). So I believe that as your trivially-written toy build system moves into the real-world direction, you will find yourself re-implementing more and more of the cross-platform abstractions from scratch to fit your needs. An example for the process management case that immediately comes to mind is support for thread-local environment variables (i.e., in a multi-threaded build system it is handy to be able to change the environment on the per-thread basis and then start a process from this thread that automatically inherits this thread-specific environment).

                                                                                                                  1. 7

                                                                                                                    i.e., in a multi-threaded build system it is handy to be able to change the environment on the per-thread basis and then start a process from this thread that automatically inherits this thread-specific environment

                                                                                                                    I have no idea why you would modify the environment in the build system to do this. Why would you not just have a dictionary (or a vector of key-value pairs as strings) that you pass as the environment into the routine that handles process creation? That’s what I do in every program where I’ve ever needed to do this in C++. Modifying your environment after process creation is such a spectacularly bad idea (and is undefined behaviour in a multithreaded program). If it’s not the environment and is just a dictionary that is per-thread or per-branch-of-the-build-process context then it shouldn’t need anything complicated, it’s just another collection in your state.

                                                                                                                    1. 1

                                                                                                                      Modifying your environment after process creation is such a spectacularly bad idea (and is undefined behaviour in a multithreaded program)

                                                                                                                      I came to say this too. I think this article speaks to the issue of setenv(3) not being thread-safe pretty well.

                                                                                                                      1. 1

                                                                                                                        I have no idea why you would modify the environment in the build system to do this. Why would you not just have a dictionary (or a vector of key-value pairs as strings) that you pass as the environment into the routine that handles process creation?

                                                                                                                        Yes, of course, we have this and that’s what you would use if you wanted, say, set a specific environment variable when creating a process from a rule implementation.

                                                                                                                        But we also have a notion of hermetic build configurations where environment variables that affect the project are saved in its configuration and then that’s the environment that the project “sees” from now on regardless of the changes to the environment from which the build is running. The way we implement this is by changing the thread-local environment whenever we switch the project any particular thread is working on. The benefit of this is that calls to getenv() (our custom wrapper that consults the thread-local overrides), process creation, etc., all “see” this project-specific environment without us having to drag it explicitly through many, many levels of calls.

                                                                                                                        such a spectacularly bad idea

                                                                                                                        I guess one man’s spectacularly bad idea is another man’s elegant solution. Unfortunately, in this field we too often pass judgement on a solution without fully understanding the problem.

                                                                                                                        1. 4

                                                                                                                          But we also have a notion of hermetic build configurations where environment variables that affect the project are saved in its configuration and then that’s the environment that the project “sees” from now on regardless of the changes to the environment from which the build is running.

                                                                                                                          That sounds like a useful thing, but doesn’t require modifying the current process’s environment.

                                                                                                                          The way we implement this is by changing the thread-local environment whenever we switch the project any particular thread is working on. The benefit of this is that calls to getenv() (our custom wrapper that consults the thread-local overrides), process creation, etc.,

                                                                                                                          Why is any of this using environment variables, rather than a dictionary that may be populated on startup from sources including environment variables?

                                                                                                                          I guess one man’s spectacularly bad idea is another man’s elegant solution. Unfortunately, in this field we too often pass judgement on a solution without fully understanding the problem.

                                                                                                                          No, it’s a spectacularly bad idea because it can make unrelated code in other threads segfault, depending on the libc implementation.

                                                                                                                          1. 1

                                                                                                                            Of course we don’t change the processe’s physical environment (i.e., call libc’s setenv()). Rather we’ve implemented out own wrappers of getenv()/setenv() that logically allow changing environment on the per-thread basis (by keeping a thread-local dictionary of overrides to apply on top of the original process environment).

                                                                                                                            The point I was making in my original comment is that a process creation support in the language’s standard library or a third-party package is unlikely to integrate seamlessly with such a custom environment mechanism.

                                                                                                                            1. 4

                                                                                                                              The point I was making in my original comment is that a process creation support in the language’s standard library or a third-party package is unlikely to integrate seamlessly with such a custom environment mechanism.

                                                                                                                              I don’t see why you would think that. The APIs I’ve seen for process creation all allow the caller to specify the environment variables in a similar way to the arguments. NSTask exposes a dictionary for the environment that you can set. The Python APIs that I remember used a kwarg thing to take the environment. The Rust ones allow you to either add / remove environment variables, or start with a pristine environment and add your own. The underlying execve takes the environment as a pointer to an array of strings (and inheriting the current process’ environment is not the default, it requires explicitly capturing it and passing it).

                                                                                                                              I have never come across a process-creation API in high- or low-level APIs that didn’t give complete control over the environment.

                                                                                                                              Redirecting arbitrary (not the first three) file descriptors is not always surfaced and various things like entering capability mode, attaching to a jail, applying a secomp-bpf filter, pledging, and so on are all likely to be unavailable in high-level abstractions, unfortunately.

                                                                                                                              1. 1

                                                                                                                                Yes, maybe this won’t force you to write process management from scratch. You would still probably want to wrap the existing implementation into something that automatically merges the environment bits coming from different sources. There are also nuances like searching for an executable in PATH that the process creation API usually provides (the PATH may have to come from the thread-local override). When I was adding hermetic configuration support in build2 I was sure glad I could just put the necessary bits directly into our process management implementation rather than having to dance around a third-party code I could not touch.

                                                                                                                  2. 5

                                                                                                                    The “it’s just a static binary” argument falls apart as soon as a C programmer needs to fix or debug an issue in your Rust program.

                                                                                                                    People with that sort of anti-learning mentality probably aren’t very good programmers anyways. Especially given that Rust can teach you a lot on how to write better C.

                                                                                                                  3. 1

                                                                                                                    Every modern C compiler is a C++ compiler (except tcc), so a C++ dependency is not so bad since it is already provided by the compiler you use to compile C.

                                                                                                                  4. 5

                                                                                                                    It’s a downside to need two languages, but not a big one. You probably have tools written in many different languages on your machine.

                                                                                                                    1. 1

                                                                                                                      I disagree. If your language package manager is written in the language it manages, it creates a circularity problem that is a consistent PITA. (Unless, like the NPM or Go tool, you build it into the compiler itself.) People keep running into this problem over and over again in Python. Node has a related problem with Babel being written in JS, so you need to install and configure some JS to read your JS configs. It’s better if you can install one thing with your OS package manager, and then delegate language management to it.

                                                                                                                      1. 1

                                                                                                                        I can see why some would prefer that, but I don’t agree. scons is written in Python, and is widely used by several C/C++ projects (eg: Godot). clang is written in C++, and yet people use it to compile C programs.

                                                                                                                        The JS ecosystem had been eating its own tail for the longest time (ESLint, Webpack, Prettier, etc.), until more performant tools written in other languages came about (Biome, ESBuild, SWC, etc.). Today, ESBuild is easily the best build system for JavaScript, although written in Go.

                                                                                                                    2. 3

                                                                                                                      It seems to not be for Rust deps but only written in Rust.

                                                                                                                    3. 1

                                                                                                                      The real question is why mDNS is authoritative for anything but *.local

                                                                                                                      1. 4

                                                                                                                        mDNS is just the name of the dns daemon.

                                                                                                                          1. 2

                                                                                                                            From the article:

                                                                                                                            mDNSResponder, also known as Bonjour, is the macOS system daemon historically responsible for Multicast DNS, i.e. things like finding and resolving other devices on your network via a special .local TLD.

                                                                                                                            However, mDNSResponder also handles unicast (“normal”) DNS - and it handles all DNS queries that are not explicitly made via a resolver. In fact, it also handles queries done via getaddrinfo() through a UNIX domain socket on /var/run/mDNSResponder as well as a dedicated XPC service named com.apple.dnssd.service.

                                                                                                                      2. 38

                                                                                                                        I’ll just note that regardless of language used, when asked why is this program slow, programmers who know C provide more credible answers in my opinion. Programmers who don’t are far more likely to blame the hardware for being more than a year old. But nobody needs to write fast software. A new MacBook is only a few thousand dollars.

                                                                                                                        1. 14

                                                                                                                          I was reading the thread to find this.

                                                                                                                          It’s not about the language; it’s about the mindset the language makes you have after a while of playing and twisting bytes. Most of the engineers in my technical team are web-related, and the CEO and I both have game programmer backgrounds. The difference in debugging workflow and mindset when programming is abysmal.

                                                                                                                          1. 8

                                                                                                                            as much I hate the broad classification of web programmers in this light, myself being one, there is something to this.

                                                                                                                            I bought a $3000 macbook 4 years ago (pre m1), that is starting to struggle to do completely reasonable tasks.

                                                                                                                            I wouldn’t say its because web developers are stupid, as I have worked with mostly very smart people, but someone reaches for an extremely complex architecture way too early - microservices, SPAs, x-overly-hard-to-debug-js-state-management-library etc. Once you started down this path, most people have lost the ability to achieve decent performance, it’s just about staying above water.

                                                                                                                            The part that’s really bothers me, is like, javascript isn’t slow, even simple SPA setups with frameworks do not have to be that complex & can perform just fine, but you have to be extremely disciplined & run against the grain & stay out of fashion-oriented-programming.

                                                                                                                            1. 6

                                                                                                                              programmers who know C provide more credible answers in my opinion.

                                                                                                                              I wanted to provide a counter example of, say, you don’t need to know C… Zig programmers… but the reality is that most Zig programmers are coming from C anyway… so it’s not a counter example.

                                                                                                                              Maybe Forth programmers are a counter example? All 10 of them… :)

                                                                                                                              1. 4

                                                                                                                                I wonder how many programmers know x64 or arm64 assembly but don’t know C. It seems quite possible in our brave new world of Go / Rust for systems programming…

                                                                                                                                1. 2

                                                                                                                                  My experience is that programmers who know C can only provide sometimes-credible-ish performance answers for things that are in their specific wheelhouse. Someone whose performance-analysis toolbox only contains things like alignment and cache coherency is not supe helpful to me; I need the person whose performance-analysis toolbox is reading query plans or looking for N+1 issues. But then I also don’t particularly feel the need to sneer at and belittle programmers who work in fields that aren’t my own or whose day-to-day problems are different from my own or require different skillsets.

                                                                                                                                2. 3

                                                                                                                                  I was going to say it should be possible to shuffle the confetti from many bills more thoroughly and prevent reconstruction, but they don’t really get that far anyway. It would have been funny if they discovered that all the serial numbers were already punched out before shredding.

                                                                                                                                  1. 8

                                                                                                                                    Is there a reason why this is better than the existing tar.bz2 builds? Download, untar, run. Works great. (Besides the link on the page to tar.bz2 builds is broken.)

                                                                                                                                    1. 8

                                                                                                                                      Nope, that’s the same build and that’s how I run Firefox (Nightly) in Linux. The only technical difference is that your computer is running different code during an update and it’s more controllable with apt.

                                                                                                                                      1. 3

                                                                                                                                        Do the tarballs update or does one have to remember to go download the latest version? A repository is usually preferred for this reason alone.

                                                                                                                                        1. 7

                                                                                                                                          Auto updates for me. I just leave it unpacked in the downloads folder, with a symlink in ~/bin.

                                                                                                                                          1. 5

                                                                                                                                            For me personally, I use the Debian version specifically because it doesn’t auto-update at random intervals. Other than security updates, I want new features when I’m good and ready for them, I don’t want random things moved or removed from one day to the next.

                                                                                                                                            1. 4

                                                                                                                                              You could get the same result by using Firefox ESR directly. IIRC that’s how Debian gets their stability anyway. (Though the Debian package in that case is probably still more convenient to install.)

                                                                                                                                          2. 2

                                                                                                                                            One reason it might be worse: I tried switching from the tar build to the deb package, and despite the blog’s promise that it will keep using your existing ~/.mozilla/firefox folder, it does not, and loses all open tabs & preferences. (I didn’t try to debug it much before giving up and switching back. I assume they just missed something simple.)

                                                                                                                                            1. 3

                                                                                                                                              I had the same behaviour, but found that it does use the same files, but it just created a new profile. I navigated to about:profiles and set the previous one as the default and restarted the browser and everything was back as I had left it.

                                                                                                                                          3. 8

                                                                                                                                            Thought it was a good quiz. If you weren’t able to answer correctly, it should humble you in your ability to write correct portable C. That’s a good thing overall. I’m concerned at the realistic prospect that I will be treated with a medical device whose software was written by an engineer that was incapable of responding correctly to a quiz like this. I’m terrified if that same engineer responded to their failure of this quiz with hubris instead of humility.

                                                                                                                                            1. 2

                                                                                                                                              Are you afraid the size of int in your medical device will change mid-treatment?

                                                                                                                                              1. 8

                                                                                                                                                No, but an integer might overflow in the medical device because the programmer made a wrong assumption about its size.

                                                                                                                                                1. 3

                                                                                                                                                  Still, overflows can happen in other languages too. The finite size of processor registers is hardly a problem unique to C.

                                                                                                                                                  1. 3

                                                                                                                                                    C is not unique, but is quite unusual, in having a default set of integer types whose ranges are implementation defined. Most other languages do not have this problem. The really fun ones are Lisp, Smalltalk, and similar, where small integers are stored aliased with pointers and are automatically promoted on overflow and so 32-bit platforms hit a massive performance cliff long before 64-bit ones.

                                                                                                                                                    1. 1

                                                                                                                                                      Languages like Rust, Perl, Go, Pascal, Fortran, etc. all have integer sizes which are based on the native size. It’s not at all unusual. Languages like Rust, Go and C/C++ also have fixed size ints available. I don’t really see how C’s unusual here, in having both fixed and native integer sizes available.

                                                                                                                                                  2. 3

                                                                                                                                                    And that’s why you write: _Static_assert(sizeof(int) >= 4, "Platforms where int is less than four bytes are not supported") somewhere in your code and forget about it. Or just use int32_t where you care.

                                                                                                                                                  3. 7

                                                                                                                                                    No, I’m worried that some math calculation promotes/converts in an unexpected way and some unlikely value/timing results in the code misbehaving and it wasn’t noticed in testing because the particular values that would trigger this are incredibly unlikely.

                                                                                                                                                    1. 3

                                                                                                                                                      Haha. I’m just paranoid about medical devices in general. The story of Therac-25 changed me. It is funny that my overall pessimism about other engineers has turned me into a Rust zealot. Is this the mentality that drives it? I think I preferred my naive optimism. Now when I look at other programmers, I no longer see human beings, I see undefined behavior.

                                                                                                                                                    2. 1

                                                                                                                                                      https://www.youtube.com/watch?v=yE5Tpp2BSGw

                                                                                                                                                      During the questions section, Rob Pike mentions that he wishes for ints in Go to be arbitrary precision.

                                                                                                                                                      As I understood, he suggests that it might be a compatible change and it can be reasonably performant.

                                                                                                                                                      I doubt it will happen, but it would be interesting to see the consequences of this change.

                                                                                                                                                      It would be interesting to have a relatively performant language where you don’t have to worry about numeric overflows by default.

                                                                                                                                                    3. 26

                                                                                                                                                      I wish it were easier to delete my Microsoft GitHub account as it was to delete my Microsoft LinkedIn account. My network was never really on LinkedIn, but you can barely participate in open source without a Microsoft GitHub account. If you send an email patch, many folks don’t know what to do with it (answer: git am); if you send it to a mailing list if it exists, many times the patchset is that just opened as a pull request on the Microsoft GitHub repository & discussions are expected to happen there (which you can’t do without an account). Further compounding is no external bug tracker or all too often the only alternative source of communication is proprietary Discord (which I won’t create an account for, since my GitHub account was pre-Snowden & exists as legacy from when I didn’t “see the light” as were being told you have to build a portfolio exclusively there). All but my Elm projects have moved elsewhere (since that community ties its identify to Microsoft GitHub & package management requires it), but contributing to other projects is something I do a lot more. Luckily a lot of older projects have been moving away, but just as I had an account when I was younger & looking for advice in the space, newer projects from the next generation are taking the place of those projects likely used to the hustle act of social media (which let us be real: Microsoft GitHub is a social media network).

                                                                                                                                                      Either way, I’m proud the writer migrated & slowly I hope we can see the shift away from this large proprietary services and neo-EEE. Even just a separate mirror is useful for contributions either for those sanctioned out of service with Microsoft GitHub or those that wish not to use it on moral grounds. The write may however want to include an issue tracker at some point since you can find a bug yet not know how to fix it—as the mentioned mailing patches to be prefered to issues (meanwhile I tried doing this with a PR this week & was told I should have opened an issue first 🤷).

                                                                                                                                                      1. 16

                                                                                                                                                        If you send an email patch, many folks don’t know what to do with it (answer: git am);

                                                                                                                                                        But only in a sandboxed environment, because you don’t want to be applying arbitrary patches to a build system and then running arbitrary code sent by some random internet person.

                                                                                                                                                        Honestly, for me, this is one of the biggest benefits of things like GitHub. They make it trivial to apply patches, build, and run tests in an ephemeral worker, which has no access to anything I care about. If you compromise a GitHub worker, I don’t care: mine all the bitcon you like, it’s costing Microsoft money, not me.

                                                                                                                                                        It’s possible to do something similar via email, but it’s nontrivial.

                                                                                                                                                        1. 18

                                                                                                                                                          Yeah, good luck having contributors sending patches via email. Here is what will likely happen:

                                                                                                                                                          • I want to contribute to this project
                                                                                                                                                          • Oh it’s hosted on some random custom git stuff and I can’t even open an issue? Forget it.

                                                                                                                                                          I’ve been doing git related dev stuff for many years, and not ONCE I had to send patch via email.

                                                                                                                                                          I’m not a big fan of centralization, but after being used to the GitHub/Lab level of PR review, build, lint, test and the amazing tools available to review commit by commit, hide whitespaces changes, leave comments on specific lines, etc…. Going to an email based workflow like you’re some kind of kernel dev seems a clear loss in my book!

                                                                                                                                                          But if it makes them all warm and fuzzy, why not, after all, they do what they think is best!

                                                                                                                                                          1. 50

                                                                                                                                                            I think I am the maintainer of the only GNU project hosted on GitHub. Richard Stallman periodically emails me to tell me I’m evil, but since I moved it to GitHub the number of external contributors has gone from zero to several highly motivated people. Most recently, a student found it and did RISC-V and PowerPC ports (some bits need to be in platform-specific assembly). Being on GitHub increased the amount of Free Software in the world.

                                                                                                                                                            I don’t like GitHub’s monopoly position. I refused to even create an account for many years but at the end of the day you have to pick your battles. Given a choice between using a non-Free platform and having fewer contributors, I regard the non-Free platform as the lesser evil. Particularly since there is nothing intrinsic to GitHub that we need, so we can always move to a better alternative if one exists. I’d love to see a truly distributed revision control system, with a mechanism for members to contribute the ability to run isolated VMs for other people’s CI, for example.

                                                                                                                                                            1. 11

                                                                                                                                                              Richard Stallman periodically emails me to tell me I’m evil

                                                                                                                                                              Wait, does he actually do this? Based on his reputation, I wouldn’t put it out of the ordinary…

                                                                                                                                                              I’d love to see a truly distributed revision control system, with a mechanism for members to contribute the ability to run isolated VMs for other people’s CI, for example.

                                                                                                                                                              I think the greatest tragedy of source control that originated from the Unix world is that they just dealt with source, not the whole SDLC. Mainframe based ones emphasize things like deployment and testing and are generally called “change management”. Distributed change management could be quite interesting.

                                                                                                                                                              1. 10

                                                                                                                                                                Wait, does he actually do this

                                                                                                                                                                Yup, and they’re long. He is very articulate and they are well written, I just find the assumption that I haven’t considered the issues that he’s discussing to be mildly insulting.

                                                                                                                                                                1. 2

                                                                                                                                                                  Do they all start with

                                                                                                                                                                  [[[ To any NSA and FBI agents reading my email: please consider ]]]

                                                                                                                                                                  [[[ whether defending the US Constitution against all enemies, ]]]

                                                                                                                                                                  [[[ foreign or domestic, requires you to follow Snowden’s example. ]]]

                                                                                                                                                                  ?

                                                                                                                                                                  1. 2

                                                                                                                                                                    Yup:

                                                                                                                                                                    [[[ To any NSA and FBI agents reading my email: please consider    ]]]
                                                                                                                                                                    [[[ whether defending the US Constitution against all enemies,     ]]]
                                                                                                                                                                    [[[ foreign or domestic, requires you to follow Snowden's example. ]]]
                                                                                                                                                                    

                                                                                                                                                                    Message goes here.

                                                                                                                                                                    --
                                                                                                                                                                    Dr Richard Stallman (https://stallman.org)
                                                                                                                                                                    Chief GNUisance of the GNU Project (https://gnu.org)
                                                                                                                                                                    Founder, Free Software Foundation (https://fsf.org)
                                                                                                                                                                    Internet Hall-of-Famer (https://internethalloffame.org)
                                                                                                                                                                    
                                                                                                                                                              2. 8

                                                                                                                                                                Richard Stallman periodically emails me to tell me I’m evil

                                                                                                                                                                This is amazing and I would print them out and frame them.

                                                                                                                                                                1. 7

                                                                                                                                                                  Particularly since there is nothing intrinsic to GitHub that we need, so we can always move to a better alternative if one exists.

                                                                                                                                                                  The thing that’s intrinsic to GitHub that the project needs is the network effect - you said so yourself when you argued that using GitHub is why the project has several contributors it otherwise wouldn’t have. This is exactly the same thing that makes GitHub pernicious - if you do want to move to a better alternative, including an ethically-better alternative according to Richard Stallman’s set of ethics around software, you’ll face the headwind of losing that network effect for yourself, and only marginally weakening it for GitHub as a whole.

                                                                                                                                                                  1. 1

                                                                                                                                                                    It’s unfortunate there is not enough discipline within the GNU organization to maintain their principles across all projects.

                                                                                                                                                                    1. 38

                                                                                                                                                                      It’s unfortunate that the FSF never invested in building infrastructure that makes it easy to contribute to Free Software projects and instead just tries to berate people into using Savannah.

                                                                                                                                                                      1. 4

                                                                                                                                                                        It is easy to contribute to Savannah projects. It does not require creating an account.

                                                                                                                                                                        1. 33

                                                                                                                                                                          Don’t be disingenuous. The user experience for Savannah is pretty miserable, especially as a non-contributor. It’s very much a SourceForge-era thing.

                                                                                                                                                                          I think running an FSF/GNU gitea/Sourcehut instance would probably be better than what they’re doing now, and take a lot less effort to maintain too.

                                                                                                                                                                          1. 8

                                                                                                                                                                            I think running an FSF/GNU gitea/Sourcehut instance would probably be better than what they’re doing now, and take a lot less effort to maintain too.

                                                                                                                                                                            I wonder why they don’t do that. Savannah/nongnu isn’t bad but it’s definitely emitting a moldy 90s odor. Tries to resist making moldy odor joke about FSF folks

                                                                                                                                                                            1. 4

                                                                                                                                                                              I completely reject your accusation of disingenuousness, and the rest of your comment is useless because it lacks specifics. The user experience of Savannah is perfectly acceptable, unlike more popular options which can’t even list the contents of a directory half the time.

                                                                                                                                                                        2. 6

                                                                                                                                                                          Major projects have left GNU over terrible policy enforced from above before, so that’s obviously never going to happen.

                                                                                                                                                                          1. 3

                                                                                                                                                                            Isn’t that an indication of discipline being enforced, rather than the opposite?

                                                                                                                                                                            1. 3

                                                                                                                                                                              No, because GNU doesn’t want to lose relevance by losing projects like GCC, and if they push things too far again they’ll quickly reverse course and kindly ask any project considering leaving to please not.

                                                                                                                                                                              1. 1

                                                                                                                                                                                did the major projects you have in mind all leave around the same time as a result of the same “terrible policy”?

                                                                                                                                                                                1. 1

                                                                                                                                                                                  putting our stupid argument aside, I was actually interested in the GNU policies and former GNU projects you had in mind, if you could provide any specifics.

                                                                                                                                                                                  1. 5

                                                                                                                                                                                    I got kinda tired, and I didn’t really like your line of questioning. Different groups have split, threatened to split, or come back to GNU over the years for different reasons. It’s late here but I’ll type out some stuff I can remember.

                                                                                                                                                                                    It’s worth noting that for the most part, projects can’t really leave GNU officially. If the developers decide to leave, the FSF will consider it a fork and ‘continue’ development of the GNU version. So most examples would usually be reported as forks.

                                                                                                                                                                                    Most recently, GCC probably would’ve left in 2022 if Stallman had actually tried to enforce his authority. (He had been kicked from the GCC steering committee* in 2021, but tried to make changes from above in 2022).

                                                                                                                                                                                    Libreboot I think is the most recent project to actually leave. It was only part of GNU for a short while, and left over allegations of transphobia.

                                                                                                                                                                                    Not long before that, the developer responsible for something like 95% of new code in Nano for years refused to sign a copyright assignment agreement with the FSF, and decided that Nano was no longer going to be a GNU project. The FSF considered this a hostile fork. “GNU Nano” continued to exist but was basically dead. Some time later, Nano rejoined GNU, though I don’t know the politics behind that.

                                                                                                                                                                                    (* you probably already know this one, but it’s kinda funny, the FSF-approved maintainer of GCC couldn’t really keep up back in the day, so a bunch of people representing the majority of development of GCC created a friendly fork called EGCS, which originally tried to simply streamline upstream GCC development, but GCC ended up lagging so far behind that EGCS became a hard fork. EGCS didn’t want any one company or group to control the project, so they created a steering committee to run it instead of the typical “maintainer” idea you usually see in open source. With GCC basically dead, and after a lot of back and forth and negotiating with the FSF and Stallman, the EGCS steering committee was appointed the role of maintainer of GCC, and GCC was replaced with EGCS. This is why the GCC logo is an egg.)

                                                                                                                                                                        3. 21

                                                                                                                                                                          after being used to the GitHub/Lab level of PR review

                                                                                                                                                                          The worst part is “the github level of PR review” still has serious flaws.

                                                                                                                                                                          For instance, if there’s >10 comments on a pull request, the API to show them is paginated, and users have to spot the unobtrusive button that says “load more”.

                                                                                                                                                                          Many pages use remove items from the DOM when they’re scrolled out of view (for performance), so such exotic techniques as ctrl-f to search for text on the page don’t work.

                                                                                                                                                                          I’d love to have something with the nice parts of github pull requests that didn’t feel like running through treacle (often over a second elapses from “click add comment” to the “add comment” form appearing). I suspect they haven’t considered performance from a non-USA location or something (Australia to USA is a bit of a hop, even at light speed).

                                                                                                                                                                          1. 11

                                                                                                                                                                            Many pages use remove items from the DOM when they’re scrolled out of view (for performance), so such exotic techniques as ctrl-f to search for text on the page don’t work.

                                                                                                                                                                            OMG thank you for explaining this. I wavered between blaming Firefox and assuming I was going crazy when ctrl-f wasn’t working recently.

                                                                                                                                                                          2. 4

                                                                                                                                                                            For me the biggest pain of PRs is that I cannot send patchset as a PR, it is always “whole branch or nothing”. This mean that I cannot reliably work on top of work from another PR and make it independent PR without painful amount of work, and that slows me down. So being able to send set of commits or just patch and let GitHub make PR out of it “automatically” would be enormous thing. But I feel that GH is not keen into that.

                                                                                                                                                                            And that workflow is why I am moving to SourceHut with my projects recently, at least with hacking projects that I do not think that will receive much of “contributing traffic”. Simply because AFAIK no other forge supports such feature. I would love to test Pijul, but as this project is still very much in the flux, and there is only one forge that I am aware of, and this forge is based on Cloudflare thingies, that is no go. I would love to see something like SourceHut, cgit or even Forgejo, but for Pijul and I then I would think about migration. But Git is still a king, and as no other forge allows me to have flow that I want, I stay with tooling that supports it, even if it is a little bit alien to potential contributors.

                                                                                                                                                                            1. 1

                                                                                                                                                                              What’s the problem with that forge using Cloudflare services?

                                                                                                                                                                              1. 6

                                                                                                                                                                                Centralization of the internet & making folks like me not living in the West solve hCAPTCHA Sudokus to train their AI models in exchange to just visit a site (like GitLab) … that is if I’ve not been outright banned with the error page saying no content will be served to my country’s IP.

                                                                                                                                                                                But this might be referring to using those worker models where you put everything in their cloud, not just DNS, not just a proxy.

                                                                                                                                                                                1. 2

                                                                                                                                                                                  I want to host it on my own. No CloudFlare or other extra services. Just like cgit - let me show the project code, you can clone it, read some docs. Nothing more over that is needed. Let community write rest of the Forge if they will require it, let focus on the tool, not forges.

                                                                                                                                                                                  1. 1

                                                                                                                                                                                    I strongly sympathize with the sentiment, I am just not sure we have many good self-hosting GIT software candidates. I keep an eye on the Rust GIT implementation but it’s not moving fast and might be a long time before it’s ready for production usage.

                                                                                                                                                                                    RE: focus on tool and not forge, I agree, though I wouldn’t give up GitHub’s or GitLab’s PR / MR UI flows. They are super convenient. Though needing to fork is pretty dumb; I agree with your other comment that simply submitting a commit-set should be quite enough for a PR / MR.

                                                                                                                                                                                2. 1

                                                                                                                                                                                  Darcs Hub or Smederee might fit your favored workflow.

                                                                                                                                                                              2. 4

                                                                                                                                                                                I don’t care: mine all the bitcon you like, it’s costing Microsoft money, not me.

                                                                                                                                                                                That’s a terrible way of thinking about things. What if your CI was not owned by a large corporation?

                                                                                                                                                                                1. 10

                                                                                                                                                                                  Then you’d approach it differently than you would with MS Github.

                                                                                                                                                                                  1. 5

                                                                                                                                                                                    I disagree, having good project hygiene should not depend on who’s hosting you.

                                                                                                                                                                                2. 4

                                                                                                                                                                                  But after you merge the pull request, aren’t you going to be running that code on your machine?

                                                                                                                                                                                  1. 7

                                                                                                                                                                                    Yes, but that’s after it’s had code review and after it’s run tests. I want tests to run before I even look at a patch, because automated testing scales a lot better than human attention and so serves as a filter. In most cases, I don’t look at a patch until CI is green (including code formatting things and clang-tidy warnings about naming). At that point, I know it works and is correctly formatted and I can look for things that need a human reviewer.

                                                                                                                                                                                  2. 2

                                                                                                                                                                                    True. The only patches I’ve ever received were trivial & easy to know what was going to happen just by reading–which I mean you should be reading the patchset. SourceHut will run the project’s .build.yml on patches sent using their mailing list in their CI if you set it up which gives you integration while still having the mailing list if you prefer. That said, most DVCS tools let you diff pull/merge requests if sent a repository & CLI tools show the diff fairly straight forward (& you can have better control over the diff tool)–not to discount the time investment & tool learning required to use these over a web GUI (to which Microsoft GitHub does not have the only or best interface for).

                                                                                                                                                                                    1. 1

                                                                                                                                                                                      you don’t want to be applying arbitrary patches to a build system and then running arbitrary code sent by some random internet person.

                                                                                                                                                                                      If there’s one thing you can say about GitHub, it’s that they work hard to make contributing changes easy. Their hard work doesn’t go unappreciated.

                                                                                                                                                                                      https://johnstawinski.com/2024/01/11/playing-with-fire-how-we-executed-a-critical-supply-chain-attack-on-pytorch/

                                                                                                                                                                                    2. 4

                                                                                                                                                                                      It would honestly be really cool if Github supported some sort of patch based workflow, if only because the fact that people fork projects for the purpose of just sending in a patch is kinda silly!

                                                                                                                                                                                      From a practical perspective, Github still offers a lot of decent UX stuff for project management, and my understanding is that everyone else is playing catch up. Would love to see someone show up with a completely fresh take instead. It’s hard to be merely reactive

                                                                                                                                                                                      1. 4

                                                                                                                                                                                        Patch-based workflows are even better if you are using Darcs or Pijul whose forges would be elsewhere as Git requires the order matters are you get merge conflicts. Microsoft GitHub doesn’t even let you attach *.diff or *.patch files in their UI despite allowing things like Microsoft Word documents.

                                                                                                                                                                                        That said, there is some value around an integrated system for project management, but I’m not entirely convinced tying it to the forge is better—it’s just the JIRAs of the world offer bad UX & developers have a bias towards “the code is the product” where I’ve heard non-developers around a fan (same applies to GitLab).

                                                                                                                                                                                        1. 1

                                                                                                                                                                                          Git requires the order matters or* you get merge conflicts

                                                                                                                                                                                          heard non-developers aren’t* a fan

                                                                                                                                                                                      2. 3

                                                                                                                                                                                        If you send an email patch, many folks don’t know what to do with it (answer: git am);

                                                                                                                                                                                        How does that work with webmail’s? Say gmail to keep it simple.

                                                                                                                                                                                        1. 7

                                                                                                                                                                                          With webmail you’d generally download the patch attachment (or copy/paste it it was inline) to a text file and use git apply. The git am is trawling the mailbox to get the patch text but running git apply behind the scenes.

                                                                                                                                                                                        2. 2

                                                                                                                                                                                          I think you’re mistaken about these websites. They’re not called “Microsoft Github” or “Microsoft Linkedin”. You have a lot of pretty weird and unnecessarily paranoid preconceptions about web services in general. You should focus on things that are real, instead of imaginary dangers.

                                                                                                                                                                                          1. 11

                                                                                                                                                                                            You should focus on specifics that can be falsified, not generalities that function primarily as personal attacks. What specifically are they mistaken about? There’s absolutely nothing wrong with including the name of the parent company when referring to a product or service.

                                                                                                                                                                                            1. 18

                                                                                                                                                                                              I’d argue it’s even somewhat important to mention the the parent company’s name when they are clearly not using their name to obfuscate the fact that they are the new owners. Especially when said company has been a bad actor for decades before finally deciding it was in their financial interest to embrace “open source”.

                                                                                                                                                                                              1. 1

                                                                                                                                                                                                they are clearly not using their name to obfuscate the fact that they are the new owners

                                                                                                                                                                                                To me, not rebranding everything seems like the path of least resistance, and it seems overconfident to say that this was done “clearly” to “obfuscate” something rather than for any of the many possible combinations of reasons, including that it is the path of least resistance.

                                                                                                                                                                                              2. 5

                                                                                                                                                                                                I dunno, insisting on “Microsoft GitHub” reminds me of formulaic Slashdot comments around the turn of the millennium when people would go out of their way to write “Micro$$$oft” or similar. It certainly does communicate something, of course, but the thing it communicates may not be what the author hoped it would be.

                                                                                                                                                                                                1. 2

                                                                                                                                                                                                  What’s wrong with writing Micro$$$oft?

                                                                                                                                                                                        🇬🇧 The UK geoblock is lifted, hopefully permanently.