I write programming stuff, cryptography stuff, and also just general ideas that are on my mind. Have a gander: https://cryptolosophy.org
Posted on a website that immediately took over my screen to get me to consent to signing away my privacy.
Cool project 👍🏻. I’m wondering is it “correct” to say that “now we can write safer c” if the C code is transpiled to Rust?
The resulting Rust code is only slightly safer. Some things like array bounds that were not previously checked will be checked. For the most part this translation is just the first step in enabling more substantial refactoring from which the benefits from Rust can start to shine.
Why is the resulting rust code only slightly safer? Rust as a language is a lot more memory safe than C. If you’re talking about current transcompilers, then improving those should lead to improvements in C.
It’s translating to mostly-unsafe Rust (so does corrode, the other project that does this)
This means you still have the same burden of checking most of the invariants involved.
One use case for tools like these is an easy way to start converting a codebase from C to Rust, doing away with a bunch of the tedium.
Ah, I’ve misread that. I was referring to Rust -> C compilers which are useful to create if only to understand the domain well enough to bring improvements to C.
Or the opposite: submit compiling C code that doesn’t compile in Rust (or is otherwise broken) so that we can fix the translator to handle that case.
See the known limitations page: https://github.com/immunant/c2rust/wiki/Known-Limitations-of-Translation
I feel like most of these criticisms are superfluous and don’t really matter. Why does it matter that we call writing printing? etc
I came up with a solution to the first problem with solution 2 in the article. However my solution still violates constraint 2.
Each developer generates a random number n and some key k. They then send h = hash(n || k) to the group chat. Once they have all sent this h, they then each send their n and k. Then each person can verify (by computing h and comparing with the given value) that the n sent by each other person was independent of any information that they had already received from the other developers.
Hence any participant cannot be in control of the sum or product or exclusive-or of the inputs, modulo 20.
Honestly I don’t see how this is a big deal.
As a user, it’s a decent feature because I get to view, say, news coverage of some current event from multiple sources; and as a publisher, why do I care about where else my users go to consume media?
Why would commercial site operators care that a browser vendor is presenting unsolicited links to their competitors, to visitors?
Let’s shift the industry: what if chrome started showing links to Chrysler on Ford’s website? Why would Ford care?
Okay the links are clearly not on the page itself. You can see at the top that the UI is clearly distinct from the website itself and is more like a feature of the browser than anything else. As far as I can see, it’s just a convenient way of displaying search results for the headline without the user having to manually search it themselves.
And well, that’s really a strawman argument since that is not at all what Google is doing here. Equating a news site and the website of a corporation that sells actual products isn’t really meaningful.
Really, it’s like berating Spotify for having their radio feature because it can show me similar songs to the one I’m listening to. It’s a feature that doesn’t really hurt anyone, and on the contrary, benefits the majority of people. I’m sure Tyler’s fine with Spotify playing some Frank Ocean on his radio.
users don’t understand browser vs site differences.
You say that comparing two rival news businesses with two rival car businesses is a straw man and then bring up a music streaming service which pays each of the artists for the songs played.
Google isn’t running a news service, and paying site owners for displaying their articles.
This would be like if while using the Spotify app, Siri chirped out with “hey we have songs on on Apple Music”.
Have anything to back up that claim of user ignorance? Also, if we suppose that the claim is correct, does it make a difference?
The straw man is in promoting a product on a rival’s site. Nothing even remotely like that is happening here. The Spotify example was mentioned because music suggestions are vaguely analogous to information media, but I take your point. Even still, my point stands.
And I disagree that it’s like that. In that case both Spotify and Apple Music would be trying to get users listening to the same thing on different platforms. Articles on the other hand are some person’s unique view of some topic, and my browser giving me suggestions for other people’s viewpoints on the same topic is nothing to beef about.
Apple literally changed the way javascript alert()/etc look in Safari because users got confused that it was the webpage showing a dialog, not the OS/browser.
In this very statement you also acknowledge that it is possible for a UI to be unambiguous, as Apple did change it so as to make it unambiguous.
Hence your claim about the UI of the suggested pages is meaningless, since you’d need to show that this particular UI falls into one of the two categories.
Apple changed the elements in question from native-styled (i.e. they look and behave exactly like a native macOS/iOS element) elements that are modal above the whole browser (i.e. they blocked all interaction with Safari while open) into in-tab plain white elements that look like a plain-jane javascript html “modal” window.
The chrome UI in question is literally a white bar at the bottom of the page - how could anyone determine whether it’s chrome’s chrome, or in-page content?
I don’t think you’re looking at the right picture: https://pbs.twimg.com/media/Db1OtiSWkAEKPag.jpg
But we’re digressing. What actual point are you trying to make here, because I don’t see it.
look at the first picture, which is what the user sees, seemingly as part of the site.
This whole thing is in response to this claim:
You can see at the top that the UI is clearly distinct from the website itself
The next sentence was:
Also, if we suppose that the claim is correct, does it make a difference?
We could argue about weather or not a user will think it’s part of the site all day, but it doesn’t really matter.
And I made many points in the post that you quoted, not just that one.
The rest of your ‘points’ are arguing that the news sites in question wouldn’t be concerned by this move.
Did you happen to notice who wrote the tweet thats linked to? The Executive Editor of The Verge. He seems none to pleased about this change, for what I think are pretty obvious reasons.
If you want to put Google on some nerd pedestal and believe nothing they do can be faulted, thats your choice, but dont expect other people to follow your logic.
If you want to put Google on some nerd pedestal and believe nothing they do can be faulted, thats your choice, but dont expect other people to follow your logic.
You’re taking some large leaps here. I do think that Google have fucked up with AMP overall, and I don’t think that Google can do no wrong: they’re a terrible company for user privacy and they’ve shit all over their “Don’t Be Evil” slogan recently.
However, we’re talking about a very specific feature of one of their services, and as a user I welcome a little tab that gives me related articles on X topic, regardless of where The Verge want me to consume information.
I didn’t realise that I had to outline my position on Google as a whole to be able to have an opinion on something that they do.
as a user I welcome
As a user you’re entitled to want what you want, but you seem to have forgotten the part where you said:
as a publisher, why do I care about where else my users go to consume media
You claim to acknowledge Google’s faults, but you seem unable to comprehend how this change could affect online news companies, either now or in any future incarnations of this ‘feature’.
The two aren’t mutually exclusive.
You claim to acknowledge Google’s faults, but you seem unable to comprehend how this change could affect online news companies, either now or in any future incarnations of this ‘feature’.
And you’ve failed to demonstrate how.
You’re fucking kidding me.
You don’t see how driving traffic away from a site to it’s competitors could affect them?
You’re being deliberately obtuse.
It’s the “driving traffic away” part I don’t agree with, but since you’ve dissolved this discussion into pure ad hominem attacks, I won’t be continuing with the conversation.
Have a nice day dude!
No, they changed it to confuse fewer people.
As someone who’s been doing UI design for more than a decade and has also followed this part of the industry even longer I can assure you there are no meaningful interfaces that wouldn’t confuse at least some people. What we all try to do is each day reduce number of confused people which you can see by over time evolving UI widgets and patterns.
The big issue is that every time Google has provided any kind of listing anywhere, ever, they’ve allowed companies through AdWords to get to the top.
And that becomes super shady.
I think a physics tag is a great idea. And I mean, if any of the pure sciences were closely linked to technology, it’d be physics.
Speaking of bash aliases, they can be great fun.
All of them:
The fact that they exist at all. The build spec should be part of the language, so you get a real programming language and anyone with a compiler can build any library.
All of them:
The fact that they waste so much effort on incremental builds when the compilers should really be so fast that you don’t need them. You should never have to make clean because it miscompiled, and the easiest way to achieve that is to build everything every time. But our compilers are way too slow for that.
Virtually all of them:
The build systems that do incremental builds almost universally get them wrong.
If I start on branch A, check out branch B, then switch back to branch A, none of my files have changed, so none of them should be rebuilt. Most build systems look at file modified times and rebuild half the codebase at this point.
Codebases easily fit in RAM and we have hash functions that can saturate memory bandwidth, just hash everything and use that figure out what needs rebuilding. Hash all the headers and source files, all the command line arguments, compiler binaries, everything. It takes less than 1 second.
Virtually all of them:
Making me write a build spec in something that isn’t a normal good programming language. The build logic for my game looks like this:
if we're on Windows, build the server and all the libraries it needs
if we're on OpenBSD, don't build anything else
build the game and all the libraries it needs
if this is a release build, exit
build experimental binaries and the asset compiler
if this PC has the release signing key, build the sign tool
with debug/asan/optdebug/release builds all going in separate folders. Most build systems need insane contortions to express something like that, if they can do it at all,
My build system is a Lua script that outputs a Makefile (and could easily output a ninja/vcxproj/etc). The control flow looks exactly like what I just described.
The fact that they exist at all. The build spec should be part of the language, so you get a real programming language and anyone with a compiler can build any library.
I disagree. Making the build system part of the language takes away too much flexibility. Consider the build systems in XCode, plain Makefiles, CMake, MSVC++, etc. Which one is the correct one to standardize on? None of them because they’re all targeting different use cases.
Keeping the build system separate also decouples it from the language, and allows projects using multiple languages to be built with a single build system. It also allows the build system to be swapped out for a better one.
Codebases easily fit in RAM …
Yours might, but many don’t and even if most do now, there’s a very good chance they didn’t when the projects started years and years ago.
Making me write a build spec in something that isn’t a normal good programming language.
It depends on what you mean by “normal good programming language”. Scons uses Python, and there’s nothing stopping you from using it. I personally don’t mind the syntax of Makefiles, but it really boils down to personal preference.
Minor comment is that the codebase doesn’t need to fit into ram for you to hash it. You only need to store the current state of the hash function and can handle files X bytes at a time.
When I looked at this thread, I promised myself “don’t talk about Nix” but here I am, talking about Nix.
Nix puts no effort in to incremental builds. In fact, it doesn’t support them at all! Nix uses the hashing mechanism you described and a not terrible language to describe build steps.
The build spec should be part of the language, so you get a real programming language and anyone with a compiler can build any library.
I’m not sure if I would agree with this. Wouldn’t it just make compilers more complex, bigger and error prone (“anti-unix”, if one may)? I mean, in some cases I do appriciate it, like with go’s model of go build, go get, go fmt, … but I wouldn’t mind if I had to use a build system either. My main issue is the apparent nonstandard-ness between for example go’s build system and rust’s via cargo (it might be similar, I haven’t really ever used rust). I would want to be able to expect similar, if not the same structure, for the same commands, but this isn’t necessarily given if every compiler reimplements the same stuff all over again.
Who knows, maybe you’re right and the actual goal should be create a common compiler system, that interfaces to particular language definitions (isn’t LLVM something like this?), so that one can type compile prog.go, compile prog.c and compile prog.rs and know to expect the same structure. Would certainly make it easier to create new languages…
I can’t say what the parent meant, but my thought is that a blessed way to lay things out and build should ship with the primary tooling for the language, but should be implemented and designed with extensibility/reusability in mind, so that you can build new tools on top of it.
The idea that compilation shouldn’t be a special snowflake process for each language is also good. It’s a big problem space, and there may well not be one solution that works for every language (compare javascript to just about anything else out there), but the amount of duplication is staggering.
Considering how big compilers/stdlibs are already, adding a build system on top would not make that much of a difference.
The big win is that you can download any piece of software and build it, or download a library and just add it to your codebase. Compare with C/C++ where adding a library is often more difficult than writing the code yourself, because you have to figure out their (often insane) build system and integrate it with your own, or figure it out then ditch it and replace it with yours
+1 to all of these, but especially the point about the annoyance of having to learn and use another, usually ad-hoc programming language, to define the build system. That’s the thing I dislike the most about things like CMake: anything even mildly complex ends up becoming a disaster of having to deal with the messy, poorly-documented CMake language.
Incremental build support goes hand in hand with things like caching type information, extremely useful for IDE support.
I still think we can get way better at speeding up compilation times (even if there’s always the edge cases), but incremental builds are a decent target to making compilation a bit more durable in my opinion.
Function hashing is also just part of the story, since you have things like inlining in C and languages like Python allow for order-dependent behavior that goes beyond code equality. Though I really think we can do way better on this point.
A bit ironically, a sort of unified incremental build protocol would let compilers avoid incremental builds and allow for build systems to handle it instead.
I have been compiling Chromium a lot lately. That’s 77000 mostly C++ (and a few C) files. I can’t imagine going through all those files and hashing them would be fast. Recompiling everything any time anything changes would probably also be way too slow, even if Clang was fast and didn’t compile three files per second average.
You could always do a hybrid approach: do the hash check only for files that have a more-recent modified timestamp.
Do you use xmake or something else? It definitely has a lot of these if cascades.
It’s a plain Lua script that does host detection and converts lines like bin( "asdf", { "obj1", "obj2", ... }, { "lib1", "lib2", ... } ) into make rules.
Codebases easily fit in RAM and we have hash functions that can saturate memory bandwidth, just hash everything and use that figure out what needs rebuilding. Hash all the headers and source files, all the command line arguments, compiler binaries, everything. It takes less than 1 second.
Unless your build system is a daemon, it’d have to traverse the entire tree and hash every relevant file on every build. Coming back to a non-trivial codebase after the kernel stopped caching files in your codebase will waste a lot of file reads, which are typically slow on an HDD. Assuming everything is on an SSD is questionable.
One shouldn’t have to consider the possibility of getting swatted for contributing to an open-source project.
I came down to the conclusion that it is unlikely, but given that a communities’ individual
it’s reasonable to think about what the next escalation steps will be.
In the end, I found myself thinking about steps to maintain my safety. That’s probably a good time to leave abusive and harassing environments.
That’s basically when the harassment started.
I got the response from leadership that it’s not needed, because “there are no beginners out there, everyone already knows Scala.”
With an attitude like that, there certainly will be no beginners.
I believe you’re making a mistake in conflating this to all open source projects, but I can understand how the PTSD can cause that.
Yes, that’s my impressions as well.
Documentation shapes a community just as much as a community shapes documentation:
If your documentation is poor, you will only attract users (and contributors) which are fine with poor documentation. And with users (and contributors) that are fine with poor documentation, improving documentation will never be a priority.
I’d recommend looking at LEDE instead of OpenWrt, it seems that’s where all development is going on.
Regarding the part about Java using the same reference for small integers, the author wrote,
even if the performance benefits are worth it
But are they really? Making things faster by breaking the == operator is cringe-inducing. Is there some place where this property is so useful that it justifies itself?
I mean if you want to write something that’s really, really fast, you don’t really look at Java (fast as it is), you’d likely go for something like C.
Interning strings is something that the JVM supports, though I’ve never been able to find compelling studies that measure its usefulness. It’s also easy enough in Java that if I ever found a need, I would probably give interning a shot before doing a rewrite in C.
I could imagine it being worth it if you have an application that caches a lot of data with in-memory maps/lists and you need to make the list.contains/list.find/map.get methods fast. All those rely on Object.equals, so being able to speed up equality checking could have a real effect.
I’m not aware of any other company that goes to these lengths to make their service so reliable.
I’d be really interested in seeing the lengths that Google goes to.
There’s some insight into that in this talk https://www.youtube.com/watch?v=H4vMcD7zKM0
I’ve been using Rust based commandline tools for a while now, ripgrep to replace grep and fd-find to replace find.
These are not commandline-preserving compatible with the old gnu versions, for me that’s a bonus: they feel modern, are fast and are written in a memory-safe language. In my experience you don’t usually write replacements of old utilities in a 100% compatible way, but just switch to using the new tools and change your assumptions.
100% drop-in replacement rewrites of old C software in Rust or in other memory-safe languages is an impossible goal and it’s not something that is worth trying. Reimagining how old utilities would look like in a modern environment, however is probably a worthy goal.
In my experience you don’t usually write replacements of old utilities in a 100% compatible way, but just switch to using the new tools and change your assumptions.
Basic tools like grep, wc, cut etc are all over build chain of Unix-based systems. That’s by “POSIX-compliant” is so important, you need to know a new utility can handle an edge case in a Makefile somewhere.
Make is a really bad example here, given how much software needs gmake or bsd make and doesn’t support both in practice.
(Also, by the way, one of the reasons why I think it is good that rustc is not built with make anymore)
I’m not that knowledgeable about the differences between different flavors of make , but surely all support running POSIX utilities for whatever reason (searching for installed binaries, checking whether a file exists, etc)?
The differences start appearing as soon as you try to do something as basic as wildcards.
they feel modern, are fast and are written in a memory-safe language
They “feel modern” is subjective, so I’ll pass on that point. Saying they’re fast implies that the current implementation is slow, and is “written in a memory-safe language” really an advantage of a program being written in a memory safe language? Seems a little circular.
I’m by no means an expert so my opinion doesn’t count for shit and so I take no sides in this, but it seems like the OP made a bunch of really good points that need addressing, and this really doesn’t.
For a tool like ripgrep, that it is written in a memory safe language isn’t a terribly direct benefit to an end user. In particular, folks don’t usually use the tool in a context where a vulnerability would be that damaging.
With that said, memory safety has other benefits, like (IMO) lower development time. I’ve been maintaining ripgrep for over a year now, and I literally don’t spend time debugging memory related bugs. I don’t just mean “it doesn’t happen that often,” I mean, “it’s never happened, not once.” That’s a lot more time I can put into fixing other types of bugs or adding features. This is good for users, albeit indirectly. Granted, this is probably a pretty lame argument, but I’m just relaying my own experience here as I see it.
This of course varies from programmer to programmer. A better C or C++ programmer than myself, for example, might not spend any time debugging memory related bugs either. But I’m not that good.
One might also make an argument in favor of using another memory safe language that has a GC, but the onus is on them to show that the tool can actually compete performance wise. I do suspect it is possible, although it hasn’t been done. I’d expect D to be capable, and probably also Go. (There are some ripgrep-like tools written in Go, but they break down once you push them a little harder than simple literal searches. I cannot conclude that this is a breakdown in Go itself because there is missing implementation work to make it run faster. Particularly in the regex engine.)
Saying they’re fast implies that the current implementation is slow
GNU grep isn’t slow in the vast majority of cases. On a single file search, ripgrep might edge it out. As you increase the complexity of the regex (say, by using lots of Unicode features without any literals), then ripgrep starts to get a lot faster (~order of magnitude). In the simpler cases, ripgrep tends to edge out GNU grep by using a mildly smarter heuristic that lets it spend more time in an optimized routine like memchr in common cases. But GNU grep could easily adopt the latter. It would be harder for them to fix their Unicode problems, and probably not worth their time. (Because once you stick a literal into the pattern, it speeds back up again.)
However, if you say “compare default usages of these tools, including recursive search,” then ripgrep is going to toast GNU grep just because it will use parallelism by default. Which makes this a mostly uninteresting comparison if you’re curious about the details of performance difference, but that doesn’t stop it from being a very interesting UX comparison. That’s typically the source of claims like “ripgrep is so much faster than grep,” which are unfortunately also easily interpreted as a really interesting claim which provokes the question, “oh, so what is the interesting implementation choice that makes it faster?” When folks find out that it’s just parallelism or ignoring certain files (that are in your .gitignore), they rightfully feel cheated. :-) Another less common source of this that people use “grep” to refer to both BSD and GNU grep, and BSD grep has markedly different performance characteristics than GNU grep.
Performance is complicated, so if you really want to the dirty details, just skip straight to my analysis: http://blog.burntsushi.net/ripgrep/
I should throw out a complement for your work on ripgrep. It’s impressive. I think ripgrep lands on the wrong side of XY problem style arguments (or more like gets stuck in the middle) because everybody wants to talk about what’s best without first defining requirements. :(
Hah, yes, indeed. And thanks. :-) I try to keep tabs on it as best I can and keep it in check, but it’s hard!
Although, putting my security hat on, memory-safety is one of things where you just never know how a certain program will get used. I mean, imagemagick is just one of the obvious cases, but what about more subtle stuff such as running something from CI? Untrusted data has a tendency to show up in unexpected places.
Yeah, that is a good point. And I’ve been slowly splitting ripgrep into smaller crates. Other CLI utilities (like fd) are benefiting from that, but that definitely increases the odds of the code showing up in even more unexpected places. The core search routines haven’t been split out yet, but it’s on the list!
“I don’t just mean “it doesn’t happen that often,” I mean, “it’s never happened, not once.” That’s a lot more time I can put into fixing other types of bugs or adding features. This is good for users, albeit indirectly.”
That’s not a lame argument: it’s the very argument that safer languages such as Ada, Java, and C# sold businesses on. Knocking out problems that take up lots of debugging time lets you improve your existing product more or build new ones. If you have competition, then you can potentially improve at a faster rate than them. Outside business, there’s quite a few FOSS projects each doing something similar that compete with each other on features and performance even if not security so much. So, being able to rapidly add or fix features could help one get ahead.
The champions of that in what few studies exist were the Common LISP and Smalltalk languages. However, if wanting low-level performance, one will have to go with something else. The prior studies put Ada ahead of C with half the defects plus easier maintenance. Rust is most comparable to it in focus areas or strengths. So, choosing Rust for low-defect development without a GC has some empirical support indirectly. I think an interpreted variant with REPL and great debugging might be great, though, for closing some of the gap between it and those dynamic languages.
Sorry, I didn’t mean “lame” as in “bad” or “incorrect,” but rather, “tired” or “old.” i.e., Everyone has heard it before.
I would add, writing grep in any language with a complicated runtime probably wouldn’t be nice for openbsd. Pledging() relies on knowing what the program/runtime is going to do at any given point, I have had Go problems get killed by random syscalls that the Go runtime decided to make late into program execution.
Thankfully, Rust isn’t one of those languages :) Its “runtime” does uhh… stack backtraces… and possibly some other little things I guess. It’s similar to C++.
killed by random syscalls
That, IMO, is the worst part of pledge. Capsicum gracefully denies syscalls, letting the program handle the error instead of blowing up :P Sure blowing up lets you quickly debug a core dump of a simple C program (like OpenBSD’s base utilities), but with more complex software written in different languages, I’d much rather handle the damn error.
Ripgrep is a lot faster than grep for general use; not necessarily because the search algorithm is faster, but because it defaults to ignoring .gitignored files and BLOBs, which usually is what you want.
They “feel modern” is subjective, so I’ll pass on that point. Saying they’re fast implies that the current implementation is slow, and is “written in a memory-safe language” really an advantage of a program being written in a memory safe language? Seems a little circular.
The parent said “they feel modern, are fast”, not “they feel modern, are fast_er_”. The point is that they are on par, not that the current state is “slow”.
I’m by no means an expert so my opinion doesn’t count for shit and so I take no sides in this, but it seems like the OP made a bunch of really good points that need addressing, and this really doesn’t.
Given that around here, there’s a discussion on how people can’t find out what the OP exactly meant, I find this statement a bit confusing.
ripgrep does some terribly silly things. One is not having an option to write output as:
file1:line MATCH1
file1:line MATCH2
file2:line MATCH1
Which would be obvious to somebody familiar with awk or cut. ‘modern’ often means forgotten knowledge.
ripgrep does exactly that when you use it in a pipeline. When you print to a tty, it uses a “prettier” format. If you don’t like the prettier format, the --no-heading will force the more traditional format. Put it in an alias and then forget about it.
‘modern’ often means forgotten knowledge
And is also often a good thing. Compare and contrast POSIX grep’s and ripgrep’s support for Unicode, for example. ripgrep “forgets” a lot of stuff and just assumes UTF-8 everywhere. This works well enough in practice that I’ve never once received a complaint. Well, that’s not true. I did receive one complaint: that ripgrep can’t search UTF-16 files, which are not uncommon on Windows. But I fixed that by adding a bit of code to ripgrep that transcodes the corpus from UTF-16 to UTF-8 when a UTF-16 BOM is detected, and hey, it works! But if I didn’t “forget” about POSIX, then that might not have been possible at all.
This comment should not be interpreted as an argument against POSIX, OpenBSD’s (or any other OS) use of it. Instead, I’d like you to interpret it as a criticism of the myth that “the good ol’ days” ever existed at all. My bottom line is that history has a lot to teach us, and just because we don’t copy it exactly as it was doesn’t mean it was forgotten. Sometimes mistakes are made, or perhaps even more frequently, technology simply changes and evolves that makes repeating historical choices a hard pill to swallow in some contexts.
So how about instead of focusing on “forgotten” knowledge (and, to the same extent on the other end of the spectrum, avoid the new and shiny—but I’m speaking to a certain audience here so I’ve omitted that) we just focus on the problem we’re trying to solve instead? If there’s some ancient (or new, I don’t care) wisdom that I don’t know about that could make ripgrep a better tool, then I’m all ears.
Also, I’ve never once marketed ripgrep as “modern.” ;-)
Most of what I said wrong was based on a mistake that wasted a bunch of my time (which annoyed me), So take the comments written in a bad mood with a grain of salt. And I do like how it respects .ignore files, a tool was definitely needed to deal with that.
I do wonder why that “prettier” mode is needed in the first place though. It just seems like a waste of effort and I can’t see the gain. I think AG might do the same, so maybe you were just copying.
ag does do the same, and ag in turn copied it from ack. That particular format has been around since ack has been around, which certainly isn’t as long as grep, but ack has been around for a while by now.
In my experience, for every complaint about ripgrep’s defaults, there’s probably N other people that actually like the defaults. For example, I’ve heard from folks that think respecting .ignore by default is a terrible or silly decision. But the defaults don’t need to be some universal statement on what’s right; it’s just my approximation of what most people want based on my perception of the world. It also continues a long held tradition started by ack, and ripgrep definitely descends from ack.
While I don’t know for sure, if I didn’t use the pretty format by default, I’m pretty sure I’d have legions of former ag/ack users lining up to complain about it. You might think that’s silly, but I’ve heard of people who use ag over ripgrep because ag is easier to type than rg!
use ag over ripgrep because ag is easier to type than rg
haha. If they care about easy typing, they should use shell aliases. I have aliased ‘sr’ (for ‘search’) to mean rg||ag||pt||ack depending on what’s available.
Are you sure? I’m not at the computer right now, but IIRC it outputs exactly like that when it’s writing to a pipe, not a terminal. Try piping into less.
It seems you are right, modern just means violating least surprise. :) At least the output was colourised - I suppose that is to remind me I am not colour blind (or rub the fact in if I am).
Even good old ls does that though, to be fair. Send it to a pipe, and it outputs a newline separated list of file/directory names, send it to a tty, and it outputs a nicely formatted table, probably with ANSI color escape codes.
It’s definitely not only GNU ls that outputs colors, see FreeBSD ls -G.
I absolutely hate that about ls too, I guess I’m wrong, but I don’t feel to bad about being wrong in this case.
I’m not surprised at applications providing nicely formatted output to terminals, checking whether stdout is a tty is an old trick :)
If by “nicely formatted” you mean “colored”, then sure. Changing the first-level textual content of the output is a huge and highly undesirable surprise to me, though. I would like to be able to predict what the input to the next piece of pipeline will look like without running it through cat first just to trick isatty.
I don’t fully understand the threat model. Presumably unprivileged processes cannot read another users processes arbitrarily, and if they can, then isn’t it just making the attack slightly harder? Do kernels not wipe physical memory when reallocating it to other processes?
What does this defend against?
I think people take it a bit too far, but the various threats include:
Do a crypto op. Do an insecure op. An exploit finds the key leftover from previous crypto op.
A variation of sorts of the above, the memory gets reused, leaks, oops.
There’s some flavor of kernel bug that leaks memory, and you’d like to narrow the window of vulnerability.
Suspend, hibernate, cold boot, etc.
I think the threat model is basically adversary gets a snapshot of memory at some point, so you’d like it to be uninteresting.
Also: process crashes, core dump gets written with secrets inside of it, adversary gets access to disk.
Maybe add “code ran in VM that moved” to 4 depending on whether whatever manages them overwrites memory before dropping a new one. This could become a bigger risk for platforms that use lightweight VM’s/containers with rapid launch and shutdown.
It is, indeed, a lot of complexity for a threat model which most don’t need to worry about: naive RAM scrapers which they’d like to trip with the guard pages. A sophisticated attacker with remote code execution can easily sidestep such a defense, because the secrets are still sitting unencrypted in RAM.
A better approach would be to actually encrypt the sensitive information. This is particularly useful for cryptographic keys, because we can e.g. use a key-encrypting-key sitting in XMM registers to decrypt an encrypted data-encrypting-key (DEK) into other xmm registers. That is to say, we can keep an encrypted copy of the DEK in memory, decrypt it into xmm registers when we want to use it, perform cryptographic operations with it, and then we never have to worry about the unencrypted version sitting around in memory in the first place.
Or, if you’re really worried, use something like Intel SGX, or perform encryption using a separate physical device like a TPM, HSM, or Yubikey
I agree that encrypting secrets is the next logical step. I’ve been planning a scheme for a while now and it should hopefully land in the next major update, time permitting.
Alright telling consumers to ditch the most widely-used e2e messaging platform won’t make them more secure. They’re not going to switch over to Signal, they’re going to go directly to Facebook messenger or texts or something worse.
I’d rather my parents stuck to WhatsApp where Facebook slurped their personal information (that they likely already have), while keeping the end-to-end security on their actual communications.
I don’t mean to say that those of us that actually care about privacy should use it instead of Signal, but this kind of click-bait misses the nuances of the issue and will only contribute to the widespread erosion of consumer privacy and security.
Also, Signal is just an inferior product for day-to-day usage. It constantly refuses to deliver messages to/from iPhone for me.
These things matter, as I can understand and commiserate with product bugs, but my family trying to figure out where to pick me up from the airport cannot.