https://stackoverflow.com/questions/11360831/about-the-branchless-binary-search
suggests that branchless binary search is only faster for in cache datastructures. Once you’re outside your CPU cache, branched search is faster because the speculative execution on a modern CPU will issue multiple memory reads in parallel so that you get the maximum use out of your available memory bandwith. The branchless search has a data dependency between successive memory reads & so cannot issue more than one read at a time. (Can modern CPUs can execute through this kind of data dependency though?)
Hence a branchless search is faster for in cache sets, where the search loop time dominates, whereas branched search is faster for out of cache sets, where memory latency dominates.
It looks to me as if the tests the author ran on their code topped out at ~30Mb of data, assuming they were searching sets of 32bit ints. It might be instructive to run the tests on much larger sets to see how much difference it makes, and to try clearing the CPU caches before running the comparisons so that the data is no longer in cache.
NB, I have a suspicion that the latency spikes at power of 2 sizes in the standard library versions might be due to cache aliasing issues?
It’s suggestive that they start appearing at what looks like a small integer multiple of a typical 1st level cache size.
A few CPUs have done value prediction, which would allow them to speculate through the branch less version. We were quite lucky that the only recent one that was planning to do so was cancelled before it shipped because value prediction opens up incredibly powerful transient execution data leaks and it’s almost certainly impossible to write secure code on such a device. As such, the performance delta that you describe is unlikely to change any time soon (it requires solving some very hard open research questions).
https://arxiv.org/ftp/arxiv/papers/1509/1509.05053.pdf is my go-to reference for reasoning about why so much branchless/less branchy stuff falls off a cliff when the working set size surpasses L2 - it’s a great comparison and guide for when you want to pick branch-free things or Eytzinger stuff, but at a certain point you get better performance by being branchier.
Great paper. Using speculative execution as a more aggressive prefetcher to saturate the memory bandwidth..phew. Talk about a leaky abstraction :)
I remember in the security lab I was briefly interning at in 2011 that we were always talking about how at least at a theoretical level, the interesting control flow attacks were basically solved by control flow graph validators and that the grunt work to make them practical was “just around the corner” so we shifted focus to non-control data attacks that required far more detailed invariant detection and enforcement to address. While I think we were a bit too optimistic about the timeline for control graph validation, it’s nice to see stuff like this starting to make shallow aspects of it practical.
ASLR implementations tend to have weaknesses that manifest in ways that might not be obvious just looking at the scrambled output. For instance, take the program
fn main () { dbg![main as *const u8]; }
results in
0x000055a7a8b55410
0x000055a4d561b410
0x000055c683ec3410
0x0000560f7b369410
or, more succinctly:
fn main () { dbg![main as *const u8 as usize % 4096]; }
results in
1088
1088
1088
which would break things using this entropy source for indexing into some slice etc…
While the stack in the post’s example gets a different distribution due to its reliance on a stack frame’s placement in the address space, there are still sharp edges about every system’s ASLR implementation that it’s kind of frustrating that only exploit writers seem to pay any attention to, despite their significant impacts on performance.
This has really come a long way over the past few years! I think I’ll try it out over the holiday break. I wonder how hard it would be to write a STM similar to Haskell’s popular one…
What the author thinks of as having “integrity” I think of as simply being dead. I don’t think we need to accept oppression so that some aspect of reality can fit more concisely on somebody’s whiteboard.
I imagine that many people will be wondering how Hare differs from Zig, which seems similar to me as an outsider to both projects. Could someone more familiar with the goals of Hare briefly describe why (assuming a future in which both projects are reasonably mature) someone may want to choose Hare over Zig, and Zig over Hare?
I imagine that many people will be wondering how Hare differs from Zig, which seems similar to me as an outsider to both projects.
As someone who used Hare briefly last year when it was still in development (so this may be slightly outdated), I honestly see no reason to use Hare for the time being. While it provides huge advances over C, it just feels like a stripped-down version of Zig in the end.
My understanding is that Hare is for people who want a modern C (fewer footguns, etc) but who also want a substantially more minimalist approach than what Zig offers. Hare differs from Zig by having a smaller scope (eg it doesn’t try to be a C cross-compiler), not using LLVM, not having generic/templated metaprogramming, and by not having async/await in the language.
That definitely sounds appealing to me as someone who has basically turned his back on 95% of the Rust ecosystem due to it feeling a bit like stepping into a candy shop when I just wanted a little fiber to keep my programs healthier by rejecting bad things. I sometimes think about what a less-sugary Rust might be like to use, but I can’t practically see myself doing anything other than what I am doing currently - using the subset of features that I enjoy while taking advantage of the work that occasionally improves the interesting subset to me. And every once in a while, it’s nice to take a bite out of some sugar :]
If I remember correctly, there was some discussion about a kind of barebones Rust at some point around here. Is that what you would ideally have/work with? Which features would survive, and which be dropped?
It looks like it’s a lot simpler. Zig is trying to do much more. I also appreciate that Hare isn’t self-hosting and can be built using any standard C compiler and chooses QBE over LLVM, which is simpler and more light-weight.
As I understand it, the current Zig compiler is in C++; they are working on a self-hosting compiler, but intend to maintain the C++ compiler alongside it indefinitely.
If it’s not self-hosting today it looks like self-hosting is a goal:
Well, at some point it would make sense, much like C compilers are ubiquitously self-hosted. As long as it doesn’t make it too hard to bootstrap (for instance, if it has decent cross-compilation support), it should be fine.
Speculating: The thing about testing in production is that it is maximally resistant to excuses. You can, as a group, decide to do tests like this, and you can do the tests, and the tests cannot fail to happen - however, assuming you have buy-in from the top, any failures will still be the fault of the people or group who failed to implement the mitigation/error handling setup. As such, chaos testing is a way to bypass internal resistance to fault tolerance overhead/effort. (The same concept applies to pentesting.)
Specifically, this sentence:
But, presumably we should be testing this in development, when we’re writing the code to contact that service.
Note that the “we” doing the chaos test and the “we” who should be testing this in development may not be the same “we”!
Yeah, and I feel that chaos engineering is in some ways symmetric to the same social friction-bypassing aspect of writing services at all. It’s a messy technique for a messy world. It’s not a particularly fast way to find bugs in distributed systems, and it can incur heavy reproduction costs (bisecting git commit logs for a big batch of commits under test takes a lot longer when you have to run the highly non-deterministic fault injection for a long enough time on each commit to gain confidence about whether the bug is present or not at that point). But it lets whoever is writing the bugs decouple themselves more from whoever is fixing them :P (and often allowing social credit to accumulate with the bug producers rather than the bug fixers).
Is there any evidence at all that more efficient languages do anything other than induce additional demand, similar to adding more lanes to a highway? As much as I value Rust, I quickly became highly skeptical of the claims that started bouncing around the community pretty early on around efficiency somehow translating to meaningful high-level sustainability metrics. Having been privy to a number of internal usage studies at various large companies, I haven’t encountered a single case of an otherwise healthy company translating increased efficiency into actually lower aggregate energy usage.
If AWS actually started using fewer servers, and Rust’s CPU efficiency could be shown to meaningfully contribute to that, this would be interesting. If AWS continues to use more and more servers every year, this is just some greenwashing propaganda and they are ultimately contributing to the likelihood of us having a severe population collapse in the next century more like the BAU2 model than merely a massive but softer population decline along the lines of the CT model. We are exceedingly unlikely to sustain population levels. The main question is: do we keep accepting companies like Amazon’s growth that is making sudden, catastrophic population loss much more likely?
We’ve always had the Gates’ law offsetting the Moore’s law. That’s why computers don’t boot in a millisecond, and keyboard to screen latency is often worse than it was in the ‘80s.
But the silver lining is that with a more efficient language we can get more useful work done for the same energy. We will use all of the energy, maybe even more (Jevon’s Paradox), but at least it will be spent on something else than garbage collection or dynamic type checks.
I can tell you that I was part of an effort to rewrite a decent chunk of code from Python to C++, then to CUDA, to extract more performance when porting software from a high-power x86 device to a low-power ARM one. So the use case exists. This was definitely not in the server space though, I would love to hear the answer to this in a more general way.
I’m not going to try to extrapolate Rust’s performance into population dynamics, but I agree with the starting point that AWS seems unlikely to encourage anything that results in them selling fewer products. But on the flip side if they keep the same number of physical servers but can sell more VM’s because those VM’s are more lightly loaded running Rust services than Python ones, then everyone wins.
I’ve spent a big portion of the last 3+ years of my career working on cloud cost efficiency. Any time we cut cloud costs, we are increasing the business margins, and when we do that, the business wants to monitor and ensure we hold into those savings and increased margins.
If you make your application more energy efficient, by what ever means, it’s also probably going to be more cost efficient. And the finance team is really going to want to hold onto those savings. So that is the counter balance against the induced demand that you’re worried about.
I helped debug the new pipeline mode (ability to run multiple statements at the same time) in libpq
(official C client library) and this stuff is pretty complex. I wonder how many wire-compatible implementations from that table support this.
If I were to implement the postgres pipeline mode in a library of my own for the server side of the conversation, does libpq have any randomized client tests around this to get bugs to pop out that I could take advantage of?
There are some tests for this but I dont’t believe there is anything randomized. Another thing that you could use is tests from projects that use libpq
in the pipeline mode. For example, in ODB (a C++ ORM) we have a bunch of “bulk operation” support tests which we run against Oracle, MSSQL, and now PG. So you could run those against your server implementation.
Thanks for the pointer to the bulk operation tests! One of the nice things about targeting such a common interface is the access to heavy-duty testing suites. Even though I have a pretty high amount of confidence in the randomization + fault injection approach that I’m taking in my tests for a DB I’m writing, it’s nice to throw as many things as I can at it for finding my blind spots.
Good ruling - it is a privacy problem, and potentially a security problem, and in all likelyhood, a performance problem too as the browser must do more lookups and connections (and the alleged benefit of caching across sites rarely worked anyway, and then the other issues recently made browsers stop even trying anyway). This fad was always very iffy on multiple measures.
a performance problem too as the browser must do more lookups and connections
There was a time when browsers only did a small amount of parallel connections per domain, that’s when you saw www1, www2, www2 for assets. I think that’s been changed for the most part, but I don’t buy the argument that doing a (parallel) lookup for either your own subdomain or a 3rd party domain would take longer. With http2 there should be more pipelining, so it’s not a clear yes or no - or people don’t use http2.
TLDR: While I don’t want to detract from the privacy problem, I don’t buy the performance problem. That’s a case by case decision, unless you’re actually including assets willy nilly from a ton of domains.
Roll the latency distribution dice N times and your overall load time is max’d to the worst of all parallel response times plus local compute costs. You lose worse by rolling more often. Tail at scale.
point taken, but that’s what async is for, to a degree. Maybe my “case by case decision” was a bit handwavy, but maybe my experience is skewed towards teams that take care of this anyway and are already minimizing their external dependencies. So “if you include 2-3 it’s probably not worse than with 0 external” but if go towards 10-30, then your cited reference kicks in…
You don’t know what those 2-3 extras are until your first HTML arrives, after having set up the connection and sent the request etc… The only case where it wouldn’t result in a slowdown is if the first server’s connection’s throughput dramatically dropped to the point where the dependencies could be entirely fetched in the time between when the first bit of HTML arrives and the response is fully received.
What are the other issues that stopped browsers caching? (I’m assuming you aren’t talking about JS because browsers cache JS in all the ways they can come up with)
Sites were checking the performance of loading various assets to do cross-site tracking; they could see if shared.com/thing was loading fast to figure it was in the cache and thus the user had been to shared.com before. There was a category of these side-channel attacks the browser vendors wanted to block. It went into effect last year.
But, even before the browsers changed the cache implementation, most javascript libraries and fonts wouldn’t be in the cache anyway just due to the variety of sources and versions different sites used. And trying to use a global thing meant you couldn’t bundle and strip things to only what you used. (For example, with font files, if it is for your site in particular, you can only ship the subset of glyphs you use. This kind of aggressive stripping not super common irl anyway, but it impossible with the cdn approach entirely.)
Hmm, I wonder if this can be exploited through JS engine caches - you would not be likely to be able to find out about an arbitrary site, but you could probably do your own tracking.
I do not consider Erlang’s error handling supervisor hierarcies to encourage such bad coarse-grained error handling. They are a clear way to push error handling hierarchies down into code with unique handling responsibilities, where issues only bubble up if the local handler can’t deal with them, in stark contrast to the way most people use ? + From/Into in Rust to bubble all errors up to main without really caring about what they are, using big-ball-of-mud enums that end up destroying the ability to rely on pattern matching to locally handle local concerns, and severely contributing to the well-understood problem of most bugs in production networked systems existing in incorrect, never-actually-tried error-handling code. It can be really frustrating trying to achieve the same level of single-responsibility error-handling in Rust that comes very naturally in Erlang through hierarchies.
I strongly wish that Rust had either Java-like checked exceptions or a way to encode Erlang-like error handling hierarchies in any way other than with nested result types. I enjoy error handling in situations where functions can clearly communicate to callers the specific mixture of failures that they should consider, rather than having most crates just give up due to poor error handling ergonomics and throw everything into a big enum. I am well-aware of the fact that programmers in all languages tend to suck at error handling. I know that the reason why Kotlin and C# chose not to have checked exceptions was because of their creator’s experiences dealing with vast ecosystems of people who ignored their power, basically the same way the Rust ecosystem creates terrible big-ball-of-mud top-level error enums that everything ends up being pushed into, causing correct error handling to become infeasible due to bad decisions and perpetuating the unreliable malaise that is the ecosystem of anything that deals with sockets or files.
Cancellation is one of these things that the RESF love to sing praises about, but anyone who actually fault injects and debugs their stuff tends to quickly come to similar conclusions that anything you put in a Drop impl is likely to surprise you with how late or early it runs in certain cases due to generally poor understandings of drop ordering. Our minds tend to ignore all of the concurrent stuff that may happen between a type’s initialization and the various stages of its Drop, which in-practice leads to a lot of bugs, especially when mutexes are taken out somewhere in Drop in a way that may deadlock due to drop ordering in various scopes. In opposition, “let it crash” tends to actually work in Erlang because of the general culture of avoiding shared state. If the things that got cancelled in Rust had a similar culture of unique state reliance, cancellation would tend to work a lot better in Rust than it does today. It’s requires discipline that is a bit rare in the ecosystem.
This article talks at-length about the tensions between tokio and mixed workloads that it cannot even theoretically serve well, and goes into a bunch of workarounds that needed to be put in-place to mask these tensions, instead of just avoiding the tensions to begin with.
When a request’s dependency chain contains a mixture of components optimized for low latency (with short buffers in front) and components optimized for high throughput (with large buffers in front), you get the worst-of-all-worlds high-level system behavior.
It’s like taking a school bus of kids to McDonalds and going through the drive-through and looping around once for each kid on the bus. Each loop is latency-optimized and whichever kid whose turn it is to order will receive their meal at a low latency after the time that they get to order it, but their sojourn time where they are waiting around doing nothing before being served explodes. But by taking the whole bus’s orders at once, the overall throughput is far higher, and because we have to accomplish a whole bus’s worth of orders anyway, there’s no point optimizing for latency below the whole-bus threshold. The idea has strong descriptive power for so many things in life, especially in software. Most people would probably not have a social media web site server kick off an apache mapreduce job for each GET to the timeline, even if the mapreduce job was to looking at a miniscule amount of data, because it is a similar (more exaggerated but still the same idea) mixing of low-latency components with high throughput components. Mixing of queue depths in a request chain degrades both latency and throughput.
Sure, it’s software, you can mix queue depths, and in this case it will probably actually give you a social advantage by signalling to a wider group of subcommunities that you are invested in the products of their social software activities, but you are leaving a lot on the table from an actual performance perspective. This is pretty important queue theory stuff for people who want to achieve competitive latency or throughput. I strongly recommend (and give to basically everyone I work with on performance stuff) chapter 2: Methodology from Brendan Gregg’s book Systems Performance: Enterprise and Cloud which goes into the USE method for reasoning about these properties at a high level.
Drilling more into the important properties of a a scheduler: parallelism (what you need for scaling CPU-bound tasks) is, from a scheduling perspective, the OPPOSITE of concurrency (what you need for blocking on dependencies). I love this video that illustrates this point: https://www.youtube.com/watch?v=tF-Nz4aRWAM&t=498s. By spending cycles on concurrent dependency management, you really cut into the basic compute resources available for accomplishing low-interactivity analytical and general CPU-bound work. The mind-exploder of this perspective is that a single-threaded execution is firmly in-between parallelism and concurrency on the programmer freedom vs scheduler freedom spectrum. The big con is people like Rob Pike who have managed to convince people (while selling compute resources) that concurrency is somehow an admirable path towards effective parallelism.
Sure, you can essentially build a sub-scheduler that runs within your async scheduler that runs on top of your OS’s scheduler that runs on top of your cluster’s scheduler that runs on your business’s capex resources etc… etc… but it’s pretty clear that from the high-level, you can push your available resources farther by cutting out the dueling subschedulers for workloads where their tensions drive down the available utilization of resources that you’re paying for anyway.
Basically, they want one process to deal with latency optimized requests and other throughput optimized requests. I don’t know if there use case is valid but basically they try to separate them into different systems. Something you actually advocate for.
It is not unlike having a UI main thread and offloading long running ops to background tasks which may, e.g. report progress back to the UI.
That is entirely reasonable, in my opinion.
They even do use OS scheduling for their needs by using low priority threads.
The decision whether to use tokio for the long running ops is more questionable. It might be just a matter of “it works well enough and we prefer using the same APIs everywhere.”
They also put the a big buffer in front with the channel (and a rather slow one, I think).
I think it can obviously be optimized but the question is more, so they need to? Or is their a simpler solution that is also simpler considering the new APIs it requires devs to know about.
Just a random thought related to not using OS schedulers to their full extend: in-process app schedulers relate to OS schedulers in a similar way Electron relates to native desktop toolkits.
Reimplementing part of the OS stack seems silly until you want apps to work cross platform. Then your limited scheduler is still a compromise but at least it works similarly everywhere. Overhead for threads, processes, fibers is pretty different in Operating Systems, at least it used to be.
Most services are deployed on Linux (and if not Linux then usually only one operating system), however, so discrepancies that cause performance drops on other operating systems are not that important.
I agree about deployment but the ecosystem would still prefer a cross platform aporoach.
Being able to repeoduce problems on your dev machine without a vm is very valuable, though.
Also I don’t think an optimized a single OS approach to async in rust would get adoption.
This was one of the features of scala that really made it challenging for me to understand what code may have been doing during stressful debugging sessions. It’s cute, which should always be a huge red flag. I find it to be an abstraction that inappropriately obscures complexity.
I’ve never used Scala but I have a former colleague who has and really hates it. The way he put it is you have to learn so much of it to be productive on other people’s codebases that you spend more effort learning the language than actually doing anything with it, and no matter how much of it you know, you still never know enough to productively extend code that you haven’t written yourself. Sort of like what we used to say about C++ a long time ago: it’s so large that nobody uses more than 10% of it, but nobody can agree on which 10% that is, so every C++ shop basically writes its own C++ dialect.
(I don’t know if this is an accurate description of Scala but if I’m being honest I kind of suspect it is…)
This is… kind of the impression I get after reading this. On the one hand, it’s super elegant. On the other hand there seem to be hundreds of these elegant things in Rust, and they keep popping up, and there’s only one of me. Embedding every single abstraction you’re ever going to need in the language’s definition is tempting from a language theory point of view but it’s a battle that you’re never going to win: there are infinitely many abstractions out there, but people can’t afford to invest infinitely many hours into learning a language that they use professionally.
(Edit: also, just sayin’: since I doubt people keep piling up new abstractions on languages just because they have weird hobbies, it’s hard not to think that maybe if the language was better at helping you build the ones you need, it wouldn’t need people with a PhD in Rustology to figure out a reasonable way to pass contexts around…)
Even Scala’s original designers will admit they missed the mark and are attempting to improve things in Scala 3. See “contextual abstractions” on this page: https://docs.scala-lang.org/scala3/new-in-scala3.html
Jury is out on whether or not this is meaningfully better, but reading through that and https://docs.scala-lang.org/scala3/reference/contextual/context-functions.html makes it clear to me that the design is substantially better.
Interesting, I feel it would do the opposite for me. Instead of dealing with hidden thread local storage as context of which I can’t touch at all unless it was exposed, I could pass that context at my entry point of the program explicitly and better capture it further down and print it out with dbg!
.
As a library author, it would simplify initialization for me and remove, in my opinion, the magic of it for users. If you didn’t to initialize something stored statically that was required you’ll have to handle that at runtime, but equipped with this proposal that becomes a compile time error. I could also see it helping the async runtime problem rather than depending on thread-local or another weird marker feature that rust could use in the future like #[global_allocator]
.
I’ll most definitely concede there are most certainly places where could feature could be abused. If a library author wanted, I assume they could provide the context with a wrapper function and keep the context type private making my first point moot. I wonder that this could hurt compile time and increase binary sizes due to monomorphization when different contexts are used, though that concern isn’t specific to this proposal, the current technique of static data pushes the context to be a single concrete type.
Maybe submit a pr to add the tags in a migration, linking to the proposals here that you feel resulted in a fairly clear amount of support for them?
this previous migration should be helpful in creating an appropriate new one https://github.com/lobsters/lobsters/blob/f25fc62d7603c1bf7089925ad5517948b5008d42/db/migrate/20200809023435_create_categories.rb
Not sure where DHH is coming from here.
Is it his faux hacker-bro persona, “I’m with you in the trenches” schtick? Is it the dude that races endurance cars on his days off? Is it the businessman who decided that “politics” was off-topic and lost 1/3 of his employees?
Borderline off-topic and flagged as such.
If you can extract any message from this ill-formed rant, please let us know what it is.
If this had been written by anyone other than DHH it would have been laughed out of the room.
The message I took away was that we should treat the programming profession as the serious field that it is, and not let ourselves diminish the skills we’ve learned by resorting to snide comments about how we all copy and paste.
Yes, that would be important if anyone outside programming humor subreddits and DHH’s fevered imagination acted that way.
I’ve seen and heard senior engineers say this, so people definitely use the phrase outside of the limited scope you listed.
I don’t necessarily agree with DHH’s analysis that it’s a real problem, but it’s interesting to consider.
I don’t see a problem with both being true. At the beginning you copy to learn from examples. The higher you go, the more you copy things that either save you time or come from areas you don’t deal with all the time. It doesn’t make programming not serious. It’s a relatively unique thing we’re able to do in our field. I’m a senior engineer and I copy things all the time. Some are actually more accessible through SO than otherwise (I’ll go back to the question about getting the current username in c# every few weeks for example - and it’s completely fine) It’s great we can do that and as long as you understand what you’re copying, I don’t care if you write the whole app that way - this will only get more common/legitimate now with copilot.
I didn’t really have any difficulty in parsing out (what I believe to be) his point:
Other than the initial assertion of “hey, this is a widespread phenomena in our culture”–and poor organization on DHH’s part–this all seems like something any of us should be able to say “okay, yeah, I buy it”. Even if the original assertion isn’t true, the rest of this is still a laudable take.
The message I’m getting is “I alienated a huge swath of my company, they left, I went on right wing media to talk about how I’m actually not mad about how I drove a bunch of people to quit, and it’s also everyone else’s fault that we aren’t getting many qualified applicants.”
A message does not have meaning without a context. Assuming a context with an anonymous programmer author, the message has far less meaning than that which can be interpreted with the addition of this specific well-known author into the interpretive context. This message is simply a different message without knowledge of this well-known author who wrote it. There’s no use asking whether this message should actually be a different message. It’s not the situation at-hand.
Caught covid while waiting in-line for my 3rd shot about a week ago, and it has actually been awesome to face what I’ve feared for the last 2 years while armed with such a high antibody count, and it has been totally manageable after so much preparation and worrying about the unknown since the pandemic started. Despite being a little sick, I’m making far more progress on an experimental pagecache that might end up in sled than I’ve been able to make in months. Weird how these things work sometimes! I never imagined catching covid would be the ultimate burnout buster hahaha… thanks pfizer :D
I’m happy you are feeling better.
Weird how these things work sometimes! I never imagined catching covid would be the ultimate burnout buster hahaha…
As someone who guides his career path towards database implementation internals & distributed systems I just wanted to use this occasion to say that you are a large source of inspiration for me and I perceive you as super productive. Your lobste.rs comments on databases are often better reads than long-form blog posts. Kudos and keep on hacking! :)
I worked for a while in a comp sci lab that specialized in solving real-world problems with statistical analysis and (generally non-neural-network) machine learning. The PI was quite good at making contacts with interesting people who had interesting data sets. People would come to him with “big data” problems and he would say “this fits on a single hard disk, this isn’t big data. We can work with your data far more easily than these other people with Hadoop pipelines and whatever, you should work with us.”
We had a compute server with 1 TB of memory, in 2016, and it was not particularly new. Turns out that if you’re counting things that exist in the real world, there’s not that many actual things that require a terabyte of RAM to keep track of. It probably cost low six figures to buy, or rather, about the same as one full-time engineer for a year. (Or two grad students, including tuition.)
I didn’t do particularly well at that job, but I did learn that 90% of the work of any data-oriented project is getting your data cleaned up and in the right shape/format to be ingested by the code that does the actual analysis. xsv
, numpy
and similar tools can make the difference between spending a day on it and spending a week on it. That was far more fun for me than the actual analysis was.
By contrast, at a previous gig I watched an entire data science group basically fuck around with Amazon and Google Bigtable and data lakes and lambdas and Kafka to handle ingests of a whopping…three events a day? maybe?
Their primary deliverable every month seemed to be five-figure AWS bills occasionally punctuated with presentations with very impressive diagrams of pipelines that didn’t do anything…my junior I was working with, by contrast, was whipping together SQL reports and making our product managers happy–a practice we had to hide from the org because it was Through The Wrong Channels.
And because reasons, it was politically infeasible to call bullshit on it.
Really understanding what problem you’re actually trying to solve is often overlooked in the desire to jump on the latest buzzwordy technologies.
Every time BitCoin power consumption comes up, I go and look at the transactions per second that the entire BitCoin network has done over the last week. I’ve never seen it average more than 7/second. If you want to use a cryptocurrency as a currency (rather than the exciting smart-contract things that things Etherium allow, which may require a bit of compute on each transaction), each one is simply atomically subtracting a number from one entry in a key-value store and adding it to another.
A Raspberry Pi, with a 7 W PSU, could quite happily handle a few orders of magnitude more transactions per second with Postgres or whatever. Three of them in different locations could manage this with a high degree of reliability. You could probably implement a centralised system that outperformed Bitcoin with a power budget of under 50W. BitCoin currently is consuming around 6 GW. That’s a roughly 8 orders of magnitude more power consumption in exchange for the decentralisation.
Really understanding what problem you’re actually trying to solve is often overlooked in the desire to jump on the latest buzzwordy technologies.
Perversely, the problem that’s often being solved is “keeping the engineers from getting bored”, “padding your resume to make it easier to jump”, or “making your company more sound more important to attract VC dollars”.
Now that I think about it, “If we use technology $x we’ve got a better chance of nabbing VC money” can often be a sound business decision.
Every time BitCoin power consumption comes up, I go and look at the transactions per second that the entire BitCoin network has done over the last week. I’ve never seen it average more than 7/second. If you want to use a cryptocurrency as a currency (rather than the exciting smart-contract things that things Etherium allow, which may require a bit of compute on each transaction), each one is simply atomically subtracting a number from one entry in a key-value store and adding it to another.
This is part of the design. Every 2000 blocks or so the protocol adjusts the difficulty of mining to keep the average mining rate at roughly 1 per 10 minutes.
It probably cost low six figures
IIRC, my infrastructure team told me the cost to replace our 1U 40-core IBM server with 1TB RAM was going to be around $50k
Just checked. 48 core AMD EPYC cpu, 1TB RAM, no disks past boot, 1U: just over $18k. Call it 20k with a 40G ethernic and a couple TB of NVMe.
That’s a pretty affordable kit! The 1TB of RAM (8x128GB) alone from SuperMicro or NewEgg would cost $10k-$15k.
I was talking with my team again today and this came up, that $50k price tag was actually for a pair of servers.
Turns out that if you’re counting things that exist in the real world, there’s not that many actual things that require a terabyte of RAM to keep track of.
I’ve been saying this for years. I haven’t used numpy too much—I’m not a real data analyst—but I’ve gotten the job done with SQLite, and/or split; xargs -P | mawk
.
The biggest problem you’ll have with evading sanctions in this manner is US OFAC Sanctions are applicable to any business that transacts in USD, which is basically the entire tech sector.
I’d suggest dual stacking, with a ‘local’ .ir domain to guard against the case where an ‘unfair’ set of sanctions is levied against Iran again that would cause you to lose another domain, and a domain from a European liberal democracy (like .is suggested in another comment) to guard against the “Revolutionary Guard doesn’t like my posts” case.
This. You have to identify your threat model before defending against it.
If your threat models are the US and Iran, in my view you can go with countries that are either too enlightened and self-sufficient to care much about their beef with each other or their own citizens (.is, .se, .ch, etc), or countries that are too small and mercantile for them to care much about (.io, .to, etc.) Of course, nothing is going to really save you from a determined state-level attacker.
or countries that are too small and mercantile for them to care much about (.io …)
.io isn’t run by the country it supposedly represents.
It’s worth noting that there are basically three cases where you need to care about the US blocking your blog:
In the first case, you have many, many other problems with anything technology related and you’re best off looking at a complete hosting stack in your country or a friendly one. Whether you can reach people in the US is a separate issue.
The second one is more likely. If someone in the US decides that you’re infringing their trademark with your domain, for example, then you have to defend the case in the US, which is expensive. In this case, you can lose the domain. The same is true for international trademarks elsewhere, so at least registering in a locale that you can easily travel to and where the rules about who pays costs are friendly to you may make sense. If there is a criminal prosecution of you in the US and you don’t attend the court, then remember that this will make it impossible for you to ever travel to / through the US and may make it difficult for most global financial institutions to do business with you.
In the third situation, control over your blog is going to be absolutely the last thing that you care about.
TL;DR: If US government action against you is a realistic part of your threat model, then the domain name under which your blog is registered is the least of your problems.
In the first case, you have many, many other problems with anything technology related and you’re best off looking at a complete hosting stack in your country or a friendly one. Whether you can reach people in the US is a separate issue.
This is all true I assume. But at the same time, all of this stuff can be done later. If you can’t use Amazon or Linode or DigitalOcean anymore due to an embargo, you can reasonably quickly move everything over to a cloud which does business with Iranians. But the domain can’t just be moved over; if you write a bunch of content on your .com blog, and links to your blog posts end up all over the place, all those links will break if Verisign doesn’t want to do business with you anymore.
The only part of the stack which can’t be replaced with an Iranian or self-hosted alternative is the domain, so it makes sense to worry more about that than about everything else.
Its biggest feature is that it is functionally very similar to Twitter, already with a bunch of the best shitposters having accounts, while not being actively under destruction by Elon Musk. The at protocol is kind of a side curiosity. I’m having a lot of fun on there and I don’t really view the at protocol as a meaningful aspect of what makes a social network interesting to me. While Elon’s destructiveness may cause a lot of people to say that the system should be built in a way that is resistant to such hierarchical control, the actual interesting mechanism is that people will vote with their feet and they will go where they feel good, and its backing architecture is usually kind of irrelevant to anyone other than its operators.
Elon Musk is only destructive of twitter if you adhere to one particular set of Anglosphere political preferences. If you don’t share those politics, then Elon Musk’s management of twitter is neutral to positive compared to the previous status quo.
In any case the backing architecture of a social network is highly relevant to users even if they’re not directly aware of it. The protocol affects how operators can moderate the network, which affects how people can use it.
Elon’s leadership has been destructive to Twitter in countless and widely-reported dimensions, few of which are related to his politics.
In December Twitter started showing unrelated tweets in replies to a tweet, which made reading Twitter so unpleasant I quit within a week.
Lobsters, HN, the bigger tech subreddits and so on all lean in that particular political direction so the OPs view is likely the dominant one here.
We’ve seen with Mastodon what happens when you federate such a system: People will divide the social graph based on an arbitrary axis like political or sexual identity and make it impossible for users from these groups to communicate with each other. Bluesky will probably do that as well, if only to support their business model.