1. 2

200 SQL statements per webpage is excessive for client/server database engines like MySQL, PostgreSQL, or SQL Server.

Laughs in the ~5000 SQL statements executed on MySQL per user search at work.

1. 1

I used to think the same thing, but then I realized, depending on what you are shipping, there might be a caveat. The hackers who steal your code might not use it to set up a competing service, but to inspect it for vulnerabilities and hack you or your customers and do worse things.

Although this argument falls in the realm of security by obscurity, obscurity might be a very useful protection if you ship code that

wasn’t of the highest quality either: Half the time, the build would be broken. Testing and documentation were basically non-existent.

I do agree with the rest of the argument though.

1. 1

I think the article is interesting for coming very close to saying that code is a liability but workers are an asset, and then not taking that step. Corporations are people, as US law famously declares, but that phrase can be read in two ways and every place I’ve worked has really hated to read it the second way.

1. 5

I’ve always had problems with RSS because I always found the list of “unread articles” stressful. I know you don’t need to read everything, but somehow my mind doesn’t cope well with the concept of “letting unread things remain unread” 🤷

I find that Lobsters/HN/Reddit works fairly well.

1. 2

I’m in the same boat. When I used newsboat, a while ago, I had to mark everything as “read” weekly to keep out noise.

I decided to make my own feed reader a few months ago. It ended up becoming a CLI that I pipe into fzf to read feeds. No read/unread, no complex navigation, and notifications on new items. It’s worked really well for me. If read/unread is overwhelming, you may consider it.

1. 1

Same, so I made my own that doesn’t have that.

1. 1

That’s for the “main news” sites, where I also find RSS doesn’t work so well. But it works great for following low-traffic blogs from people where I want to see everything they post.

1. 17

Any headline that ends in a question mark can be answered by the word no.

1. 7

The cover story of the January issue of the CACM was Does Facebook Use Sensitive Data for Advertising Purposes?.

1. 7

For every joke, somebody will point out that it’s not literally true

– gthm’s law

2. 6

The linked post doesn’t disagree with you.

can we see here that Microsoft is releasing more and more parts of Windows as open source?

Windows will probably remain a proprietary product for some time, but I can imagine that the trend of releasing more and more code will continue

This take seems quite reasonable.

1. 2

It was an open question and more of a thought than an answer. 😊

1. 2

i did read the article before replying and it is very sensible, i just couldn’t help myself 😅

2. 4

By ‘no’, do you mean:

• No, it won’t become open-source,
• Hard to say, but it’s unlikely it will become open-source,
• No, you don’t want it to happen, because it will be bad for MS,
• No, you don’t want it to happen, because it will be bad for other systems,
• You think even if MS releases the source, it will never be truly open-source,
• Something else?

:^)

1. 1

I thought this was a nice blog and wanted to follow it, but they don’t seem to have RSS or Atom feeds. Dommage.

1. 1

Yep, like so many sites it’s offering an update via email instead.

1. 26

After performing over 100 interviews: interviewing is thoroughly broken. I also have no idea how to actually make it better.

yep

1. 2

Maybe Amazon’s interview is broken. This data-structure bullshit doesn’t help at all if the applicant doesn’t know shit about real work, system designs, soft skills, security, team work, etc

1. 4

As much as I dislike FAANG interviews, every attempt I’ve seen to fix them is also fraught with problems

1. 6

I’d love to work for one of those big FAANG but I don’t know from the top of my head how to do a BFS on a tree. So fuck it, my 20 years of development is garbage for them.

1. 4

There are many books and courses to prep candidates for FAANG interviews. For senior engineers might be daunting but to join a FAANG some drills are to be expected.

Whatever company will have a big pool of candidates will end up in similar situation: assess things that are largely irrelevant to the day-to-day job.

The real drama is that very smart people who could use their brain-power to improve society at large, deal with low utility projects for years.

1. 4

You’re doing yourself a disservice by having this mindset

1. 1

Why?

1. 1

Because you don’t get to work at FAANG

2. 3

I seriously believe that if you are a good programmer, went to uni or similar and spend two weekends with “cracking the coding interview”, do 1 or 2 mock interviews to train the communication style, you have a good chance. If you aren’t anxious or have other such problems during the interview.

Without preparation most would be lost.

You can, of course, still think this is fucked but it’s not unpassable for good programmers without anxiety problems, reasonably good communication skills and time to prepare.

If you are interested, I can do a mock interview with you.

1. 15

I don’t think the problem for most people is the details of playing the game. The game is learnable, and if one has gotten anywhere in this field its because one can learn things. The problem people have is they question why the game, which everyone knows has no bearing on the ability to do the job at hand, needs to be played at all?

If we put our cynicism hat on (mine is pretty worn-out by now), we can answer that question by saying that what the game is about is testing people’s willingness to jump through arbitrary hoops. In that sense, it may actually accurately test their ability to function within the organization at hand, and thus may in fact be very good at its job of filtering out candidates who would not work out.

1. 5

but it’s not unpassable for good programmers without anxiety problems, reasonably good communication skills and time to prepare.

It’s not, but good programmers with 20 years of experience can always get a job someplace where they don’t have to jump through these silly hoops.

It works surprisingly well for both parties. It’s not like recruitment heads in Big Corp don’t already know this puts off experienced programmers, everyone’s been aware of that for a long time now. They just don’t want that many experienced programmers. If you’re recruiting for senior and lead positions, it’s much more efficient to go through recommendations (or promote from within) in which case the interview is… somewhat more relaxed, so to speak. The interviews are designed for something else.

(Edit: I’m with @gthm on what they’re designed for . The main aim is to select young graduates and mid-career developers who will put up with arbitrary requirements and don’t mind spending some of their free time on it every once in a while.)

1. 2

Having been through the Google interview gauntlet a few years ago, there’s quite a bit more than just whiteboarding algorithms.

I was completely unprepared for the ‘scale this data query service’ chunk, which I didn’t even know was going to be part of the interview (which is a failure of the Google recruiter frankly) but I now know is pretty standard amongst FAANG company interviews for SRE type roles. Didn’t help that the interviewer was a jerk who refused to answer any of my questions, but that’s hardly unusual!

1. 2

That part is also covered in “Cracking the Coding Interview”

Not to invalidate your experience but the vast majority of my interviewing experience was pleasant. Maybe you have had bad luck or me good luck or your standards are different.

1. 3

1 grumpy jerk who clearly didn’t want to be there, 2 decent guys & a third who was OK but stone walled me when I asked questions about the problem he posed. Which was a little weird, but there it is.

(CtCI has 5 pages on system design & about 100 pages on data structures, algorithms & all the rest. When a quarter of the interview is system design, that’s not going to help you much. There are some good online resources around these days though.)

1. 4

It may be a good idea to copy down the encryption passphrase onto paper and put it in a safe space like a safety deposit box.

Just remember that “safe deposit” boxes are not particularly safe – it turns out banks don’t actually do a great job of securing them, either against loss or intrusion.

Giving the passphrase in an envelope to a trusted friend (or using a 2-of-3 secret-splitting scheme) may be the better option for some people.

1. 0

Be careful the Joker can’t exploit your threat model for dramatic purposes in the third act.

1. 4

what

1. 4

It would be really nice if Digital Ocean let you upload arbitrary ISO files and go from there, but that is apparently not the world we live in.

My cloud VM is a NixOS on DigitalOcean. I can dig up the details of how that works if you want @cadey. I build a NixOS VM with some config stuff for DO, upload the image, and run that.

1. 1

I have a NixOS server on DigitalOcean. nixos-infect worked wonderfully, I just copied my configuration files on and up it went.

1. 6

People mock Perl as a write-only language but at least people write Perl instead of exclusively implementing yet another Perl interpreter.

1. 4

I’ve done a couple of these, a key-value database (once as an interview take-home question and once as a teaching tool) and a stock trading bot (when I worked in finance). They’re very different projects and both very fun and educational. Once I have time (ahahahahahaha) I’d like to try some of the others.

1. 2

I don’t know anything about the problem domain so maybe this is a silly question, but is the Euclidean metric especially meaningful here? Since the space is finite-dimensional all norms on it are equivalent and at a glance the l_1 or l_\infty norms look like they’re easier to compute.

1. 1

I wonder if the constant needed to “convert” to an L1/Linf norm would cause overflow in the uint16 implementation. I don’t think the engineer in this post had access to the underlying hash algorithm (which might be able to absorb that change).

Regardless this is a good idea

1. 5

Isn’t this generalized by enumerations?

I really liked the contrast of the original API vs the one using chained methods and the bitmasks. Drives the message home right away.

1. 2

Yes, and C# has an attribute called ‘Flags’ you can apply so that tooling can know these values are meant to be ORed together and help out.

Example from Microsoft:

[Flags]
public enum Days
{
None      = 0b_0000_0000,
Monday    = 0b_0000_0001,
Tuesday   = 0b_0000_0010,
Wednesday = 0b_0000_0100,
Thursday  = 0b_0000_1000,
Friday    = 0b_0001_0000,
Saturday  = 0b_0010_0000,
Sunday    = 0b_0100_0000,
Weekend   = Saturday | Sunday
}

1. 1

Is it? Bitmasks can be combined with bitwise or, I’m not aware of similar enum combinations in the implementations I’m familiar with.

1. 1

C lets you do that with enums; from https://en.cppreference.com/w/c/language/enum :

Enumerated types are integer types, and as such can be used anywhere other integer types can, including in implicit conversions and arithmetic operators.

 enum { ONE = 1, TWO } e;
long n = ONE; // promotion
double d = ONE; // conversion
e = 1.2; // conversion, e is now ONE
e = e + 1; // e is now TWO


This works in clang 12:

#include <stdio.h>

enum asdf { A = 1, B = 2, C = 4 };

int main() {
enum asdf x = A;
x = B | C;
printf("%d\n", (int) x);
}

> clang --version
Apple clang version 12.0.0 (clang-1200.0.26.2)
Target: x86_64-apple-darwin20.1.0
InstalledDir: /Library/Developer/CommandLineTools/usr/bin
6

1. 2

The lack of type- and range-checking can be considered a drawback. B | C isn’t a valid asdf value since the declaration has no item equal to 6; its type is actually int. C++ is stricter and won’t let you assign that back to an asdf variable without a typecast.

This makes using enums as bit-sets annoying in C++-compatible code. Apple’s frameworks work around this by making the bit-set type actually a typedef for unsigned, not the enum type itself.

1. 1

The lack of type- and range-checking can be considered a drawback.

So much of typing in C can and should be considered a drawback, and I wouldn’t shed a tear if software written in C went off into the sunset.

Swapping the stdio.h to cstdio and running the same file through clang++ does in fact error, which, yeah makes sense :

clang: warning: treating 'c' input as 'c++' when in C++ mode, this behavior is deprecated [-Wdeprecated]
bitmask_enum.c:7:8: error: assigning to 'enum asdf' from incompatible type
'int'
x = B | C;
~~^~~
1 error generated.


And C++ is kind of ridiculous about what you gotta do to use a scoped enumeration (c.f. https://en.cppreference.com/w/cpp/language/enum ) as content for a bitmask:

#include <cstdio>

enum class jkl { A = 1, B = 2, C = 4};

int main() {
enum jkl x = jkl::A;
int f = (int)jkl::B | (int)jkl::C;
printf("%d\n", f);
}


In conclusion, ¯\_(ツ)_/¯

1. 3

It’s another case where C++ gave us something good (enum classes) but failed to add language support for making it pleasant to use. (See also: variants; functors and iterators 2003-2010; and apparently async/await in C++20.)

You can fix this, but it requires writing a bunch of operator overloads on your enum class to ad the necessary type-casts. Not rocket science, but why was this not added to the standard library?

2. 1

works in clang 12

It should work in any conformant c compiler.

1. 2

This post reads great until the conclusion, where it makes claims I find odd. The first one is:

Let’s start with computational complexity. […] which according to Cosma would correspond to an optimization problem that would take a thousand years to solve on a modern desktop computer. However, if Moore’s Law holds up, it would be possible in 100 years to solve this problem reasonably quickly.

We’re already claiming Moore’s law doesn’t hold anymore, and it seems odd to rely on it to hold for the next 100 years to make planned economies a reality at scale. As @luiz pointed out below, some modern corporations run large de facto planned economies, so an answer may lie in some direction other than raw increase in computational power.

As described earlier, the second serious issue with a centrally planned economy was data quality: […] Whether a government would be able to harness [data on demand] as competently as Amazon is doubtful, and it’s obviously worth asking whether we would ever want a government to be using that type of data.

And here I was thinking the lesson of the last decade of social media was that we couldn’t trust tech companies with our data. Why is it only worth asking whether we’d want governments to use that type of data? Where does this default trust of large corporations come from?

This latter objection seems much more serious than the former, which is “just” a tech problem. Even if we assume planned economies are better than capitalist ones, how can we trust an institution with the data it needs to run such an economy?

1. 12

I work at Booking.com. We have millions of lines of Perl in production, with more than 500 devs working on the same repo. The unpleasant parts of the experience are infra problems, not Perl.

1. 3

Oh interesting, booking(s) is still a lot/mostly Perl? I remember talking with people from the Dutch Perl community about 20 years ago, and I was wondering which direction booking has gone recently.

1. 5

Yeah, it’s mostly Perl by volume (14M lines in our main repo). Over the last five years or so some important new services have been written in Java, and we have a smattering of Go (in some infra tools), Python (infra and data science) and NodeJS (honestly I don’t know where).

The direction the company wants to go into is to carve the Perl monolith into services (which will either be in Java or again Perl). This creates a lot of developer work and job security, and will possibly have other effects as well.

1. 1

Booking is always named as one of the biggest (and successful) companies using Perl, but I can’t help but wonder if it’s just the Facebook effect. If you have enough manpower and smart people you get to your goal with every half-decent technology (not a slight against PHP or Perl), but the real question (and usually pointless) if they succeeded because of or despite the use of said language. I’m pretty much in the “doesn’t matter, but it probably helped at some point” camp.

1. 1

This is first and foremost an academic journal snafu. I don’t like being that guy, but I don’t see how this is on topic?

1. 4

If it’s not on topic, then it’ll be removed when the moderators are awake. It wouldn’t be the first time that I’ve misunderstood what’s appropriate.

What I find interesting here is not just that the paper was retracted, but that the retraction breaks a pattern of this particular author being tenuously accepted by various journals and then the journal editors being unwilling to change their minds when presented with contrary evidence. Despite clear crankery, it was only when the topic turned to mathematics that they were defeated, and the defeat came in the form of experts pointing out that one of the implications is impossible. This is the sense in which, if mathematics were a science, it would be the hardest science; this is also the sense in which mathematics is too hard to be merely an empirical science.

Quoting the retraction notice:

Since publication, concerns have been raised about the integrity of the mathematics in the article. The main error is visible not only in the title, but in the abstract… Hurwitz’s theorem says all 8-dimensional normed division algebras over the reals are nonassociative and isomorphic to the octonions. This famous result, published in 1923, has been confirmed with a number of well-established proofs. Thus, based on these factors the Editors assess the author’s main result to be false.

The emperor had no clothes, but the empirical sciences had trouble assessing this by looking. This should teach us something about the blind spots of science.

1. 13

I love how Urbitters will just blithely claim they have magic and move on as if they had not claimed anything incredible. Easy example from the post:

exactly-once messaging: this is enforced at the network protocol level, which means you can have an app mirror another ship’s state and it “just works”

General exactly once messaging is impossible, and it is a key part of Urbit. Not sure why people keep posting this snake oil promotion on here.

1. 2

Section 9.6 of the Urbit whitepaper explains the level of abstraction at which urbit can guarantee exactly-once messaging, and the assumptions and tradeoffs involved.

1. 9

Did you actually read it? I have read a number of Urbit ‘papers’ - every one of them (including this one) makes sense if one does not actually read it, but when you actually read it it falls apart immediately. The scheme that is proposed is trivial and only works if magic is real.

Like, they completely fail at understanding the problem, and think that the problem is that no one else has ever thought of retries, or thought of version numbers, or thought of opening a new socket when your socket unexpectedly closes, or thought of logs and replaying on node restart.

There are real schemes that look superficially similar - like, on possible network partition immediately die and lose all data, rebuild from scratch every time. This works if you have short lived clusters and you don’t mind blowing them up and losing everything several times a day.

Or only send idempotent messages, and keep retrying forever until you get an ack - but this only works directly if you can rework everything you are doing into semilattices, which is not a general solution - for example LVars.

Bleh.

1. 8

Indeed; Section 9.6 begins:

Protocol design is much easier with the assumption of uniform persistence. In a protocol designed for uniformly persistent nodes, we can assume that no node involuntarily loses state.

The same is true for other problems in distributed computing; if we assume failures can’t happen, the solutions become simpler (or in this case, possible at all).

This is for example why Leslie Lamport published Time, Clocks and the Ordering of Events in a Distributed System in 1978, where he details a distributed state machine in a system that has no failures, and it took twelve years until the writing of The Part-Time Parliament in 1990 to figure out how to do the same in the face of component failures. (And even then, only in the face of so many failures. And it took eight years for the paper to be published. And no one understood it for years. And people invented other distributed algorithms to avoid having to understand the paper.)

1. 2

How can one protect themselves against unprovoked attacks on their weekend like this?

1. 6

Get a family. Powerful antidote to online time-wasting.

1. 2

Not sure if tongue in cheek advice or opening volley of an epic burn war.

1. 5

Can confirm; I have a family and spend no time online on the weekends.

1. 3

Nice! As a chat systems nerd, I’m always interested in people trying to do something new in this space :)

I like that a form of catchup is baked into the protocol from day 1 – being able to implement the equivalent of a message broker’s “durable subscription” is very valuable for not dropping messages on the floor (like in IRC where your connection drops). Minimalism is also a worthy goal – XMPP and Matrix do indeed try to promise the world, and are extensible enough that you can send anything over them, having a deliberately spartan alternative is a nifty idea.

One thing that does seem lacking is any sort of discussion around how bridging to other protocols might work – would it just be a special case of s2s (as in XMPP), or would you design out a special extension for it (as in Matrix)?

1. 1

This is not a full federated your-server-talks-to-every-other-server type thing.

There is no s2s. It seems it is meant to compete with IRC-ish c2s only and use centralised servers.

1. 1

Yeah, because centralized servers are simple to do and relatively hard to break. And because I don’t know a whole lot about decentralized chat systems. Seems like with a centralized chat system you have to trust the owners/operators of the server your client talks to, while with a decentralized chat system you need to trust the owners/operators of every server your own server talks to.

Frankly, if I want to talk in the channel #foo@example.com then I see no reason to have to go through an intermediary home-server rather than just talk to example.com directly, and if you want a global topic #foo that any server can participate in then that seems Hard Enough I Don’t Want To Bother. And for bridging to other protocols, I’ve never seen it done well enough to be compelling enough to bother.

My mind may be changed on these points, however. Most importantly, authentication and any potential account metadata is decentralized, so no matter whose server you’re talking to, you can still own your own identity. (This part I HAVE thought about.)

1. 1

Yeah, because centralized servers are simple to do and relatively hard to break. And because I don’t know a whole lot about decentralized chat systems.

I’ve been idly wondering how one would do a decentralized chat system since I saw your sketch here. I think one would partition messages across peers with a distributed hash table, and have messages point to the most recent message posted in the chat/channel they’re in like you’ve discussed. That seems to lead to messages forming a DAG that we need some way of making eventually consistent, so maybe one needs some kind of CRDT for the messages? It’s fun to think about.

1. 1

For sure, except in a Global Network like original IRC dreams, chatrooms effectively end up being single-server. Though it’s useful as as admin to be able to choose the server my chatroom is on without requiring my users to connect to Yet Another Server.

If you have decentralized authentication/identity and account metadata (and 1:1 chats, if those are supported) then you are a fully federated type thing.

1. 2

Now O want to builds something like this (IRC) distributed and federated. Great, as if I don’t have enough unfinished projects.

1. 2

I’d love to hear your thoughts on design, to be honest. Like I said, I don’t know much about distributed chat systems, but learning more would be nice.

2. 1

The primary use case of multiple servers is for operators to be able to bounce instances or have some die without losing what might be an important piece of infrastructure. Anything else, including possible load sharing or ping time optimization, is at best a nice side effect.

1. 1

Right, sorry, I should be careful to say “centralized” in this context. Number of “servers” is a red herring

2. 1

Sometimes I wonder if federated is bad for implementations, but federated identity on whatever centralized/decentralized server we want is good.

1. 1

Like OpenID? That never took off. Nowadays Facebook is probably the largest identity provider online.

1. 1

Nice post! Where/how do you store the passphrase for automated backups?

1. 2

Thanks! The systemd unit that does the backups talks to pass to get the passphrase. It in turn relies on gpg-agent to not have to ask to unlock the password store. This works for me because I do backups during the day and my email client keeps the gpg-agent awake.

1. 1

Aren’t you stuck in a chicken and egg problem ? You encrypt your backups using a password, saved in a store. If you loose your whole \$HOME, how do you recover ? You need the password, which itself need a gpg key, which is backed up, but encrypted right ?

Or maybe you backup your gpg keys and password store using other means ?

1. 2

Not certain if this is what they meant, but I assume the idea is that they both memorize the passphrase (in case recovery is needed), and also don’t want to keep typing it in for automated daily backups.

1. 2

Yes, except 1password has it memorized for me.

1. 6

Between this and the recent home-manager post, I think I’ve found a new blog worth subscribing.

1. 14

That’s very kind, but you’re in for disappointment! I hardly ever write anything.

1. 16

all the better, less cost to subscribe