1. 2

Fascinatingly (to me) the author of this article took nearly exactly the same journey I recently (I guess not so recent.. time flies) took when trying to solve the same problem. My version is here: https://dpzmick.com/posts/2021-03-28-polynomial-from-roots.html

Turns out there’s also functions in numpy and co that perform a special case of this operation, like the clearly named `zpk2tf`. These essentially are just repeatedly convolving new terms into a poynomial until all terms are consumed.

1. 1

Fascinatingly (to me) the author of this article took nearly exactly the same journey I recently (I guess not so recent.. time flies) took when trying to solve the same problem. My version is here: https://dpzmick.com/posts/2021-03-28-polynomial-from-roots.html

Oh, nice.

Turns out there’s also functions in numpy and co that perform a special case of this operation, like the clearly named zpk2tf. These essentially are just repeatedly convolving new terms into a poynomial until all terms are consumed.

Even with multiple characters available to name things, mathematicians can’t resist to use cryptic namings…

1. 3

Indeed, according to the official block diagram (https://i.mt.lv/cdn/product_files/CCR2004-1G-12Splus2XS_200459.png) the CCR2004 does all switching on the CPU…

Compare to the CRS309 (https://i.mt.lv/cdn/product_files/CRS309-1G-8Splus_190200.png) which has a real switching chip (but no SFP28)

1. 2

For a 25G system the CPU switching is kinda no-go..

1. 1

Along these lines, if you pass traffic between switching “groups” then the CPU will be used a lot more. My RB1100AHx4 has 2 (or 3) switching groups, so I use only the first one (first 6 ports) to avoid that.

1. 4

/tmp/ or \$HOME/tmp or \$HOME/trash/ for directories.

I have a personal policy (for files and git branches) to always delete anything I’ve previously named tmp without looking at what was in it any time I try to create a new temp thing in the same location. This ensures that I never work off of a tmp branch for very long or leave anything important in one of these files.

1. 5

`test.{ext}`

I have so many /tmp/test.go, /tmp/test.ex, /tmp/test.rb files it’s sad. Then they get lost on reboot and I curse myself out, but don’t change for some reason.

1. 2

One good thing about the name “test” is that it can be typed entirely with the left hand (on a QWERTY keyboard). Depending on the context, this can leave the right hand free to hover over the Enter key or the mouse.

1. 2

Always some variation of `test`! Unless I’m feeling particularly lazy OR there’s another `test` in that directory and I don’t want to delete it. Then `t`.

Anything with those names is always fair game for deletion, and in fact probably won’t be tolerated next time Future Me runs an `ls` command.

1. 2

I do this as well. I keep a handful of test.whatever files in \$HOME with a large number of accumulated includes and imports. All the includes I need to try something out are ready to go in those files.

1. 1

Then my mail app on my phone or email inside of emacs, or webmail

1. 1

Does anyone know what they are (might be) using for the hash function? I’ve worked on similar locality-sensitive-hash problems before and find the properties of such hashes to be pretty interesting

1. 1

According to their blog, they’re not telling

Fuzzy Hashing’s Intentionally Black Box

How does it work? While there has been lots of work on fuzzy hashing published, the innards of the process are intentionally a bit of a mystery. The New York Times recently wrote a story that was probably the most public discussion of how such technology works. The challenge was if criminal producers and distributors of CSAM knew exactly how such tools worked then they might be able to craft how they alter their images in order to beat it. To be clear, Cloudflare will be running the CSAM Screening Tool on behalf of the website operator from within our secure points of presence. We will not be distributing the software directly to users. We will remain vigilant for potential attempted abuse of the platform, and will take prompt action as necessary.

Which is unfortunate because that is much more interesting than the contents of this article.

1. 1

Thanks for the link! Seems like a reasonable call on their part.

Thinking on this a bit more, image fingerprinting seems like really interesting problem; the problems I’ve thought about before all deal with byte streams, but with an image the hash has to be like, perceptual? Not sure what the right word is, but seems really interesting. I’ve probably found a weekend tangent haha

1. 3

This comparison of perceptual hashing is a good intro to the topic: https://tech.okcupid.com/evaluating-perceptual-image-hashes-okcupid/

1. 1

I feel like this fundamentally is about stateful vs stateless services, not push vs. pull (although this is a great TLA example!). The thing that accept snapshots is effectively stateless but the thing that has to store some local state and apply deltas to it is very stateful. It’s clear that the stateful thing is harder to manage operationally and likely has more failure modes.

Replacing long lived statefulness with short lived statefullness has gone a long way for the reliability of the software I work on. My favorite example is in IGMP protocol. Consumers of a multicast group must periodically push out a message saying “I want to get packets sent to this group.” Switches only need to “remember” anything about what packets go where for a short interval, so the impact of a switch restart (which drops the what-goes-where state) is minimized.

1. 4

I don’t see event-based and poll (request-reply) being mutually exclusive. Having a broker to distribute the published event is necessary and independent subscriptions (and progress) are necessary on the consumer-side. Assuming the event delivery eventually occurs, there is no real difference between that and a consumer performing a request to fetch the event.

Request-reply can be useful for bootstrapping clients or (as the articular suggests) initiating a sync with the authoritative source of the information. Whether this initiation simply involves a transfer of a single state object or a stream of events to replay, that is necessary to get in sync (up to some moment). An event-based pub/sub model for general distribution is still superior to a server distribution or client poll for when online changes occur.

In other words, request-reply are useful during startup and if a timeout/fault is detected.

1. 2

Lots of financial exchange market data feeds work this way. They have a “snapshot channel” (which usually pulses on a timer) and an “incremental channel”. At startup (or restart after a crash), you can bootstrap state from the snapshot channel, then consume small deltas from the update channel. Since everything is produced by a single producer, you can use the timestamps in messages to know where you are in the stream.

1. 24

I’m the original designer of the Atreus; happy to answer any questions.

1. 1

Why do you choose a fixed Split keyboard, instead of an adjustable split keyboard?

I can’t find the reason in your blog post neither in Atreus repository.

Notes:

• Fixed split, I mean such Atreus.
• Adjustable split, I mean such ErgoDox.
1. 1

Found. https://technomancy.us/172 Thanks for a very thorough history, reasoning, and decision.

I work from local coffee shops frequently, and the Advantage is just too clunky to toss in a bag and tote around.

Update: I’ve designed by own keyboard, which is meant to be a smaller, more travel-friendly complement to the Ergodox that shares a lot of its characteristics.

2. 1

Do you find it difficult to switch back and forth between the Atreus and a standard keyboard? I would be concerned that, given time, that it would be problematic given how many keys on the Atreus require using a layer. Would switch between keyboard types cause me to focus too much on the typing and not what I am typing.

1. 4

I’ve found that the weirder the weird keyboard is, the easier it is to switch between the weird one and a normal one. I used to use a standard qwerty 60% keyboard at work, with lots special bindings/layers, and a normal laptop at home. This was constantly problematic because I’d try to use my special arrow key bindings and they obviously didn’t work anywhere.

I’ve since switched to a kinesis for “work” (now my desk) and I no longer have any problems typing on my laptop because it’s so much different in every way. I also got an atreus and played around with it for a bit and I feel like it is likely in the “weird enough to be okay” territory due to the non-staggered key layout (forgot the technical term for this)

The only exception to this rule is that I can hardly use a computer if caps-lock isn’t bound to control, but that’s a different problem.

1. 1

I actually do this. Surprisingly enough, switching is mostly painless. I use Colemak on all keyboards, and muscle memory works itself out somehow, at least 95%.

1. 1

My experience as a laptop user is that even though I greatly prefer the Atreus, having to plug it into my laptop means that I don’t use it 100% of the time; sometimes I’ll open my laptop for something really quick and won’t get the external keyboard plugged in. This is infrequent, but for me it has been enough for me to maintain my ability to type on a conventional keyboard.

However, if you only very rarely use a laptop, this might not apply; can’t speak to that.

2. 1

How easy is it to use a three-finger chord key? I have a keyboardio model 1 and find that three-finger chords - in particular the alt-shift-arrows that I use all the time in Eclipse - become an effectively impossible to type four-finger chord (since arrow keys need a modifier).

1. 1

Depends on which three fingers! I’ve been using ctrl-alt-letter chords since long before building the Atreus, because I’m an Emacs user. I don’t use any programs which require you to hold down shift while moving the cursor, so I can’t really say authoritatively, but alt-shift-arrows sounds like a key chord I would like to rebind to something less awkward even on a conventional keyboard.

If that was a combo I had to use a lot and could not fix in software for some reason, I would probably remap my keyboard so that the alt key was adjacent to the shift key so that a single thumb could hit both.

2. 1

Got mine one month ago and I’m experimenting different layouts. I’m quite happy with just the main layer and a symbols+numbers+f-keys layer, and I still have a bunch of unused keys in the second layer.

The software is nice, but I wish it allowed sending macros (for typing accented characters using a non-international US keymap, for instance). I might try menelaus at some point if you think it can handle that.

The article mentions it was designed with a resting position for the pinkies at Z and ‘/’ in mind. Is that correct? I might experiment with that configuration using them also as shift modifiers when pressed.

1. 1

The software is nice, but I wish it allowed sending macros (for typing accented characters using a non-international US keymap, for instance).

I’m like … 99% sure that this limitation is part of the GUI frontend, not the underlying firmware implementation itself. So the path of least resistance would be to build Kaleidoscope.

I might try menelaus at some point if you think it can handle that.

It definitely can’t handle that out of the box, but depending on your relative familiarity with C++ toolchains vs Scheme, it could conceivably be easier to implement that functionality to Menelaus vs configuring that as existing functionality in Kaleidoscope. Only one way to find out!

1. 1

What about the last bit? Do you rest the pinkies at Z and /?

1. 1

Oh, no I keep them on A and semicolon normally, but I hit the outermost top keys with my ring finger instead of the pinky. The pinky only hits A/Z and semicolon/slash (well, the dvorak equivalents of where those are on qwerty) and occasionally enter/esc; tho I usually use Ctrl-m instead of the enter key since it sends the ASCII equivalent of enter.

1. 5

I wonder if machine learning is popular not because it leads to optimal solutions, but because it appears to be a solution path that doesn’t seem to require a lot of knowledge about the problem domain, or even much in the way of mathematics.

The method used in the original video is slow, clumsy and suboptimal, but very likely to be applicable to a completely different problem. It’s presented as a general problem solving skill, rather than a way to solve a specific problem.

1. 1

I agree with the author here, but, playing devil’s advocate, it is sort of handy to have a tool you can whack a lot of problems with, then move on to other problems. I’m not one of these, but I’m guessing a machine learning practitioner could whack this with a few ML hammers pretty quickly, get something working, then move on to other problems. ML seems to be doing an okay job at “general hammering”

I guess another way of saying that is that I’m not sure this is true:

These polynomials are obviously much faster than a neural network, but they’re also easy to understand and debug.

If you’re on a team of ML people who are very comfortable hitting everything with pytorch, maybe the pytorch solution is easier?

1. 1

I’m not even talking about the practical applications of the pytorch hammer (and to be clear: I also prefer the author’s approach), I’m talking about its reputation. Every day we’re presented with new, exciting applications for machine learning. One day it’s generating faces, the next day it’s generating text. It’s not strange that people see it as a generalist skill that’s worthwhile to acquire.

As opposed to plain old boring mathematics.

1. 2

I’ve finally reached the tipping point of {running out of Dropbox space, hating YT Music for playing my personal MP3 collection (I used to be on Google Play Music), being worried about Google Photos changes} and have decided to roll my own solution.

Several approaches come to mind, and I need to figure out what the right subset of the following is:

• Install Syncthing on an old laptop and keep it running 24/7
• Regular backups of that to Backblaze or Tarsnap or something
• Install Syncthing on a DigitalOcean VPS
• Buy a server on Craigslist
• If I stick with the pure Syncthing route, use local photo viewers, music players, etc.
• Otherwise, investigate Ampache and Funkwhale for music
• Investigate photo apps (Piwigo and…there must be others)
1. 1

I’m very interested in hearing how this turns out! I’m in the same boat, Google Play Music was okay, and I had a bit of my own collection uploaded. YouTube music is terrible though, but I don’t see any reasonable alternatives at this time..

I have gone down the NAS route already, but gave up on both photos and music (some ill-formed ranting is here: https://dpzmick.com/posts/2020-02-01-homelab4-cloud.html). In short, nothing I found off-the-shelf to run on the server really worked for photos/music either.

1. 1

I’m pretty happy with my music solution now:

1. The “source of truth” copy is on an old laptop running Debian testing and syncing with Syncthing. Uptime is currently 17 days.
2. I have two folders, `music/core-set` and `music/archive`. The former syncs to both my main laptop and my phone, the latter only to my laptop (to save space on my phone). They’re 22GB and 1.5GB respectively, so maybe I should clean up a little harder. But there’s plenty of space on my phone.
3. I’ve been experimenting with different music players on my phone. It’s nice to have options. Currently I’m using Metro and it’s fine. I plan to try out Odyssey at some point.

I do not yet have an off-site backup (apart from YT Music). I plan to set one up soon, but I figured the odds of two computers at home and the phone in my pocket all blowing up at the same time this month were slim.

1. 2

this article looks fantastic. Wish I had seen it a few months ago when I setup some alpine linux routers on some (rather expensive) “Protectli Vault 4 Port, Firewall Micro Appliance/Mini PC” micro-pcs I found on amazon.

1. 2

I’m using OpenBSD on one of those Protectli Vault boxes and it works great. I bought it to replace my old APU2-based OpenBSD router, which served me well for years but once I upgraded to gigabit service it couldn’t keep up.

1. 1

Alpine Linux is what I use for my router/firewall at home as well. It’s fantastic and I also use unbound for DNS. Alpine linux feels like a BSD at times, it’s quite nice.

1. 1

Tangentially related, but I’ve been bitten by another case of undefined behavior in production code with `mem*` functions.

consider this code:

``````extern void* get_ptr(size_t* out_n);
extern void abort();

int main() {
size_t n_a, n_b;
void* a = get_ptr(&n_a);
void* b = get_ptr(&n_b);

memcpy(a, b, n_b);
if (!a) abort();
}
``````

abort will never be called because memcpy implies input arguments are non-NULL (godbolt: https://godbolt.org/z/e34ETT). We had some code like `memcpy(a, b, n)` in which `a` would be NULL sometimes when `n` was zero.

The same is true for many other `mem*` functions, including `memcmp`

1. 7

I’m fine with this being here, I want to know about alternative platforms.

1. 2

agreed. I’m very interested in tracking what’s happening with hardware platforms like this and a big part of hardware platforms is hardware releases

1. 3

As often comes up here and on HN, it seems like self-hosting is often a step that young languages take to stress-test the language, but compiling a programming language is a different tasks than general purpose programming tasks.

I’m curious if anyone knows of any relatively popular languages that did not go through the self-hosting-compiler stepping stone, and, if any, what project did those languages use to sort of “prove themselves”?

1. 3

Swift hasn’t gone self-hosting at least yet, the compiler is still in C++.

1. 2

Rust is not self-hosting too since they rely on LLVM and `rustc` is not sufficient to build `rustc` from source.

1. 2

Thinking for like 30 seconds more, I have a list:

• C# (recently it is self-hosting but was popular before self hosting)
• python, javascript, ruby, php, R; although not sure if interpreted languages count

It looks like most of the other relatively-mainstream (non-interpreted) languages took the self-hosting route so maybe there aren’t any success stories/project examples for languages that took a different approach

1. 1

Sticking with just compiled languages, these don’t have self-hosting compilers (brackets show compiler implementation language): Java (C++), D, Objective-C, Julia (all C/C++ and LLVM), Elm (haskell), etc. Fortran probably had a self-hosted compiler at one point, but nowadays all the implementations are in C.

1. 2

I didn’t include java b.c. this wikipedia page lists it as a language with self-hosting compiler: https://en.wikipedia.org/wiki/Self-hosting_(compilers) but I don’t see any evidence of that anywhere else.

Objective-C and Fortran/Julia are both interesting as far as the how did they prove themselves question goes. Both sort of had a target market in mind, so they just started building things to meet the needs of that market. I suppose elm falls into that category as well.

I guess something like a web server or database might be a good proof-of-viability for a language like zig as well, however, it looks like my premise for “why selfhost” is wrong in zig’s case, see: https://github.com/ziglang/zig/issues/89#issuecomment-328214707

1. 4

IMO the most compelling pro for self-hosting is that everyone who uses the language can read/debug/extend the compiler. Its much easier to onboard new contributors if you don’t require expertise in an additional language.

1. 12

Protobufs are an attempt at a solution for a problem that must be solved at a much lower level.

The goal that Protocol Buffers attempt to solve is, in essence, serialization for remote procedure calls. We have been exceedingly awful at actually solving this problem as a group, and we’ve almost every time solved it at the wrong layer; the few times we haven’t solved it at the wrong layer, we’ve done so in a manner that is not easily interoperable. The problem isn’t (only) serialization; the problem is the concept not being pervasive enough.

The absolute golden goal is having function calls that feel native. It should not matter where the function is actually implemented. And that’s a concept we need to fundamentally rethink all of our tooling for because it is useful in every context. You can have RPC in the form as IPC: Why bother serializing data manually if you can have a native-looking function call take care of all of it for you? That requires a reliable, sequential, datagram OS-level IPC primitive. But from there, you could technically scale this all the way up: Your OS already understands sockets and the network—there is no fundamental reason for it to be unable to understand function calls. Maybe you don’t want your kernel serialize data, but then you could’ve had usermode libraries help along with that.

This allows you to take a piece of code, isolate it in its own module as-is and call into it from a foreign process (possibly over the network) without any changes on the calling sites other than RPC initialization for the new service. As far as I know, this has rarely been done right, though Erlang/OTP comes to mind as a very positive example. That’s the right model, building everything around the notion of RPC as native function calls, but we failed to do so in UNIX back in the day, so there is no longer an opportunity to get it into almost every OS easily by virtue of being the first one in an influential line of operating systems. Once you solve this, the wire format is just an implementation detail: Whether you serialize as XML (SOAP, yaaay…), CBOR, JSON, protobufs, flatbufs, msgpack, some format wrapping ASN.1, whatever it is that D-Bus does, or some abomination involving punch cards should be largely irrelevant and transparent to you in the first place. And we’ve largely figured out the primitives we need for that: Lists, text strings, byte strings, integers, floats.

Trying to tack this kind of thing on after the fact will always be language-specific. We’ve missed our window of opportunity; I don’t think we’ll ever solve this problem in a satisfactory manner without a massive platform shift that occurs at the same time. Thanks for coming to my TED talk.

1. 5

You might want to look into QNX, an operating system written in the 80s.

1. 1

It should not matter where the function is actually implemented.

AHEM OSI MODEL ahem

/offgetlawn

2. 3

I’ve been thinking along the same lines. I’m not really familiar with Erlang/OTP but I’ve taken inspiration from Smalltalk which supposedly influenced Erlang. As you say it must be an aspect of the operating system and it will necessitate a paradigm shift in human-computer interaction. I’m looking forward to it.

1. 2

I’ve been finding myself thinking this way a lot recently, but I’ve also been considering a counterpoint: all software is fundamentally just moving data around and performing actions on it. Trying to abstract moving data and generalizing performing actions always just gets me back to “oops you’re designing a programming language again.”

Instead, I’ve started to try and view each piece of software that I use as a DSL for a specific kind of data movement and a specific kind of data manipulation. In some cases, this is really easy. For example, the jack audio framework is a message bus+library for realtime audio on linux, dbus does the message bus stuff for linux desktopy stuff, and my shell pipelines are a super crude data mover with fancy manipulation tools.

Rampant speculation: the lack of uniformity in IPC/RPC mechanisms boils down to engineering tradeoffs and failure modes. Jack can’t use the same mechanism that my shell does because jack is realtime. dbus shouldn’t use full-blown HTTP with SSL to send a 64 bit int to some other process. Failure modes are even more important, a local function call fails very differently from an RPC over a TCP socket fails very differently than an RPC over a UDP socket fails very differently than a multicast broadcast.

I feel like the abstractions and programming models we have/use leak those engineering tradeoffs into everything and everybody ends up rolling their own data movers and data manipulator DSLs. From my limited exposure, it seems like orgs that are used to solving certain kinds of problems end up building DSLs that meet their needs with the primitives that they want. You say those primitives are “lists, text strings, byte strings, integers, floats”, but I’d just call all of those (except maybe floats) “memory” which needs some interpretation layer/schema to make any sense of. Now we’re back into “oops I’m designing an object system” or “oops I’m coming up with rust traits again” because I’m trying to find a way to wrangle memory into some nice abstraction that is easily manipulable.

In conclusion I keep finding myself saying things very similar to what you’re written here, but when I’ve explored the idea I’ve always ended up reinventing all the tools we’ve already invented to solve the data movement + data manipulation problems that programs are meant to solve.

1. 2

cap’n proto offers serialisation and RPC in a way that looks fairly good to me. Even does capability-based security. What do you think is missing? https://capnproto.org/rpc.html

1. 3

Cap’n proto suffers from the same problem as Protobuffers in that it is not pervasive. As xorhash says, this mechanism must pervade the operating system and userspace such that there is no friction in utilizing it. I see it as similar to the way recent languages make it frictionless to utilize third-party libraries.

2. 2

well, the fundamental problem imho is pretending that remote and local invokations are identical. when things work you might get away with it, but mostly they dont. what quickly disabuses you of that notion is, that some remote function calls have orders of magnitude higher turnaround time than local ones.

what does work is asynchronous message passing with state-machines, where failure modes need to be carefully reasoned about. moreover it is possible to build a synchronous system on top of async building blocks, but not so the other way around…

1. 5

Hacking on rustc to add codegen for some floating point intrinsics. This is my first ever real compiler work, so I’m fairly excited.

1. 1

for x86 or something else? which intrinsics? I’d be interested in skimming the patch when it’s ready purely to satisfy some curiosity (I’ll have nothing of value to add though)

1. 2

I’m working on adding AVX 512f floating point comparisons. I ended up not needing to do any modifications to rustc. There are simd codegen paths for floating point comparisons, but they don’t support specifying rounding mode and aren’t as fine grained as the full set of intrinsics. I was having trouble getting the relevant LLVM intrinsic to link, so I thought I was going to need to add to the simd codegen. In the end, I managed to get it to link and saved myself some time.

1. 2

I’ve been running an old dell r720 from ebay under my desk for a few months. It’s pretty quiet and doesn’t seem to need any extra cooling. The biggest advantage I’ve found with having a rack-mounted server is that I can log into the remote management console and kick it when I am not at home. Many enterprise tower workstations also have this feature, but a used rack-mount server is a lot cheaper than a used tower with comparable specs (at least it was when I was looking). I’d probably prefer the tower.

I was looking at buying used HP machines, but people have said that HPs are pretty unfriendly to “non-certified” PCIe cards and will spin the fans at 100% if one is installed. My dell did this too but I managed to find IPMI trick to disable this feature. Not sure if the HP rumor is actually true or not.

I use this machine as a ZFS box and for a number of other long running services (getting ZFS to work was a bit of a chore, I had to replace the built in raid controller with the “low end” model of the same controller, flashed with an IT mode firmware (these can be purchased on ebay preflashed by hobbyists))

On speeds, I have much faster internet than you and still find it irritating that my local laptop drive can do 2-3 GB/s (NVMe) but if I want to send something to my NFS mount, I’m constrained to (best case) gigabit. My local network is all 10 gig ethernet for this reason. As others have said, ssh/tmux probably don’t need lots of bandwidth (though if you have a really crappy provider, you might have latency issues).

1. 2

but a used rack-mount server is a lot cheaper than a used tower with comparable specs (at least it was when I was looking)

That is a large factor in my thinking so far :)