1. 3

    I was wondering if you could go a bit more indepth about your network storage.

    1. 5

      Sure, which parts are you interested in?

      I think https://michael.stapelberg.ch/posts/2016-11-21-gigabit-nas-coreos/ should give a good introduction, if you haven’t read that yet

    1. 7

      Postgresql is a great database, but I think people could use SQLite a lot more often.

      1. 5

        I bought a Planck EZ Glow and I love it! My typing speed increased by like 20 wpm over the shitty Macbook Pro keyboard, although I realize that’s a low bar.

        Now I want an ergonomic keyboard, but for portability + fitting onto small desks, the Planck EZ is hard to beat. I’ll probably end up buying either Keyboard.io’s Model 01 refresh (the apparently upcoming Model 100) if it’s good, or an Ergodox EZ refresh. I want USB-C.

        I also am backing the Keyboardio Atreus Kickstarter for funsies, even though it kind of fills the same role as the Planck EZ… I’ll see which one I like better, and give away or sell the other most likely.

        1. 2

          Second this, I’m also using a Planck-EZ as my daily driver. I customized it, put in some lubed Holy Pandas and a nicer keycap set. Take a look at my layout!

          1. 3

            Nice, I like the spacebar as a layer toggle when held! Very clever, I’m going to have to steal that.

            If we’re sharing layouts, here’s mine: https://configure.ergodox-ez.com/planck-ez/layouts/9wqxW/latest/0

          2. 1

            I was just looking at a Planck EZ the other day. A few questions:

            • Does the case build quality feel good? It looks like the ones straight from OLKB are aluminum and the EZs are plastic.
            • Do you like the MIT layout (2U space bar)?
            • Did you get an older one that didn’t have USB-C? It looks like all the ones today have it.

            As a no-longer-insecure-about-it Vim user (I tried Emacs three times and it’s not for me), I’m beginning to think that a 40% might be just what I’m looking for (coming from a mechanical 100%).

            1. 3
              • It feels pretty solid to me. I wouldn’t use it as a battering ram, but I don’t feel insecure about throwing it in my backpack and carrying it around. When I wrote to the company asking about carrying cases, they said that the main concern was keeping debris out of the keyboard, not really protecting it from bumping against things, it’s pretty sturdy. (I was annoyed they didn’t have any official carrying cases, but I bought a Nintendo Switch case and that fit the keyboard very well once I cut out the irrelevant stand for propping up the Switch inside the case.)
              • The space bar works pretty well for me. I actually held off on buying a Keyboardio Model 01 because of the strange “space button” configuration where there’s only a single little Space key on the right hand, I was like “I want to be able to hit the spacebar with either hand!” The Planck EZ’s 2U spacebar definitely works in that regard. But I found that when I’m touch-typing at speed, I never hit the spacebar with my left hand, so I probably would be fine with a 1U space button only for my right hand.
              • No, the Planck EZ I have has USB-C, but that was a reason I bought it over the current Ergodox EZ or the Keyboardio Model 01, neither of which have USB-C.
              1. 2

                … or the Keyboardio Model 01, neither of which have USB-C

                Just fyi, the Model 01 does actually have a USB C port

                1. 2

                  Whoops, you’re totally right! Maybe I should just buy one after all.

                2. 1

                  Awesome, thanks! I think I might buy one.

            1. 1

              Interesting take! I think I understand what the author means. As of currently, AI couldn’t match human performance because they can’t deal with too much randomness in their input. That’s a fair point and I’d have to agree with that, but that wouldn’t mean that advances in AI wouldn’t enable a car to be fully autonomous.

              We will probably invent something which can mimic the human way of learning stuff better, supercharged probably even. Look at GPT-2.

              1. 3

                The last sentence of item 7 is dubious. The right language for the right job can make a lot of difference. It shapes the way you think by making you apply certain concepts which can be great or not-so-great for certain tasks. I think there’s a pretty thick line between tribalism / fanboyism / evangelism and knowing which language is better for which job, these are not mutually exclusive.

                1. 1

                  But “the right language for the right job” is a long way from what I suspect he experienced as tribalism, which was likely only ever using one language, not matter the situation. I think this is less prevalent than it used to be, but it’s definitely a thing.

                  1. 1

                    Definitely still is a thing, but it’s a far cry from “The language doesn’t matter”. It’s not as if there’s nothing inbetween tribalism and acknowledging that the language matters to a certain degree. Only a Sith deals in absolutes.

                  2. 1

                    For every 10 times I hear that the specific language makes a difference, probably 8 or 9 it’s tribalism. I’ve also been scarred by lots of enterprise mandates trying to limit and stagnate languages.

                    Whenever I hear this, I try to check the commit history of the person. If they have a mix of languages used, then I worry less. If they have a sustained history of only using a single language or approach, not even toy projects or minor fixes, then I strongly suspect tribalism.

                  1. 24

                    In some cases, I have a great deal of sympathy for the author’s point.

                    In the specific case of the software that triggered this post? Not so much. The author IS TALKING ABOUT A SENDMAIL MILTER when they say that

                    Python 2 is only legacy through fiat

                    No. Not in this case. An unmaintained language/runtime/standard library is an absolute environmental hazard in the case of a sendmail milter that runs on the internet. This is practically the exact use case that it should absolutely be deprecated for, unless you’re prepared to expend the effort to maintain the language, runtime and libraries you use.

                    This isn’t some little tool reading sensor data for an experiment in a closed environment. It’s processing arbitrary binary data from untrusted people on the internet. Sticking with this would be dangerous for the ecosystem and I’m glad both python and linux distro maintainers are making it painful for someone who wants to.

                    1. 2

                      A milter client doesn’t actually process arbitrary binary data from the Internet in a sensible deployment; it encapsulates somewhat arbitrary binary data (email messages and associated SMTP protocol information that have already passed some inspection from your MTA), passes it to a milter server, and then possibly receives more encapsulated binary data and passes it to the MTA again. The complex binary milter protocol is spoken only between your milter client and your milter server, in a friendly environment. To break security in this usage in any language with safe buffer handling for arbitrary data, there would have to be a deep bug that breaks that fundamental buffer safety (possibly directly, possibly by corrupting buffer contents so that things are then mis-parsed at the protocol level and expose dangerous operations). Such a deep break is very unlikely in practice because safe buffer handling is at the core of all modern languages (not just Python but also eg normal Rust) and it’s very thoroughly tested.

                      (I’m the author of the linked-to blog entry.)

                      1. 2

                        I guess I haven’t thought about one where it would be safe… the last one I worked on was absolutely processing arbitrary binary data from the internet, by necessity. It was used for encrypting/decrypting messages, and on the inbound side, it was getting encrypted message streams forwarded through from arbitrary remote endpoints. The server could do some inspection, but that was very limited. Pinning it to some arbitrary library version for processing the message structures would’ve been a disaster.

                        That’s my default frame of reference when I think of a milter… it processes information either on the way in or way out that sendmail doesn’t know how to and therefore can’t really sanitize.

                        1. 1

                          For us, our (Python) milter client sits between the MTA and a commercial anti-spam system that talks the milter protocol, so it gets a message blob and some metadata from the MTA, passes it off to the milter server, then passes whatever the milter server says about the email’s virus-ness and spam-ness back to the MTA. This is probably a bit unusual; most Sendmail milter clients are embedded directly into an MTA.

                          If our milter client had to parse information out of the message headers and used the Python standard library for it, we would be exposed to any bugs in the email header parsing code there. If we were making security related decisions based on header contents (even things like ‘who gets how much spam and virus checking’), we could have a security issue, not just a correctness or DoS/crash one (and crashes can lead to security issues too).

                          (We may be using ‘milter client’ and ‘milter server’ backward from each other, too. In my usage I think of the milter server as the thing that accepts connections, takes in email, and provides decisions through the protocol; the milter clients are MTAs or whatever that call up that milter server to consult it (and thus may be eg email servers themselves). What I’m calling a milter server has a complicated job involving message parsing and so on, but a standalone client doesn’t necessarily.)

                          1. 2

                            Mine was definitely in-process to the MTA. (I read “milter” and drew no client/server distinction, FWIW. I had to go read up just now to see what that distinction might even be.) Such a distinction definitely wasn’t a thing I had to touch in the late 2000s when I wrote the milter I was thinking about as I responded.

                            The more restricted role makes me think about it a little differently, but it’d still take some more thinking to be comfortable sitting on a parsing stack that was no longer maintained, regardless of whether my distro chose to continue shipping the interpreter and runtime.

                            Good luck to you. I don’t envy your maintenance task here. Doubly so considering that’s most certainly not your “main” job.

                      2. 1

                        Yeah, it’s a good thing they do, it’s not the distro-maintainers fault that Python became deprecated.

                      1. 15

                        Isn’t this a complaint about the lack of free support from distros? Am I misunderstanding?

                        Perhaps a group of people would like to start a paid support service for Python 2?

                        1. 8

                          RHEL will be supporting a Python 2 interpreter until at least June of 2024. Potentially longer if they think there’s enough money in offering another “extended lifecycle” (which got RHEL 6 up to a whopping 14 years of total support from its initial release date).

                          1. 2

                            Alternately something can be “done” and never need to be touched again. Expiring toolchains breaks a lot of “done” code.

                            1. 20

                              In the current security landscape? Are you serious? No code is perfect. New flaws in old code are being found and exploited all the time.

                              1. 1

                                Obviously python is large enough to be a security problem, but take e.g. boltdb in the golang world. It doesn’t need more commits, unless golang shifts under it. I believe it’s possible to have code that’s useful and not a possible security problem.

                                1. 15

                                  I don’t understand where you’re coming from here. I’m not a Golang fan, but looking at the boltdb repo on github I see that it’s explicitly unmaintained.

                                  You’re saying that you don’t think boltdb will ever have any serious security flaws that need addressing?

                                  I don’t mean to be combative here, but I have a hard time swallowing this notion. Complex software requires maintenance in a world where the ingenuity of threat actors is ever on the increase.

                                  1. 3

                                    Maybe it wouldn’t need any more commits in terms of features, which is most likely true in case of Bolt as the README states that it focusses on simplicity and doing one thing. But there’s no way to prove that something is secure, you can’t know if there’s a certain edge-case which will result in a security vulnerability. And in that sense, it does require maitainence. Because we can’t prove that something is secure, we can only prove that something is insecure.

                                    1. 1

                                      In fact, the most recent Go 1.14 release adds a -d=checkptr compiler flag to look for invalid uses of unsafe that is enabled for -race and -msan builds by default, and because it does invalid unsafe pointer casts all over the place, it causes fatal errors if you like to run with -race in CI, for example.

                                      So yeah, Go indeed did shift from under it very recently.

                                  2. 6

                                    Some things might be able to. I do not personally believe a sendmail milter is one of those things that can be “done” and never need to be touched again. Unless email itself becomes “done”, I suppose.

                                1. 3

                                  Honest question, why would you use a nvim in a GUI which functions like the nvim TUI?

                                  1. 12

                                    Reasons I do:

                                    • Ligature support
                                    • Animated cursor
                                    • Faster performance
                                    • Possibility of future graphical features such as blurred floating windows, frameless window, etc

                                    Reasons I’ve heard from other users:

                                    • Identical cross platform experience

                                    Terminal is great for some things, but these days I use neovim as my terminal emulator, so I guess I’d ask it the other way around. Why use TUI when you can have a terminal inside of neovim?

                                    1. 0

                                      Reasons I do:

                                      • Ligature support

                                      Notice that many terminals support fonts with ligatures. You can see a handy table of terminals on the FiraCode documentation: https://github.com/tonsky/FiraCode In particular, konsole, qterminal, Windows Terminal, kitty, iTerm, have full ligature support.

                                      • Animated cursor

                                      What do you mean exactly by that? the blinking of the cursor? I prefer non-blinking cursors but surely all terminals support blinking cursor.

                                      • Faster performance

                                      I seriously doubt that the terminal inside the editor will be faster than a native terminal. Do you have benchmarks for that? Anyhow, terminal performance is very rarely an issue nowadays.

                                      • Possibility of future graphical features such as blurred floating windows, frameless window, etc

                                      What are “blurred floating windows” and why would you want such an horrific thing?

                                      1. 7
                                        • Ligature Support

                                        On windows the only terminal that supports ligatures with any semblance of performance is the Windows Terminal which is a new app that has issues of it’s own and doesn’t have mouse pass through at all. Maybe I’ve missed others?

                                        • Animated Cursor

                                        The gui supports a smear effect on the cursor which helps the user track where the cursor jumps. I find in my usage that I lose track of the cursor in some cases. The readme has a good example of it. This helps with that.

                                        • Faster Performance

                                        The combination of ligature support and good performance is very difficult to get right. In my experience on windows, terminal emulation is very slow. This isn’t

                                        • Blurred Floating Windows

                                        Floating windows are a feature inside of Neovim which lets windows appear on top of other windows. Neovim also supports some amount of fake transparency of the background so that characters behind show dimly in the front. This effect is fun and interesting, but a gui should be able to blur the background of these floating windows so that the text is less distracting but the effect is still visible.

                                        As mentioned in the other comment, you are being unnecessarily mean.

                                        1. 6

                                          Your tone is a bit dismissive. You could express your points more kindly.

                                          As for performance, terminal latency is still kinda bad for many. See the benchmarks from danluu and others. I don’t think there’s any particular reason to believe that a neovim GUI should have worse performance than a terminal. They do similar jobs.

                                          If you look at the readme, you will see what they meant by animated cursor.

                                          1. 1

                                            Having “full ligature support” on a feature list and actually doing it are very different things. Last I checked (june 2019ish) windows terminal, no Linux terminal i can find, and iterm don’t do ZWJ emoji sequences according to font rules (hacker cat comes to mind but there’s more) or various non-emoji double-width characters right. The only one I know that does (at least as far as my IRC usage and dadaist fish_prompt is concerned) is mintty/putty, and it’s unfortunately slow.

                                            1. 1

                                              cannot speak about windows, but I’m a happy user of FiraCode on linux terminals (qterminal and kitty), and ligatures have been working since a few years ago (when I first heard about them).

                                          2. 1

                                            Ligature support and animated cursors are good reasons, but is the GUI faster than e.g. Alacritty, which is GPU accelerated? Also, many terminals can have blurry backgrounds, frameless windows, etc. Identical cross platform experience is also a good reason if it works consistently cross platform.

                                            Terminal is great for some things, but these days I use neovim as my terminal emulator, so I guess I’d ask it the other way around. Why use TUI when you can have a terminal inside of neovim

                                            I also run terminals inside of Neovim, but it’s far from a tmux replacement. That’s why I use the TUI, to I can use it with tmux.

                                            1. 9

                                              I don’t know exactly how nvim’s embedding api works, but in principle it should be easier to achieve high performance with a purpose-built editor frontend than a terminal. Reason being that with vt100 all you have is a character buffer, so redrawing only dirty sections requires extra work and increases coupling. But in principle there’s no reason why one would have to perform better than the other, and alacritty has had a lot more work done on it.

                                              1. 5

                                                Terminals cannot do blurry backgrounds for the floating windows inside of neovim. I think in the end, it comes down to preference. tmux isn’t an option today on windows, so for me the neovim terminal emulation is miles further than anything else I easily have available.

                                                Perf wise, its not nearly as good as alacritty yet, but we are working on it, and as mentioned above, alacritty doesn’t support ligatures which is where a lot of the perf cost exists today.

                                                1. 3

                                                  How is tmux not an option on windows? I use tmux in mintty a ton

                                                  1. 1

                                                    Gotta use cygwin for that. I’m not a fan, but if you are into that, tmux works great :)

                                                    1. 1

                                                      There’s a WSL port (full disclosure, I haven’t tried it) https://github.com/mintty/wsltty

                                                  2. 1

                                                    Fair point, I switched to Linux completely, so I kinda forgot about Windows. But yeah, especially on Windows it’d be nice to have this Neovim GUI.

                                              2. 6

                                                The terminal literally emulates a decades old hardware design. Using that as a platform is a silly default.

                                                1. 1

                                                  Your cells emulate a billion-year-old hardware design. Please, update to a non-silly platform.

                                                  1. 12

                                                    Oh, man. Just tell me how.

                                                  2. 1

                                                    The terminal is just too convenient to not use as a platform.

                                                    1. 6

                                                      The concept isn’t the implementation. One can imagine a text oriented user interface without multiple decades of legacy requirements.

                                                      1. 1

                                                        Hmm, could you expand on how you’d envision that in a bit more detail? Because I’m unsure I completely understand.

                                                        1. 3

                                                          One could reconsider/omit:

                                                          • The entire termcap/terminfo infrastructure.

                                                          • The limitation of the vt100 and friends as a medium (strict cell boundaries, no graphics, everything like DBCS/UTF-8 being a graft)

                                                          • The raw bytestream nature, using a more structured protocol, which could enable everything from low bandwidth form entry (the 5250/3270 reality) or rich objects in your CLI (think anything from Mathematica to PowerShell)

                                                          1. 2

                                                            The entire termcap/terminfo infrastructure.

                                                            I’m unfamiliar with this infrastructure, what does it do and why is it bad?

                                                            The limitation of the vt100 and friends as a medium (strict cell boundaries, no graphics, everything like DBCS/UTF-8 being a graft)

                                                            Fair point, having the option to have these things would be a very welcome addition.

                                                            The raw bytestream nature, using a more structured protocol, which could enable everything from low bandwidth form entry (the 5250/3270 reality) or rich objects in your CLI (think anything from Mathematica to PowerShell)

                                                            Well, if everything would work using that protocol / interface it’d be nice, but raw bytestreams seem to be the ultimate backwards compatible “protocol”, if you could call it such. Having these battle-tested tools which are still usable and easily extensible is quite a boon. The cost of turning over to another system with a more strict protocol doesn’t seem worth the benefits to me personally.

                                                            Maybe if there’d be awk-like tools which could parse raw bytestreams to these objects? If this was simple enough, it could provide the same kind of backwards compatibility and extensibility.

                                                            1. 5

                                                              Well, if everything would work using that protocol / interface it’d be nice, but raw bytestreams seem to be the ultimate backwards compatible “protocol”, if you could call it such.

                                                              This is the part of the discussion that really annoys me, because it’s so misleading.

                                                              You can just dump arbitrary bytes on a terminal, if you don’t mind it switching into an unpredictable mode, or even crashing. But in reality, there is a protocol made up of the ANSI escape sequences and your codepage (hopefully UTF-8).

                                                              Well-designed CLI apps, like vim, will escape these streams just like web apps escape tag characters, for readability purposes, but more importantly to prevent clickjacking.

                                                              There are also potential vulnerabilities on the input side, too. For example, what happens if your clipboard contains an ESC and you paste it into vim?

                                                              1. 1

                                                                I get your point, but that doesn’t take away that its backwards compatibility is unmatched. But rebuilding everything from scratch every x years takes a lot of effort which, most of us, are not willing to put in I reckon.

                                                          2. 1
                                                  1. 10

                                                    Relevant discussion on the Apple developers forum, where the developers of Little Snitch, TripMode and Radio Silence, among others, express their concerns:

                                                    https://forums.developer.apple.com/thread/79590

                                                    Apple official position is for them to file an “enhancement request”. Good luck with that…

                                                    1. 2

                                                      And all of that was in 2017. Really unlikely that Apple is going to do anything given it’s been almost 3 years.

                                                      1. 6

                                                        Right. I was never fan of the theory that Apple was iPad’ifying macOS. But it looks like we are heading that direction, even if accidentally. I can understand Apple’s motivations for the individual changes. In principle SIP is great, it protects against many malware attacks. In principle user-space drivers are also great, a vendor’s crap drivers should not run in our ring-0 [1]. Signed applications were great, but the mechanism was somewhat sensitive to stolen developers keys. No we have notarization, which puts makes Apple de gate keeper, even outside the App Store.

                                                        With many of these steps, there are accommodations for more advanced users, but they are all half baked. The do user-space drivers, but never complete the APIs necessary for developers to actually restore the old functionality in user-space. They make the system volume read-only, but come up with a half-baked mechanism for users who actually need a top-level directory. E.g. installing Nix in Catalina requires creating a new volume, creating an entry in synthetic.conf, and creating an entry in fstab. And then it doesn’t really work well if you encrypt the volume, because encrypted volumes are only mounted upon login, which means that applications that rely on the store could be started before the Nix store is mounted. How about just providing a menu item in Disk Utility that says “Create a top-level mounted volume”.

                                                        The thing is that advanced users were just a gateway in the early 2000s for Apple to gain a foothold in the market and bootstrap a developer ecosystem. Now that the vast majority of Mac users are not advanced users, it’s just not their focus anymore. Their focus is providing a system that is as easy and secure as possible for the large majority of users and avoiding diverging from the iOS ecosystem to avoid maintenance costs. That’s a perfectly fine direction to take, but we as developers/advanced users should not expect much more than the occasional nice ‘back door’ that Apple developers manage to smuggle in, such as synthetic.conf.

                                                        [1] The situation is really different compared to Linux, because in Linux virtually all drivers are open source and upstreamed, so one can verify that they don’t do stupid stuff.

                                                        1. 0

                                                          As of lately I’ve been a bit dissatisfied with MacOS. It used to be great IMO. I really hope they won’t completely dumb down MacOS.

                                                    1. 4

                                                      Sounds pretty nice, but how well is this format supported?

                                                      1. 3

                                                        About not at all. No browser and no major tool apart ImageMagick.

                                                      1. 1

                                                        Brilliant. Learning these things should be mandatory.

                                                        1. 8

                                                          Use the same language for the little tools and scripts in your system too. There are few good reasons to drop down into bash or Python scripts, and some considerable disadvantages.

                                                          I don’t understand this. If what you need to do is string together a simple sequence of commands, isn’t bash the correct tool for the job? Why waste time on compile/debug/test when you can have something that works in seconds?

                                                          1. 4

                                                            Yup, it’s much easier to execute commands in bash than in most other languages. On the other hand, this is mostly accidental complexity; There isn’t any reason why it has to be hard. Deep down I believe there’s no reason that we can’t develop both plumbing scripts and low-level system critical programs in the same language.

                                                            Admittedly, in C, it’s pretty simple with the system(…) function, but in C a lot of other things are very hard.

                                                            1. 3

                                                              “Standard” Unix (in practice Linux and the BSDs) doesn’t have a object view of the OS. Windows has - sorta, if you limit yourself to .Net. This means that PowerShell works on objects, not plain text strings.

                                                              To bring that mindset into Linux would be a decades-long project, at least, and require the same level of focus and goal-orientation of systemd (if not more). It would probably be as resisted as systemd, too.

                                                              Maybe one could get halfway by ensuring that a decently capable scripting language (Lua?) was always available on a Unix.

                                                              1. 2

                                                                What would be neat is a way of using shell completion to generate wrapper libraries for use in non-scripting programming languages.

                                                                1. 1

                                                                  So it’d be like a subprocess control library generator? That’s a neat idea, although bear in mind that some of the complexity around shell scripting isn’t just enumerating the various available commands and their inputs but also leveraging their relationships through operating system interfaces.

                                                                2. 1

                                                                  Admittedly, in C, it’s pretty simple with the system(…) function, but in C a lot of other things are very hard.

                                                                  That’s precisely it. The devil’s in the details :) Sure, executing a command is easy, but even bash scripts very rarely JUST execute commands. You pipeline, check return codes, make decisions and the like.

                                                                  1. 1

                                                                    I find that all languages I frequently use make an equivalent of system() simple enough. Python and OCaml also make interacting with external processes (i.e. writing to their stdin and reading their stdout/stderr) simple enough.

                                                                    I’m pretty sure one can make a nice interface for that for any language, it’s just that some come without it in their standard libraries.

                                                                    1. 1

                                                                      I recently wrote something in Python which called other tools and had to capture stdout. It was just inconvenient enough to not use Python for these kind of scripts. I think I had to manually split the tool name and it’s arguments or something like that. Not terrible, but it’s just easier in bash. (Sadly, most other things are terrible in bash).

                                                                      1. 1

                                                                        I’m pretty sure one can make a nice interface for that for any language, it’s just that some come without it in their standard libraries.

                                                                        No doubt this is true, but thinking about doing this in C as a for instance, executing commands and checking return codes is trivial, but what if you need to watch the output and look for a certain string? Or build a pipe and ensure that any component fails the entire pipeline? All of this is possible in C but arguably not easy.

                                                                    2. 3

                                                                      Not to mention that “plain text” executables are much easier to maintain (along space (other devs) and time) than binary blobs.

                                                                      1. 3

                                                                        Agreed. Would one be mistaken in thinking it a good idea to keep POSIX standards in mind if going down this route?

                                                                        1. 2

                                                                          Writing a shell script is way more convenient than writing it all in e.g.: Java. If you use Shellcheck and adhere to the POSIX standards there shouldn’t be a lot of problems. Shell scripts can be of quality too.

                                                                      1. 4

                                                                        Nice article! Minor correction, Android apps are not compiled to Java bytecode, but to Dalvik bytecode. Android apps run on the Dalvik VM, but when reversing Android apps it’s easier to convert Dex to Jar and decompile that to Java.

                                                                        1. 3

                                                                          I’m there! Will arrive at about 18h as well. DECT 6776 & 6767

                                                                          1. 2

                                                                            Together with @eloy, @gregory and me!

                                                                          1. 2

                                                                            Excellent slides on why Neovim is the future! The way they innovate and reinvent Vim is amazing!

                                                                            1. 12

                                                                              There is also a VIDEO of the talk.

                                                                              1. 3

                                                                                Ah, yes, thanks. I was about to ask for it, since the slides are sometimes a bit incomprehensible without the talk.

                                                                            1. 1

                                                                              Currently I use iCloud for all my documents, so those are backed up anyway. I also use timemachine with an external HDD. Still considering using something like Backblaze.

                                                                              1. 9

                                                                                I love Clojure, but we need to have a talk. Specifically, the use of nil-punning is pretty terrible. (if (seq xs) ...) is not actually a good way to spell (if (not (empty? xs)) ...) but both the language designers and the community embrace this conflation of null and the empty sequence. There are other issues around nil and collections as well.

                                                                                Kotlin is my current favorite, but it’s not OK that the best way to write Kotlin currently involves a behemoth of an editor that takes like 2 GB of memory. I can write Clojure in Emacs, which only takes about 80 MB.

                                                                                1. 1

                                                                                  I haven’t tried it, but you should take a look at the Kotlin language server.

                                                                                  1. 2

                                                                                    It looks promising, and I love the idea of a language server, but I failed to get it working in my alotted yak-shaving time. I need to take another crack at it. :-)

                                                                                1. 4

                                                                                  Playing the flare-on CTF. Practicing some reverse engineering.

                                                                                  1. 32

                                                                                    My position has essentially boiled down to “YAML is the worst config file format, except for all the other ones.”

                                                                                    It gets pretty bad if your documents are large or if you need to collaborate (it’s possible to have a pretty good understanding of parts of YAML but that’s not always going to line up with what your collaborators understand).

                                                                                    I keep wanting to say something along the lines of “oh, YAML is fine as long as you stick to a reasonable subset of it and avoid confusing constructs,” but I strongly believe that memory-unsafe languages like C/C++ should be abandoned for the same reason.

                                                                                    JSON is unusable (no comments, easy to make mistakes) as a config file format. XML is incredibly annoying to read or write. TOML is much more complex than it appears… I wonder if the situation will improve at any point.

                                                                                    1. 22

                                                                                      I think TOML is better than YAML. Sure, it has the complex date stuff, but that has never caused big surprises for me (just small annoyances). The article seems to focus mostly on how TOML is not Python, which it indeed is not.

                                                                                      1. 14

                                                                                        It’s syntactically noisy.

                                                                                        Human language is also syntactically noisy. It evolved that way for a reason: you can still recover the meaning even if some of the message was lost to inattention.

                                                                                        I have a mixed feeling about TOML’s tables syntax. I would rather have explicit delimiters like curly braces. But, if the goal is to keep INI-like syntax, then it’s probably the best thing to do. The things I find really annoying is inline tables.

                                                                                        As of user-typed values, I came to conclusion that everything that isn’t an array or a hash should just be treated as a string. If you take user input, you cannot just assume that the type is correct and need to check or convert it anyway, so why even bother having different types at the format level?

                                                                                        Regardless, my experience with TOML has been better than with alternatives, despite its flaws.

                                                                                        1. 6

                                                                                          Human language is also syntactically noisy. It evolved that way for a reason: you can still recover the meaning even if some of the message was lost to inattention.

                                                                                          I have a mixed feeling about TOML’s tables syntax. I would rather have explicit delimiters like curly braces. But, if the goal is to keep INI-like syntax, then it’s probably the best thing to do. The things I find really annoying is inline tables.

                                                                                          It’s funny how the exact same ideas made me make the opposite decision. I came to the conclusion that “the pain has to be felt somewhere” and that the config files are not the worst place to feel it.

                                                                                          I have mostly given up on different config formats and just default to one of the following three options:

                                                                                          1. Write .ini or Java properties-file style config-files when I don’t need more.
                                                                                          2. Write a dtd and XML when I need tree or dependency-like structures.
                                                                                          3. Store the configuration in a few tables inside an RDBMS and drop an .ini-style config file with just connection settings and the name of the config tables when things get complex.

                                                                                          As of user-typed values, I came to conclusion that everything that isn’t an array or a hash should just be treated as a string. If you take user input, you cannot just assume that the type is correct and need to check or convert it anyway, so why even bother having different types at the format level?

                                                                                          I fully agree with this well.

                                                                                        2. 23

                                                                                          Dhall is looking really good! Some highlights from the website:

                                                                                          • Dhall is a programmable configuration language that you can think of as: JSON + functions + types + imports
                                                                                          • You can also automatically remove all indirection in any Dhall code, converting the file to a logic-free normal form for non-programmers to understand.
                                                                                          • We take language security seriously so that your Dhall programs never fail, hang, crash, leak secrets, or compromise your system.
                                                                                          • The language aims to support safely importing and evaluating untrusted Dhall code, even code authored by malicious users.
                                                                                          • You can convert both ways between Dhall and JSON/YAML or read Dhall configuration files directly into a language that supports a native language binding.
                                                                                          1. 8

                                                                                            I don’t think the tooling should be underestimated, too. The dhall executable includes low-level plumbing tools (individual type checking, importing, normalization), a REPL, a code formatter, a code linter to help with language upgrades, and there’s full blown LSP integration. I enjoy writing Dhall so much that for new projects I’m taking a more traditional split between a core “engine”, and then pushing out the logic into Dhall - then compiling it at a load time into something the engine can work with. The last piece of the puzzle to me is probably bidirectional type inference.

                                                                                            1. 2

                                                                                              That looks beautiful! Can’t wait to give it a go on some future projects.

                                                                                              1. 2

                                                                                                Although the feature set is extensive, is it really necessary to have such complex functionality in a configuration language?

                                                                                                1. 4

                                                                                                  It’s worth understanding what the complexity is. The abbreviated feature set is:

                                                                                                  • Static types
                                                                                                  • First class importing
                                                                                                  • Function abstraction

                                                                                                  Once I view it through this light, I find it easier to convince myself that these are necessary features.

                                                                                                  • Static types enforce a schema on configuration files. There is almost always a schema on configuration, as something is ultimately trying to pull information out of it. Having this schema reified into types means that other tooling can make use of the schema - e.g., the VS Code LSP can give me feedback as I edit configuration files to make sure they are valid. I can also do validation in my CI to make sure my config is actually going to be accepted at runtime. This is all a win.

                                                                                                  • Importing means that I’m not restricted to a single file. This gives me the advantage of being able to separate a configuration file into smaller files, which can help decompose a problem. It also means I can re-use bits of configuration without duplication - for example, maybe staging and production share a common configuration stanza - I can now factor that out into a separate file.

                                                                                                  • Function abstraction gives me a way to keep my configuration DRY. For example, if I’m configuring nginx and multiple virtual hosts all need the same proxy settings, I can write that once, and abstract out my intention with a function that builds a virtual host. This avoids configuration drift, where one part is left stale and the rest of the configuration drifts away.

                                                                                                  1. 1

                                                                                                    That’s very interesting, I hadn’t thought of it like that. Do you mostly use Dhall itself as configuration file or do you use it to generate json/yaml configuration files?

                                                                                                2. 1

                                                                                                  I finally need to implement Dhall evaluator in Erlang for my projects. I <3 ideas behind Dhall.

                                                                                                3. 5

                                                                                                  I am not sure that there aren’t better options. I am probably biased as I work at Google, but I find Protocol Buffer syntax to be perfectly good, and the enforced schema is very handy. I work with Kubernetes as part of my job, and I regularly screw up the YAML or don’t really know what the YAML is so cutty-pasty from tutorials without actually understanding.

                                                                                                  1. 4

                                                                                                    Using protobuf for config files sounds like a really strange idea, but I can’t find any arguments against it.
                                                                                                    If it’s considered normal to use a serialisation format as human-readable config (XML, JSON, S-expressions etc), surely protobuf is fair game. (The idea of “compiled vs interpreted config file” is amusing though.)

                                                                                                    1. 3

                                                                                                      I have experience with using protobuf to communicate configuration-like information between processes and the schema that specifies the configurations, including (nested) structs/hashes and arrays, ended up really hacky. I forgot the details, but protobuf lacks one or more essential ingredients to nicely specify what we wanted it to specify. As soon as you give up and allow more dynamic messages, you’re of course back to having to check everything using custom code on both sides. If you do that, you may as well just go back to yaml. The enforced schema and multi language support makes it very convenient, but it’s no picnic.

                                                                                                      1. 2

                                                                                                        One issue here is that knowing how to interpret the config file’s bytes depends on having the protobuf definition it corresponds to available. (One could argue the same is true of any config file and what interprets it, but with human-readable formats it’s generally easier to glean the intention than with a packed binary structure.)

                                                                                                        1. 2

                                                                                                          At Google, at least 10 years ago, the protobuf text format was widely used as a config format. The binary format less so (but still done in some circumstances when the config file wouldn’t be modified by a person).

                                                                                                          1. 3

                                                                                                            TIL protobuf even had a text format. It sounds like it’s not interoperable between implementations/isn’t “fully portable”, and that proto3 has a JSON format that’s preferable.. but then we’re back to JSON.

                                                                                                    2. 2

                                                                                                      JSON can be validated with a schema (lots of tools support it, including VSCode), and it’s possible to insert comments in unused fields of the object, e.g. comment or $comment.

                                                                                                      1. 17

                                                                                                        and it’s possible to insert comments in unused fields of the object, e.g. comment or $comment.

                                                                                                        I don’t like how this is essentially a hack, and not something designed into the spec.

                                                                                                        1. 2

                                                                                                          Those same tools (and often the system on the other end ingesting the configuration) often reject unknown fields, so this comment hack doesn’t really work.

                                                                                                          1. 8

                                                                                                            And not without good reason: if you don’t reject unknown fields it can be pretty difficult to catch misspellings of optional field names.

                                                                                                            1. 2

                                                                                                              I’ve also seen it harder to add new fields without rejecting unknown fields: you don’t know who’s using that field name for their own use and sending it to you (intentionally or otherwise).

                                                                                                          2. 1

                                                                                                            Yes, JSON can be validated by schema. But in my experience, JSON schema implementations are widely diverging and it’s easy to write schemas that just work in your particular parser.

                                                                                                          3. 1

                                                                                                            JSON is unusable (no comments, easy to make mistakes) as a config file format.

                                                                                                            JSON5 fixes this problem without falling prey to the issues in the article: https://json5.org/

                                                                                                            1. 2

                                                                                                              Yeah, and then you lose the main advantage of json, which is how ubiquitous it is.

                                                                                                              1. 1

                                                                                                                In the context of a config format, this isn’t really an advantage, because only one piece of code will ever be parsing it. But this could be true in other contexts.

                                                                                                                I typically find that in the places where YAML has been chosen over JSON, it’s usually for config formats where the ability to comment is crucial.

                                                                                                          1. 9

                                                                                                            I’m glad somebody wrote this up, because I feel like everybody who works with big data learns these lessons independently (sometimes multiple times within the same organization). If you teach CS, please make your students read this before they graduate.

                                                                                                            Understanding these lessons is basically why the office I work at is so much more productive than the rest of the company we’re a part of: there’s been an effort to get incoming devs to understand that, in most cases, it’s faster, cheaper, & easier to use unix shell tools to process large data sets than to use fancy hypebeast toolchains like hadoop.

                                                                                                            There are a couple things this essay doesn’t mention that would probably speed up processing substantially. One is using LC_ALL=C – if you force locale to C, no locale processing occurs during piping, which speeds everything up a lot. Another is that if you are using GNU awk, there’s support for running commands and piping to them internally, which means that downloads can actually be done inside AWK and posts can be done there too – which allows you to open multiple input and output streams and switch between them in a single batch, avoiding some merge steps. Also, one might want to use xargs instead of gnu parallel, because xargs is a more mature tool & one that’s available on basically all unix machines out of the box.

                                                                                                            1. 4

                                                                                                              One thing I found particularly useful about this post (not evident from the title, but constitutes the first half) is specifics about how the Big Data Science Toolchains can fail, in this case Apache Spark, even when the author tried a bunch of the obvious and less-obvious fixes.

                                                                                                              The biggest win here seems to be not necessarily the raw processing time due to low-level optimizations in awk, but more big-picture algorithmic wins from “manually” controlling data locality, where Spark didn’t do the right thing automatically, and couldn’t be persuaded to do the right thing less automatically.

                                                                                                              1. 3

                                                                                                                Have them read “The Treacherous Optimization” which is all about how GNU grep is so fast: grep is important for its own sake, of course, but the point is that these tools have had decades of work poured into them, even the relatively new GNU tools which postdate the Classic Unix codebases.

                                                                                                                It’s also an interesting introduction to code optimization and engineering tradeoffs, or tradeoffs where multiple decisions are defensible because none of them are absolutely perfect.

                                                                                                                1. 3

                                                                                                                  Yeah I’ve personally run into exactly this kind of slowness with R (and Python to a lesser extent), and fixed it with shell. I love R but it can be very slow.

                                                                                                                  That’s part of the reason I’m working on Oil. Shell is still useful and relevant but a lot of people are reluctant to learn it.

                                                                                                                  I posted this in another thread, but it is good to eyeball your computations with “numbers every programmer should know”:

                                                                                                                  https://gist.github.com/jboner/2841832

                                                                                                                  https://people.eecs.berkeley.edu/~rcs/research/interactive_latency.html

                                                                                                                  In particular most “parsing” is linear time, so my rule is that you want to be within 2x-10x of the hardware’s theoretical speed. With certain tools you will be more in the 100-1000x range, and then it’s time to use a different tool, probably to cut down the data first. Hardware is cheap but waiting for data pipelines costs human time.

                                                                                                                  1. 4

                                                                                                                    When I was working in my first lab I did exactly the same – moved an existing computational biology pipeline off of R to AWK, lots of shell plumbing, GNU Parallel, and a Flask front-end server which submitted jobs in GridEngine. Brought runtime down from 40 minutes to about 30 seconds for one genome. R is nice but can be slow (also, it was just a prototype.)

                                                                                                                    The pivotal lesson I learned was to embrace the battle-tested technologies in the shell stack and everything Unix instead of fantsy-pantsy modern stacks and tools on top of Hadoop, Spark and others. “Someone probably solved your problem in the 80s” from the author rings absolutely true.

                                                                                                                    Others call it Taco Bell programming.

                                                                                                                    1. 2

                                                                                                                      Taco Bell programming is amazing, I’m saddened by the fact that Taco Bell programming has become quite esoteric. I wish this knowledge was more widespread in the industry.

                                                                                                                  2. 1

                                                                                                                    You must be very glad. 3 identical comments 😅

                                                                                                                    1. 1

                                                                                                                      Just a glitch. My mouse’s debounce doesn’t work properly, and lobste.rs doesn’t properly deduplicate requests, so when I click post it sometimes emits several duplicate requests which the server treats as duplicate comments (even though they come from the same form).

                                                                                                                      There was a patch applied for this a year ago, but either it didn’t work or it never migrated from the git repo to the live version of the site.