1. 6

    This looks great, let’s please replace PGP with it everywhere. :D

    1. 9

      Yes, let’s replace a system which has been tested and proven and worked on since the 1990s with a random 47-commit project someone just tossed up on GitHub. Because good encryption is easy.

      /s

      1. 9

        File encryption is in fact kind of easy, thanks to everything we learned in the last 30 years.

        1. 9

          Yes, actually.

          1. 3

            I guess even the author of PGP would be up for that: https://www.vice.com/en_us/article/vvbw9a/even-the-inventor-of-pgp-doesnt-use-pgp

            1. 3

              In a cryptography context, “since the 1990s” is basically derogatory. Old crypto projects are infamous for keeping awful insecure garbage around for compatibility, and this has been abused many many times (downgrading TLS to “export grade” primitives anyone?)

              1. 3

                I don’t see the point in sarcasm. PGP does many things and most of them are handled poorly be default. This is not a PGP replacement, it’s a tools with single purpose: file encryption. It’s not for safe transfers, it’s not for mail. It’s got a mature spec and it’s designed and developed by folks who are in the crypto community and there are two ref implementations. It does one thing and does it well which is everything PGP isn’t.

                1. 1

                  I think icefox’s comment was already being sarcastic

                  1. 5

                    Not necessarily, PGP is a trashfire

                    1. 2

                      Why do you say that?

                      1. 6

                        This should answer your question better than I ever will.

                        https://latacora.micro.blog/2019/07/16/the-pgp-problem.html

                        1. 1

                          Thanks

              1. 2

                Disclosure timeline:

                • 13th September 2019: We submitted the issue to product-security@apple.com
                • 18th September 2019: Apple asked us the resend the screen shots
                • 10th October 2019: Apple told us that they were planning to address this issue in a future update
                • 30th October 2019: Apple released version 12.10.2 of iTunes but did not fix the issue
                • We asked several times about this case but no answer from Apple
                • 9th December 2019: We informed Apple that we would release a public post about this issue on 12th of December (90 days since the initial submission)
                • 11th December 2019: Apple released version 12.10.3 of iTunes but did not fix the issue
                • 12th December 2019: still no answer, post has been published

                Is this normal?

                1. 1

                  Unfortunately it feels like it’s getting more common.

                1. 2

                  These are great steps.

                  1. 1

                    I missed memory mapping. Smart on-disk structures with memory mapping (or seeks + reads) often work well when data is larger than memory.

                    1. 1

                      I feel like memory mapping is one way to imlpement chunking, or in some cases indexing, rather than a fundamental technique. You can’t just ignore the fact you’re trying to do chunking, for example, just because you’re doing mmaping, or you’ll get terrible performance depending on your usage pattern.

                      1. 3

                        If your access pattern is linear, mmapping will let you write more idiomatic, easier to read, and (slightly) faster code than writing the same program using chunking. But yes, if your access pattern is not linear, you can run into terrible performance problems. Smart use of things like posix_madvise can help with this though.

                        1. 1

                          FWIW, I have seen it claimed that repeatedly read() and processing chunks in a loop is a little faster on Linux than mmap()ing a big file to use it once linearly? I believe this is because calling the kernel via a syscall is faster than calling the kernel via a page fault.

                          Don’t actually know this for sure though so I really ought to check some time tomorrow.

                          1. 2

                            You should be able to verify this for yourself using ripgrep. rg --no-mmap zqzqzqzq some-big-file and rg --mmap zqzqzqzq some-big-file will search without and with memory maps, respectively. Make sure you account for I/O caching (put some-big-file on a ramdisk, usually /tmp, to be sure).

                            In my experiments, memory mapping a large file to search it linearly gives a nice speed boost on Linux over incrementally reading the file. However, if you’re searching a lot of small files, then memory maps end up being quite a bit slower. My running hypothesis there is that it’s due to the overhead of memory mapping in the kernel.

                            I’ve also found this to be a platform specific finding. The differences between memory mapping and incremental reading on macOS and WIndows, for example, are not necessarily the same as on Linux.

                            1. 1

                              An experiment I just tried on Linux is looking at what mincore() says for files in cache: it looks to me like when I mmap() a file which is in cache, mincore() immediately returns 1 for every single page, so… maybe Linux is immediately giving my mapping a full set of backing pages from the filesystem cache right away and there are no page faults at all when I iterate it?

                        2. 1

                          Even though it could be seen as chunking (though that is not completely correct, if the OS has enough free memory, it can decide to keep the full file in memory), it is a very useful abstraction by itself. The nice thing is that you can often treat the memory-mapped data as any other array, so you can present a simpler API. E.g. when you have a large matrix or tensor, you can by and large look up cells, do slicing, etc. like it is a normal array. You can even use your normal linear algebra packages to do things like matrix-vector multiplications.

                      1. 7

                        I also thought the article was a bit misleading. For example, it includes pledge as a “security framework” on the same level of SELinux or AppArmor, even though they work in fundamentally different ways (Pledge is opt-in for each application). It also lists iptables as the only firewall option under linux, even though iptables is just one frontend for the actual firewall netfilter, and makes no mention of firewalld. It remarks about docker under the containers section, but no mention of cgroups. To stoke even more controversy, it lists systemd as the only possible method for services control on linux. Under event notification syscall it mentions epoll but not its ancestors poll or select, which still work as far as I know. Finally, if ZFS is supported on linux, why is “Native ZFS” relevant at all?

                        1. 7

                          It remarks about docker under the containers section, but no mention of cgroups.

                          Also:

                          Linux has been late to many things, such as containers, or copy on write file systems. Container-like tehcnologies were developed back in the year 2000 on FreeBSD Jails, and later extended to Zones in Solaris.

                          On Linux: Virtuozzo - 2000, Linux VServer - 2001, OpenVZ 2005. (Of course, the idea of containers precedes work in Linux and BSD by decades.)

                          Moreover, they are just cherry picking here. There are other technologies where Linux was definitely first. E.g. kernel-based hypervisors KVM (Linux) - 2006 (mainlained in 2007), bhyve - introduced in FreeBSD 10.0, 2014.

                          (This is not a pro-Linux post, just pointing out that this article is incorrect in many places, and cherry-picks to make specific systems look worse.)

                          1. 2

                            kernel-based hypervisors KVM (Linux) - 2006 (mainlained in 2007)

                            Or VM/370 made in the 1970’s with some open-source releases. Helped kernel developers with debugging, too. Still around.

                            Coincidentally, the high-security prototype was called KVM/370 for Kernalized [with a security kernel] VM/370. Here’s an old paper (pdf) on that one for anyone following that stuff. That puts securing hypervisors as early as 1978.

                          2. 2

                            Poll and select work just fine. Why wouldn’t they? :)

                          1. 10

                            Making parameters both positional and keyword by default is a bad idea (1. a positional parameter cannot be renamed because it may be used as a keyword 2. keyword parameters cannot be swapped because they may be used positionally). I am glad Julia does the right thing: a parameter can be either positional or keyword, but not both.

                            Ruby also messes up keyword arguments, but in a different way https://bugs.ruby-lang.org/issues/14183

                            1. 11

                              def f(a, b, /, c, d, *, e, f):

                              I work in quite a few different languages, and that looks completely bizarre to me. One of the strong points of Python originally was that it was often intuitively readable because code would often do the most obvious thing. This democratized code writing so more domain-specialists could write their own code.

                              A lot of recent Python changes seem targeted to dedicated programmers (type annotations, walrus operator, positional-only “/”) and not to those domain-specialists. My favorite thing about Python is that it let me pretty much draw a line between “Where I am” and “Solved problem” without a lot of fluff in the middle.

                              1. 7

                                It seems to me that the positional-only “/” was added as parity with “*”. You could always have a “splat” argument collector, such as

                                def f(a, b, *args)
                                

                                Any extra arguments are collected in args as a list. Keyword arguments can go after it, like this:

                                def f(a, b, *args, extension=".txt")
                                

                                Obviously these can’t be passed as positional arguments, like f(a, b, ".dat"), or it would get collected into args. So these are keyword-only arguments. If you don’t care about the extra arguments, and just want extension to be keyword-only and not passed accidentally, you can omit the args, like this:

                                def f(a, b, *, extension=".txt")
                                

                                So now there’s a way to express keyword-only arguments. But there is no way to express positional-only arguments. So the addition of the “/” separator is just bringing feature parity to function definitions. If it were defined as

                                def f(a, /, b, *, extension=".txt")
                                

                                then you couldn’t pass a as a positional argument anymore. It’s just to make the feature more symmetric.

                                1. 2

                                  I think the idea is that Python library writers can use this tooling to provide nicer packages for application writers.

                                  The kind of python code needed to write nice APIs can be a bit gnarly, and I think that the library/application divide can be pretty stark. Hopefully the community won’t go too deep on some of this stuff at the application layer.

                                  Well, the walrus operator is a pure “make it easier to write application code”, at least.

                                  At lot of these changes are also about reducing the amount of weird internal surprises you get that are really hard to debug (for example Python 3 up until 3.7 used to sometimes use the system locale for some default decoding stuff and it would cause weird system-specific crashes reminiscient of python 2, and they fixed that up).

                                  Python is becoming more and more user friendly when you’re at a certain level of the stack, partly because there are more and more tools available for the tool writers themselves.

                                2. 5

                                  Sure, but it’s too late to change that in Python now. The change is that now it’s possible to specify positional-only parameters at all, which isn’t perfect, but is as far in the right direction as it’s possible to go without breaking things.

                                  1. 4

                                    Yup, I borrowed Julia’s scheme for https://www.oilshell.org/ . It was easy to implement and is very expressive. A lot of people liked my explanation here:

                                    https://news.ycombinator.com/item?id=21253729

                                  1. 1

                                    Out of all of these tools, BIC looks the coolest. I love tools that allow me to explore APIs like that, LinqPad being perhaps the biggest example that comes to mind (but for C# instead of C)

                                    1. 1

                                      I downloaded BIC today and was was disappointed to find a bug pretty fast (it computes -1 % 3 = 2). I’ve filed a bug report, but I’ve been using cling for quite a while and I’ve been really happy with it. Usually I’m just using it to remind myself of little behaviors, but I end up whipping it out more often than I’d care to admit. It’s nice to have.