1. 3

    Interesting. But encountering an entire programme written by just one person is a rather rare occurance nowadays I think. Does it work on programmes authored by multiple persons, giving a list of all persons who contributed to a given programme? That might be quite interesting for copyright issues.

    1. 15

      Hey folks,

      Jon messaged me a day or two ago. I gave him the standard answer about these sorts of inquiries: I’m happy to run queries that don’t reveal personal info like IPs, browsing, and voting or create “worst-of” leaderboards celebrating most-downvoted users/comments/stories, that sort of thing. I can’t volunteer to write queries for people, but the schema is on GitHub and the logs are MySQL and nginx, so it’s straightforward to do.

      A couple years ago jcs ran some queries for me and I wanted to continue that especially as the public stats that answered some popular questions have been gone for a while. It’s useful for transparency and because it’s just fun to investigate interesting questions. I’ve already run a few queries for folks in the chat room (the only I can remember off the top of my head is how many total stories have been submitted; we passed 40k last month).

      I asked Jon to post publicly about this because it sounded like he had significantly more than one question he was idly curious about, to help spread awareness that I’ll run queries like this, and get the community’s thoughts on his queries and the general policy. I’ll add a note to the about page after this discussion.

      I’m going offline for a couple hours for a prior commitment before I’ll have a chance to run any of these, but it’ll leave plenty of time for a discussion to get started or folks to think up their own queries to post as comments.

      1. 3

        Wasn’t the concept of Differential Privacy developed to allow for exactly the purpose of querying databases containing personal data while maintaining as much privacy as possible? Maybe this could be employed?

        1. 3

          In this particular case I don’t think the counts are actually sensitive, so it’s unclear that applying DP is even necessary. But I’ll ping Frank McSherry who’s one of the primary proponents of DP in academia nowadays and see what he thinks :) Maybe with DP we could extract what is arguably more sensitive information (e.g., by reducing the bin widths).

          1. 1

            That seems doable, but I have the strong suspicion that if I wing it I’ll screw something up and leak personal info. So hopefully Frank can chime in with some good advice.

            1. 4

              I’m here! :D I’m writing up some text in more of a long-form “blog post” format, to try and explain what is what without the constraints of trying to fit everything in-line here. But, some high-level points:

              1. Operationally queries one and two are pretty easy to pull off with differential privacy (the “how many votes per article” and “how many votes per user” queries). I’ve got some code that does that, and depending on the scale you could even just use it, in principle (or if you only have a SQL interface to the logs, we may need to bang on them).

              2. The third query is possibly not complicated, depending on my understanding of it. My sed-fu is weak, but to the extent that the query asks only for the counts of pre-enumerable strings (e.g. POST /stories/X/upvote) it should be good. If the query needs to discover what strings are important (e.g. POST /stories/X/*) then there is a bit of a problem. It is still tractable, but perhaps more of a mess than you want to casually wade into.

              3. Probably the biggest question mark is about the privacy guarantee you want to provide. I understand that you have a relatively spartan privacy policy, which is all good, but clearly you have some interest in doing right by the users with respect to their sensitive data. The most casual privacy guarantee you can give is probably “mask the presence / absence of individual votes/views”, which would be “per log-record privacy”. You might want to provide a stronger guarantee of “mask the presence absence of entire individuals”, which could have a substantial deleterious impact on the analyses; I’m not super-clear on which guarantee you would prefer, or even the best rhetorical path to take to try and discover which one you prefer.

              Anyhow, I’m typing things up right now and should have a post with example code, data, analyses, etc. pretty soon. At that point, it should be clearer to say “ah, well let’s just do X then” or “I didn’t realize Y; that’s fatal, unfortunately”.

              EDIT: I’ve put up a preliminary version of a post under the idea that info sooner rather than later is more helpful. I’m afraid I got pretty excited about the first two questions and didn’t really do much about the third. The short version of the post is that one could probably release information that leads to pretty accurate distributional information about the multiplicities of votes, by articles and by users, without all of the binning. That could be handy as (elsewhere in the thread) it looks like binning coarsely kills much of the information. Take a read and I’ll iterate on the clarity of the post too.

              1. 1

                To follow up briefly on this: yes, it would be useful to avoid the binning so that we could feed more data to whatever regression we end up using to approximate the underlying distribution.

          2. 2

            For those who are curious, I’ve started implementing the workload generator here. It currently mostly does random requests, but once I have the statistics I’ll plug them in and it should generate more representative traffic patterns. It does require a minor [patch](https://github.com/jonhoo/trawler/blob/master/lobsters.diff to the upstream lobste.rs codebase, but that’s mostly to enable automation.

          1. 1

            I’m sorry, but this looks like copyright infringement to me if the author doesn’t have Nintendo’s consent to publish this.

            1. 6

              It’s reverse-engineered code, a legal gray area. Emulators would be in the same legal gray area if not for the precedent Sony vs. Bleem set.

              1. 4

                I’m a law student from Europe, specifically Germany, so I can’t say anything about the legal situation in the U.S.A. Maybe I should have clarified that. For Germany, emulators operate on the exemption for private copies (§ 53 German Copyright Act, and related § 44a for the ephemeral copy in RAM).

                This however does assume that you obtain your emulatable software yourself. It does not cover purchase of software ripped by anybody else than you. Specifically, § 53 of the German Copyright Act does not permit publishing anything you ripped. There are some unhealthy paragraphs — which I’d like to not be there — on the prohibition of DRM circumvention in the law as well. §§ 95a ff. forbid circumvention of DRM (making the private copy exemption pretty useless for DRM’ed content) culminating in a criminal law paragraph § 108b that penalises circumvention of DRM under certain conditions. I find it cynic that § 95b(1)(Nr.6)(a) specifically allows DRM circumvention under the premise that your private copy is on paper. That being said, I have no idea whether whatever Nintendo used or did not use on the Pokémon game cartidges counts as DRM or not.

                If you did not only rip the software, but also modified it, you are probably in breach of another paragraph as well, because § 69c(Nr.2) makes modification of computer software dependant on the consent of the rights owner (this is different from modification of all other kinds of copyright-protected works, where modification does not require consent, but only publishing of the modification). There might be some more sections relevant, all the above is what I tought of off the top of my head.

                The German Copyright Act is based in most of its part on the EU Direction on Information Society 2001/29/EC, which enables me to say that the situation is probably very similar in other EU member states.

                At least in Europe, I thus conclude that publishing software ripped from cartidges on the Internet is illegal. What about people downloading the software? That’s only illegal if this repository is “clearly illegal” (original wording § 53). Given my lengthy legal explanation above, I wouldn’t say it’s “clearly” illegal, so users are probably fine. OTOH, since I now gave these explanations, to anyone who made it this far in this post it may now be “clearly illegal”. So you must decide yourself now. The familiar “ALL THE WAREZ FOR FREE!!” site however is probably “clearly” illegal.

                1. 2

                  It’s not ripped/modified software though. It’s hand written code which used Pokémon Red as a reference. If you want insight into their reverse-engineering process, look at pokeruby. Right now pokeruby falls into the “clearly illegal” category (since it’s full of raw disassembly), but once it is finished, it will be all hand-written C code.

                  I’m not saying it’s legal, I’m just saying it’s a gray area.

                  1. 1

                    It’s hand written code which used Pokémon Red as a reference.

                    That’s interesting. I’m sorry that I didn’t immediately understand. In that case, the judicial outcome depends on what you mean by “reference”. The process here appears to have been then that the author did walk through all the machine code and then produced a programme that does the exact same like the machine code he viewed at. For that matter, he could have just written the programme in any other language as well.

                    Taking something as inspiration is of course not covered by copyright law in any way. If a programme is reproduced in all its instructions and structure however, I would qualify this as a copy. It’s an interesting issue about which I need to think more deeply. It is a question of the definition of “copy” then. And if it isn’t a copy, it might still be a “modification”. Both actions are reserved for the rights owner in case of computer programmes.

                    On a side note: I haven’t checked, but if the author uses the original Pokémon graphics, then we’re at a copyright infringement there more easily than with the code.

                    Edit: Decompilation is specifically regulated as well and usually forbidden as well §69e. :-)

                    1. 1

                      if the author uses the original Pokémon graphics, then we’re at a copyright infringement there more easily than with the code

                      I thought the Internet made this part of copyright law essentially meaningless? As far as images go, anyway. Sites like Serebii and Bulbapedia host these images, not to mention all of the screencaps and whatnot that are posted on Twitter/Reddit/4chan/whatever. It would be kind of weird to go after pokered for hosting those sprites when there are tons of other people/businesses who do the same thing.

                      1. 1

                        the judicial outcome depends on what you mean by “reference”. The process here appears to have been then that the author did walk through all the machine code and then produced a programme that does the exact same like the machine code he viewed at. For that matter, he could have just written the programme in any other language as well.

                        From what I can tell, a base ROM was never in the repository. It was user-supplied and used to build the entire pokered repo for a while, until all code and assets had been dumped into files that could be used to rebuild the ROM from just the pokered repo itself. See an early README mentioning base ROMs being required: https://github.com/pret/pokered/blob/c07a745e36cf3b3d07bbf7c2d3c897ddd5127200/README#L3-L16

                        The translation process was probably just using a tool to disassemble code and labeling pieces. This is most likely an act of “decompilation” as 2001/29/EC understands it. In all likelyhood, the original was programmed in assembly as well; C was very rare in the Game Boy days. The author could not have “written the programme in any other language”: the few compilers that do exist for the Game Boy yield code that is unsuitable for the constraints of the Game Boy. Furthermore, if 1:1 identical binaries are your goal, you cannot just rewrite it in C when the original wasn’t; there’s no realistic way to get identical results.

                        pokeruby first disassembled Pokémon Ruby and then adds a pass to conversion to C with the same goal, which is only possible because they have also unearthed the correct compiler.

                        And yes, by definition of having an identical ROM, there are all of the original assets, namely player-visible text, graphics, sound. They’re shipped as part of the repository.

                        (The judicial outcome is a total crapshoot anyway. Copyright law in the context of software makes for surprising decisions one after another. It’s simply unsuitable and doctrine in other countries has incessantly pointed it out, but due to international pressure from the United States of America, it happened anyway.)

                2. 3

                  The project has gotten rather large. It seems to be while the legal situation is indeed far from a clear one, Nintendo and The Pokémon Company international seem to be leaving this repo (or any of the other github/pret efforts) alone.

                1. 2

                  I do know the distraction problem, but is it really required to solve it via a programme? Can’t you just pull the Ethernet cable and/or disconnect from wifi? After all, the main cause of distractions are notifications from all kinds of programs, and they are effectively removed by that. If you need the documentation for your programming language or library, there’s often a way to download it and read it offline, and some languages offer documentation via commandline (like Ruby’s ri). I often make use of this when travelling by train, where Internet tends to be wonky.

                  Side note: For distractionless writing of anything that is not source code, I have – not kidding – switched to a physical, mechanical typewriter. It’s a relief. No distraction, just you, the keys, and the paper. For source code that doesn’t work due to lack of special keys, but otherwise it’s great if you really want to concentrate on a specific topic and just write.

                  1. 2

                    The all or none approach is a bit more difficult when you have actual work that needs to happen on GitHub or some remote server.

                    Though I’ve been a bit more honest about the ratio of that kind of work. Most work can happen offline

                    (I have a Pomera DM200 for typing up paragraphs in a concentrated way. Not perfect but good enough for my needs)

                    1. 1

                      can you recommend a good typewriter that’s old enough to be simple and durable, but also new enough to be easy to use?

                      1. 2

                        Um … any manual typewriter that still works? They’re not complicated to use …

                        1. 1

                          I have helped in courses on C programming, and when I explained the difference in newlines between Linux and Windows, I used to bring around a typewriter to make the students see what a “Carriage Return” really is. Of course, I left the typewriter available for playing around and from that I can definitely tell you that a typewriter is not self-explanatory anymore. By far the most common question I was asked was:

                          “How do I advance to a new line?”

                          It’s not obvious if you’re used to have an Enter key.

                        2. 2

                          Any mechanical typewriter that was manufactured before 1970 should be good and durable, after that date quality appears to decline, and for electric/electronical typewriters it’s much more difficult to repair things if they break. I’m happily typing on a 1960ies Olympia SM9, but really, take a look at eBay, your local antiques shop or similar and just buy one that looks nice to you. You shouldn’t probably start with a “Standard” (i.e. full-size) type writer, because they’re very heavy and hard to sell again. Don’t worry if the ribbon is dried out, it’s easy to get replacement ribbons e.g. on Amazon. If everything else works, the typewriter is probably fine.

                          Also, don’t go with extremely old models (pre 1920) if you want to actually use them and not just look at them.

                          If you handle your machine with care (never clean it with WD40!), it will last decades as they have lasted already. Apart from my Olympia I have a completely functional 1930ies Continental typewriter that I occasionally type at, but it requires much more pressure on the keys, which I find uncomfortable.

                          Lobsters isn’t a typewriter community (yet), so I think that should complete it. You might want to register at a typewriter forum like this one if you have further questions, as they can be answered there much better and by more competent people than me.

                      1. 2

                        There’s no need to update dynamic DNS via a script. BIND has native dynamic DNS capabilities by means of the RFC 2136 DNS UPDATE command that uses (symmetric) crypto so you can execute it from your local connection. Back in 2015, I wrote a blog post on that approach, but it’s German.

                        On the topic of running things from home: I’m very much a friend of it for control and privacy reasons, but there’s a major bummer for me. My ISP prohibits running servers from home in his ToS; you need to upgrade to a “business account” if you want to do that, and the business account comes with a static IP address anyway, so there’s not much reason to do a complex dynamic DNS setup. As a result, I currently run my website on a simple VPS.

                        1. 9

                          Hah, I was actually curious whether AST will make a move. Good to see he did.

                          Still, it’s sad that he doesn’t seem to care about ME.

                          1. 7

                            Whether he cares about ME is irrelevant here. By releasing the software under most (all?) free software and open source licenses, you forfeit the right to object even if the code is being used to trigger a WMD - with non-copyleft licenses you agree not to even see the changes to the code. That’s the beauty of liberal software licenses :^)

                            All that he had asked for is a bit of courtesy.

                            1. 4

                              AFAIK, this courtesy is actually required by BSD license, so it’s even worse, as Intel loses here on legal ground as well.

                              1. 5

                                No, it is not - hence the open letter. You are most likely confused by the original BSD License which contained the so called, advertising clause.

                                1. 5

                                  Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

                                  http://git.minix3.org/index.cgi?p=minix.git;a=blob;f=LICENSE;h=a119efa5f44dc93086bc34e7c95f10ed55b6401f;hb=HEAD

                                  1. 9

                                    Correct. The license requires Intel to reproduce what’s mentioned in the parent comment. The distribution of Minix as part of the IME is a “redistribution in binary form” (i.e., compiled code). Intel could have placed the parts mentioned in the license into those small paper booklets that usually accompany hardware, but as far as I can see, they haven’t done so. That is, Intel is breaching the BSD license Minix is distributed under.

                                    There’s no clause in the BSD license to inform Mr. Tanenbaum about the use of the software, though. That’s something he may complain about as lack of courtesy, but it’s not a legal requirement.

                                    What’s the consequence of the license breach? I can only speak for German law, but the BSD license does not include an auto-termination clause like the GPL does, so the license grant remains in place for the moment. The copyright holder (according to the link above, this is Vrije Universiteit, Amsterdam) may demand compensation or acknowledgment (i.e. fulfillment of the contract). Given the scale of the breach (it’s used in countless units of Intel’s hardware, distributed all over the globe by now), he might even be able to revoke the license grant, effectively stopping Intel from selling any processor containing the then unlicensed Minix. So, if you ever felt like the IME should be removed from this world, talk to the Amsterdam University and convince them to sue Intel over BSD license breach.

                                    That’s just my understanding of the things, but I’m pretty confident it’s correct (I’m a law student).

                                    1. 3

                                      It takes special skill to break a BSD license, congrats Intel.

                                      1. 5

                                        Actually, they may have a secret contract with the University of Amsterdam that has different conditions. But that we don’t know.

                                        1. 2

                                          Judging from the text, doesn’t seem AST is aware of it.

                                          1. 2

                                            University of Amsterdam (UvA) is not the Vrije University Amsterdam (VU). AST is a professor at VU.

                                      2. 1

                                        I’ve read the license - thanks! :^)

                                        The software’s on their chip and they distribute the hardware so I’m not sure that actually applies - I’m not a lawyer, though.

                                        1. 5

                                          Are you saying that if you ship the product in hardware form, you don’t distribute software that it runs? I wonder why all those PC vendors were paying fees to Microsoft for so long.

                                          1. 2

                                            For the license - not the software

                                            1. 3

                                              Yes, software is licensed. It doesn’t mean that if you sell hardware running software, you can violate that software’s license.

                                          2. 3

                                            So, they distribute a binary form of the OS.

                                            1. 4

                                              This is the “tivoization” situation that the GPLv3 was specifically created to address (and the BSD licence was not specifically updated to address).

                                              1. 2

                                                No, it was created to address not being able to modify the version they ship. Hardware vendors shipping GPLv2 software still have to follow the license terms and release source code. It’s right in the article you linked to.

                                                BSD license says that binary distribution requires mentioning copyright license terms in the documentation, so Intel should follow it.

                                                1. 3

                                                  Documentation or other materials. Does including a CREDITS file in the firmware count? (For that matter, Intel only sells the chipset to other vendors, not end users, so maybe it’s in the manufacturer docs? Maybe they’re to blame for not providing notice?)

                                                  1. 3

                                                    You have a point with the manufacturers being in-between Intel and the end users that I didn’t see in my above comment, but the outcome is similar. Intel redistributes Minix to the manufacturers, which then redistribute it to the end-users. Assuming Intel properly acknowledges things in the manufacturer’s docs, it’d then be the manufacturers that were in breach of the BSD license. Makes suing more work because you need to sue all the manufacturers, but it’s still illegal to not include the acknowledgements the BSD license demands.

                                                    Edit:

                                                    Does including a CREDITS file in the firmware count?

                                                    No. “Acknowledging” is something that needs to be done in a way the person that receives the software can actually take notice of.

                                                    1. 2

                                                      The minix license doesn’t use the word “acknowledging” so that’s not relevant.

                                                      1. 2

                                                        You’re correct, my bad. But “reproduce the above copyright notice” etc. aims at the same. Any sensible interpretation of the BSD license’s wording has to come to the result that the receivers of the source code must be able to view those parts of the license text mentioned, because otherwise the clause would be worthless.

                                              2. 1

                                                If they don’t distribute that copyright notice (I can’t remember last seeing any documentation coming directly from Intel as I always buy pre-assembled hardware) and your reasoning is correct, then they ought to fix it and include it somewhere.

                                                However, the sub-thread started by @pkubaj is about being courteous, i.e. informing the original author about the fact that you are using their software - MINIX’s license does not have that requirement.

                                    2. 7

                                      I think he is just happy he has a large company using minix.

                                      1. 5

                                        Still, it’s sad that he doesn’t seem to care about ME.

                                        Or just refrains from fighting a losing battle? It’s not like governments would give up on spying on and controlling us all.

                                        1. 6

                                          Do you have a cohesive argument behind that or are you just being negative?

                                          First off, governments aren’t using IME for dragnet surveillance. They (almost certainly) have some 0days, but they aren’t going to burn them on low-value targets like you or me. They pose a giant risk to us because they’ll eventually be used in general-purpose malware, but the government wouldn’t actually fight much (or maybe at all, publicly) to keep IME.

                                          Second off, security engineering is a sub-branch of economics. Arguments of the form “the government can hack anyone, just give up” are worthless. Defenders currently have the opportunity to make attacking orders of magnitude more expensive, for very little cost. We’re not even close to any diminishing returns falloff when it comes to security expenditures. While it’s technically true that the government (or any other well-funded attacker) could probably own any given consumer device that exists right now, it might cost them millions of dollars to do it (and then they have only a few days/weeks to keep using the exploit).

                                          By just getting everyday people do adopt marginally better security practices, we can make dragnet surveillance infeasibly expensive and reduce damage from non-governmental sources. This is the primary goal for now. An important part of “marginally better security” is getting people to stop buying things that are intentionally backdoored.

                                          1. 2

                                            Do you have a cohesive argument behind that or are you just being negative?

                                            Behind what? The idea that governments won’t give up on spying on us? Well, it’s quite simple. Police states have happened all throughout history, governments really really want absolute power over us, and they’re free to work towards it in any way they can.. so they will.

                                            They (almost certainly) have some 0days, but they aren’t going to burn them on low-value targets like you or me.

                                            Sure, but do they even need 0days if they have everyone ME’d?

                                            They pose a giant risk to us because they’ll eventually be used in general-purpose malware

                                            Yeah, that’s a problem too!

                                            Defenders currently have the opportunity to make attacking orders of magnitude more expensive, for very little cost. [..] An important part of “marginally better security” is getting people to stop buying things that are intentionally backdoored

                                            If you mean using completely “libre” hardware and software, that’s just not feasible for anyone who wants to get shit done in the real world. You need the best tools for your job, and you need things to Just Work.

                                            By just getting everyday people do adopt marginally better security practices, we can make dragnet surveillance infeasibly expensive and reduce damage from non-governmental sources.

                                            “Just”? :) I’m not saying we should all give up, but it’s an uphill battle.

                                            For example, the blind masses are eagerly adopting Face ID, and pretty soon you won’t be able to get a high-end mobile phone without something like it.

                                            People are still happily adopting Google Fiber, without thinking about why a company like Google might want to enter the ISP business.

                                            And maybe most disgustingly and bafflingly of all, vast hordes of Useful Idiots are working hard to prevent the truth from spreading - either as a fun little hobby, or a full-time job.

                                          2. 4

                                            It reads to me like he just doesn’t want to admit that he’s wrong about the BSD license “providing the maximum amount of freedom to potential users”. Having a secret un-auditable, un-modifiable OS running at a deeper level than the OS you actually choose to run is the opposite of user freedom; it’s delusional to think this is a good thing from the perspective of the users.

                                            1. 2

                                              And the BSD code supported that by making their secret box more reliable and cheaper to develop.

                                            2. 3

                                              Oh, it’s still not lost. ME_cleaner is getting better, Google is getting into it with NERF, Coreboot works pretty well on many newish boards and on top of that, there’s Talos.

                                            3. 2

                                              He posted an update in which he says he doesn’t like IME.

                                            1. 4

                                              Your docs mention that on POSIX systems, paths might not be valid UTF-8 (or any single encoding), but it’s not clear to me what Pathie does in such a situation: are paths containing invalid UTF-8 inaccessible? Can you read them from the OS but not construct them yourself?

                                              Your docs also say that Windows uses UTF-16LE, which is not strictly true: in the same way that POSIX paths are a bucket of bytes and not necessarily valid UTF-8, Windows paths are a bucket of uint16_ts and not necessarily valid UTF-16 (in particular, they can have lone surrogates that do not form a surrogate pair, or values that are not assigned in the Unicode database). How does Pathie interact with such malformed paths?

                                              Lastly, macOS: as your documentation points out macOS does have an enforced filename encoding, but it also has an enforced normalisation (at least for HFS+ volumes). That means your application can create a string, create a file with that name, then readdir() the directory containing that file and none of the returned directory entries will byte-for-byte match the string you started with. Does that affect Pathie’s operation?

                                              1. 2

                                                Your docs mention that on POSIX systems, paths might not be valid UTF-8 (or any single encoding), but it’s not clear to me what Pathie does in such a situation: are paths containing invalid UTF-8 inaccessible?

                                                First off, Pathie does not assume POSIX pathes are UTF-8, because that isn’t specified. Unless you compile Pathie with ASSUME_UTF8_ON_UNIX, it takes the encoding information from the environment via the nl_langinfo(3) function called with CODESET as the parameter (which is why you need to initialise your locale on Linux systems).

                                                are paths containing invalid UTF-8 inaccessible? Can you read them from the OS but not construct them yourself?

                                                In the case of a path with invalid characters in the locale’s encoding (e.g., invalid UTF-8 on most modern Linuxes), you’ll get an exception when trying to read such a path from the filesystem, because iconv(3) fails with EILSEQ (which is transformed into a proper C++ exception by Pathie). You cannot either construct pathes containing invalid characters, because you will receive the same exception. I’ll make this more clear in the docs.

                                                Your docs also say that Windows uses UTF-16LE, which is not strictly true:

                                                Pathes in valid encoding are UTF-16LE. Broken path encodings may be anything and that’s nothing one can make assumptions about. Again, you’ll receive an exception when you encounter them (because the underlying WideCharToMultiByte() function from the Win32API fails).

                                                (in particular, they can have lone surrogates that do not form a surrogate pair, or values that are not assigned in the Unicode database)

                                                I was not aware of that. Do you have a link with explanations, ideally on MSDN?

                                                Lastly, macOS: as your documentation points out macOS does have an enforced filename encoding, but it also has an enforced normalisation (at least for HFS+ volumes)

                                                macOS is not officially supported by Pathie (which is stated at the top of the README), simply because I don’t have a Mac to test on.

                                                That means your application can create a string, create a file with that name, then readdir() the directory containing that file and none of the returned directory entries will byte-for-byte match the string you started with. Does that affect Pathie’s operation?

                                                It shouldn’t affect Pathie’s operation itself. Pathie will simply pass through what the filesystem gives it; since on macOS no conversion of path encodings happens, these normalised sequences are handed through to the application that uses Pathie.

                                                Thanks for the feedback!

                                                1. 3

                                                  Do you have a link with explanations, ideally on MSDN?

                                                  Unfortunately, I can’t find a smoking-gun writeup on MSDN. However, in my searching, I did find:

                                                  • Scheme48 has a special OS String type, and motivates it saying “On Windows, unpaired UTF-16 surrogates are admissible in encodings, and no lossless text decoding for them exists.”
                                                  • Racket’s encoding conversion functions include special “platform-UTF-8” and “platform-UTF-16” encodings: “On Windows, the input can include UTF-16 code units that are unpaired surrogates…”
                                                  • Rust also includes a special OSString type: “On Windows, strings are often arbitrary sequences of non-zero 16-bit values, interpreted as UTF-16 when it is valid to do so.”
                                                  • I found the Rust ticket that introduced the OSString type, which includes a (Rust) test case. One of the Rust devs dug up an MSDN page that says “…the file system treats path and file names as an opaque sequence of WCHARs.”
                                                  • That issue also linked to a report of UTF-16-invalid filenames being found in the wild in somebody’s Recycle Bin.
                                                  1. 3

                                                    It’s a problem of enforcement. Rust internally uses WTF-8 as an internal encoding to fix that.

                                                    https://simonsapin.github.io/wtf-8/

                                                    1. 1

                                                      An interesting read, thank you for the pointer. I’ll see if I adapt Pathie accordingly, but until now it has done the job for me (and it’s mostly a library I use for my own projects).

                                                    2. 1

                                                      Thanks!

                                                1. 1

                                                  The one question I have about any cross-platform GUI library is whether it’s using the native platform APIs or doing its own rendering, usually by means of OpenGL (aka “immediate GUI”). I didn’t find an answer on the website. So far, I only know of exactly one real cross-platform C++ GUI library that uses the native APIs, which is wxWidgets. Most libraries I have encountered promise cross-platform and then use the “immediate GUI” approach, which will never look really native and is a battery killer in my experience.

                                                  1. 2

                                                    Looking at the source for the graphics class and the button class, looks like they’re just drawing their own widgets via Windows APIs that I’m not familiar with or Xlib (the X11 drawing library). Xlib is an interesting choice; most GUI libraries I’ve seen go for OpenGL, which gives you a lot more control, and isn’t slated to be replaced by Wayland. To your point, it might be friendlier on battery than an OpenGL implementation, since it’s not doing any drawing in immediate mode (and instead looks to be repainting into X11 buffers when needed).

                                                    Personally, I’ve given up trying to make the GUIs I work on feel native, especially since the GUIs I’m writing are usually for internal tools or for my own consumption. I default to using Dear ImGUI, which makes GUIs that are very utilitarian and look nothing like native, and is super super nice to use as a programmer.

                                                    1. 2

                                                      My solution is to just write a native UI for each platform you care about. For me, this basically means I’ll just use Windows, and just keep a rudimentary GTK# frontend working, primarily so logic can be split out from views when reasonable.

                                                      I’m a big stickler in things feeling like that they belong to the platform. Otherwise, why not write a web app instead?

                                                      1. 1

                                                        I think it depends on the target audience; usually, I’m writing a GUI for a piece of C++ code that controls a robot or a geometry system or something, and the intended users are either me or the employees of a company I’m consulting for. Writing a web frontend for some chunk of native code is irritating – I either have to jam a web server into a C++ binary, or create wrappers for some scripting language – and using a native toolkit isn’t that important to me, especially because the applications I’m writing usually have some weird widgets in them anyway (2D sliders, or line graphs, or something).

                                                        In general, I think that writing a very simple UI can really aide in debugging complex control systems, and I just pick up the tool that makes it as easy as possible to do so. In my experience, that’s an immediate GUI, without callbacks or a dedicated event loop or anything.

                                                  1. 4

                                                    Some time ago, I wrote a C library to interact with the X and MS Windows clipboard in a cross-platform way without large dependencies like Gtk+. It only supports text, but at least Unicode text. I haven’t used it anymore since then, though. From that experience, I can definitely say that the X clipboard system is ridiculously complex. Does anyone have a pointer to me whether Wayland has improved on that topic?

                                                    1. 3

                                                      At one point I decided I wanted to write the Wayland equivalent of xclip, and I started looking into the documentation. Unfortunately for me, it turns out that Wayland will only let you read or write the clipboard in response to an input event, like a mouse-click or key-press. This is so that unscrupulous programs can’t lurk in the background and send all your clipboard-copies to the Russian mafia and cause your pastes to produce Viagra adverts. A noble goal, but it makes a hypothetical wayclip program much less usable.

                                                      Of course, in practice every Wayland desktop will have Xwayland installed for compatibility with existing X11 apps… which means that under Wayland, you keep using xclip and it works just fine.

                                                    1. 2

                                                      The suggestion in the comments to buy back broken FP1s for spare parts is a good one I hope they’d consider following up.

                                                      And hopefully a few enthusiasts in the community would be able to take up the reins on the KitKat port?

                                                      1. 9

                                                        I would not expect a community effort for software updates. The source for FP1 was never available.

                                                        Fairphone does not seem to understand open source communities well. They are experts on the (very difficult) hardware supply chain problem. But since keesj left the company from the outside it looks like there is nobody left inside who actually groks open source :(

                                                        1. 2

                                                          And hopefully a few enthusiasts in the community would be able to take up the reins on the KitKat port?

                                                          Someone tried to bring Cyanogen with Android 5 (Lollipop) to the Fairphone 1, and failed. It’s the missing source code that’s the problem.

                                                        1. 6

                                                          As an owner of a Fairphone 1 which I still use daily, I feel a little angry indeed. I can absolutely understand their situation and after thinking about it, I don’t think they really had a choice. As has been pointed in this ridiculously long thread in the Fairphone forum, Fairphone never promised a sustainable phone. They promised a conflict-free phone (or at least one that was more fair than other phones). To that end, they have, at their time, succeeded in my opinion. Since then more companies have started to look on the fair side of phones (probably their biggest achievement), so for the future, they will have to do things differently in order to stand out.

                                                          While Fairphone has stopped development of the software for the Fairphone 1, there is a last alpha build available for Android 4.4. I installed it, and it mostly runs (no problems with the phone encryption, I was surprised). It’s good for daily use in my experience. I’m unsure as to how I will continue. I would have greatly appreciated it if they offered a discount for the existing (now naturally angry) owners of Fairphone 1 devices with regard to their new devices. They don’t, and I don’t feel like spending such an amount of money again for a phone that’s probably not going to live longer than usual phones. Since about 3 weeks ago I appearently got one of the last Fairphone 1 batteries from their store, I do have some time to think about it; the phone is still fully intact.

                                                          1. 7

                                                            That’s great. GDC will then finally move into regular Linux distribution repositories I hope.

                                                            (still no D tag?)

                                                            1. 4

                                                              Eh, gdc has been in Debian (hence also Ubuntu) for a long time now.

                                                              1. 1

                                                                I stand corrected; thanks.

                                                            1. 5

                                                              It looks like D could be a really good language for most applications, in particular on linux. People seem to either keep C (unsafe, no abstraction) or python (slow and untyped)… At least D can be reasonably high level (like python) but still very performant. I’m just a bit pessimistic on the chances that languages that have been around for a while suddenly become popular.

                                                              1. 6

                                                                I’m just a bit pessimistic on the chances that languages that have been around for a while suddenly become popular.

                                                                ironic for an ocaml person to say that :)

                                                                1. 2

                                                                  s/ironic/realistic/ ;-) I love OCaml, but I doubt it will ever become popular. Maybe the reason syntax (which is more C-like, something that can help a lot) will change that though, but I will not hold my breath.

                                                                  1. 3

                                                                    oh :) i was thinking of the way ocaml has suddenly seen a spike in popularity over the last few years - it will never be C-level popular, but it definitely feels like it has a lot of momentum and community activity it didn’t have for a long time.

                                                                    1. 1

                                                                      Indeed, some factors made this possible (better tooling with merlin, the opam package manager, …). The community is active, and more people have joined it, but it still is small.

                                                                2. 3

                                                                  in particular on linux.

                                                                  I would love to see D as a viable alternative for Windows development as well, but since both dmd and ldc have a hard dependency on MSVC, I don’t see this to come soon; it makes crosscompilation from Linux quite difficult up to impossible. GDC might be able to fill this hole, but it still is a one-man show and will only very slowly evolve (not to mention what happens if the maintainer loses interest). Also, I have been told GDC produces giant executables for small programs, but that might improve more quickly.

                                                                  For Linux, I think there are enough easily installable, modern alternatives to C/C++ that I don’t think that that’s a place where D could shine.

                                                                  1. 1

                                                                    I’m just a bit pessimistic on the chances that languages that have been around for a while suddenly become popular.

                                                                    If you model language usage as a logistic curve then this scenario is perfectly realiseable.

                                                                  1. 2

                                                                    There seems to be quite some D stuff recently. Do we need a D tag?

                                                                    1. 1

                                                                      Yes! I was surprised to find no tag for D.

                                                                    1. 9

                                                                      D has two compilers, the fully free LLVM-based D compiler and the official reference compiler dmd, the backend of which used to be proprietary, only licensed for single users by Symantec.

                                                                      https://www.reddit.com/r/programming/comments/82cgp/new_release_of_the_d_programming_language_now/

                                                                      1. 10

                                                                        There’s also GDC.

                                                                        1. 4
                                                                          1. 1

                                                                            wow, hadn’t seen that before. nice project!

                                                                          2. 1

                                                                            Problem with GDC is that it’s a one-man show. Take a look at the commit log and the contributors graph:

                                                                            It’s great this person is placing so much effort into it, but I’d be careful before I use it for anything serious.

                                                                            1. 1

                                                                              Right!

                                                                              Fun coincidence: Yesterday I discovered that you had been involved in developing rstat.us and today you are writing me a comment here.

                                                                              1. 3

                                                                                Yup! Been watching Mastodon with… feels. I hope they can accomplish what I could not.

                                                                          1. 11

                                                                            The directive that is now codified as 2009/24/EC was a direct response to the German Federal Court’s (BGH) decision nicknamed “Inkasso-Programm”, where it ruled that a software needs to reach the level of creativity usually required for any kind of work to be protected by German copyright law (“Schöpfungshöhe”). According to the Court, programs only rarely cross that creativity level and are thus public domain usually, unless unusually creative. The Member States and the Commission found this unsatisfying naturally, especially with the USA already protecting programs with their completely different copyright system. As a result, the mentioned directive was issued whose primary purpose was to lower the creativity burden without ditching it completely. Germany implemented it later in §§ 69a ff. UrhG, so that programs now usually fall under copyright.

                                                                            When the copyleft effect of the GPL kicks in is highly debated, which the article does not really mention. There is no consent about this in jurisprudence, and GPL cases are surprisingly rarely taken to courts to clarify this particular issue. The ongoing case against VMWare might give some opportunity for courts to clarify, but it is not over yet, and I suspect it will not cover all possible aspects. The case of dynamically linking a GPL'ed library is an especially hot topic.

                                                                            There is an ongoing debate as to whether copyright law is the correct place for software protection at all, because in Continental Europe, the concept of copyright is mainly focused on the author’s personal creativity and only sees the commercial use of works as a necessicity for the author to make his life. The concept of copyright cannot be thought without the author, who is connected to his work for all of his lifetime. If it weren’t for several international treaties mainly drafted by the USA, software protection law in Europe would probably have taken a very different way.

                                                                            1. 1

                                                                              Since someone mentioned Webkit, I’ll just stick the question in here. As a Gentoo user, I bother about compile times, and compiling WebkitGtk+ takes more than an hour on my system. Even Firefox compiles faster. If I were to develop an application that should render some simple HTML e.g. for help pages, which alternative embeddable renderers exist?

                                                                              1. 4

                                                                                librocket/dillo/netsurf

                                                                                librocket is pretty enormous but it got used for the UI in Warsow, dillo/netsurf are probably also enormous and I don’t know anything that uses them.

                                                                                1. 4

                                                                                  It’s simply not possible to write a small, clean browser implementation which conforms to current web standards. If you’re willing to stick to a buggy and incomplete subset of html, there are some options like netsurf and dillo.

                                                                                  Personally, I’d avoid HTML entirely if I could.

                                                                                  1. 1

                                                                                    If you’re not using JavaScript then consider NetSurf or Dillo.

                                                                                  1. 2

                                                                                    While I like the idea of slim browsers, the general consensus is to avoid browsers using ‘WebKit/GTK+’. OpenBSD now warns users when they use a browser that depends on www/webkit:

                                                                                    --- +webkit-2.4.11p1v1 -------------------
                                                                                    !!! WARNING: WebKitGTK+ 2.4 is known to have many security vulnerabilities that
                                                                                    !!! will NOT be fixed. Avoid browsing with it
                                                                                    

                                                                                    Here is a blog post that goes into detail about the issues around existing webkit based stuff.

                                                                                    1. 4

                                                                                      The above info is incorrect as per the post it links to. WebkitGtk+ 2.4 is old and insecure, since it’s a branch of WebkitGtk+ not maintained anymore since quite some time. Newer versions of WebkitGtk+ are properly maintained. Just take a look at the actual post you linked to.

                                                                                      1. 3

                                                                                        I think the key takeaway is that surf uses 2.4. I guess I should have been more explicit when I said “browsers using ‘WebKit/GTK+’.”, I assumed the context of the original article would be applied.

                                                                                    1. 8

                                                                                      I have tried surf and ran for for quite some days, but there were a number of problems I had with it.

                                                                                      • It is unstable. It crashed way too often for me to be used as my day-to-day browser.
                                                                                      • I know no reliable way to do adblocking with it. Just fiddling with /etc/hosts is not really enough; many pages look weird if you do that. Adblockers in Firefox have never shown this problem for me. And keeping /etc/hosts up-to-date is a pain.
                                                                                      • Enabling JavaScript for a page only on demand does not work. I want it off by default.
                                                                                      • surf rejects a number of SSL websites Firefox accepts for no obvious reason (especially bad with lets-encrypt sites). In contrast to what the article says, surf does support SSL, though. Just not in the stable version I have found.
                                                                                      • I have no idea how to create the facility of a plugin I very much like, Flagfox. It displays a little country flag in Firefox' URL bar depending on where it thinks the IP is from.
                                                                                      • Bookmarks. I know I can manage them with scripts, but until now I have been too lazy for that, since a proper script would allow me to search the bookmark list and then directly follow the link. I often find myself remembering the title of an article I read and bookmarked, but not the page. So it is required that the bookmark link is stored together with the title and can be selected by either title or URL. As said, possible, but I’m just too lazy for that.
                                                                                      • How to forbid 3rd party cookies? How to delete all cookies after quitting the last surf instance?

                                                                                      There were probably more points which I don’t remember anymore.

                                                                                      Now I’m back on Firefox and have turned on the start-search-by-typing option. This gives me the required level of keyboard navigation I need – I can just type in the text of a link and Firefox will select it. There is a surprising amount of useful keyboard shortcuts in Firefox that is a little bit hidden (for example, by typing ‘ [single quote] with the start-search-by-typing option enabled you search only the links of a page, very useful).

                                                                                      1. 2

                                                                                        The SSL woes are more than likely because Firefox caches sub-CAs it sees in the wild to handle all the badly configured webservers that do not serve the whole certificate chain when connecting.

                                                                                        I really wish browsers did not do this as it masks a problem that to the sysadmin running the site looks just like a temporary glitch in the matrix that they can ignore.

                                                                                        I hate the web.

                                                                                        1. 1

                                                                                          For adblocking, you could use http://git.codemadness.nl/surf-adblock/ with surf-webkit2.

                                                                                          1. 1

                                                                                            surf rejects a number of SSL websites Firefox accepts for no obvious reason (especially bad with lets-encrypt sites). In contrast to what the article says, surf does support SSL, though. Just not in the stable version I have found.

                                                                                            There are some issues with TLS, at least on my Fedora system (visiting the badssl.com dashboard). For example, no host matching, no check for expired certificate, etc.

                                                                                          1. 5

                                                                                            Although I don’t necessarily like GitHub, this interpretation of the ToS strikes me as rather unusual. Point by point:

                                                                                            Section D.7 requires the person uploading content to waive any and all attribution rights.

                                                                                            This is not true in this broad sense. §D.7 first asserts “You retain all moral rights to Content you upload”, which sets the general direction. The next sentence is what the author is hanging up on: “However, you waive these rights” — now continue reading and you see the sentence continues: “and agree not to assert them against us,”. The further explanation in that sentence and the subsequent grant of rights paragraph make clear that you grant the waiver only to GitHub Inc. and not to anybody else.

                                                                                            It is true that you cannot waive rights you don’t own, though. The one true point in this critique is that uploading works not owned by you (or for which you don’t have the rights to grant waivers) is probably a bad idea now.

                                                                                            Next one:

                                                                                            Section D.5 requires the uploader to grant all other GitHub users… the right to “use, display and perform” the work

                                                                                            This is nonsense. The author likes to not read complete sentences it appears and tear them all apart. While his interpration is possible, it would basically contradict the entire environment of the snippet he quotes. Let me put this straight by counterquoting with an elision:

                                                                                            If you set your pages and repositories to be viewed publicly, you grant each User of GitHub a nonexclusive, worldwide license to […] use, display and perform your Content […] solely on GitHub as permitted through GitHub’s functionality.

                                                                                            If read like that, it is clear that you grant these rights only for use on GitHub and not anywhere else. It only declares now finally legal which is perceived as normal: If you upload something publically to GitHub, you want that people can look on it and you want people can use the “fork” button on it. Also please note that the clause explicitely only refers to public repositories.

                                                                                            Next one:

                                                                                            Anything requiring integrity of the author’s source

                                                                                            I’m not sure what the author is getting at here. I think he worries that GitHub may remove some content from your repository while leaving the rest intact, thereby making you an infringer of the LPPL (LaTeX Project Public License) which appearently forbids such partial removal (I haven’t checked the LPPL, I just assume this is true in the following). If GitHub doesn’t notify you of such partial removal, you should be able to successfully argue that you didn’t know, which should get you off all liability other than removal of the content. If GitHub does notify you of their removal (almost certainly), you have enough time to conduct a complete removal. As a side note, in such a case GitHub would probably actually break the LPPL on its own also as they distribute the partial repository.

                                                                                            Next one:

                                                                                            This means that repositories from people who last used GitHub before March 2017 are excluded.

                                                                                            Rather not. If you have an account and have not explicitely disagreed with the ToS, you’re in, regardless of whether you used or not used GitHub.

                                                                                            My conclusion: Much ado about nothing. Don’t upload content you don’t own (which you shouldn’t anyway), and you’re fine.

                                                                                            FYI: I’m a law student.