1. 5

    I self-host since years on a VPS with Postfix and Dovecot, spamassassin and OpenDKIM. I do it mainly for two reasons: full control over the process (I make extensive use of Sieve scripts) and learning how the e-mail ecosystem operates. And privacy, esecpecially once I get to move the entire thing into my basement.

    The main obstacle I have is actually that my e-mail is qualified as spam by large providers (most notably Microsoft-based services, especially outlook.com) without any reason I could identify. I do have a proper PTR reverse DNS record, I do have working SPF and DKIM. My IP is not blacklisted anywhere. I have come to the conclusion that there’s a policy at Microsoft that says that you’re spam if you’re not a large e-mail provider. For important e-mail, I always have to call or send a chat message to ensure the recipient checks his spam folder.

    1. 2

      I have come to the conclusion that there’s a policy at Microsoft that says that you’re spam if you’re not a large e-mail provider

      This is the sort of thing I always worry about when I contemplate self-hosting.

    1. 2

      If I thought that the advertisers who keep trying to buy space on my blog (for link spam mostly) actually read my blog I might consider doing this.

      1. 3

        I get e-mails like that too. I got one from Casper to write a review. I think and I gave them some outrageous number (like $5,000) and never heard back.

        Also the post you linked to from that post about e-mail and small business, I’ve got a pretty similar story:

        https://penguindreams.org/blog/how-google-and-microsoft-made-email-unreliable/

        There are services out there now like Sendgrid and Mailgun to at least help small businesses get mail out without it going to spam, or of course MailChip for mailing lists. I should really do a part II to that post at some point.

        1. 2

          I run my own mail server and I have seen all the problems described in the post. From a German law perspective, the behaviour shown by Google and Microsoft probably qualifies as illegal under § 4 Nr. 4 UWG (Act against unfair competition, English translation). If anyone reading this runs an e-mail-based business in Germany, you should consider challenging them for the sake of the free e-mail exchange.

      1. 1

        Thank you for the planet. There seems to be about 100 blogs/feeds coming in to the planet. But the planet rss feed is just 100 items, most of which seem to come from just a couple of blogs that don’t have proper timestamps?

        1. 2

          Well spotted.. it wasn’t apparent yesterday but I just fixed an SSL problem and suddenly there are quite a few. I’ll remove any more I spot, but please, feel free to go crazy on pull requests :)

          edit: this is way more broken than I thought. Planet doesn’t seem to do anything about feeds that lack timestamp, which is surprising. Anyone got a recommendation for better software? The main value in this existing thing is the Travis setup and the list of feed URLs.

          edit: ok, I /think/ I’ve got it this time.. some bad settings in there, and squelching untimestamped feeds doesn’t happen after the first time they’re seen, so had to wipe the cache and start again

          1. 1

            I’m tempted to write something better, or at least help improve what you have currently got working :)

            1. 1

              I’ve once authored a planet generator named Uranus, but I don’t really maintain it anymore. It does have the advtange of not having any dependencies other than Ruby, though (no gems, just plain stdilb). There’s another planet generator named Pluto that is still maintained.

          1. 3

            Most of what I write is technical, but there’s just one problem… I usually blog in German. Anyway, here’s the URL:

            https://mg.guelker.eu/

            1. 8

              About analytics: You can do them on the server side by parsing your web logs! That used to be how everyone did it! Google Analytics popularized client side analytics using JavaScript around 2006 or so.

              Unfortunately I feel like a lot of the open source web analytics packages have atrophied from disuse. But I wrote some Python and R scripts to parse access.log and it works pretty well for my purposes.

              http://www.oilshell.org/ is basically what this article recommends, although I’m using both client-side and server-side analytics. I can probably get rid of the client-side stuff.

              related: http://bettermotherfuckingwebsite.com/ (I am a fan of narrow columns for readability)

              1. 4

                I agree, I used to use JAWStats a PHP web app that parsed and displayed the AWStats generated data files to provide visually appealing statistics a lot like Google analytics but entirely server side with data originating from apache/nginx log files.

                It’s a shame that it was last worked on in 2009. There was a fork called MAWStats but that hasn’t been updated in four years either :(

                For a while I self hosted my feed reader and web analytics via paid for apps, Mint and Fever by Shaun Inman but those where abandoned in 2006. It seems like all good software ends up dead sooner or later.

                1. 3

                  Maybe the GDPR will give these project a new breath.

                  They are much better for privacy aware people.

                  1. 2

                    It’s been on my list of projects to attempt for a while, but my static site generator Tapestry takes up most of my spare time.

                2. 4

                  You want GoAccess. Maintained, and looks modern. Example. I’m using it and it has replaced AWStats for me completely.

                  1. 2

                    I currently use GoAccess myself, the only thing that would make the HTML reports better is seeing a calendar with visit counters against days.

                1. 7

                  As usual with decentralized systems, the main problem I had was discovering good feeds. One could find stuff, if one knew what one was looking for, but most of the time these feeds only contain the first few lines of a article. And then again, there are other feeds that just post too much, making it impossible to keep up. Not everyone is coming to RSS/Atom which a precomposed list of pages and websites they read.

                  These are the “social standards”, which I belive are just as important as the technical standards, which should have to be clarified in a post like this one.

                  1. 6

                    I agree. Finding good feeds is difficult indeed, but I believe that good content does spread by word at some point (it may even be word in social media, actually). Feeds that post too much are definitely a problem. Following RSS/Atom feeds of newspapers specifically defeats the purpose. Nobody can manage this hilarious amount of posts, often barely categorised. I don’t have a solution for these at hand; this article suggests that the standard should be improved on this. It might be a good idea to do so.

                    Excerpt feeds I don’t like, because they are hard to search using the feed reader’s search facilities. I however still prefer an excerpt feed over no feed at all, which is why the article mentions this as a possible compromise. The main reason for excerpt feeds appears to be to draw people into the site owner’s Google Analytics.

                    1. 3

                      As far as unmanageably large&diverse sites go, I seem to recall at The Register you can/could run a search and then get an RSS feed for current and future results of that query. Combined with ways to filter on author etc. that worked a treat.

                    2. 2

                      the main problem I had was discovering good feeds

                      This is why my killer feature (originally of GOOG Reader and now of NewsBlur) is a friends/sharing system. The value of shared content is deeply rooted in discovery of new feeds.

                      feeds only contain the first few lines of a article

                      Modern feed readers generally support making it easy to get full articles / stories without a context switch.

                      feeds that just post too much, making it impossible to keep up

                      Powerful filtration is also another place where modern readers have innovated. Would definitely check them out, because these are solved problems.

                      1. 2

                        Can you recommend any specific readers?

                        1. 1

                          NewsBlur is pretty great. It’s a hosted service, rather than a local application, but that’s kind of necessary for the whole “sharing” thing.

                          1. 1

                            If you’re an emacs user: elfeed. It has pretty good filtering and each website can be tagged.

                            1. 1

                              I tried that for a while, but eventually I just couldn’t keep up. I never really have the time to read an article when I’m in Emacs, since usually I’m working on something.

                            2. 1

                              I have been quite pleased with NewsBlur. It has the added benefit of being open source, so if it were to disappear (cough, cough, GOOG Reader), it could easily be resurrected.

                              For the social aspect, of course, might want to poll friends first to see what they are on.

                        1. 5

                          So many comments here, I’m a little overwhelmed. Thanks to everyone <3

                          Something that crossed my mind: maybe it could be possible to join the RSS/Atom efforts with the efforts in the area of decentral social networks, like Mastodon? Forgive me, I’m not a Mastodon user (yet), but maybe there is some kind of possible integration… Maybe, if RSS/Atom feeds could be “followed” somehow?

                          1. 2

                            Working on it: https://getstream.io/blog/winds-2-0-its-time-to-revive-rss/ It’s not so easy though, it’s a vicious cycle. Less people use RSS, less publishers support RSS, RSS tools degrade in quality and so on.

                            You wouldn’t believe the number of if statements in the Winds codebase just to make RSS work (ish). The standard isn’t really much of a standard with everyone having small variations. Here’s an example, not all feeds implement the guid properly, so you end up with code like this: https://github.com/GetStream/Winds/blob/master/api/src/parsers/feed.js#L82

                            1. 1

                              Now, that looks like an interesting project. I have updated my SaveRSS page to include a link to Winds in the RSS clients section. You might also consider linking to the SaveRSS page for arguments on why to use RSS/Atom as a publisher.

                              Personally, the project isn’t for me, though. I’m a happy user of elfeed, but I can absolutely see how your project can benefit the RSS/Atom community.

                              1. 1

                                Dang, this bloatware is pushing 6k stars on github already. Nothing like an RSS reader that combines Electron, Mongo, Algolia, Redis, machine learning (!), and Sendgrid

                                1. 1

                                  l

                                  The goal is build an RSS powered experience that people will actually want to use. The tech stack is based around the idea of having a large group of people being able to contribute. (We use Go & RocksDB for most of our tech, so it was a very conscious move to use Node & Mongo for Winds to foster a wider community adoption)

                                  1. 1

                                    Makes sense. Thanks for the gracious reply, I feel bad about my grumpy comment.

                              1. 2

                                RSS was a great concept (and appropriate for its time), but was designed by people who didn’t comprehend XML namespaces, instead forcing implementations (both generators and readers) to escape XML and/or HTML tags, which requires multiple passes for generating and parsing feeds - with an intermediate encoding/decoding step (Really Simple?). They purportedly addressed this in RSS 2.0, but if you have a look at their RSS 2.0 example, they still got it wrong, persisting a 1990’s understanding of the web. Although I still use it, I shake my head in disappointment every time I see RSS source. RSS 2.0 should really have been based on something that could be validated, such as XHTML.

                                At this point, it is probably way too late for a comeback, as:

                                1. Social media platforms like Twitter are commonly used as a substitute and have a large hegemony over content.
                                2. Browsers have given up on RSS in favor of their own peculiar readers.
                                3. Google, Microsoft, Yandex and whatever Yahoo is now are pushing for an entirely different system based on extracting information from HTML content via an ever-changing pseudo-ontology that lacks definitions and is inconsistently employed by every practitioner.

                                You could read the above points as things that RSS should be able to overcome. If RSS were indeed to make a comeback, I would hope that in a new “RSS 3.0” incarnation it would satisfy the following criteria:

                                1. Standard comes before implementation (e.g., utilize existing standards).
                                2. Validatable (e.g., employ XML namespaces and utilize an XSD for document validation).
                                3. Human readable (i.e., subset of XHTML or HTML, that can be consistently rendered as in any modern web browser)
                                4. Strict specification (use a well-defined structure with a minimal tag set that prevents multiple interpretations of the specification).

                                I’ll admit, I do not like JSON one bit because it is antithetical to several, if not all of the above criteria. However, since a JSON alternative is desired, I would recommend that it be directly based on an XML/HTML version that does satisfy the above criteria. Then a simple XSL (read “standardized”) spreadsheet could be employed to generate the equivalent JSON version, satisfying both worlds.

                                1. 9

                                  Doesn’t Atom fulfill your RSS 3 criteria?

                                  1. 2

                                    they still got it wrong, persisting a 1990’s understanding of the web. Although I still use it, I shake my head in disappointment every time I see RSS source. RSS 2.0 should really have been based on something that could be validated, such as XHTML.

                                    Atom does fulfill your second list’s criteria, is often used today in place of RSS, and can even be validated. My article even says that if in doubt, use Atom.

                                    Social media platforms like Twitter are commonly used as a substitute and have a large hegemony over content.

                                    The entire point of the site is to set something against this before it is too late. Today, there still are many sites providing feeds, and I do hope that this article will sustain that. To be clear, I don’t advocate leaving social media. All I ask in that article is to provide a feed additionally to your social media presence.

                                    Browsers have given up on RSS in favor of their own peculiar readers.

                                    I’ve actually never used Firefox’ RSS/Atom support and I don’t believe that browsers are the correct target for RSS/Atom feeds. There are feed reader programs that deal specifically with feeds and they are still being maintained, so I don’t see browsers removing their feed support as problematic.

                                    Google, Microsoft, Yandex and whatever Yahoo is now are pushing for an entirely different system

                                    You listed yourself why it isn’t a real alternative.

                                  1. 4

                                    There’s also feed.json that serves the same purpose but using JSON instead of XML

                                    https://jsonfeed.org

                                    1. 20

                                      In my opinion, jsonfeed is doing active harm. We need standardization, not fragmentation.

                                      1. 2

                                        Well as long as people are just adding an additional feed, xml/rss + json. You can have two links in your headers. Over the course of time, all readers will probably add support and then it shouldn’t matter which format your RSS feed is in. That’s kinda how we got to where we are today.

                                      2. 10

                                        How far spread is support for this in feed readers? RSS and Atom have a very broad support among feed readers, so unless there’s a compelling reason a working and widely supported standard shouldn’t be replaced just because of taste.

                                      1. 2

                                        HTML email is problematic indeed. But entirely opting out of it is difficult. My favourite example is eBay; it’s automatic emails are just plain impossible to read without parsing the HTML.

                                        What bugs me on the discussion about Efail and HTML email is that it’s always an all-or-nothing discussion. Either you accept reality and live with HTML email or you reject it and go plain text. My opinion is that neither is the correct way. What we need is a new standard. A standard for formatted email that does not expose the difficulties and dangers of HTML email and allows more formatting than plain text. Before someone is going to mention this XKCD I want to point out that HTML email is not even a standard, which is part of the problem.

                                        I envision this new standard such that it allows things like this:

                                        • General markup – bold, italic, underline, etc.
                                        • Inline images based on image data sent with the email (not web images)
                                        • Letterhead with logo and legally required information; many companies try to abuse HTML email for this. If this is properly defined, then terminal clients can detect that information and simply not show it (e.g. omit the logo).

                                        The list is not complete, but there are certainly things that should never be on it. For example, this new standard should not allow embedding of remote resources for privacy and security reasons. Tracking pixels for example should be impossible, and without the ability to “phone home”, attacks like Efail are not possible. Similaryly, there’s no reason to allow script execution (like JavaScript) in e-mails.

                                        RFC 3676 (format=flowed) never got widely adapted, and not even Thunderbird gets it right appearently. It also doesn’t address the problem with markup.

                                        1. 3

                                          I don’t know if a new standard would be required, beyond just HTML. Nothing says that user agents must implement the whole HTML spec, including fetching remote resources, CSS, etc. Your stripped-down markup format could be implemented right now as a “stripped-down” HTML renderer. For example, I use Emacs and mu4e to read my email, which calls out to w3m to render the page as slightly prettified text (e.g. emphasis, underlines, etc. work; presumably using ANSI terminal escape codes). There’s no reason that’s limited to text though; I’d imagine it’s safe enough to render anything as long as (a) no external resources are fetched (everything must be included in the email, e.g. as MIME parts or data URIs) and (b) the result is inert (nothing clickable, nothing that interacts with external resources, etc.).

                                          From the sounds of it, this Efail problem would still pose a couple of problems, even if some new restricted markup format were used. Firstly, part of the trick seems to be a general problem with any delimited markup; e.g. one part contains <a href="... whilst another contains >. That would still be a problem with, say, [x](... and ) in markdown. Whilst it can be mitigated by escaping/quoting discipline, as the article mentions, that requires effort for every implementation. The other problem, exfiltration of decrypted data, seems to me like it would still be problematic for plain text. Even if we have an inert, non-clickable format, people will still want to visit URLs sent via email (e.g. password resets, etc.). Even if we show the entire URL, and force the user to copy/paste it by hand, it might not be obvious that decrypted data has been leaked. For example if it’s something machine-generated and nestled inside an innocuous looking parameter of an ‘unsubscribe’ URL. I’m not familiar enough with PGP, etc. to know how difficult it would be to outright forbid such things (e.g. forbidding mixed encrypted/non-encrypted messages entirely)

                                        1. 5

                                          Please note: this is not a terminal vote. The European Parliament in plenum will later on have to vote on the directive as a whole. As of now, it’s not in force nor even set to enter into force.

                                          Also please note that that it’s not a law (i.e. regulation in EU speak). It’s a directive. It requires member states to make laws in a certain fashion, and more often than not, there can be differences in each member state’s law based on the directive.

                                          And since appearently nobody cares to actually read what has been voted on, the voting document is available online on the EU parliament’s website. Art. 13 is mentioned under CA 14. Though to be honest I find this document confusing. If someone has a consolidated version available somewhere, a link would be nice.

                                          1. 3

                                            Interesting. But encountering an entire programme written by just one person is a rather rare occurance nowadays I think. Does it work on programmes authored by multiple persons, giving a list of all persons who contributed to a given programme? That might be quite interesting for copyright issues.

                                            1. 15

                                              Hey folks,

                                              Jon messaged me a day or two ago. I gave him the standard answer about these sorts of inquiries: I’m happy to run queries that don’t reveal personal info like IPs, browsing, and voting or create “worst-of” leaderboards celebrating most-downvoted users/comments/stories, that sort of thing. I can’t volunteer to write queries for people, but the schema is on GitHub and the logs are MySQL and nginx, so it’s straightforward to do.

                                              A couple years ago jcs ran some queries for me and I wanted to continue that especially as the public stats that answered some popular questions have been gone for a while. It’s useful for transparency and because it’s just fun to investigate interesting questions. I’ve already run a few queries for folks in the chat room (the only I can remember off the top of my head is how many total stories have been submitted; we passed 40k last month).

                                              I asked Jon to post publicly about this because it sounded like he had significantly more than one question he was idly curious about, to help spread awareness that I’ll run queries like this, and get the community’s thoughts on his queries and the general policy. I’ll add a note to the about page after this discussion.

                                              I’m going offline for a couple hours for a prior commitment before I’ll have a chance to run any of these, but it’ll leave plenty of time for a discussion to get started or folks to think up their own queries to post as comments.

                                              1. 3

                                                Wasn’t the concept of Differential Privacy developed to allow for exactly the purpose of querying databases containing personal data while maintaining as much privacy as possible? Maybe this could be employed?

                                                1. 3

                                                  In this particular case I don’t think the counts are actually sensitive, so it’s unclear that applying DP is even necessary. But I’ll ping Frank McSherry who’s one of the primary proponents of DP in academia nowadays and see what he thinks :) Maybe with DP we could extract what is arguably more sensitive information (e.g., by reducing the bin widths).

                                                  1. 1

                                                    That seems doable, but I have the strong suspicion that if I wing it I’ll screw something up and leak personal info. So hopefully Frank can chime in with some good advice.

                                                    1. 4

                                                      I’m here! :D I’m writing up some text in more of a long-form “blog post” format, to try and explain what is what without the constraints of trying to fit everything in-line here. But, some high-level points:

                                                      1. Operationally queries one and two are pretty easy to pull off with differential privacy (the “how many votes per article” and “how many votes per user” queries). I’ve got some code that does that, and depending on the scale you could even just use it, in principle (or if you only have a SQL interface to the logs, we may need to bang on them).

                                                      2. The third query is possibly not complicated, depending on my understanding of it. My sed-fu is weak, but to the extent that the query asks only for the counts of pre-enumerable strings (e.g. POST /stories/X/upvote) it should be good. If the query needs to discover what strings are important (e.g. POST /stories/X/*) then there is a bit of a problem. It is still tractable, but perhaps more of a mess than you want to casually wade into.

                                                      3. Probably the biggest question mark is about the privacy guarantee you want to provide. I understand that you have a relatively spartan privacy policy, which is all good, but clearly you have some interest in doing right by the users with respect to their sensitive data. The most casual privacy guarantee you can give is probably “mask the presence / absence of individual votes/views”, which would be “per log-record privacy”. You might want to provide a stronger guarantee of “mask the presence absence of entire individuals”, which could have a substantial deleterious impact on the analyses; I’m not super-clear on which guarantee you would prefer, or even the best rhetorical path to take to try and discover which one you prefer.

                                                      Anyhow, I’m typing things up right now and should have a post with example code, data, analyses, etc. pretty soon. At that point, it should be clearer to say “ah, well let’s just do X then” or “I didn’t realize Y; that’s fatal, unfortunately”.

                                                      EDIT: I’ve put up a preliminary version of a post under the idea that info sooner rather than later is more helpful. I’m afraid I got pretty excited about the first two questions and didn’t really do much about the third. The short version of the post is that one could probably release information that leads to pretty accurate distributional information about the multiplicities of votes, by articles and by users, without all of the binning. That could be handy as (elsewhere in the thread) it looks like binning coarsely kills much of the information. Take a read and I’ll iterate on the clarity of the post too.

                                                      1. 1

                                                        To follow up briefly on this: yes, it would be useful to avoid the binning so that we could feed more data to whatever regression we end up using to approximate the underlying distribution.

                                                  2. 2

                                                    For those who are curious, I’ve started implementing the workload generator here. It currently mostly does random requests, but once I have the statistics I’ll plug them in and it should generate more representative traffic patterns. It does require a minor [patch](https://github.com/jonhoo/trawler/blob/master/lobsters.diff to the upstream lobste.rs codebase, but that’s mostly to enable automation.

                                                  1. 1

                                                    I’m sorry, but this looks like copyright infringement to me if the author doesn’t have Nintendo’s consent to publish this.

                                                    1. 6

                                                      It’s reverse-engineered code, a legal gray area. Emulators would be in the same legal gray area if not for the precedent Sony vs. Bleem set.

                                                      1. 4

                                                        I’m a law student from Europe, specifically Germany, so I can’t say anything about the legal situation in the U.S.A. Maybe I should have clarified that. For Germany, emulators operate on the exemption for private copies (§ 53 German Copyright Act, and related § 44a for the ephemeral copy in RAM).

                                                        This however does assume that you obtain your emulatable software yourself. It does not cover purchase of software ripped by anybody else than you. Specifically, § 53 of the German Copyright Act does not permit publishing anything you ripped. There are some unhealthy paragraphs — which I’d like to not be there — on the prohibition of DRM circumvention in the law as well. §§ 95a ff. forbid circumvention of DRM (making the private copy exemption pretty useless for DRM’ed content) culminating in a criminal law paragraph § 108b that penalises circumvention of DRM under certain conditions. I find it cynic that § 95b(1)(Nr.6)(a) specifically allows DRM circumvention under the premise that your private copy is on paper. That being said, I have no idea whether whatever Nintendo used or did not use on the Pokémon game cartidges counts as DRM or not.

                                                        If you did not only rip the software, but also modified it, you are probably in breach of another paragraph as well, because § 69c(Nr.2) makes modification of computer software dependant on the consent of the rights owner (this is different from modification of all other kinds of copyright-protected works, where modification does not require consent, but only publishing of the modification). There might be some more sections relevant, all the above is what I tought of off the top of my head.

                                                        The German Copyright Act is based in most of its part on the EU Direction on Information Society 2001/29/EC, which enables me to say that the situation is probably very similar in other EU member states.

                                                        At least in Europe, I thus conclude that publishing software ripped from cartidges on the Internet is illegal. What about people downloading the software? That’s only illegal if this repository is “clearly illegal” (original wording § 53). Given my lengthy legal explanation above, I wouldn’t say it’s “clearly” illegal, so users are probably fine. OTOH, since I now gave these explanations, to anyone who made it this far in this post it may now be “clearly illegal”. So you must decide yourself now. The familiar “ALL THE WAREZ FOR FREE!!” site however is probably “clearly” illegal.

                                                        1. 2

                                                          It’s not ripped/modified software though. It’s hand written code which used Pokémon Red as a reference. If you want insight into their reverse-engineering process, look at pokeruby. Right now pokeruby falls into the “clearly illegal” category (since it’s full of raw disassembly), but once it is finished, it will be all hand-written C code.

                                                          I’m not saying it’s legal, I’m just saying it’s a gray area.

                                                          1. 1

                                                            It’s hand written code which used Pokémon Red as a reference.

                                                            That’s interesting. I’m sorry that I didn’t immediately understand. In that case, the judicial outcome depends on what you mean by “reference”. The process here appears to have been then that the author did walk through all the machine code and then produced a programme that does the exact same like the machine code he viewed at. For that matter, he could have just written the programme in any other language as well.

                                                            Taking something as inspiration is of course not covered by copyright law in any way. If a programme is reproduced in all its instructions and structure however, I would qualify this as a copy. It’s an interesting issue about which I need to think more deeply. It is a question of the definition of “copy” then. And if it isn’t a copy, it might still be a “modification”. Both actions are reserved for the rights owner in case of computer programmes.

                                                            On a side note: I haven’t checked, but if the author uses the original Pokémon graphics, then we’re at a copyright infringement there more easily than with the code.

                                                            Edit: Decompilation is specifically regulated as well and usually forbidden as well §69e. :-)

                                                            1. 1

                                                              if the author uses the original Pokémon graphics, then we’re at a copyright infringement there more easily than with the code

                                                              I thought the Internet made this part of copyright law essentially meaningless? As far as images go, anyway. Sites like Serebii and Bulbapedia host these images, not to mention all of the screencaps and whatnot that are posted on Twitter/Reddit/4chan/whatever. It would be kind of weird to go after pokered for hosting those sprites when there are tons of other people/businesses who do the same thing.

                                                              1. 1

                                                                the judicial outcome depends on what you mean by “reference”. The process here appears to have been then that the author did walk through all the machine code and then produced a programme that does the exact same like the machine code he viewed at. For that matter, he could have just written the programme in any other language as well.

                                                                From what I can tell, a base ROM was never in the repository. It was user-supplied and used to build the entire pokered repo for a while, until all code and assets had been dumped into files that could be used to rebuild the ROM from just the pokered repo itself. See an early README mentioning base ROMs being required: https://github.com/pret/pokered/blob/c07a745e36cf3b3d07bbf7c2d3c897ddd5127200/README#L3-L16

                                                                The translation process was probably just using a tool to disassemble code and labeling pieces. This is most likely an act of “decompilation” as 2001/29/EC understands it. In all likelyhood, the original was programmed in assembly as well; C was very rare in the Game Boy days. The author could not have “written the programme in any other language”: the few compilers that do exist for the Game Boy yield code that is unsuitable for the constraints of the Game Boy. Furthermore, if 1:1 identical binaries are your goal, you cannot just rewrite it in C when the original wasn’t; there’s no realistic way to get identical results.

                                                                pokeruby first disassembled Pokémon Ruby and then adds a pass to conversion to C with the same goal, which is only possible because they have also unearthed the correct compiler.

                                                                And yes, by definition of having an identical ROM, there are all of the original assets, namely player-visible text, graphics, sound. They’re shipped as part of the repository.

                                                                (The judicial outcome is a total crapshoot anyway. Copyright law in the context of software makes for surprising decisions one after another. It’s simply unsuitable and doctrine in other countries has incessantly pointed it out, but due to international pressure from the United States of America, it happened anyway.)

                                                        2. 3

                                                          The project has gotten rather large. It seems to be while the legal situation is indeed far from a clear one, Nintendo and The Pokémon Company international seem to be leaving this repo (or any of the other github/pret efforts) alone.

                                                        1. 2

                                                          I do know the distraction problem, but is it really required to solve it via a programme? Can’t you just pull the Ethernet cable and/or disconnect from wifi? After all, the main cause of distractions are notifications from all kinds of programs, and they are effectively removed by that. If you need the documentation for your programming language or library, there’s often a way to download it and read it offline, and some languages offer documentation via commandline (like Ruby’s ri). I often make use of this when travelling by train, where Internet tends to be wonky.

                                                          Side note: For distractionless writing of anything that is not source code, I have – not kidding – switched to a physical, mechanical typewriter. It’s a relief. No distraction, just you, the keys, and the paper. For source code that doesn’t work due to lack of special keys, but otherwise it’s great if you really want to concentrate on a specific topic and just write.

                                                          1. 2

                                                            The all or none approach is a bit more difficult when you have actual work that needs to happen on GitHub or some remote server.

                                                            Though I’ve been a bit more honest about the ratio of that kind of work. Most work can happen offline

                                                            (I have a Pomera DM200 for typing up paragraphs in a concentrated way. Not perfect but good enough for my needs)

                                                            1. 1

                                                              can you recommend a good typewriter that’s old enough to be simple and durable, but also new enough to be easy to use?

                                                              1. 2

                                                                Um … any manual typewriter that still works? They’re not complicated to use …

                                                                1. 1

                                                                  I have helped in courses on C programming, and when I explained the difference in newlines between Linux and Windows, I used to bring around a typewriter to make the students see what a “Carriage Return” really is. Of course, I left the typewriter available for playing around and from that I can definitely tell you that a typewriter is not self-explanatory anymore. By far the most common question I was asked was:

                                                                  “How do I advance to a new line?”

                                                                  It’s not obvious if you’re used to have an Enter key.

                                                                2. 2

                                                                  Any mechanical typewriter that was manufactured before 1970 should be good and durable, after that date quality appears to decline, and for electric/electronical typewriters it’s much more difficult to repair things if they break. I’m happily typing on a 1960ies Olympia SM9, but really, take a look at eBay, your local antiques shop or similar and just buy one that looks nice to you. You shouldn’t probably start with a “Standard” (i.e. full-size) type writer, because they’re very heavy and hard to sell again. Don’t worry if the ribbon is dried out, it’s easy to get replacement ribbons e.g. on Amazon. If everything else works, the typewriter is probably fine.

                                                                  Also, don’t go with extremely old models (pre 1920) if you want to actually use them and not just look at them.

                                                                  If you handle your machine with care (never clean it with WD40!), it will last decades as they have lasted already. Apart from my Olympia I have a completely functional 1930ies Continental typewriter that I occasionally type at, but it requires much more pressure on the keys, which I find uncomfortable.

                                                                  Lobsters isn’t a typewriter community (yet), so I think that should complete it. You might want to register at a typewriter forum like this one if you have further questions, as they can be answered there much better and by more competent people than me.

                                                              1. 2

                                                                There’s no need to update dynamic DNS via a script. BIND has native dynamic DNS capabilities by means of the RFC 2136 DNS UPDATE command that uses (symmetric) crypto so you can execute it from your local connection. Back in 2015, I wrote a blog post on that approach, but it’s German.

                                                                On the topic of running things from home: I’m very much a friend of it for control and privacy reasons, but there’s a major bummer for me. My ISP prohibits running servers from home in his ToS; you need to upgrade to a “business account” if you want to do that, and the business account comes with a static IP address anyway, so there’s not much reason to do a complex dynamic DNS setup. As a result, I currently run my website on a simple VPS.

                                                                1. 9

                                                                  Hah, I was actually curious whether AST will make a move. Good to see he did.

                                                                  Still, it’s sad that he doesn’t seem to care about ME.

                                                                  1. 7

                                                                    Whether he cares about ME is irrelevant here. By releasing the software under most (all?) free software and open source licenses, you forfeit the right to object even if the code is being used to trigger a WMD - with non-copyleft licenses you agree not to even see the changes to the code. That’s the beauty of liberal software licenses :^)

                                                                    All that he had asked for is a bit of courtesy.

                                                                    1. 4

                                                                      AFAIK, this courtesy is actually required by BSD license, so it’s even worse, as Intel loses here on legal ground as well.

                                                                      1. 5

                                                                        No, it is not - hence the open letter. You are most likely confused by the original BSD License which contained the so called, advertising clause.

                                                                        1. 5

                                                                          Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

                                                                          http://git.minix3.org/index.cgi?p=minix.git;a=blob;f=LICENSE;h=a119efa5f44dc93086bc34e7c95f10ed55b6401f;hb=HEAD

                                                                          1. 9

                                                                            Correct. The license requires Intel to reproduce what’s mentioned in the parent comment. The distribution of Minix as part of the IME is a “redistribution in binary form” (i.e., compiled code). Intel could have placed the parts mentioned in the license into those small paper booklets that usually accompany hardware, but as far as I can see, they haven’t done so. That is, Intel is breaching the BSD license Minix is distributed under.

                                                                            There’s no clause in the BSD license to inform Mr. Tanenbaum about the use of the software, though. That’s something he may complain about as lack of courtesy, but it’s not a legal requirement.

                                                                            What’s the consequence of the license breach? I can only speak for German law, but the BSD license does not include an auto-termination clause like the GPL does, so the license grant remains in place for the moment. The copyright holder (according to the link above, this is Vrije Universiteit, Amsterdam) may demand compensation or acknowledgment (i.e. fulfillment of the contract). Given the scale of the breach (it’s used in countless units of Intel’s hardware, distributed all over the globe by now), he might even be able to revoke the license grant, effectively stopping Intel from selling any processor containing the then unlicensed Minix. So, if you ever felt like the IME should be removed from this world, talk to the Amsterdam University and convince them to sue Intel over BSD license breach.

                                                                            That’s just my understanding of the things, but I’m pretty confident it’s correct (I’m a law student).

                                                                            1. 3

                                                                              It takes special skill to break a BSD license, congrats Intel.

                                                                              1. 5

                                                                                Actually, they may have a secret contract with the University of Amsterdam that has different conditions. But that we don’t know.

                                                                                1. 2

                                                                                  Judging from the text, doesn’t seem AST is aware of it.

                                                                                  1. 2

                                                                                    University of Amsterdam (UvA) is not the Vrije University Amsterdam (VU). AST is a professor at VU.

                                                                              2. 1

                                                                                I’ve read the license - thanks! :^)

                                                                                The software’s on their chip and they distribute the hardware so I’m not sure that actually applies - I’m not a lawyer, though.

                                                                                1. 5

                                                                                  Are you saying that if you ship the product in hardware form, you don’t distribute software that it runs? I wonder why all those PC vendors were paying fees to Microsoft for so long.

                                                                                  1. 2

                                                                                    For the license - not the software

                                                                                    1. 3

                                                                                      Yes, software is licensed. It doesn’t mean that if you sell hardware running software, you can violate that software’s license.

                                                                                  2. 3

                                                                                    So, they distribute a binary form of the OS.

                                                                                    1. 4

                                                                                      This is the “tivoization” situation that the GPLv3 was specifically created to address (and the BSD licence was not specifically updated to address).

                                                                                      1. 2

                                                                                        No, it was created to address not being able to modify the version they ship. Hardware vendors shipping GPLv2 software still have to follow the license terms and release source code. It’s right in the article you linked to.

                                                                                        BSD license says that binary distribution requires mentioning copyright license terms in the documentation, so Intel should follow it.

                                                                                        1. 3

                                                                                          Documentation or other materials. Does including a CREDITS file in the firmware count? (For that matter, Intel only sells the chipset to other vendors, not end users, so maybe it’s in the manufacturer docs? Maybe they’re to blame for not providing notice?)

                                                                                          1. 3

                                                                                            You have a point with the manufacturers being in-between Intel and the end users that I didn’t see in my above comment, but the outcome is similar. Intel redistributes Minix to the manufacturers, which then redistribute it to the end-users. Assuming Intel properly acknowledges things in the manufacturer’s docs, it’d then be the manufacturers that were in breach of the BSD license. Makes suing more work because you need to sue all the manufacturers, but it’s still illegal to not include the acknowledgements the BSD license demands.

                                                                                            Edit:

                                                                                            Does including a CREDITS file in the firmware count?

                                                                                            No. “Acknowledging” is something that needs to be done in a way the person that receives the software can actually take notice of.

                                                                                            1. 2

                                                                                              The minix license doesn’t use the word “acknowledging” so that’s not relevant.

                                                                                              1. 2

                                                                                                You’re correct, my bad. But “reproduce the above copyright notice” etc. aims at the same. Any sensible interpretation of the BSD license’s wording has to come to the result that the receivers of the source code must be able to view those parts of the license text mentioned, because otherwise the clause would be worthless.

                                                                                      2. 1

                                                                                        If they don’t distribute that copyright notice (I can’t remember last seeing any documentation coming directly from Intel as I always buy pre-assembled hardware) and your reasoning is correct, then they ought to fix it and include it somewhere.

                                                                                        However, the sub-thread started by @pkubaj is about being courteous, i.e. informing the original author about the fact that you are using their software - MINIX’s license does not have that requirement.

                                                                            2. 7

                                                                              I think he is just happy he has a large company using minix.

                                                                              1. 5

                                                                                Still, it’s sad that he doesn’t seem to care about ME.

                                                                                Or just refrains from fighting a losing battle? It’s not like governments would give up on spying on and controlling us all.

                                                                                1. 6

                                                                                  Do you have a cohesive argument behind that or are you just being negative?

                                                                                  First off, governments aren’t using IME for dragnet surveillance. They (almost certainly) have some 0days, but they aren’t going to burn them on low-value targets like you or me. They pose a giant risk to us because they’ll eventually be used in general-purpose malware, but the government wouldn’t actually fight much (or maybe at all, publicly) to keep IME.

                                                                                  Second off, security engineering is a sub-branch of economics. Arguments of the form “the government can hack anyone, just give up” are worthless. Defenders currently have the opportunity to make attacking orders of magnitude more expensive, for very little cost. We’re not even close to any diminishing returns falloff when it comes to security expenditures. While it’s technically true that the government (or any other well-funded attacker) could probably own any given consumer device that exists right now, it might cost them millions of dollars to do it (and then they have only a few days/weeks to keep using the exploit).

                                                                                  By just getting everyday people do adopt marginally better security practices, we can make dragnet surveillance infeasibly expensive and reduce damage from non-governmental sources. This is the primary goal for now. An important part of “marginally better security” is getting people to stop buying things that are intentionally backdoored.

                                                                                  1. 2

                                                                                    Do you have a cohesive argument behind that or are you just being negative?

                                                                                    Behind what? The idea that governments won’t give up on spying on us? Well, it’s quite simple. Police states have happened all throughout history, governments really really want absolute power over us, and they’re free to work towards it in any way they can.. so they will.

                                                                                    They (almost certainly) have some 0days, but they aren’t going to burn them on low-value targets like you or me.

                                                                                    Sure, but do they even need 0days if they have everyone ME’d?

                                                                                    They pose a giant risk to us because they’ll eventually be used in general-purpose malware

                                                                                    Yeah, that’s a problem too!

                                                                                    Defenders currently have the opportunity to make attacking orders of magnitude more expensive, for very little cost. [..] An important part of “marginally better security” is getting people to stop buying things that are intentionally backdoored

                                                                                    If you mean using completely “libre” hardware and software, that’s just not feasible for anyone who wants to get shit done in the real world. You need the best tools for your job, and you need things to Just Work.

                                                                                    By just getting everyday people do adopt marginally better security practices, we can make dragnet surveillance infeasibly expensive and reduce damage from non-governmental sources.

                                                                                    “Just”? :) I’m not saying we should all give up, but it’s an uphill battle.

                                                                                    For example, the blind masses are eagerly adopting Face ID, and pretty soon you won’t be able to get a high-end mobile phone without something like it.

                                                                                    People are still happily adopting Google Fiber, without thinking about why a company like Google might want to enter the ISP business.

                                                                                    And maybe most disgustingly and bafflingly of all, vast hordes of Useful Idiots are working hard to prevent the truth from spreading - either as a fun little hobby, or a full-time job.

                                                                                  2. 4

                                                                                    It reads to me like he just doesn’t want to admit that he’s wrong about the BSD license “providing the maximum amount of freedom to potential users”. Having a secret un-auditable, un-modifiable OS running at a deeper level than the OS you actually choose to run is the opposite of user freedom; it’s delusional to think this is a good thing from the perspective of the users.

                                                                                    1. 2

                                                                                      And the BSD code supported that by making their secret box more reliable and cheaper to develop.

                                                                                    2. 3

                                                                                      Oh, it’s still not lost. ME_cleaner is getting better, Google is getting into it with NERF, Coreboot works pretty well on many newish boards and on top of that, there’s Talos.

                                                                                    3. 2

                                                                                      He posted an update in which he says he doesn’t like IME.

                                                                                    1. 4

                                                                                      Your docs mention that on POSIX systems, paths might not be valid UTF-8 (or any single encoding), but it’s not clear to me what Pathie does in such a situation: are paths containing invalid UTF-8 inaccessible? Can you read them from the OS but not construct them yourself?

                                                                                      Your docs also say that Windows uses UTF-16LE, which is not strictly true: in the same way that POSIX paths are a bucket of bytes and not necessarily valid UTF-8, Windows paths are a bucket of uint16_ts and not necessarily valid UTF-16 (in particular, they can have lone surrogates that do not form a surrogate pair, or values that are not assigned in the Unicode database). How does Pathie interact with such malformed paths?

                                                                                      Lastly, macOS: as your documentation points out macOS does have an enforced filename encoding, but it also has an enforced normalisation (at least for HFS+ volumes). That means your application can create a string, create a file with that name, then readdir() the directory containing that file and none of the returned directory entries will byte-for-byte match the string you started with. Does that affect Pathie’s operation?

                                                                                      1. 2

                                                                                        Your docs mention that on POSIX systems, paths might not be valid UTF-8 (or any single encoding), but it’s not clear to me what Pathie does in such a situation: are paths containing invalid UTF-8 inaccessible?

                                                                                        First off, Pathie does not assume POSIX pathes are UTF-8, because that isn’t specified. Unless you compile Pathie with ASSUME_UTF8_ON_UNIX, it takes the encoding information from the environment via the nl_langinfo(3) function called with CODESET as the parameter (which is why you need to initialise your locale on Linux systems).

                                                                                        are paths containing invalid UTF-8 inaccessible? Can you read them from the OS but not construct them yourself?

                                                                                        In the case of a path with invalid characters in the locale’s encoding (e.g., invalid UTF-8 on most modern Linuxes), you’ll get an exception when trying to read such a path from the filesystem, because iconv(3) fails with EILSEQ (which is transformed into a proper C++ exception by Pathie). You cannot either construct pathes containing invalid characters, because you will receive the same exception. I’ll make this more clear in the docs.

                                                                                        Your docs also say that Windows uses UTF-16LE, which is not strictly true:

                                                                                        Pathes in valid encoding are UTF-16LE. Broken path encodings may be anything and that’s nothing one can make assumptions about. Again, you’ll receive an exception when you encounter them (because the underlying WideCharToMultiByte() function from the Win32API fails).

                                                                                        (in particular, they can have lone surrogates that do not form a surrogate pair, or values that are not assigned in the Unicode database)

                                                                                        I was not aware of that. Do you have a link with explanations, ideally on MSDN?

                                                                                        Lastly, macOS: as your documentation points out macOS does have an enforced filename encoding, but it also has an enforced normalisation (at least for HFS+ volumes)

                                                                                        macOS is not officially supported by Pathie (which is stated at the top of the README), simply because I don’t have a Mac to test on.

                                                                                        That means your application can create a string, create a file with that name, then readdir() the directory containing that file and none of the returned directory entries will byte-for-byte match the string you started with. Does that affect Pathie’s operation?

                                                                                        It shouldn’t affect Pathie’s operation itself. Pathie will simply pass through what the filesystem gives it; since on macOS no conversion of path encodings happens, these normalised sequences are handed through to the application that uses Pathie.

                                                                                        Thanks for the feedback!

                                                                                        1. 3

                                                                                          Do you have a link with explanations, ideally on MSDN?

                                                                                          Unfortunately, I can’t find a smoking-gun writeup on MSDN. However, in my searching, I did find:

                                                                                          • Scheme48 has a special OS String type, and motivates it saying “On Windows, unpaired UTF-16 surrogates are admissible in encodings, and no lossless text decoding for them exists.”
                                                                                          • Racket’s encoding conversion functions include special “platform-UTF-8” and “platform-UTF-16” encodings: “On Windows, the input can include UTF-16 code units that are unpaired surrogates…”
                                                                                          • Rust also includes a special OSString type: “On Windows, strings are often arbitrary sequences of non-zero 16-bit values, interpreted as UTF-16 when it is valid to do so.”
                                                                                          • I found the Rust ticket that introduced the OSString type, which includes a (Rust) test case. One of the Rust devs dug up an MSDN page that says “…the file system treats path and file names as an opaque sequence of WCHARs.”
                                                                                          • That issue also linked to a report of UTF-16-invalid filenames being found in the wild in somebody’s Recycle Bin.
                                                                                          1. 3

                                                                                            It’s a problem of enforcement. Rust internally uses WTF-8 as an internal encoding to fix that.

                                                                                            https://simonsapin.github.io/wtf-8/

                                                                                            1. 1

                                                                                              An interesting read, thank you for the pointer. I’ll see if I adapt Pathie accordingly, but until now it has done the job for me (and it’s mostly a library I use for my own projects).

                                                                                            2. 1

                                                                                              Thanks!

                                                                                        1. 1

                                                                                          The one question I have about any cross-platform GUI library is whether it’s using the native platform APIs or doing its own rendering, usually by means of OpenGL (aka “immediate GUI”). I didn’t find an answer on the website. So far, I only know of exactly one real cross-platform C++ GUI library that uses the native APIs, which is wxWidgets. Most libraries I have encountered promise cross-platform and then use the “immediate GUI” approach, which will never look really native and is a battery killer in my experience.

                                                                                          1. 2

                                                                                            Looking at the source for the graphics class and the button class, looks like they’re just drawing their own widgets via Windows APIs that I’m not familiar with or Xlib (the X11 drawing library). Xlib is an interesting choice; most GUI libraries I’ve seen go for OpenGL, which gives you a lot more control, and isn’t slated to be replaced by Wayland. To your point, it might be friendlier on battery than an OpenGL implementation, since it’s not doing any drawing in immediate mode (and instead looks to be repainting into X11 buffers when needed).

                                                                                            Personally, I’ve given up trying to make the GUIs I work on feel native, especially since the GUIs I’m writing are usually for internal tools or for my own consumption. I default to using Dear ImGUI, which makes GUIs that are very utilitarian and look nothing like native, and is super super nice to use as a programmer.

                                                                                            1. 2

                                                                                              My solution is to just write a native UI for each platform you care about. For me, this basically means I’ll just use Windows, and just keep a rudimentary GTK# frontend working, primarily so logic can be split out from views when reasonable.

                                                                                              I’m a big stickler in things feeling like that they belong to the platform. Otherwise, why not write a web app instead?

                                                                                              1. 1

                                                                                                I think it depends on the target audience; usually, I’m writing a GUI for a piece of C++ code that controls a robot or a geometry system or something, and the intended users are either me or the employees of a company I’m consulting for. Writing a web frontend for some chunk of native code is irritating – I either have to jam a web server into a C++ binary, or create wrappers for some scripting language – and using a native toolkit isn’t that important to me, especially because the applications I’m writing usually have some weird widgets in them anyway (2D sliders, or line graphs, or something).

                                                                                                In general, I think that writing a very simple UI can really aide in debugging complex control systems, and I just pick up the tool that makes it as easy as possible to do so. In my experience, that’s an immediate GUI, without callbacks or a dedicated event loop or anything.