1. 32

    exFAT implementations exist but because of Microsoft’s patents they’re not able to be included by default in a lot of things.

    I’m not 100% opposed to software patents, but man, some of them should be expunged for being just too obvious. For example, this is a patent covering exFAT: Quick Filename Lookup Using Hash

    It literally describes a linear lookup using a hash to avoid string comparisons. As in, hash a filename, get directory entry hash, compare hashes, if match, compare full string, return if true. This is basic computer science. It’s like patenting addition. It’s bizarre.

    (Of course, the University of Texas has a trademark on the color orange and Ohio State has a trademark on the word “the” so who the(R) hell knows what’s going on anymore.)

    1. 21

      Trademark, copyright and patents are all totally distinct things. Trademark is probably honestly the least ridiculous one, because it’s restricted to, y’know, use in trade and must be actively used and protected. UT’s trademark orange, for instance, means Texas A&M can’t market themselves with the same color orange, but would have no applicability should the European wireless provider Orange decide to open operations in the US. Similarly, The™ Ohio State University just means you won’t be seeing ads for The Penn State University any time soon (though that’s certainly not the only, or best, reason).

      All of this should be fairly obvious since Ohio State has filed exactly no lawsuits against anyone audacious enough to use a definite article.

      Copyright, on the other hand, applies automatically and universally, but only to a specific creative work. Nobody can copyright the color orange, but I have copyright on this comment simply by virtue of exercising a modicum of creativity in writing it. If you copy it without my permission, you’re violating my copyright (though I’d have a hard time proving any damages, and I may be implicitly giving fairly broad permissions by posting it on lobste.rs). The biggest problems with copyright are that corporations can hold it and it lasts far too long (thanks and go fuck yourself, Disney).

      Patents are the big bad guys of IP law. Like trademark they don’t apply automatically and like trademark they’re relatively broad (they apply to “use of the invention”, to be construed however a court choses to construe that), but unlike trademark and like copyright, they apply universally, and do not need to be used (thus enabling the existence of “patent trolls” who hold patents with neither ability nor intent to execute the inventions described therein). If UT had a patent on the color orange, they could successfully sue Orange S.A., Crayola, Tropicana, the tiny outfit in Seattle that lined my backpack with orange corduroy when they made it. (IMO, most software patents make about this much sense.) Patents are supposed to cover “novel inventions”, but—as you note—software patents in particular underrun that benchmark constantly and egregiously.

      In my opinion, patents in general have not been performing their desired function of stimulating inventiveness for decades now and should be repealed altogether, but software patents are without question by far the worst offenders.

      (Not a lawyer, this comment is extremely US-centric, related disclaimers, etc.)

      1. 4

        There is a huge world outside software bubble where patent system (even with their well publicized drawbacks) is crucial. For example, most of material/physical world engineering, where the system works largely as intended.

        1. 1

          Do you have any examples or reputable documentation? I don’t doubt you, and that’s well outside my experience, but I haven’t observed it—just stuff like Volvo subverting the intent of the patent system by freely licensing three-point seatbelts.

          1. 4

            What kind of documentation you mean? There’s gazillion of actual, meaningful patents that were granted, let their inventors recoup their R&D costs, earn a profit, and expired. You don’t hear about those because “everything is normal and works as intended” never makes a headline.

            As an example, this one you certainly know about.

            1. 2

              My domain is largely software, but the I.S. lawyers I talk to who do patents in biotech are of the opinion that it’s a pretty negative system in that domain as well, benefitting the rich incumbents at the expense of everyone else.

        2. 4

          Does their commitment to including exFAT in Open Invention Network help with that situation?

          1. 1

            I mean…it shouldn’t have to. But yeah, probably.

          2. 4

            Sometimes I hope that IP law reaches such an apex of absurdity that all of it becomes unenforceable.

          1. 5

            I just started using Python in earnest this year and I was constantly infuriated by examples and modules that were Python 2 but had no indication until you started using them. Over and over I’d find a neat module that did exactly what I wanted but then I’d find out it was Python 2. Absolutely maddening.

            1. 1

              Can you give some examples? I found that years ago there was a lot of this, but many libraries I use now are Python3-only, and many of them even explicitly Python3.6+ (probably to use f-strings).

              1. 1

                These were very hardware-oriented modules for doing robotic related things. The primary one that I needed was an interface for iRobot’s open platform: https://github.com/pomeroyb/Create2Control. I gave up and made two separate scripts because I needed Bluetooth from Python 3 libraries.

                1. 2

                  I’ve had a very similar experience with various Raspberry Pi hardware add-ons. The vendors typically provide a python library to interface with the hardware, but it’s often not updated after it’s released. Try to build something with 2 or more bits of hardware and you find a horrible mess of mismatched python versions and mismatched dependency versions. Worst of all, you don’t find out until runtime whether two versions are incompatible.

              2. 1

                This year? I’m not disputing your experience, but that’s surprising to me since the the community as a whole has been firmly on Python 3 for years now. http://py3readiness.org/ is all green, etc.

                1. 3

                  These were very hardware-oriented modules for doing robotic related things.

                  1. 2

                    I for one am not surprised. The long tail is pretty long still. We’ll be dealing with infrequently maintained domain specific Py2 codebases and docs for many years to come.

                    1. 1

                      I wish I’d known about the “Can I Use Python 3?” tool that is on the page @kbd linked to. That would have saved me some frustrating moments. https://github.com/brettcannon/caniusepython3

              1. 2

                I have long been frustrated and annoyed at the state of modern event stuff. I wish Meetup had really evolved from when it started (the WeWork acquisition seems to have done absolutely nothing useful for it), and I am bummed that Upcoming.org seems to have stalled back out again (https://github.com/upcoming/upcoming-www). I still use Meetup quite often because of it’s popularity, since it seems to be the only viable easy-to-use option to Facebook events.

                This seems like a fun exercise, and I like that the IndieWeb has ideas for events/RSVP… but practically, this is pretty tedious. And my friends meeting up for a pint aren’t going to do this (hardly anyone I know in real life even has a website anymore).

                Has anyone built anything on top of Mastodon for events? Seems like a great pub/sub platform for it.

                1. 5

                  This is happening way too often.

                  1. 3

                    And could easily avoided (in this case at least) with 2FA enabled on account.

                    1. 2

                      And it’s going to accelerate, in all package registries. Every time one is discovered and publicized it will give ideas to new attackers.

                    1. 10

                      End of an era, I suppose. There’s no such thing as a healthy monoculture, just monocultures that haven’t found their blight yet.

                      1. 33

                        According to that Stack Overflow survey there are plenty of popular alternatives, such as “ZIP file back-ups”, “Copying and pasting files to network shares”, and “I don’t use version control”.

                        1. 18

                          Did the ever popular “Final FINAL Copy 2” make it in? That’s got to be the number one version control system, right?

                          1. 9

                            My first programming job was like that. I was working as a repair tech at a computer shop, at some point they needed someone to clean up the intranet, and that someone was me.

                            First thing I did was set up svn and get rid of all the “foo.orig”, “foo.orig2”, etc directories. This was trickier as it might seem as some of the projects were being served from those .orig2 dirs.

                            All was going well, and then half a year later the guy who had been doing this asked me if knew what happened to the “cms.orig” directory. After I told him I deleted it he said he had been storing the company outings photos there for the last 15 years. By the time we discovered it was too late to recover from backup, so … all lost.

                            I still don’t understand why you would store pictures in a deeply nested subdir of an otherwise unused cms.orig …. 🤨 from what I heard he ditched svn and went back to his “system” after I left.

                        2. 13

                          Well, just to play a devil’s advocate… Some things are so good exactly because they are ubiquitous. Like Unicode, for example. Or POSIX. They have their flaws for sure, but they made writing interoperable software much easier.

                          There are already other tools that work with the same repository format as the Torvalds’ git. Maybe git format becoming the standard repo format is a good thing after all. No one has to use the reference implementation if they prefer different UI and abstractions.

                          1. 14

                            Maybe git format becoming the standard repo format is a good thing after all.

                            No, it’s definitely not. It doesn’t scale. The git “API” is literally the local filesystem. Microsoft has valiantly hacked the format into functioning at scale with VFS for Git, but the approach is totally bananas.

                            How does it work?

                            VFS for Git virtualizes the filesystem beneath your Git repository so that Git tools see what appears to be a normal repository when, in fact, the files are not actually present on disk. VFS for Git only downloads files as they are needed.

                            VFS for Git also manages Git’s internal state so that it only considers the files you have accessed, instead of having to examine every file in the repository. This ensures that operations like status and checkout are as fast as possible.

                            - vfsforgit.org

                            Microsoft had to implement an entire virtual filesystem that, through smoke and mirrors, tricks git to behave sanely. More details in this GVFS architecture overview.

                            1. 4

                              Isn’t the same true for Mercurial and every other DVCS in existence?

                              1. 17

                                No. Git’s remote repository API is nothing more than a specialized rsync implementation (git-send-pack and git-receive-pack).

                                Mercurial uses a semantic API for exchanging changes with the server. It doesn’t need local files in the same way git does. That opens up a lot of doors for scaling large repositories, because you can implement optimizations in the client, protocol, and server.

                                For git repos, where local filesystem operations are the protocol, there really is no alternative to Microsoft’s smoke and mirrors, virtualize the world approach. You’d have to just reimplement git, which defeats the point.

                                1. 1

                                  Ah, that is interesting. Thanks for the information, I should look into the way mercurial actually works.

                                  1. 3

                                    If you’re curious about the actual on-disk formats (which should be irrelevant, as hg tries to compartmentalise them), you can read about Mercurial’s internals.

                              2. 4

                                I don’t see anything wrong with git using the local file system API.

                                There are multiple implementations of such file systems – Linux, FreeBSD, OS X, Minix, etc. git works fine on all those systems, and the code is portable AFAICT.

                                1. 8

                                  So, I personally love how git transparently exposes its internal data structures for direct manipulation by the user. It gives you tons of power and freedom. Back when I used git, I considered it just as much a development tool as my editor.

                                  But that transparency is problematic for scaling. To the point where you really do need to implement a virtual remote filesystem tailored for git to support huge repos. Whether you like git or not, that’s bananas.

                                  1. 5

                                    There’s nothing bananas about that: scaling is a feature and it’s not surprising that you need more code/engineering to scale. It would be surprising if you didn’t!

                                    To make a very close analogy, two companies I worked at used Perforce (the proprietary VCS). At one company we used it out of the box, and it worked great. Thousands of companies use Perforce like this, and Perforce is a very profitable company because as a result.

                                    The second company (Google) also used Perforce out of the box. Then we needed to scale more, so we wrote a FUSE-based VFS (which I imagine the git VFS is very similar to). That doesn’t mean Perforce is “bananas”. It works for 99% of companies.

                                    It’s just designed for a certain scale, just like git is. Scale requires a lot of tradeoffs, often higher latency, decreased throughput, etc. git seems to have made all the right tradeoffs for its targeted design space. That it succeeded beyond the initial use case is a success story, not an indication of problems with its initial design.


                                    Also, I don’t see any evidence that Mercurial reached the same scale. It probably has different problems – you don’t really know until you try it. I heard some teams were working on scaling Mercurial quite awhile ago [1], but I’m not sure what happened.

                                    https://engineering.fb.com/core-data/scaling-mercurial-at-facebook/

                                    1. 4

                                      Then we needed to scale more, so we wrote a FUSE-based VFS

                                      I currently work at Google. CitC has nothing to do with Piper performance, it’s more about the utility of sharing your workspace, both between dev machines and tools (desktop, cloudtop, cider, critique), as well as blaze.

                                      (which I imagine the git VFS is very similar to).

                                      Not at all. The git “protocol” is filesystem operations. Microsoft made VFS for Git because they need to intercept filesystem operations to interface with the git toolchain. Perforce and Mercurial have actual remote APIs, git does not.

                                      That doesn’t mean Perforce is “bananas”. It works for 99% of companies.

                                      I don’t think Perforce is bananas. I don’t think git is bananas either. I specifically think “git format becoming the standard repo format” is NOT a good thing. The git toolchain and the repo format are inseparable, leading to Microsoft’s bananas implementation of a scalable git server. Clever and impressive, but bananas.

                                      1. 2

                                        What I’m reading from your comments is: “If only git had decoupled its repo format and push/pull protocol, then it would be more scalable”.

                                        I don’t think that’s true. You would just run into DIFFERENT scalability limits with different design decisions. For example: Perforce and Mercurial don’t share that design decision, as you say, but they still have scalability limits. Those designs just have different bottlenecks.

                                        Designing for scale you don’t have is an antipattern. If literally the only company that has to use a git VFS is Microsoft, then that’s a fantastic tradeoff!!!

                                        IMO Google’s dev tools are a great example of the tradeoff. They suffer from scalability. They scale and solve unique problems, but are slow as molasses in the common case (speaking from my experience as someone who worked both on the dev tools team and was a user of those tools for 11 years)

                                        1. 2

                                          I don’t think that’s true. You would just run into DIFFERENT scalability limits with different design decisions.

                                          Git was probably strongly tied to the filesystem because it was made in 2005 (Pentium 4 era) for a lower-performance scenario by someone who understood the Linux filesystem better than high-performance, distributed applications. It worked for his and their purposes of managing their one project at their pace. Then, wider adoption and design inertia followed.

                                          It’s 2019. Deploying new capabilities backwards compatible with the 2005 design requires higher, crazier efforts with less-exciting results delivered than better or more modern designs.

                                          1. 1

                                            “If only git had decoupled its repo format and push/pull protocol, then it would be more scalable”.

                                            It would be easier to scale. When the simplest and easiest way to scale genuinely is implementing a client-side virtual filesystem to intercept actions performed by git clients, that’s bananas. To be clear, VFS for Git is more than a simple git-aware network filesystem, there’s some gnarly smoke and mirrors trickery to make it actually work. The git core code is so tightly coupled to the file format, there’s little else you could do, especially if don’t want to break other tooling using libraries like libgit2 or jgit.

                                            Designing for scale you don’t have is an antipattern.

                                            Designing tightly coupled components with leaky abstractions is an antipattern. Mercurial supports Piper at Google through a plugin. Doing the same with git just isn’t possible, there’s no API boundary to work with.

                                    2. 3

                                      To the best of my knowledge it still uses the intricate knowledge of filesystem behaviour to avoid (most?) fsync calls — and the behaviour it expects is ext3 (which is better at preserving operation order in case of crash than most other filesystems).

                                      I actually had a hard poweroff during/just after commit corrupt a git repository.

                                      So, in a way, the API it actually expects is often not provided…

                                      1. 1

                                        Do you mean its reliance on atomic rename when managing refs? Or some other behavior?

                                        1. 3

                                          I would hope that atomic renames are actually expected from a fullu-featured POSIX FS (promised by rename manual page etc).

                                          But Git also assumes some ordering of file content writes without using fsync.

                                          1. 2

                                            Renames are only atomic with respect to visibility in a running filesystem, not crash safety, though. So I guess it’s not surprising you’ve seen corruption on crash.

                                            I’ve been trying to find a way to run a small-scale HA Git server at work — as far as I can tell the only viable option is to replace the whole backend as a unit (e.g., GitHub DGit/Spokes, GitLab’s Gitaly plans). Both GitHub and GitLab started by replicating at a lower level (drbd and NFS, respectively), but moved on. I can say from experience that GlusterFS doesn’t provide whatever Git requires, either.

                                            1. 1

                                              There is at least some pressure to provide metadata atomicity on crash (you cannot work around needing that, and journalled filesystems have a natural way to provide it), but obviously data consistency is often the same as without rename. And indeed there are no persistency order guarantees.

                                              Systems like Monotone or Fossil outsources the consistency problem to SQLite — which uses carefully ordered fsync calls to make sure things are consistent — but Git prioritised speed over portability outside ext3. (Mercurial is also not doing fsync, though)

                                              And if ext4 doesn’t provide what Git needs, of course GlusterFS won’t…

                              1. 2

                                Are there mobile apps for both Android and iOS? A big part of the appeal of Google Photos is the automatic upload from mobile.

                                1. 2

                                  I would even argue that’s the primary feature.

                                  1. 1

                                    Not for now, but the site is mobile friendly. We are thinking about a way to do the syncing without hurting user privacy.

                                  1. 5

                                    What exactly is the architecture here? How exactly does it use blockchain tools beyond a marketing line?