1. 48
  1.  

  2. 30

    Therefore, if after reading this post, you independently rediscover the algorithms presented here, that’s ok, but you must still license your “independent rediscovery” under the Gnu GPL-2.0 license, and cite the sources (for instance this post).

    This sounds legally void to me. GPL covers code, not algorithms. Algorithms aren’t copyrightable anywhere, and patentable only in some countries.

    1. 4

      It was even stranger this morning.

      Warning about licenses

      This blog post contains documentation about Pijul. Pijul is licensed under the Gnu GPL-2.0 or any later version at your convenience.

      Therefore, if after reading this post, you independently rediscover the algorithms presented here, that’s ok, but you must still license your “independent rediscovery” under the Gnu GPL-2.0 license, and cite the sources (for instance this post). This also applies if that rediscovery happens in the future, including in zero, one or more years.

      1. 3

        Seems this has been removed?

        1. 1

          There’s a dispute in the European legal sciences about the copyrightability of algorithms, but indeed, most people here agree that algorithms are not copyrightable. Those who do not, usually also think that code and algorithm is the same. For those without any background in IT, this difference is difficult to grasp.

          (I studied Law in Germany)

          1. 1

            While I agree with you as far as us law goes, are we really sure that the same is true about every countries copyright law?

            I think I’ll just stay away from this one.

            1. 1

              That depends on the meaning of derived works. The algorithm is not described there independently of the code, it’s hosted on the same website, as documentation of the code.

              But nobody forces you to read that post.

              1. 10

                The GPL is only about code as @dmbaturin said. The blogpost itself is on GFDL which do not prevent me from reading article and implementing the same algorithms. AFAIK this wouldn’t be considered as a derivative work in any jurisdiction that I am aware of. So for me it also sounds void.

                1. 3

                  I would personally tend to agree with @dmbaturin on this one, although I am far from being a legal expert and in any circumstances, the final world would be to the potential court that would review the case.

                  Licensing is hard @_@

              2. 18

                The intellectual property claims were removed, so I went ahead and read the article.

                Byte level changes seem too small when working with text files, but I’m hopeful that this problem is resolved and just not explained in the blogpost considering “Of course, this is just the minimal set of dependencies needed to make sense of the text edits. Hooks and scripts may add extra language-dependent dependencies based on semantics.”.

                Consider for example if I have bytes “E3 81 81” (utf-8 for ぁ), one person changes this to bytes “E3 81 93” (utf-8 for こ) and another changes it to bytes “E4 81 81” (utf-8 for 䁁). I would not expect these two changes to merge to “E4 81 93” (utf-8 for 䁓).

                Even character level changes feel a bit to fine grained. If we start with “a = b”, one person changes this to “a -= b” and another to “a = a-b” we probably don’t want these changes to combine to “a -= a-b”.

                Lines are very convenient because in practice they are “sufficiently granular” that changes with conflicting semantics usually conflict in the diff. I’m worried that moving to byte level changes would greatly increase the set of diffs that incorrectly merge.

                1. 12

                  Byte level changes seem too small when working with text files, but I’m hopeful that this problem is resolved and just not explained in the blogpost

                  Maybe that was poorly explained, but byte-level changes means that you have maximum flexibility when creating a change, including taking the programming language into account, not that you should use that flexibility to merge UTF-8 characters with each other.

                  And indeed the current Pijul does diffs on lines.

                2. 9

                  Thanks for the in-depth explanations on what was going on @pmeunier, I can understand why it took you several days to write!

                  1. 4

                    Just to nitpick a bit, because this has been bothering me for a while: Conflict-free replicated datatypes aren’t a solution for conflicts.

                    Making a CRDT is easy. Whenever you have a conflict you hash both data sets and throw away the one with the lower hash. This may be stupid but it fulfills the criteria for a CRDT.

                    The conflict between the authors intentions is still there, the only thing that is conflict-free is the state of different copies of the data after all the changes have propagated. The conflict of the changes that where made is semantic in nature and probably has to be resolved at a language level and/or by a human.

                    Which is probably why Pijul ended up with counter-intuitive behavior by treating text as a graph and using that representation for conflict resolution.

                    1. 6

                      Just to nitpick a bit, because this has been bothering me for a while

                      You seem to be in agreement with the post. That section of the post just says “here is some related work, it’s called CRDT, and wasn’t enough to solve it”.

                      1. 6

                        I have worked on something similar (filesystem synchronization with a CRDT model) and I agree, people misunderstand CRDTs.

                        Behind the theory, the idea of CRDTs is that from the user’s point of view they are what we usually consider “data structures” plus rules that define how users can modify then and how the data will be merged in all cases. That means that:

                        • There is no case in which the system will say “I don’t know what to do” and have to ask for outside help synchronously. Adding something to the data structure that says “ask a human later” is fine.

                        • The end result will not depend on which device does the merge.

                        • Most importantly those rules are built into the data structure and exposed to the user (they are part of the API).

                        Now the problem is to design a CRDT that does what the end user wants, and that’s a lot harder than just making a CRDT “that works”…

                        1. 1

                          I think the directed graph representation comes from the original paper as explained well for non-mathematicians here.

                          1. 2

                            It’s inspired by that, but does a lot more. The actual thing used by Pijul is explained in the blog post linked here as well (and your link comes from that person reading Pijul’s source code and asking us questions about it).

                        2. 4

                          “This situation, where Alice writes something in the middle of a paragraph p, while Bob deletes p in parallel.” So, Bob deletes ABC and Alice adds a D for ABDC. But then the next paragraph talks about the case that Bob only deletes C. And I don’t understand why that would cause a conflict at all.

                          Can somebody explain?

                          1. 6

                            A clarification first: “deleting the paragraph” means deleting at least one byte immediately before or immediately after, or both.

                            Now, there’s a design choice here, we could either treat this as a conflict or not, and it’s probably better to have false positives than false negatives for conflicts.

                            In the simple model described in the post (Pijul does something a bit more complicated), there is both an alive and a deleted edge to C, so is C alive or dead? Well, we can’t decide, so we call that a conflict.

                          2. 3

                            Anyway, it seems this new name has offended some people.

                            I think there’s an important difference between someone saying “this offends me!” and “just letting you know, the new name has unfortunate and probably unintended readings”.

                            1. 1

                              I read that and thought oh no, I hope they weren’t refering to my comment on here.

                              I was not offended, and I think the worst it would do over here is induce giggles.

                              That might be something they want to avoid, or not; totally up to them.

                              1. 2

                                There were way worse comments than understandable giggles, and I don’t want to go through that again.