1. 7

  2. 4

    You can store the length of the tail in the node to avoid the two extra trips through the list (to compute length).

    I wouldn’t assume someone who uses malloc/new would worry too much about putting another ptrdiff_t length in the node, but if this gives you pause, consider this: Up to an obscene 256TiB, the 16-byte aligned node has 20 bits of “free space” in its pointer (4 at the bottom, and 16 at the top), and you can steal at least an extra 16 bits from the value, meaning you’ve basically got a “free” 36-bit integer with which to store the length of the tail. The bit-shifting might seem expensive, but when you’re pointer-chasing the cache-effects will almost certainly dominate, and a better than 3x-improvement in time is probably worth it.

    1. 1

      Sure, but then you lose O(1) insertion into the lists, and if you’re not using that then why are you using linked lists at all?

      (if the tails of some lists are shared then you may lose the ability to do insertion at all unless you use even more memory on bookkeeping)

      1. 1

        You lose O(1) insertion at the tail but you keep it at the head, so prepending items to the list is still O(1)

        1. 1

          Sure, but if you’re only adding items at one end you might as well use an array (well, I guess if insertion latency matters you might still prefer a list).

      2. 1

        You can store the length there. But there might be something more important to store in those bits. Which is part of the issue with the linked post: it doesn’t contextualise the problem sufficiently to take a holistic approach to optimising it.

      3. 1

        Namely, that the two lists, once they intersect, have the same elements. That’s an important assumption, and it’s what allows the approach outlined above to work. If the lists just intersected in one element, and then went their separate ways

        I can’t see how this, i.e. going separate ways, would be possible. A node has a single pointer to the next element. So the tails of the lists have to be identical if the two lists intersect.

        1. 3

          I agree. The author of the post seems to think, wrongly, that it’s the values stored in the nodes that need to be compared for equality, not the node pointers themselves. But the original problem statement clearly shows that they’re looking for the list structures joining.

          The term “intersection” is IMHO wrong here; it would have been better to describe the two lists as “merging”.

          A more interesting and useful way to describe this problem is to consider the links to be “parent” pointers in tree structures. The question is then, given two nodes, to find their common ancestor, or else show that they’re not in the same tree.

          1. 2

            It is perhaps an unexpected definition of “intersection” that is being used here; You should read the linked problem-statement which may make this clearer: They aren’t considering the use case of [a,b] and [a] perhaps because they are thinking of blockchains or git histories or something like that, but for whatever reason it isn’t given as a possibility the program needs to deal with.

            The author might also not be aware that “intersection” means something very different to some (many? most?) people, and that might be contributing to your confusion.

          2. 1

            Two ideas come to mind for this problem; the first is to turn the list into a xor list so that once you know the length of the lists, you can traverse backwards until there’s a difference. (which you’d have to fix afterwards, but you can do that as you traverse backwards along the nodes). This works assuming my understanding of “intersection” is what others have discussed below; the point where the two lists share a tail.

            My other thought is how you can use tying the knot to do this in Haskell… but I can’t quite work out how to do it in O(1) space. Functions that feel useful:

            lengths [] [] l r         = (l-r)
            lengths [] (_:ys) l r     = lengths [] ys l    (r+1)
            lengths (_:xs) [] l r     = lengths xs [] (l+1) r
            lengths (x:xs) (y:ys) l r = lengths xs ys (l+1) (r+1)
            dropLonger xs ys = let dif = lengths xs ys 0 0 in (drop dif xs, drop (-dif) ys)
            restAreEqual = foldr (\(l,r) (h:rest) -> (l == r && h) : h : rest) [True] 
            ghci> lengths "123" "ABCDEF" 0 0
            ghci> dropLonger "ABCDEF" "123"
            ghci> restAreEqual $ zip [1,2,3,4,5] [1,2,7,4,5]