1. 15
  1.  

  2. 6

    Out of curiosity:

    Say I need a ordered list of things. Multiple actors (threads, coroutines, whatever) will hold references to things and the list, and will add new entries either before or after the entry they hold. The insertions must be fast and avoid memory copy. A different type of actor will periodically read the list, so order must be maintained. It has to all be in memory, and in process, because latency must be minimum.

    What would the rust-native solution for this?

    Update: just to clarify, I don’t have a actual problem that matches to this constraints, I just tried to come up with something that would be optimally solved with linked lists. Just a thought experiment =)

    1. 6

      Could you describe a problem that leads to this sort of set of requirements? My instinct upon hearing this is to dig deeper and see if the problem admits an entirely different kind of solution.

      1. 2

        Oh, there is no problem, I just tried to come up with a set of constraints that would be best suited to be solved with a linked list.

        1. 2

          Each actor has a local sorted vector. The reader thread iterates over them for a total ordered view.

          1. 1

            Why three?

            1. 2

              Oh, I said three because of the three items in multiple actors. One vector per thread. I’ll edit to clarify.

          2. 2

            At that point you’re just begging the question. ‘Given this problem that’s been specifically designed to cater for solution X and only solution X, what other solutions are there?”

            1. 1

              Well, it’s not only solvable with linked lists, people came up with several interesting possibilities.

              I didn’t want to “make linked lists win”, I just wanted to see if they could be beaten on their on turf, so to speak, and looks like they can, at least sometimes.

        2. 4

          How many things? How large are they? Are they comparable for order or is this insertion order only?

          Avoiding a memory copy implies chasing pointers which implies latency.

          Echoing @justinpombrio, what’s the problem we’re solving?

          1. 4

            I’d personally use the im crate and use its immutable vectors, and have the threads send new vectors to each other on update. Having multiple writers holding references to the same structure is kinda cursed no matter what you’re doing.

            1. 1

              I was thinking something similar. You could even combine this with arc-swap so you could have more precise control over the latency of receiving updates to that data structure.

            2. 4

              The most idiomatic Rust data structure for linked lists is std::collections::LinkedList.

              Alternatively, if your question is about alternatives to linked lists in general, I’ve had success writing code involving multiple concurrent sequenced writers using (1), a ring buffer of pointers, (2) a deque with indexes (instead of pointers), and (3) a hash map of vectors. That’s a much bigger problem space than what the article’s topic, which is (IMO mostly just griping) about people writing libraries that the author thinks aren’t necessary.

              1. 5

                std::LinkedList doesn’t have cursors on stable (and doesn’t have multiple cursors on nightly), so it doesn’t solve the problem.

              2. 4

                I don’t know how to solve this in concurrent case, but for a single-threaded case one example that comes to mind is the mutable flavor of rust-analyzer syntax trees (for refactors mutable api,where you hold several “cursors” into the tree and mutate around them, is most natural). We just do linked list there: https://github.com/rust-analyzer/rowan/blob/master/src/sll.rs.

                1. 3

                  Even in this obviously-contrived scenario to favor a linked list, I wonder whether this “periodic low-latency” access won’t actually be relatively costly and high latency due to pointer chasing and randomly touching cache lines belonging to different cores. Those atomic pointer swaps will be chatty, and you’ll be paying alloc per object.

                  In modern CPUs there is ridiculously large cost difference between local CPU work and a pipeline stall due to touching cross-core caches and going all the way to the RAM. This is why we have hyperthreading — the RAM-waiting core is stalled for so long that it has time to switch to another thread, do some work there, and switch back before it hears back from RAM.

                  I’d try each worker thread append to a non-sorted vector sized appropriately for the cache, then periodically sort it, and send it away to the reader. This amortizes allocations, and is super cache-friendly. The reader then can use merge sort to combine multiple vectors (or iterate without copying). This way data remains owned by one thread most of the time, first sort pass will be multi-threaded but also happen locally in each thread’s cache, and the reader will have mostly-sorted mostly-contiguous snapshot of the data all for itself.

                  1. 1

                    You might not believe me, but I swear cache did cross my mind while I was writing my, indeed, very and purposely contrived example XD

                    All these answers are almost making me wonder where linked lists are good for anything anymore, really.

                    1. 4

                      I am fairly confident that liked list are the most overused data structure out there, but it doesn’t mean that they are entirely useless.

                      Off the top of my head:

                    2. 1

                      You can have ‘fat’ linked lists analogous to cdr coding wherein each node has a fixed capacity greater than 1 (and a variable occupancy).

                  2. 2

                    It’s true that deleting or adding elements in the middle involves a lot of copying, but your algorithm is O(n) even with the single-cursor list libraries, because it must first walk the cursor to the desired element.

                    I’d love to see some benchmarks for this cuz just pointing at the Big O notation assumes a lot. For example, if I have 10k items in an abstract sequence and need to insert an item at 5k, how slow is it to walk the linked list and insert vs slicing and concatenating a vec or vecdeque?

                      1. 2

                        Excellent, I’ll read it!

                        EDIT: looks like the linked list is faster until you get above 16k elements. Seems roughly correct to me.

                        I wonder if the size of the element is relevant here. If you’re adding a struct, will the copying be slower overall? I don’t know enough about how copying actually works to guess.

                        1. 2

                          Ah, I think there’s one funny bug in the benchmark: Rust’s vec uses 2 as a growth factor, so printing out at 2**k is correlated with reallocs. Updated the benchmark to print out cummulative times for powers of ten instead.

                          1. 1

                            Ah yeah, good call. Thanks for the update. That’s a pretty dramatic difference to the previous numbers.

                        2. 1

                          Decided to play around with this. Using –release, here’s your given code on my machine:

                          total vec  184.52ms
                          total list 4.74s
                          

                          I tried using a struct (Rectangle example struct with an extra String and usize) as the object being inserted, and it was somehow worse for the linked list? Reading the code for split_off, I’m surprised that it’s so much slower with a struct than when using a usize. I wonder what I’m misunderstanding.

                          total vec  2.40s
                          total list 13.28s
                          

                          I then experimented with the unstable “linked_list_cursor” feature and using a mutable cursor (and mutable pos) outside of the loop. Here’s where a cursor shows why to use it.

                                  // relevant change
                                  if i > 0 && i % 2 == 0 {
                                      cursor.move_next();
                                  }
                                  cursor.insert_after(struct);
                          
                          total vec  1.97s
                          total list 6.17ms
                          

                          Thanks for the prompt to dive into this stuff!

                        3. 3

                          This quote is surprising to me but I presume that it’s due to the fact that cursors are borrowed in Rust? In other languages, linked lists are often useful for object that must exist in multiple collections at the same time. Even if you are using lists for all of the collections, they’re usually O(1) to find because they are at the front of some list and then you need to remove them from another list. For example, threads may be in a circular linked list per process, where you need to be able to cheaply insert or remove them but travers only on some very slow paths. They may also be in a sleep queue or a scheduler queue, where you typically just push or pop. There’s no need to ever traverse the lists anywhere near a fast path, but you do need to remove thins quickly.

                        4. 1

                          Linked lists are typically the first data structure introduced to computer science students, invariant of domain or language or whatever else. It’s at least a little funny to see them challenged on the basis of usefulness or safety or etc.

                          1. 3

                            It’s funny, but also worth doing. Linked lists might have been competitive on 80s hardware, but they’re extremely rarely the best tool for the job on modern hardware.

                            1. 2

                              Linked lists are abstract data structures, same as hash maps or binary trees or whatever else. Whether or not they’re the best tool for the job isn’t a function of the hardware on which they run, it’s a function of the problem they’re used to solve, right?

                              1. 3

                                No, because constants matter and hardware changes constants. In particular, cost of arithmetics to memory load and store changed considerably through history of hardware, to detriment of linked lists.

                          2. 1

                            What you actually want if you really want a “C style” linked list, esp. in an embedded setting, is intrusive-collections

                            1. 1

                              There is a whole rant I could have about how the whole software and computing community is pathologically neophilic. Often we seem to actively resist reusing ideas, let alone code; and are ignorant and dismissive of what has gone before.

                              Reusing ideas is not the same as reusing code, so that’s not a fair comparison. Mindlessly reusing code leads to npm-style dependency hell.