1. 44
  1.  

  2. 9

    This has been discussed before, but I was wondering if anything has changed.

    Several difficulties and problems were raised … have they been addressed?

    Does anyone use it?

    1. 10

      I’ve been there for close to 2 years, and have tried to build my own SSB server from scratch (in a non-JS language). Feel free to ask any questions. For starters:

      • The low level protocol (transport encryption, RPC and discovery) is very well documented.

      • The application level protocol has almost no documentation, and what’s there is outdated. You really have to resort to reverse engineer behaviour from existing applications, or reading other’s code.

      • Replication/gossip mechanism is very inefficient, which leads to clients (especially mobile ones) spending a lot of time during the initial sync. There’s a newer gossip protocol which fixes some of these problems, but has zero documentation, and there’s only one implementation (in JS). There are no plans to port it to other languages since there’s a lot of tricky edge cases in there.

      • Yes, the JSON encoding was a mistake. There’s a new format using CBOR, but it’s still a few ways off in terms of mainstream usage in the network.

      • There are questionable decisions at the application level. For example, anyone can assign you new profile pictures or visible names, which can–and has–lead to bullying/name-calling.

      In terms of community, it’s mostly tech-centric, most discussions are either about SSB itself, or related protocols. The overall spirit is positive, focusing on sustainable living, gardening, off-grid, etc.

      However, the community is very small. This can get tiring, considering that most clients will show any replies to threads you follow at the top of your timeline (you will see the same 3 to 5 people all the time).

      1. 4

        I’ve also built a partial SSB implementation, in C++. I found a C port of the CLI tool (in an obscure Git repo hosted in SSB), which helped immeasurably with the secure handshake and packet codec. I used Couchbase Lite [of which I’m the architect] as the database. This proved a lot faster than the JS data store, but I still found that pulling all the subscribed content from one pub (on which I’m following a handful of accounts) resulted in a 600MB database. It would have helped if the protocol had an option to pull only messages back to a certain date.

        I’d love to know more about the new protocol, but not if the only way is to decipher a JS codebase.

        It’s a shame they’re so JS-centric. That’s a serious problem for iOS, which has security restrictions that disallow JITs outside of a web browser. Not to mention embedded systems. (And on a personal level I dislike doing serious programming in JS; it feels like sculpting in Jell-O.)

        1. 3

          There are two C implementations of the low level protocol: one for the secret handshake and one for the boxstream (transport encryption). There’s also some integration tests that you can run against your implementation to validate that everything works.

          As for the new replication protocol: the new CBOR format includes “off-chain” contents, which means that the actual log only contains a hash of the post content. This should make initial sync much faster, since clients only fetch the chain of hashes, without downloading anything else.

          Messages can also be downloaded out of order, so you only download what you want, if you have the hash for it. As most things, though, the only implementation is in JS.

          As for the database, I planned to use SQLite, but never got far enough to test that. I’m unconvinced that the log is a good abstraction for the kind of apps that SSB is used right now (social media). There are future plans to plug more applications on top of the log replication, but that’s in the long term, while the current use-cases are suffering from it.

          Edit: wanted to say, though, that for me the biggest block when developing an SSB implementation is the lack of documentation w.r.t. the application-level protocol, and forcing you to develop everything on top of the log abstraction. The JSON formatting can be painful, but solved via forking some json library and doing some changes (hacky, but it works).

        2. 1

          Yes, the JSON encoding was a mistake.

          Was JSON not fast enough?

          1. 3

            It’s not about JSON per-se, but more about how message signing works. In SSB, every post is represented as a JSON blob signed with your publick key, and it expects other clients to validate this, as well as produce valid JSON messages.

            The spec goes over the requirements of a valid message, as well as the steps to compute a valid signature. Unfortunately, it assumes things like key order (which the official JSON spec doesn’t say anything about), indentiation, spacing, etc. (This all goes back to how the V8 engine implements JSON.stringify()). This adds a lot of complexity when implementing SSB in another language, as most JSON libraries won’t care about specific formatting when printing, and specifically the key order requirement makes it quite complicated.

            All in all, it’s not the end of the world, but it adds enough friction to make SSB pretty dependend on the “blessed” javascript implementation.

        3. 5

          I use it regularly. Why..? After being a heavy social media user on the usual platforms, I’ve pretty much removed myself and don’t participate, but Scuttlebutt is the exception because it’s a fun place to be. Nothing more, nothing less.

          1. 2

            I’m one of the core developers. Happy to answer any questions.

            1. 2

              What difficulties and problems were raised?

              1. 11

                There’s two big obstacles I’ve seen that seem specific to Scuttlebutt:

                • There’s no easy way to share identities across devices. If you want to use Scuttlebutt on your computer and your phone, they’ll be separate accounts.

                • The protocol is built around an append-only log, which I’m not convinced is a good principle for any social network. Inadvertent mistakes are forever (eg. I post an unboxing photo that has my unredacted invoice visible; I paste an embarrassing link by accident).

                  It also seems like you could grief pub servers (the Scuttlebutt “hub nodes” that federate content more widely). What happens if someone posts a bunch of illegal content to the pub? As I understand it, all the pub users will pull that content down. You might be able to blacklist certain posts in your client, and you can block users, but their content is still on your device. (Bitcoin has faced similar problems.)

                1. 2

                  Your objection to log storage is valid, but there are ways around it. The data format could have ways to redact posts out of the log while leaving the overall integrity intact; in fact all revisions of a post other than the latest one could be redacted.

                  Of course the redactions need to be propagated, and there’s no guarantee every copy will be redacted, but that’s an intrinsic problem with most P2P protocols, since distributed caching/replication is so important for availability.

                  1. 1

                    Good points.

                    Also ironically where Facebook could have a chance to differentiate themselves, but chose to go in almost the exact different direction:

                    • “with federated networking you are trusting each and every host that your host ever federated with to delete a post, with us, once you click delete it is gone. Worldwide. Same with sharing: if you share something with your close friends it stays there. With a federated network it depends on every host around the globe sticking implementing the rules correctly and sticking to the rules.”

                    • fortunately IMO Facebook messed up massively early on and now everyone in tech now they are fundamentally untrustworthy.

                  2. 8

                    The main problem I saw when I looked into it was that it was a single program rather than a well-defined protocol that anyone in the ecosystem could implement.

                    This might have changed by now, but (for instance) there were aspects baked into the protocol that fundamentally prevented you from building a compatible client unless you used a JSON serializer with the exact same behavior as node.js, because the cryptographic identity of a post was based on a checksum of the output of that particular serializer rather than some inherent property of the post. An easy mistake to make, but one with far-reaching consequences.

                    1. 6

                      That’s my issue as well. It relies heavily on the interaction of a few dozen NodeJS repos. Different frontends all rely on the same backend code, making it not-that-diverse.

                      Also, while the protocol itself is well documented and designed, there are some obvious shortcomings. The protocol relies on hashes of pretty-printed JSON. The last time I checked for documentation on HOW to pretty-print that json, it was documented as “like v8 does it”. Tell you - it’s REALLY hard to format JSON like V8 JSON.format(x, true) does. Especially floating point numbers.

                      Now this could easily be fixed by changing the protocol from hash-of-pretty-printed-json-subobject to hash-of-blob. ({"data": "{\"foo\" 42}", "hash": ...} vs. {"data": {"foo": 42}, hash: ...} But you can’t do that without breaking compatibility. Worse, relying on hash-chains, you need to implement the old behaviour to be able to verify old hash chains.

                      1. 4

                        That’s my top issue too. Which is sad, because there’s so much prior art on canonicalizing JSON, going back to 2010 or so.

                        The spec is fine, but limited. It doesn’t cover all of the interactions between peers; there are plenty of messages, and properties in the schema, that aren’t documented.

                        A lot of the discussion and information about the protocol takes place on Scuttlebutt itself, meaning it’s (AFAICT) invisible to search engines, and accessible over HTTP only through some flaky gateways that often time out.

                        The main client is (like everything else) written in JS, so it’s a big Electron app, and in my experience very slow and resource hungry. I only joined two pubs and followed a handful of people and topics, but every time I fire up the app, it starts a frenzy of downloading and database indexing that lasts a long time. (They use a custom log-based DB engine written in JS.)

                        Part of the slowness must be because when you join a pub you’re implicitly following every user of that pub, and I believe all of the IDs they’re following. So there’s kind of an explosion of data that it pulls in indiscriminately to replicate the social graph and content.

                      2. 4

                        Reading the previous discussions shows a degree of frustration and scepticism. I’m sure it’s changed, and many of the questions will have been addressed, but the previous discussions are unconvincing.

                        Here are some links to some of them … it’s worth reading them in context:

                        https://lobste.rs/s/rgce6h/manyverse_mobile_scuttlebutt_client

                        https://lobste.rs/s/hncoad/secure_scuttlebutt

                        https://lobste.rs/s/l9vqm4/scuttlebutt_protocol_guide

                        https://lobste.rs/s/xe2z2r/scuttlebutt_off_grid_social_network

                      3. 1

                        Despite the fact it’s in a ‘junkyard’, I believe this issue remains unresolved, which I believe effectively means that:

                        1. scuttlebutt is not cross platform and only works on x86
                        2. It’s difficult to implement scuttlebutt libraries and clients in other languages

                        Limiting development to people who like nodejs, and limiting usage to x86 devices (when it seems like the sort of concept that should work well with mobile devices) massively reduces its appeal.

                        I would be happy to find that I’m wrong.

                        1. 3

                          You’re off on this one. I’m running SSB on ARM64 just fine, also many pubs are actually just raspberry pis on some closet.

                          I is still difficult to implement SSB in other languages mostly because of the amount of work than technical challenges. The mistakes of the past are well understood at this point even if not fixed. At the moment there are 2 Rust based implementations and one based in Go. IIRC there is also implementations in Elixir and Haskell but I am not sure how far they are. I’ve toyed with a client mixing C and Lua (just a toy prototype but it worked).

                          1. 2

                            It definitely didn’t work on arm last time I tried, so it’s good to hear they’re making some progress. It was the prototype Haskell implementation which pointed to that issue as a blocker: it looks like it hasn’t been updated since, so probably doesn’t work.

                            1. 2

                              I know it is not what you’re looking for but the JS implementation works fine under Linux/ARM64, which is also how the mobile apps are running.

                              My daily driver is a Surface Pro X with an ARM64 CPU. I’ve run go-ssb as native ARM32 binary and the Electron based client apps under win32 x86-32 bits emulation. The reason for it is just that for the love of all that is sacred I can’t find out how to build the native nodejs modules as ARM64 binaries.

                            2. 1

                              Could you link to the Haskell implementation? I’d be interested in working on that!

                        2. 4

                          I would love to hear about all the non-JS implementations and their status.

                          1. 2

                            The Rust and Go implementations are coming along very nicely, and there’s an iOS app that uses the Go implementation internally. I’ve also been experimenting with a lightweight implementation in Python, but it’s not really ‘done’.

                          2. 3

                            This is the first time I’ve heard of this, but this puts me off:

                            The first time you join the network there will be a lot to download and process. For real, this “inital syncing” process can take up to an hour and use a couple of gigabytes.

                            It’s reminiscent of running a bitcoin node…

                            1. 1

                              True, but the “have every post from all of the people you follow” is impossible unless you download all of the posts from people you follow. You aren’t downloading the entire network, just the people whose content you want to download.

                              1. 1

                                Can you only keep the most recent (say) 6 weeks worth of activity? That’d be workable.

                                1. 1

                                  Yes, but the reference implementation doesn’t do that. Like Git, it’s absolutely possible to keep shallow clones without violating any part of the protocol.

                            2. 1

                              I like the idea. I tried manyverse the other day on my phone but it was a bit laggy (and heavy on the battery I guess). Like when I follow someone.

                              Stories with similar links:

                              1. Secure Scuttlebutt via pushcx 3 years ago | 14 points | 3 comments