1. 62
    1. 16

      I’m unconvinced by the section on relays. Surely relays function as something of a central point of failure, who can ban whatever users they wish, either because of legal pressures (not a lawyer, but pretty sure it’s illegal in most countries to distribute libel, pirated media or CSAM) or simply because the people maintaining the relay wish not to be complicit in the distribution of certain content. (I’m not saying such bans are a bad thing - in fact I’d view a relay that refused to do so with a great deal of suspicion! But ultimately moderation is subjective, and it’s likely that the relay operators will make the wrong call, at least sometimes.)

      And because full-network relays are expensive to run, we end up with a situation where you have, at best, a choice of a few large entities as your moderator. And once we get past there being one relay operated by BlueSky Inc, all the “usability fun” of federation creeps back in, with users needing to be aware of - or configuring - the relay(s) their various AppFeeds consume.

      On the PDS side, it’s not clear to me if they intend for there to be a standardized way for relays to find PDSes (or vice versa). But this certainly doesn’t exist yet: BlueSky Inc’s relay has just started accepting posts from third-party relays, but the process for signing up involves asking nicely on a Discord server(!) and a limit of 10 accounts per non-BlueSky PDS.

      1. 1

        On the PDS side, it’s not clear to me if they intend for there to be a standardized way for relays to find PDSes (or vice versa).

        From the docs:

        Each PDS serves a stream of all of the activity on the repos it is responsible for. From there, relays aggregate the streams of any PDS who requests it into a single unified stream.

        https://docs.bsky.app/docs/advanced-guides/firehose

        And once we get past there being one relay operated by BlueSky Inc, all the “usability fun” of federation creeps back in, with users needing to be aware of - or configuring - the relay(s) their various AppFeeds consume.

        Not sure if you’re referring to AppViews or Feeds, but at least in the case of the latter:

        1. All that’s needed is for the Feed provider to choose a less restrictive relay. Which if you were providing, say, a piracy related feed, would be very likely.
        2. Subscribing to a Feed has similar UX to playlists. Find it from the owner’s profile. Or find it from search. Regardless, it’s a button click.
        1. 6

          From there, relays aggregate the streams of any PDS who requests it into a single unified stream.

          Right, so the onus is about the PDS to find all the relays they care about and ask the relay nicely - although it seems like there’s at least provision for “ask nicely” to be a self-serve API.

          All that’s needed is for the Feed provider to choose a less restrictive relay.

          But that “less restrictive relay” will be missing posts from PDSs who only bother to ask for inclusion on the official bsky.network relay, no? The problem isn’t how difficult it is to subscribe to a feed - sounds like that’s fine - the problem is the inherent complexity in the fact that each feed might be looking at a different relay. So a user could be subscribed to the “posts from users I follow, chronologically” feed, the “posts from users I follow in the past 24 hours, sorted by likes” feed, and the “posts from users I follow, sorted by some nebulous algorithm” feed, and all these feeds would have slightly different sets of posts because some of the user’s friends are on a PDS that hasn’t kept on top of every new relay that pops up.

          …or in practice everyone just uses the bsky.network relay, because it’s too complicated/expensive to stand up a new relay that gets a useful subset of the network, and it isn’t really decentralized at all.

          1. 3

            There’s not a 1-1 requirement of feed to relay, as far as I know. A Feed provider could combine the less restrictive relay feed, and the more restrictive one, and present both. Deduplicating is easy since the same post looks the same no matter which relay it’s coming from.

            …or in practice everyone just uses the bsky.network relay, because it’s too complicated/expensive to stand up a new relay that gets a useful subset of the network, and it isn’t really decentralized at all.

            This is certainly a possibility! Time will tell.

            1. 2

              Right, so the onus is about the PDS to find all the relays they care about and ask the relay nicely

              You only need to advertise to one. The other relays can just watch each others for for new PDSes. com.atproto.sync.* doesn’t provide for any way for a PDS to distinguish one crawler from another.

              the fact that each feed might be looking at a different relay

              That’s a good point. A user may unconfident in how complete their upstreams are and there’ may be no easy way to check.

          2. 1

            [a bit tangential] with my limited knowledge, I thought ActivityPub / Mastodon suffers from scaling issue. Why they cannot use something like relays?

            1. 3

              ActivityPub does use a similar technique; for receiving, it’s called “shared inboxes”, where all users hosted on a single domain can receive posts at the same time. For sending, it’s called “relays”, which allow one server to become aware of all public posts from one or more other servers, no matter the following relationships there. Both have existed for a while.

              1. 1

                I see. Why do I hear people saying that ActivityPub cannot scale? Is that a general consensus?

                1. 4

                  I certainly don’t believe that’s the case. I run an ActivityPub server that federates with >1500 other instances on a $12 DigitalOcean droplet, and it only costs that much because I run some bots and such alongside it.

          3. 9

            Minor nit, @steveklabnik: “It is real, meaningful account portability, and that is radically different from any similar service running today.”

            One of the protocols Jay reviewed before being put in charge of bluesky was Peergos [0], and one of the key features of peergos is full, automatic account portability, maintaining data, links, friends and identity. There are also a lot of similarities between our PKI which is like a merkleised key transparency log and did:plc.

            [0] https://book.peergos.org

            1. 3

              Ah thank you! I had never heard of Peergos before. I’ll amend the post with a reference to this later today.

            2. 6

              This is very good. I’ve been wanting a tech overview in this level of detail for a while now. I’m suddenly very curious what other distributed services might sit well on top of ATP - will there be a ForgeFed competitor?

              It also is currently 100% public, there are no private messages or similar. The reasons for this is that achieving private things in a federated system is very tricky, and they would rather get it right than ship something with serious caveats.

              This is very wise, wiser than Mastodon rushing ahead and doing it anyway IMO. (GNU Social previously avoided such a feature too.)

              There was only one thing in the post I thought was really missing, and that’s how custody of private keys worked, especially if your PDS goes down. I found the answer in the official guide.

              Each DID document publishes two public keys: a signing key and a recovery key. […] The signing key is entrusted to the PDS so that it can manage the user’s data, but the recovery key is saved by the user, e.g. as a paper key. This makes it possible for the user to update their account to a new PDS without the original host’s help.

              1. 3

                I’m suddenly very curious what other distributed services might sit well on top of ATP - will there be a ForgeFed competitor?

                I personally am interested in developing a Lexicon that serves a similar purpose as rss. Maybe forums. We’ll see. I am enthusiastic about this stuff but also tired.

                1. 2

                  I’m not sure I see how it would be less bandwidth or server load than RSS. There are peer-to-peer systems which can amortize bandwidth costs, particularly Bittorrent, Freenet, and Veilid; they all have the feature that nearby peers can stochastically provide popular content-addressed objects as-is with a nickname lookup. For BlueSky or Nostr, it’s not clear whether that would actually be an overall savings. or merely a transfer of responsibilities from the original RSS publisher to the relay operators.

                  1. 1

                    I’m not saying I believe it to be a good idea, only that I am interested in playing around with it :)

              2. 3

                What’s bluesky’s plan to make a profit?

                1. 4

                  Here’s their first thing they’ve done: https://bsky.social/about/blog/7-05-2023-business-plan

                  They’ve described it as “services-led business model,” and said they won’t be doing ads. We’ll see!

                  1. 3

                    Seems promising. Although I don’t know how you can convince a bunch of VC investors to give you a lot of money unless they’re expecting some kind of outsized return on their money. The investors at least are probably banking on the fact that these “business model ethics” are squishy and can be revised later.

                    I think we would have a lot of prediction power if we knew who has controlling shares in the company.

                    1. 1

                      Yeah, I hear you. On one hand, they’re a public benefit corporation, but on the other hand, I know that doesn’t mean that much.

                      1. 2

                        Anyway, thanks for the writeup. It seems like a worthy experiment. Maybe a carefully designed technology can resist possible centralization attempts.

                2. 3

                  For me, this document raises the “Pixiv problem”. That is to say, even assuming every PDS is okay with hosting any content they are legally permitted to, some content is legal in some jurisdictions and not others.

                  More specifically, from reading this and the ATProto docs, my understanding is that PDSes host (that is, both store and serve) a user’s repository (that is, their posts). Based on this section:

                  If your PDS goes down, and you want to migrate to a new one, there’s a way to backfill the contents of the PDS from the network itself, and inform the network that your PDS has moved. It is real, meaningful account portability, and that is radically different from any similar service running today.

                  It seems to me that, in accepting a new user, a PDS accepts responsibility for hosing everything that user has ever posted, both legal terms and in terms of storage space and bandwidth. What if that user and their previous PDS were in a jurisdiction where, say, lolicon is legal, but the new PDS is not? This is the basic reason that most ActivityPub implementations don’t port posts upon receiving the Move activity. How does BlueSky handle this case?

                  Edit: I also think that this line about the did:plc scheme is a bit disingenuous:

                  I personally see this as an example of pragmatically shipping something, others see it as a nefarious plot. You’ll have to decide for yourself.

                  I think the opinion of most folks who are bsky-shy is neither of these; or, rather, it’s an example of pragmatically shipping something and then acting like its replacement is already here. If we are going to evaluate BlueSky on how DIDs might work in the future, we also have to agree to evaulate other distributed solutions on their promises rather than their realities.

                  1. 3

                    This is the basic reason that most ActivityPub implementations don’t port posts upon receiving the Move activity.

                    I think this is a stretch. Object identifiers in ActivityPub are either null or “publicly dereferencable URIs, such as HTTPS URIs, with their authority belonging to that of their originating server” (0).

                    While some implementations (like honk) readily permit importing posts from data exports, these can’t fully assume the old posts’ identities and are unable to migrate likes/boosts/replies from other servers. Notably, ActivityPub’s assumption of invariant object identifiers prevented the recently shutdown queer.af instance from simply adopting a new domain outwith the .af TLD (1).

                    What if that user and their previous PDS were in a jurisdiction where, say, lolicon is legal, but the new PDS is not? […] How does BlueSky handle this case?

                    Bluesky encodes attachments to posts as a blob Type (2): these aren’t directly stored in user repositories, instead just a reference to the blob’s CID is used (3).

                    Where illegal content is uploaded to a PDS as a blob, the PDS can refuse to serve the blob without otherwise manipulating the user’s repository (4). Blobs can be taken down by PDS admins through the com.atproto.admin.updateSubjectStatus XRPC method, eventually landing in the ModerationService’s takedownBlob.

                    tl;dr: new PDS is empowered to enforce local laws by refusing to serve problematic blobs (or, if needed, the entire repository).

                    For better (~queer.af type situations) or worse (~lolicon), moving to a third PDS would allow a user to restore blobs that have been taken down by the second PDS by reuploading them bit-for-bit (from a backup, archive, …) while preserving the CID.

                    1. 3

                      I think this response conflates identity with content, which is common in these discussions, but I want to be sure we separate these things. First of all, it is absolutely true that ActivityPub relies on DNS as its identity system. I personally think we can take better advantage of that [1], but that’s a core aspect of the network. If your instance loses ownership over its domain, your identity is lost, unless you can issue a Move activity before that happens.

                      In the case of queer.af, many users did Move to other instances, preserving most or all of their social graph automatically. In theory - and that’s the realm we’re operating in, because nobody has yet migrated the entire userbase of a PDS under adverse conditions - Erin could have spun up a new instance at queeraf.othertld, generated profiles and accounts for all the users from queer.af who hadn’t yet Moved, and issued Move activities for their accounts, moving everyone there and preserving their identities. In this way, ActivityPub and ATProto provide a similar level of protection for identity and social graphs.

                      Where they differ is in trust. In ATProto, your DID is tied to either the DNS system or BlueSky-the-company (and, in future, maybe to key material you create and own). In ActivityPub, your identity is tied to the instance you’re using. ActivityPub defends you better against a single company being able to destroy your identity; ATProto defends you better against a compromised or destroyed PDS.

                      So much for identity; on to content. You assert:

                      While some implementations (like honk) readily permit importing posts from data exports, these can’t fully assume the old posts’ identities and are unable to migrate likes/boosts/replies from other servers.

                      This is true, but I don’t think it’s really… that important? There are two cases where this matters: metrics and links.

                      Let’s discuss links first. In both ATProto and ActivityPub-as-implemented, links are mostly tied to a particular service, and can’t move around if that service goes down. In ATProto, that’s the Application, [3] whereas in ActivityPub, the data’s owner (like the PDS) and the canonical web view into it are bundled. [2] In both cases, if the service fronting the links goes down, the links become ~impossible to access, as time tends towards infinity; neither design really solves this problem.

                      For metrics, it’s certainly true that ATProto solves this problem where ActivityPub doesn’t. I think this is an example of a tradeoff that ActivityPub makes fairly intentionally; it’s not possible to reconstruct the like count of a post from observing the network, because that would make likes public, which is not acceptable in AP’s privacy model. I prefer AP’s solution; other people prefer ATProto’s solution; that’s fine.

                      But there’s another point, and it’s the main thing I don’t think the ATProto folks have really thought through. There’s actually no technical reason you couldn’t reconstitute posts as you move an account from one ActivityPub server to the next; as you say, honk permits this. My assertion in the GP was that the reason Mastodon, Akkoma, and GoToSocial don’t do this is because it makes moderation hard. Your response was:

                      Where illegal content is uploaded to a PDS as a blob, the PDS can refuse to serve the blob without otherwise manipulating the user’s repository (4). Blobs can be taken down by PDS admins through the com.atproto.admin.updateSubjectStatus XRPC method, eventually landing in the ModerationService’s takedownBlob.

                      This is, of course, true of ActivityPub as well. Taking down posts, or just media from posts, is not difficult on existing ActivityPub servers. One could imagine building automated workflows to do this during account content moves, even. But that is difficult; either it’s a time-consuming and soul-draining manual process of checking possibly thousands of posts for content that is illegal or against the {instance, PDS’s} rules, or an extremely error-prone automated one with user frustration and possible legal consequences if your automation fails.

                      At a protocol functionality level, the only difference between ActivityPub and ATProto here is that ATProto preserves the ability to reconstitute post metrics across PDS moves, which is, frankly, a pretty small win, in my opinion, for the resulting privacy issues, as well as putting the vast majority of the userbase’s “distributed” identities (as well as the permission to stand up your own PDS!) in the hands of a venture capital-backed startup.

                      1: https://github.com/mastodon/mastodon/issues/24760 2: That said, even in ActivityPub, it’s possible to access a post on the web through a third party; In the queer.af case, many queer.af posts remain cached on other instances, and looking them up by their previous unique ID (their URL) works fine, just as it would in ATProto. 3: I anticipate a protestation that the Application link would still work if the PDS goes down or moves. That’s true, but it only matters if we imagine that Application servers are more stable, over all, than PDSes. Why would that be the case?

                      1. 2

                        the reason Mastodon, Akkoma, and GoToSocial don’t do this

                        Another reason is that they’re generally hard-coded around a “post arrives, send post out” workflow - backfilling would require logic around “if this post is backdated[1], do not send it out”. It’s not impossible[2] but it does require a whole bunch of “who can do this?”, “what are the constraints?” etc. considerations.

                        [1] Which the MastoAPI doesn’t provide support for anyway. [2] I locally-patched my Akkoma to do this when backfilling 10 years of Twitter bot content.

                        1. 2

                          That’s true. As you say, though, this is not an inherent limitation, just something most people don’t really care to change, since they don’t want to import posts anyway.

                    2. 2

                      I understand this perspective, but I think it only really works if you see the current state as an end point, and not a start. I haven’t heard the devs literally ever invoke Pixiv, but the Japanese contingent on BlueSky is large. They’ve demonstrated that they understand issues like revenge porn, CSAM, and aren’t interested in trying to make such content available forever. This stuff isn’t simple in a federated system.

                      I also think that this line about the did:plc scheme is a bit disingenuous:

                      I don’t intend to be disingenuous. I hear you that like, a lot of folks who are bsky positive tend to think about it aspirationally, that is, what it could be, and not what it is today. I think that early on in life, that stance makes sense, but also appreciate that “wait and see” isn’t good enough for everyone. That’s fine.

                      To me, when I think about systems, I want to know what futures they make possible, and which they make impossible. This specific decision doesn’t preclude other possibilities, if you dislike how did:plc works, and that’s important, to me at least. And I have seen the team act in repeatedly good ways, and so I tend to view them charitably, wheras some people are just inherently suspicious. It’s up to them to prove that our good faith is well founded. All I meant was that it’s fine with me if you don’t want to hear it and prefer to evaluate it more harshly.

                      1. 12

                        I think you know I respect you a lot. The way you describe ATProto and the things you see in its future are really exciting, and I want to share your optimistic vision. But to me, it feels like a lot of people - yourself included - are taking an optimistic view of ATProto despite having had a historically pessimistic view of, especially, ActivityPub.

                        ActivityPub has problems, but it also does a lot of really amazing things without any money and while being dramatically less centralized than BlueSky is in practice. I wish we could see what ActivityPub would look like if cutting-edge developers like Tobi were supported by an 8 million dollar seed round, rather than a bunch of nerds giving them a $60 a month. I wish we could see what Mastodon itself could look like if it had 8 million dollars to start out with rather than, at a stretch, two million Euros ever.

                        In particular, BlueSky is marketing itself in an inherently disingenuous way. They published a paper about ATProto being “Usable Decentralized Social Media” before it was meaningfully decentralized; it still isn’t, really, since it’s not possible to run your own PDS without Bluesky-the-company’s blessing, or your own Relay. That makes me pretty skeptical that the BlueSky team is going to do a lot of work to make it truly decentralized, and I freely admit, a little bitter. We’ve been doing decentralized social media in the {StatusNet, OStatus, ActivityPub, …} network for, depending on how you figure it, a decade; why does this half-baked Twitter clone get the spotlight?

                        There are many problems with ActivityPub; but for all its faults, ActivityPub was born decentralized. It has dozens of server implementations, of which perhaps a dozen are usable in production, and hundreds of clients. I myself use a non-Fediverse ActivityPub based network, as well as having an account on the wider Fediverse, using the same software to the same ends but without providing Mastodon gGmbH anything, even content. That’s simply not possible with ATProto and its related software right now. I also don’t see anything in the way of implementation efforts for third-party PDS and relay software (though it’s possible that they just exist outside my bubble.)

                        The OP perpetuates the same issue elsewhere, too:

                        There’s no “BlueSky server,” there’s just servers running atproto distributing messages to each other, both BlueSky messages and whatever other messages from whatever other applications people create.

                        As far as I understand, that’s not true, unless we also say there is “no Facebook server” because facebook.com is a distributed system, or that there was no WhatsApp server when they used XMPP internally. There is a BlueSky server - it’s the Relay and the PDSes they own, and they control access completely.

                        This [feeds] to me is one of the killer features of BlueSky over other microblogging tools: total user choice. If I want to make my own algorithm, I can do so. And I can share them easily with others. If you use BlueSky, you can visit any of those feeds and follow them too.

                        Total user choice - as long as you’re not choosing to run a PDS Bluesky-the-company doesn’t want to federate with. Even beyond that, it’s hard for me to imagine that in a future with, say, thousands of PDSes and hundreds or dozens of Relays, no Application blocks a Relay, and no Relay blocks a PDS. If the response is that this is solved by combining the feeds of multiple Relays, well, that’s entirely possible in other microblogging tools, and it’s not widely implemented in ATProto yet either, because it doesn’t need to be, because… it’s all run by one company.

                        If what you really want is total user choice, I don’t think we can stop at federation; I think we have to build truly P2P social media. ActivityPub ain’t that, but neither is Bluesky.

                        1. 3

                          I do, and I appreciate the length. I feel we’re going back and forth here, so I’ll probably drop this thread after this, but I do think there’s one part here that’s illustrative of this dichotomy we’re on the opposite ends of:

                          it still isn’t, really, since it’s not possible to run your own PDS without Bluesky-the-company’s blessing, or your own Relay.

                          I can see that. From my perspective:

                          1. The team says “here’s our design for federation, it’ll come, but it’s difficult.”
                          2. The team refactors the internal codebase to run a database per user, in preparation for real federation
                          3. The team runs their own servers, tests out federation.
                          4. The team opens up federation, with the above caveats.

                          We are here. Which leads you to say

                          That makes me pretty skeptical that the BlueSky team is going to do a lot of work to make it truly decentralized

                          But to me, what I see, is (these are their direct words): “Here, in this first phase of federation, you can file a request for your self-hosted PDS to be crawled and added to the production federated network. This is an early access phase as federation rolls out. In the next phase, you will not need to file a request through Bluesky.” I also see a team who has consistently promised to move towards openness. One of their mottos is “the company is a future adversary.” They are executing a plan, and continue to execute on that plan.

                          So, that’s why I’m not with you here: it feels like you’re reading bad faith into a slow and thoughtful rollout process of a core feature of the entire project.

                          That said, I do think your point about future vs present is apt, and I’m going to be thinking about it for a while. For me though, the present of Mastodon is not fit for purpose, and that’s maybe why I am so forward looking.

                          (oh, one last thing: “I also don’t see anything in the way of implementation efforts for third-party PDS and relay software (though it’s possible that they just exist outside my bubble.” I don’t yet either; I think that’s because the API isn’t considered to be set in stone just yet. I personally want to reimplement everything, but don’t know if I have the energy or time, and so I don’t want to begin until things are more settled. I suspect many others are in the same boat.)

                          1. 3

                            Yeah, I don’t intend to make this a further back and forth. I appreciate you reading and responding to my wall of nonsense. I genuinely do hope you’re right, but I deeply fear that you’re wrong and that we’ll be stuck with a centralized service with decentralized trappings.

                            I would also encourage you to take that forward-looking perspective into your personal evaluations of non-Mastodon ActivityPub software. There is a lot of work going on in that space, in a lot of different directions.

                          2. 2

                            I wish we could see what ActivityPub would look like if cutting-edge developers like Tobi were supported by an 8 million dollar seed round, rather than a bunch of nerds giving them a $60 a month. I wish we could see what Mastodon itself could look like if it had 8 million dollars to start out with rather than, at a stretch, two million Euros ever.

                            I generally agree with your logic but I think this part overstates the value of money. I think 1 personally-motivated programmer beats 10 money-motivated programmers in productivity and quality of output the majority of the time. I think anyone who is personally-motivated to work on non-centralized micro-blogging platforms would be happy with a modest pay, so long as they can live in comfort and with security.

                            I think we have to build truly P2P social media.

                            What do you mean exactly by P2P social media? I know your intent is something non-federated but what does that look like in practice? Do you mean every user has a private key and the client is all you need to disseminate messages?

                            1. 4

                              I think anyone who is personally-motivated to work on non-centralized micro-blogging platforms would be happy with a modest pay, so long as they can live in comfort and with security.

                              Yep, absolutely agreed. I can give you names of about a dozen people I’d hire if Nora’s Acme ActivityPub LLC got funding today, and I’d bet you anything that we’d eclipse BlueSky in a year or less.

                              What do you mean exactly by P2P social media?

                              Honestly? I don’t know. I’ve written a bit about it in the past, but the core of my design indecision is that most people - even the kind of people who are willing to self-host an ActivityPub server - don’t want to do PKI, and can’t be relied upon not to fuck it up. I say this as someone who semi-regularly uses PGP-encrypted email.

                              I do have a vision for a truly P2P design that would probably work pretty well if the RSA fairy came along and solved PKI for us, though.

                      2. 2

                        Happy to see more research in federated apps and protocols. I’m really curious about that part.

                        Is there a site with a list of the various servers (relays?) that you can signup on?

                        1. 2

                          How likely will one company use atproto when activityPub seems to be gaining momentum with Threads adopting it?

                          Not currently. Federating your PDS was only turned on a few days ago, and they’re doing it slowly. To quote, “Account migration is enabled, but we recommend that you do not migrate your main account yet and treat this as an experimental phase!”

                          Things will get there, but it’s still extremely early.