1. 26
  1. 20

    Inadequate search is my top misfeature of Fediverse. With virtually non-existent discoverability within Fediverse I just don’t have a good reason to use it.

    People getting upset that the stuff they post might get discovered is strange to me. The toots are out there. If one’s afraid someone might go through the whole timeline they shouldn’t have posted it in the first place. If someone’s out there to get you, your indignation won’t stop them. And it it’s possible to do it manually, it can be automated in the browser. At the moment Fediverse can not both propagate your content and protect it from being read. I mean, people are not deleting their old toots because someone might read them but still get upset when someone reads them.

    If that level of privacy is what they’re looking for they have to accept that Fediverse is not it. Then they might start moving towards a more private solution. Until then no amount of negative emotions will get them any closer. Denying reality will not give them what they want.

    1. 7

      With virtually non-existent discoverability within Fediverse

      What does discovery look like to you on other platforms? How do you use twitter (or other platforms) to find new people/content?

      The toots are out there. If one’s afraid someone might go through the whole timeline they shouldn’t have posted it in the first place.

      There’s a difference between “here’s this thing and if you happen across it organically that’s cool” and “here’s this this i want you to slurp it up into your indexing service”, and that’s where they’re coming from. I am of two minds, I think overall people do want better ways to find new people to follow, but there’s another desire for only wanting to be found organically. And it turns out most of the long time users want organic interactions.

      If that level of privacy is what they’re looking for they have to accept that Fediverse is not it. […] Denying reality will not give them what they want.

      If anyone is denying reality I think it’s everybody creating the indexers. They have found that the level privacy they need, claiming the fediverse doesn’t provide this when it clearly and explicitly does is just wrong. Just because the protocol itself doesn’t provide explicit protections you’re talking about doesn’t mean the fediverse and the people inhabiting it haven’t solved these problems through non technical means.

      To use everyone’s favorite quote, “your scientists were so preoccupied with whether they could, they didn’t stop to think if they should”. The communities that have found a home in the fediverse are simply asking all these developers writing crawlers and indexers to ask if they should.

      1. 9

        How do you use twitter (or other platforms) to find new people/content?

        On Twitter I can search for anything and quickly find people who talk about the thing I searched. Note that Twitter provides full text search and near real time too. So I can not only search for my favourite programming language but also current events. The later is especially useful because I’ve found more than a single digit of people who have expertise in areas I don’t and didn’t even know how those are called to be able to find them “organically”.

        Tangentially, isn’t coming by a tweet/toot in search more valuable? People are actively looking for a specific thing so their interest is probably higher than randomly stumbling upon a random toot/tweet. Anyway…

        On Mastodon, though… I can only search local instance and tags only. Not all current events or other things one might want to search for have a tag. Not all interesting people are on my instance.

        Just because the protocol itself doesn’t provide explicit protections you’re talking about doesn’t mean the fediverse and the people inhabiting it haven’t solved these problems through non technical means.

        Well… Did they? As OP pointed out people just can’t know for sure what any particular instance does. Like, if I set up my private instance and follow a bunch of people I’ll have all their toots in my db. It’s all one SQL query away and no one would even know. I’d argue that “asking nicely” not to be indexed is not a viable solution.

        The communities that have found a home in the fediverse are simply asking all these developers writing crawlers and indexers to ask if they should.

        And what would happen when any given developer would answer that they, in fact, should? Current reality of Feriverse doesn’t quite have an answer to that, does it?

        1. 2

          Tangentially, isn’t coming by a tweet/toot in search more valuable?

          I mean I think that is what people want. You find a profile by seeing something that someone else boosted, or by it being in your local or federated timelines, meaning someone in your orbit is following them, or follows someone who boosted them. The ideal (I think) being that the content filters through the network in an organic way, and not through a centralized index.

          I can only search local instance and tags only. Not all current events or other things one might want to search for have a tag. Not all interesting people are on my instance.

          This is definitely true, but I think people would argue they don’t want to be your news source for current events. If you want news, find a news account you like, or go to news websites.

          To be clear I say “people would” and similar vague noncommittal things because I am still trying to figure out my own opinions on the topic, though I do currently lean a lot towards the side of “you need to make your fancy new product explicitly opt-in”.

          Well… Did they? As OP pointed out people just can’t know for sure what any particular instance does. Like, if I set up my private instance and follow a bunch of people I’ll have all their toots in my db.

          Sure, and the entire network doesn’t operate without that. But the concern is not you individually doing those things, it’s you doing that at a mass scale with the sole intent of indexing the network. In the case of Searchtodon I think the point where the line was crossed is when it went from being “i downloaded these files to my mac and can use spotlight to search them on my own machine” to being “i created a server which will download those files for you to search, and the indexes are colocated with other users’ data”.

          And yes, you can of course do a ton of things without people knowing, and possibly do more nefarious things like sentiment analysis and account correlation, and there wouldn’t be any easy way to know. But given the community feedback projects like this are clearly not welcome. A lot of people are fiercely opposed to this. But if you were to do something like this, and kept it secret, if the greater community found out I’m not positive anyone but a FAANG company could weather it.

          And what would happen when any given developer would answer that they, in fact, should? Current reality of Feriverse doesn’t quite have an answer to that, does it?

          I mean? Its the community reaction not actually a sign that you’re probably not going to be able to say “yes we should do this”?T hat same reaction is proof that they actually do have an answer: “we will fight you tooth and nail on this, until your project is untenable or until we perish”.

          1. 5

            Sure, and the entire network doesn’t operate without that. But the concern is not you individually doing those things, it’s you doing that at a mass scale with the sole intent of indexing the network. In the case of Searchtodon I think the point where the line was crossed is when it went from being “i downloaded these files to my mac and can use spotlight to search them on my own machine” to being “i created a server which will download those files for you to search, and the indexes are colocated with other users’ data”.

            … colocated with other users’ already publicly available data. I still don’t see what the hoopla is since there isn’t any difference.

            It’s the community reaction not actually a sign […]

            You mention the “community” a lot in your posts—why do you consider only those who oppose fediverse search to be a part of it? What about the rest of us fedizens who actually want it? Why is their reaction the only one to be considered?

            1. 3

              I mean their reaction is the one I would give weight to because they’re the ones saying “I don’t want you doing this with my data”, while you’re asking “what about my desire to have their data used this way?”

              Yes I am saying the community a lot because from my perspective the overwhelming majority of people do not want this. The numbers on the searchtodon post are impressive for the fediverse, but I saw a ton of posts from instance admins speaking out against it, so yeah, I think it’s fair to say the community in this case, because that is my view of the network.

              And sure fine do what you want, just be prepared to be instance blocked, and make sure your instance admin knows you’re going to use a service like this, because if people who don’t want you using their data like this find out that you’re using their data, they are going to instance block you. I’m not saying it’s right or wrong, and it is something the network is prepared to accept and designed for, I’m just saying this is a known consequence, and a lot of instance admins aren’t willing to be cut off from that chunk of the network for that reason.

              At the end of the day, you’re free to do what you want with the fediverse and things published in the activitypub format. A ton of fascists have setup their own mastodon (or other ap) servers, they also have been disconnected from the greater network because nobody wants to deal with them. The software is open source and the API is public and well defined, do what you want, be ready to be locked out of parts of the network for it.

              This is why I keep saying people are trying to get around the social problem with a technical solution. Yes, you can do all these things. But here are the known consequences of doing it. And no technical solution can get around that.

              1. 3

                Yes I am saying the community a lot because from my perspective the overwhelming majority of people do not want this. The numbers on the searchtodon post are impressive for the fediverse, but I saw a ton of posts from instance admins speaking out against it, so yeah, I think it’s fair to say the community in this case, because that is my view of the network.

                This is not really how it works, for two reasons:

                1. Feedback has a tendency to be largely negative regardless of the actual average feelings of people, largely because it’s only those who have strong negative feelings about something who will be moved to make sufficient noise to get noticed. People who think a search function would be mildly useful might never post about it; people who think it would be really useful might make a single post. People who think it’s evil and must be fought to the bitter end will dedicate hours or days or weeks to nonstop angry posting about it. I had to unfollow someone I knew and liked IRL because they had basically turned my timeline into a sewer with the constant angry boosts and angry posts and angry threads over the search thing.
                2. The Fediverse is often more of an echo chamber than other social networks, precisely because it only shows you things from people you’ve chosen to follow and/or actively seek out. Which means that if many people in your social circle are against a thing, all it tells you is that many people in your social circle are against the thing. It tells you nothing whatsoever about what the broader “Fediverse community” thinks or feels.

                And sure fine do what you want, just be prepared to be instance blocked, and make sure your instance admin knows you’re going to use a service like this, because if people who don’t want you using their data like this find out that you’re using their data, they are going to instance block you.

                Having read quite a bit of the search drama, and also being an occasional reader of #fediblock and having seen some earlier kerfuffles like the CISA thing, I think the likelier outcome is that a relatively small number of instances are going to increasingly isolate themselves through their own aggressive defederation policies, helped along by other instance admins just getting tired of dealing with them.

                At any rate, I do not think you have sufficiently established that you do or can speak on behalf of some sort of “fediverse community”, or a majority or plurality thereof, so please stop doing so.

        2. 9

          There’s a difference between “here’s this thing and if you happen across it organically that’s cool” and “here’s this this i want you to slurp it up into your indexing service”, and that’s where they’re coming from

          I’d say that’s the difference between the public and unlisted post scopes.

          Just because the protocol itself doesn’t provide explicit protections you’re talking about

          But it gives guidelines, with the post scopes as above.

          For what it’s worth, this was the most reasonable search engine that came out from all of the twitter immigrants. And yet some extreme people still bullied it off. It will only take one developer who doesn’t have the best intentions to stand their ground with a way more aggressive indexing strategy than the one Searchtodon used to ruin it for everyone.

          1. 2

            Oh I fully agree this is the most reasonable system that’s been introduced so far, but it’s equally clear that even the way this was handled is unacceptable to the community.

            I do however strongly disagree with your framing of “bullied it off”. The greater fediverse community sees tools like this as an existential threat, they’ve been very clear about it. And yet similar scraping/indexing projects continue to pop up without really talking through their idea with the community, or figuring out how to work with them. Instead they’ve all been announced as “hey, here’s this tool that’s going to slurp up your data, you’re welcome’.

            And to be clear, I really do think searchtodon did a good deal of homework on figuring out how the community would feel, and did look into the criticisms of past attempts and tried to address their concerns. But they also didn’t spread their idea out and solicit feedback until it was already done.

            As for “one developer who [will …] stand their ground”, I don’t think any of them are truly going to be prepared for the mountain of legal paperwork they’re going to encounter if they don’t back down. Namely, the same people blasting these services in posts right now, and who are covered by GDPR, CCPA, and any privacy laws going into effect in several other states too, are going to be filing data access requests and data deletion requests, and will be applying legal pressure as well. It’s going to take a fairly large company to be able to weather that, and at that point, only a handful would be willing to keep going rather than fold and tear down the service.

            1. 7

              As for “one developer who [will …] stand their ground”, I don’t think any of them are truly going to be prepared for the mountain of legal paperwork they’re going to encounter if they don’t back down. Namely, the same people blasting these services in posts right now, and who are covered by GDPR, CCPA, and any privacy laws going into effect in several other states too, are going to be filing data access requests and data deletion requests, and will be applying legal pressure as well.

              I think it is far more likely that this would turn into asymmetric warfare – Fediverse search and discovery has enough money behind it that somebody’s going to handle the initial “storm” of GDPR/CCPA/etc. attempts and come out the other side with a product. Meanwhile, if people do try to weaponize such things against a search/discovery service, it opens the door for it to be weaponized in the other direction against the, frankly, mostly under-resourced “indie” instance admins who have been most strongly against search/discovery features. They are the ones who are overwhelmingly more likely to fold, along with their instances, when legal paperwork starts coming in from angry strangers.

              So as righteous and tempting as it sounds, I think the net effect would be the opposite of what’s desired.

              Also:

              The greater fediverse community sees tools like this as an existential threat, they’ve been very clear about it.

              I have seen a number of instance admins who treat it this way, and who claim to speak on behalf of some large majority of all Fediverse admins/users. I have not yet seen evidence that they actually do speak on behalf of such a majority, or that “the greater fediverse community” is an accurate label for them.

              1. 2

                Fediverse search and discovery has enough money behind it that somebody’s going to handle the initial “storm” of GDPR/CCPA/etc. attempts and come out the other side with a product

                does it? from who? and what strategy do they have for monetizing this product with an increasingly hostile user base?

                This whole “search has backing” line is never able to answer the question of who’s doing this and why are they doing it. Because so far it’s been entirely people who recently left twitter without any corporate backing. Clearly the problem is not a lack of resources if some larger entity wanted to do this, because multiple developers have shown this is something they can hack together in a couple weeks in their spare time.

                Hell, if search has backing why haven’t they gone to any of the larger instances and tried to work with them specifically with the goal of using their data as a starting point?

                There seems to be this view, especially from people in the tech industry, that just because twitter did something one way, it is inevitable that the fediverse will do it too. But the problem is that twitter did a lot of things simply because they were trying to figure out how to make money at VC required return rates. This is not a problem most server administrators have. There is no forcing function of “we need to show more people more content because it means we can serve more ads”.

                1. 8

                  does it? from who? and what strategy do they have for monetizing this product with an increasingly hostile user base?

                  Several people who’ve attempted to build “thoughtful” search/discovery have been clear that they are aware of efforts to build much less “thoughtful” versions, at least some of which are alleged to already be indexing, if not yet publicly offering search functionality.

                  Also I would ask who is the “increasingly hostile user base”? What I saw of the last round of this was basically a lot of “normies” who were like “oh cool, search would be a useful thing to have”, and a handful of instance admins who were prepared to harass and abuse anyone right off the internet for even daring to suggest building such a thing. I don’t doubt that those admins sincerely hold their beliefs, but I do very much strongly doubt that they speak for a majority, or even a significant minority, of Fediverse users. And mostly when I’ve tried to explain the conflict to people who weren’t familiar with it I’ve been bombarded with questions about “Wait, so they want to post publicly, but also not have it show up publicly?” Which seems to be the way the aforesaid “normies” view the whole thing.

                  There seems to be this view, especially from people in the tech industry, that just because twitter did something one way, it is inevitable that the fediverse will do it too.

                  One of the things people will use any social media platform for is talking about current events and other topics that interest them. One of the things people will want, as part of that, is the ability to find, connect, and interact with others who are talking about the same events and topics. The expected user-experience solution to that is a search box into which you can put your terms and get back relevant results. Twitter did not invent that, nor is Twitter the only social media platform nor service in general ever to have such search functionality, and there is nothing whatsoever Twitter-specific or “Twitter did it that way” about wanting to have a search box. And Mastodon at least already has limited search functionality; what’s missing is the ability to usefully search beyond one’s local instance for arbitrary terms (rather than for specific handles or hashtags, which currently can be searched for).

                  So, effectively, trying to prevent search and discovery from happening is saying “Attention all social humans! Stop being human and social! By Order Of The Admins!” This is not going to work; somebody’s going to build it, and right now everybody who tries to build it thoughtfully is being harassed into oblivion, so unfortunately that means it’s going to be built by non-thoughtful people who don’t care about being harassed over it.

              2. 6

                I am not sure where you find this greater fediverse community. The bubble I’m in seemed absolutely fine with searchtodon. Even the ones that were really angry at previous attempts essentially summed up their thoughts as “ah, this seems reasonable, why not?”. I honestly feel like this is a loud minority of people bullying others off. It happened maybe 10 times already, even before twitter’s shenanigans. I don’t think this is healthy for the network and only drives curious people who want to improve it away.

                I honestly don’t know what searchtodon could’ve changed from it’s initial plan that wouldn’t have made it completely useless, There’s just some people who don’t want search, period, and they react violently to anyone trying to work on it. I’ve seen threats at developers life from them on some previous occasions. I don’t know how else to call it but bullying.

                As for developer needing to fight legal threats: they can just release the source code. There’s plenty of actors with semi-malicious intentions that would run such software - enough that trying to take them all down would be foolish to attempt. After one goes down, another one would pop up. So honestly, I’d rather have a reasonable search that some don’t like, than a malicious one that everyone hates.

            2. 8

              You may find the berrypicking paper to be a good starting point for understanding search and indexing as tools for humans doing research.

              If anyone is denying reality I think it’s everybody creating the indexers.

              I don’t know if you recall the AOL era of the early 90s. At first, the idea was that all content would be curated, organized, and available via keyword search. (A keyword is kind of like a Mastodon hashtag.) However, the advent of full-text indexing led to today’s modern reputation-ranked full-text search interface. With the hindsight of history, I think that you are exactly backwards: if anybody is denying reality, it’s the folks publishing their data publically and then politely asking not to be indexed.

              1. 3

                I mean look, heed the warning of every index-related project that’s come before, or don’t. But be prepared for a wave of people to do everything in their power to stop you and limit you. You’re approaching this as a technical problem to be solved, when it’s not, it’s a social problem, and the society in question has said “this is unacceptable, we will not permit this”.

                If this problem truly was only technical, we wouldn’t have seen multiple fediverse index related projects collapse due to community pressure this week, let alone over the past several years.

                1. 6

                  If record labels, scientific journals, video-game publishers, and even governments cannot stop proliferation of (meta)data, then I am genuinely unable to understand what structural differences protect the Fediverse from similar proliferation. Information tends to be a free good simply because it is not scarce, and attempts at artificial scarcity of data are incongruous with information theory.