1. 25

I have never used RSS and I don’t have it set on my website.

However, after reading comments on lobste.rs over a few weeks I got convinced that RSS is necessary. So I dug around to learn about it, saw some examples of both RSS and Atom, but still some questions remain.

  1. How to handle updates? My posts are in a perpetual “work in progress” state. That is to say, unlike blogs, they sometimes get updated. But after updating an article should I also update the RSS feed? If so - how? I noticed that RSS doesn’t have any “updated” field, but Atom does.
  2. Should I trim the feed down to “top 20” or so posts, or keep all the history in?
  3. Is it at all viable to maintain the feed manually?
  4. Finally, are there any unwritten social conventions about maintaining a feed?

Sorry if these seem like trivial questions to some. But answers are not easy to find.

  1.  

  2. 10

    First of all, I’d recommend Atom over RSS. That being said…

    1. […]

    AFAIK it depends on the client. It might update the article based on the ID or just keep the first client.

    1. […]

    Depends on your taste, I don’t do it, but in case of too large feeds it might be worth considering if you don’t want your feed to load too long on the client side.

    1. […]

    It’s possible, but if I may, I’d recommend a AWK script I wrote, and use to maintain two personal feeds.

    1. […]

    I think due to it’s history there are too many different ways to do stuff, so conventions are quite hard to enforce. There are good things.

    1. 4

      Yeah, the main advice I have is to not say “RSS” when you mean “RSS and Atom”, and also to avoid RSS entirely because it’s just really kind of haphazardly designed compared to Atom.

      1. 1

        AFAIK it depends on the client. It might update the article based on the ID or just keep the first client.

        In thunderbird it shows all revisions.

      2. 5
        1. I change the contents without bumping the date, Ministry of Truth-style (this mostly just falls out of what Jekyll does, rather than a conscious choice). I don’t think the specs* says what software should do if this happens, but I assume it will either silently backdate the changes (which is what I truly want) or it will completely ignore it (which is a decent enough choice, too). Sometimes it might result in odd inconsistencies within an app where different pages show different versions of the text, but I’ve never seen one actually crash because of this, and since I’ve only been correcting typos, I don’t really care. If you’re making drastic changes to a post, you need to do something different.

        2. You definitely should limit the size of the feed. The Atom spec has support for “pagination,” but most readers don’t do anything with it, and most sites don’t use it. You should also try to make sure your web server includes an ETag or a Last-Modified date (if this is a static file, then it will, but if you have something like a PHP script generating it, then you’ll need to do it yourself). Automated software downloads this thing over and over again; you should do everything you can to save bandwidth.

        3. I’ve literally never tried to maintain an RSS feed manually. I’ve used everything from hand-crafted Perl to XSLT to Jekyll to WordPress to generate it. I never wanted to potentially forget to copy stuff from the HTML to the RSS feed.

        4. As is typical, your best bet is “do whatever WordPress does.” I’ve seen a feed reader crash in the presence of an unknown tag in a separate namespace (I was developing a feed to interoperate with identi.ca, and was testing the feed in other readers; turns out there was no way to produce a feed that simultaneously worked in DogCatcher while also having an @mention-able name in the OStatus-verse). When developing a web page, there are about four browsers that you really need to test with, and you’ve got 99.999999999% of the users covered. This is not the case with RSS; one of the great upsides of being a properly declarative format is that it gets used for a variety of purposes and applications, all of which have their own quirks and bugs.

        One thing I would do, if I were you, is try to make your RSS feed “event-centric” instead of “page-centric”. RSS is a replacement for mailing lists, not a replacement for web pages. If you’re building something like Wikipedia, then your RSS feed should probably be a changelog instead of just an index of your articles.

        * There is more than one spec. Sorry.

        1. 1

          Appreciate the response. But I do think I am getting conflicting advice, specially between you (make RSS event centric, limit the size) and @twee who says I should put entire content within the RSS element.

          Are these two different “schools of thought” or am I missing something still?

          1. 4

            Well, there are two fairly commonly used expansions of the RSS acronym - Really Simple Syndication and Rich Site Summary.

            Some sites obviously fit a particular definition more than others. As @notriddle says, trying to store the full text of each page of a wiki-type site is probably going to end in disaster. If you write articles, however, I would always recommend putting the full text in, as that’s generally what people who use RSS want; a syndication of the sites they are interested in. Now some people might just open each article in a browser window, but others want the uniformity that a single reader provides.

            I suppose they could be two different “schools of thought”, but I reckon that really they’re coming from the same perspective; as was also mentioned above, RSS is used for a variety of purposes and applications, and trying to appeal to as many of those purposes and applications as possible is maybe sensible.

            Disclaimer: I’m biased, and I get quite sad when a site I otherwise like doesn’t have a full text feed or is incriminating in another way.

            1. 1

              Are these two different “schools of thought” or am I missing something still?

              Not so much “schools of thought” as just two good things being in tension.

              Automated software is going to poll your feed, and whenever your feed changes, it’s going to re-download the whole thing. This means you should avoid making your feed gratuitously large (compression, minification, and caching should all be enabled, and since you probably don’t want to make everyone re-download your entire website every time you make a new post, you should also limit the number of posts in your feed). RSS readers also do a fairly poor job of surfacing updates, and while that’s what you actually want for typo-fixes, if you’re making a substantial change to an existing page then you probably want it to show up as a new item. Making a new item will not only surface the change better, but also give you a chance to summarize what changed instead of making me re-read the whole thing to try to figure it out.

              On the other hand, if someone’s in their feed reader consuming your content, they shouldn’t be forced to switch to their web browser. Your subscribers probably prefer their feed reader’s interface over your web page’s, and even if they don’t, nobody enjoys having to make a bunch of unnecessary clicks. I actually agree with @twee that you should include entire posts rather than just excerpts or headlines. The practice that they’re complaining about is one that news websites use to try to increase ad impressions, while I’m thinking of the sort of “adapting” that comes from using blog tooling for a website that isn’t really a blog.

              1. 1

                Yep, I agree with pretty much all of this. Well said.

                1. 1

                  Thanks a lot for these answers. It is all coming together. I have last few (I think) remaining questions:

                  1. Can I mix whole-article entries and event-style entries within the same feed? - i.e. put whole articles in the element when they appear, but only use event notifications to broadcast significant updates?

                  2. I’ve read that atom elements have to have unique international IDs. And they seem to be linked to URLs. But in the case of event entries, how would someone generate those IDs?

                  1. 3

                    Can I mix whole-article entries and event-style entries within the same feed? - i.e. put whole articles in the element when they appear, but only use event notifications to broadcast significant updates?

                    In principle, I think you could, and it might even be something that could be done well, but it might conflict with people’s expectations for a feed. My inclination might be to (if possible) treat these as two separate feeds if you want to offer both for the same set of content - one for new articles and one for deltas to articles.

                    As an example, at SparkFun Electronics, we wound up offering several feeds for products:

                    • One for new products populated as they were added to the storefront
                    • One for changes to existing products
                    • One for comments on a particular product

                    This kind of thing lets subscribers pick and choose for their use-case and can serve different audiences. You might have one customer who wants to see all the shiny new stuff, another who’s keeping an eye out for price drops, a reseller who wants to know any time a product they carry changes, or an engineer who wants to track discussion about one of their designs. Similarly, you might have some readers who just want to see when you post something new and others who want the full firehose of all your edits.

                    1. 2

                      I’m usually a huge proponent of “put the whole text in the feed” but that’s mostly nice for blog posts that don’t get a ton of edits. If it’s more like “a notification that this page was updated” like it sounded to be your case, then I’d like a small feed better, and not the whole text. If I’m reading this from oldest to newest I’d get all the (outdated) versions first.

              2. 5

                I agree with much of the general consensus. You should be using Atom instead of RSS1 - it’s a more modern and cleaner specification.

                1. If your posts are eternally work in progress, you could probably perform small updates without changing anything, and use the updated field in Atom if something significant changes.
                2. That’s your choice. Most feed readers download all articles they receive, so it’s not a huge issue for people using hosted solutions or desktop programs; it only possibly becomes an issue if someone moves to a new computer and has to download everything again, although even then it’s likely that they’ll have read all posts older than the most recent 10 or 20. Depending on the client, adding more posts will just make the request take longer (as it’s a bigger file). Some people just prefer to keep everything accessible via RSS.
                3. Yes, in the same way it’s possible to maintain the rest of your website manually. You probably don’t want to, though.
                4. The main thing is please, put the full text in. Headlines are bad. Excerpts are worse. Just put the full text in, which is counter-intuitive to a lot of the desire to track people more, as RSS doesn’t help with that, but the people who use RSS will appreciate you so much more2 3. Another thing for me is sometimes people alter their feed, and it makes the feed reader go funny (for example, showing all the articles again). In this case, it’s polite just to put out another post (even just to the feed) to apologise for the inconvenience, just as you would if you accidentally sent a spam email to a bunch of people. Oh, and make it easy to find the feed - the best way to do this is with a link to the feed, but if you’re unwilling to do that (please do that), use a predictable URL; the ones I try are /feed, /rss, /atom, /feed.xml, /feed.rss, /feed.atom, /atom.xml, /rss.xml, /index.xml, usually by hand (a script would probably make my life easier). It’s so much nicer just to have the link there for me - if I like your site enough to go to that effort, appreciate me ;)
                1. 4

                  a script would probably make my life easier

                  The OP should include an alternate link in their HTML <head>. Good feed readers will allow you to just paste the link to the home page itself, and will auto-discover the feed URL from there.

                  1. 1

                    I totally forgot that this feature existed - that can definitely go as a best practice. Thanks!

                    1. 1

                      My preferred reader, rss2email, cannot, remarkably.

                      1. 1

                        As @twee already said, this seems like a perfect opportunity for a really simple script.

                        https://github.com/notriddle/find-feed

                        1. 1

                          Yes indeed, thank you. My surprise that rss2email doesn’t already is precisely because it would be quick work to add it, and it’s pretty old software, so why nobody has yet is a mystery. I may yet do it myself.

                    2. 1

                      Thank you for elaborate response.

                      Everything rings true to me except for the one part where you suggest putting whole text into the feed.

                      Is this really the standard way of doing it? If RSS readers really wanted this functionality why wouldn’t they just download the whole article automatically from the link? Also I am not even very sure what format I would have to use, should I put everything as html, and what about pictures?

                      1. 3

                        It really seems to depend on the site generator and the individual. Most of the feeds I am subscribed to (uploaded to http://ix.io/2hEU if you want to check individual feeds for yourself) have full text feeds. In my first post I linked to two blog posts by individuals explaining why they personally use full-text HTML.

                        The content tag in atom can have the attribute type="html", and then standard HTML can be used, and will be parsed by the feed reader just as if it were HTML in a browser, so your images will look fine. I think the existence of a content tag as well as a summary tag gives a good idea as to what the Atom standard is happy with, although I haven’t checked the standard recently so am not entirely sure on the specifics of this. RSS can also take HTML through the use of CDATA or something, but as I recommend Atom I can’t advise you anymore with regards to this.

                        Some feed readers do download the whole article automatically, if they think it isn’t already there (I use newsboat, which does this, and I know that quiterss does too). The problem is, a lot of websites have content on each page which isn’t the article (adverts, a header, archive listing), and some don’t even have the content in the HTML (javascript apps as websites). This makes it pretty difficult to get the content without a full browser engine, and what’s the point in that when it’s trivial to just stick the article content in the RSS feed?

                        My opinion (which ultimately is just my opinion) is this: people who like having the full text in the feed appreciate it being there. People who don’t use that functionality probably won’t mind the added weight of it being there, as it’s not a huge amount anyway (and I say this as someone who routinely uses veeery slow connections) and they’ll likely lose the gains the moment they open one of the articles as an actual website. As Aral says in his article, the more ways your content can be propagated, the more people will read it who want or need to read it. If you take pride in the content, having people read it is definitely more important than the amount of impressions you get on your site.

                        Just my 5 cents or so :)

                    3. 3
                      1. Personally, I don’t update feeds when I update posts. I sometimes update posts for a little spelling mistake I’ve noticed, or something similar, so I don’t want people to perceive I’m spamming them.

                      2. Most people tend to limit their feeds to the most recent 10 or 20 posts. It’s rare to see an RSS feed that contains all the content on the site.

                      3. I wouldn’t say so, no. I’ve used PolitePol in the past and had a lot of success with it. https://politepol.com/en/

                      4. Not that I know of. As long as your feed is valid, you should be fine. You can check its validity here - https://validator.w3.org/feed/

                      Good luck!

                      1. 2

                        Appreciate the validity check, will be very handy.

                        1. 1

                          ad 2) I’m not sure if it’s my feed readers, but I have the impression I am not missing posts, so I don’t think that’s necessarily a good idea to limit it.

                        2. 2

                          I had some notes on RSS based on my experience with the new NetNewsWire app on macOS, which got me back on the RSS train and which, being a rather young project, has still only implemented the basics, making it a good app to test that your feed gets the basics right.

                          1. Make sure you include a <link> to your feed in all pages of the website (including the homepage, not just a blog section); if you have more than one link, make the “all posts” (or whichever is the most likely people want to subscribe to) first, as NNW, and presumably other apps, grab the first feed in their “insert URL receive feed” feature

                          2. I’ve seen various subtle problems with artisanal feeds and the younger class of static site generators: images with relative srcs work in the browser but not the reader, IDs and timestamps for posts are either missing or change on each fresh build.

                          1. 2

                            Just like each blog is kinda of it’s own island, there are many answers to all of this and in the end you need to do what suits your taste.

                            My feed contains full articles because I realized that I get quite frustrated when I see and entry on my feed reader and when I go to read it it is simply a blurb to drive me back to the original site. I don’t want my readers feeling frustrated like that. Even though I appreciate views on the site, I appreciate people reading my content more and whatever I can do to make that experience more comfortable, the better. I also have some scripts to parse my own RSS for sending webmentions and to aid cross-posting to SSB. To be able to do that effectively, I need the full content on the RSS.

                            I prefer RSS over Atom but that is a personal choice. I was programming blog tools when RSS was popularized by Userland and Dave Winner. I remember when Google/Pyra introduced Atom and what a mess of spec that felt to me. I really dislike Atom. I still generate both Atom and RSS for my blog though and the reader is free to use whatever format they prefer.

                            Depending on how much content you have, it is better to limit your feed to the recent $X posts. My feed at the moment contains everything but this is a new blog and I don’t have much content. The whole feed is ~200k, if you add compression this becomes quite small, so I’m not worried at the moment. As it grows I might switch to recent 20.

                            I don’t think it is viable to maintain RSS/Atom manually. It is better to use scripts. Maintaining the same content in two different locations might lead to some out-of-sync problems quite quickly. RSS spec is dead simple, it is easy to generate a small script to produce it.

                            As for social conventions, in my subjective experience on this I’ve noticed that it mimics cultural norms in the real world. Which means that different pockets of bloggers use different norms. I guess this comes from bloggers talking to each other in their circles, communities and coalescing towards some common practices, but those are not universal and different groups have different practices. An example is the conversation here between full feed / top 20, or full entry vs blurb.

                            The good thing is that you’re in control over your own website, so you can experiment with this and change your mind. You’re not being forced into some practice or format by a SaaS beyond your control. Play with RSS/Atom, see which one is best for you. Experiment with full entry vs blurb, see how it affects readership.

                            I think that the most important thing you can do beyond providing a feed is checking out IndieWeb and adding Webmentions, that alone will make your blogging much more fun and rewarding.

                            1. 1
                              1. I recommend trimming to a really low number, i use 5. Just in case you accidentally cause a raft of posts to bump their publication date, easy to do with software updates or server migrations (remember eg that git doesn’t store file time stamps). This protects any downstream aggregators (planet sites like planet Debian) from being spammed.
                              1. 1
                                1. Whether or not you have a syndication solution, making a weekly/monthly post called “Updates to old articles” would probably be a good idea. Furthermore, if your blog is generated from a git repo, git forges like GitHub, GitLab, and Sourcehut provide RSS feeds for commit logs; you could advertise this feature for people who want every single update.
                                2. Rather than trimming, make feeds easy to filter; use keywords in article titles that make it easy for users to filter, either with a regex/grep or with their eyeballs.
                                3. See the other comments.
                                4. Lots of people don’t like it when they have to click a link to view the “full article” because part of the benefit of feeds is having articles stored offline in plaintext, displayed correctly without JS/CSS/frames.
                                1. 1

                                  I’ve figured it’s just easier to provide both RSS and Atom and let the readers choose. Some libraries would generate you both with almost no extra effort, e.g. I’m using python-feedgen

                                  1. Similar, my posts are also sort of wor in progress.

                                    So far, I’ve been only including the post in the feeds when it’s ‘released’. In the future, I might include RSS updates for major changes, but haven’t bothered yet.

                                  2. I’ve heard some readers reject if the feed is too large, so I’m limiting mine to 512Kb (about 10 posts). Although I’m planning to add a separate ‘full’ feed too.

                                  3. Not sure why would you want it? There’s too much overhead involved like markup and escaping..