1. 10

  2. 2

    I’m assuming that Hackers News is still using continuations? If so, the id for each vote is actually an identifier to a garbage collected (e.g. something that will expire in the future) suspended computation. It’s surprising to me that a site as large as HN can still work this way… Or are they still just using one (beefy) server?

    1. 3

      They are not using continuations now, at least not for paging (the URLs are something like this: https://news.ycombinator.com/news?p=2). They migrated because, well you guessed it, the number of continuations to store overwhelmed the server. Well, the idea of using them was cool… I mean in theory :)

      1. 4

        Yeah! The ids look like actual ids now, too! Did they finally concede and start using a RDBMS, as well? Is it even still written in arc? :)

        1. 1

          Last time I checked (months ago) it was still file system instead of a proper database. Arc will be the last component to replace there knowing how PG likes Lisp :)

          Still, it’s kind of neat that it works on “internet scale” given these… unusual design choices :)

          Extra gem for people reading comments here: https://github.com/wting/hackernews

          1. 1

            The reason I am curious is mostly because they were hiring rails people a while back–not sure specifically for news.yc, but other tooling. Such a traditional web application (as opposed to the continuation passing one) lends itself to a certain style of dev.

            That being said, the HTML still references the “op,” which was parlance for the “route” in the old arc dump.

    2. 2

      Funny, I just recently wrote a scraper for Hacker News, as well as Lobsters here and a few other sites. Goal was to build a local database of everything I’ve ever written across the internet. The HN API was decent, but in addition to being read-only, it was also missing a bunch of data from the normal user posts page. But of course, a few things were easier to read via the API, so I ended up doing both and combining the results. Seemed straightforward enough, though I had to grab the login cookie to put in the script to be able to access the logged-in page info.

      1. 1

        Anyone want to type in his login cookie :-)