I’m assuming that Hackers News is still using continuations? If so, the id for each vote is actually an identifier to a garbage collected (e.g. something that will expire in the future) suspended computation. It’s surprising to me that a site as large as HN can still work this way… Or are they still just using one (beefy) server?
They are not using continuations now, at least not for paging (the URLs are something like this: https://news.ycombinator.com/news?p=2). They migrated because, well you guessed it, the number of continuations to store overwhelmed the server. Well, the idea of using them was cool… I mean in theory :)
Last time I checked (months ago) it was still file system instead of a proper database. Arc will be the last component to replace there knowing how PG likes Lisp :)
Still, it’s kind of neat that it works on “internet scale” given these… unusual design choices :)
The reason I am curious is mostly because they were hiring rails people a while back–not sure specifically for news.yc, but other tooling. Such a traditional web application (as opposed to the continuation passing one) lends itself to a certain style of dev.
That being said, the HTML still references the “op,” which was parlance for the “route” in the old arc dump.
Funny, I just recently wrote a scraper for Hacker News, as well as Lobsters here and a few other sites. Goal was to build a local database of everything I’ve ever written across the internet. The HN API was decent, but in addition to being read-only, it was also missing a bunch of data from the normal user posts page. But of course, a few things were easier to read via the API, so I ended up doing both and combining the results. Seemed straightforward enough, though I had to grab the login cookie to put in the script to be able to access the logged-in page info.
I’m assuming that Hackers News is still using continuations? If so, the id for each vote is actually an identifier to a garbage collected (e.g. something that will expire in the future) suspended computation. It’s surprising to me that a site as large as HN can still work this way… Or are they still just using one (beefy) server?
They are not using continuations now, at least not for paging (the URLs are something like this: https://news.ycombinator.com/news?p=2). They migrated because, well you guessed it, the number of continuations to store overwhelmed the server. Well, the idea of using them was cool… I mean in theory :)
Yeah! The ids look like actual ids now, too! Did they finally concede and start using a RDBMS, as well? Is it even still written in arc? :)
Last time I checked (months ago) it was still file system instead of a proper database. Arc will be the last component to replace there knowing how PG likes Lisp :)
Still, it’s kind of neat that it works on “internet scale” given these… unusual design choices :)
Extra gem for people reading comments here: https://github.com/wting/hackernews
The reason I am curious is mostly because they were hiring rails people a while back–not sure specifically for news.yc, but other tooling. Such a traditional web application (as opposed to the continuation passing one) lends itself to a certain style of dev.
That being said, the HTML still references the “op,” which was parlance for the “route” in the old arc dump.
Funny, I just recently wrote a scraper for Hacker News, as well as Lobsters here and a few other sites. Goal was to build a local database of everything I’ve ever written across the internet. The HN API was decent, but in addition to being read-only, it was also missing a bunch of data from the normal user posts page. But of course, a few things were easier to read via the API, so I ended up doing both and combining the results. Seemed straightforward enough, though I had to grab the login cookie to put in the script to be able to access the logged-in page info.
Anyone want to type in his login cookie :-)