There are often cases where comments with a poor score (say, 1 point), invite high-quality replies that receive many upvotes.
These threads are not shown at the top of the story’s comments page because their “root comment” has a low score.
An example, as of 2022-07-22 17:00 UTC, is https://lobste.rs/s/ekvqcf/random_wallpaper_with_just_bash_systemd#c_vop9bt:
Because the root comment has only 1 point, it remains at the bottom of the page, bringing down with itself also the good comment with 12 points and all the other upvoted comments.
Could an alternative ranking system be put in place, where the threads and subthreads are not sorted by the score of their root comment, but by the sum of all the scores of their children?
(Maybe taking into account only children with >1
point, to avoid giving importance to superficial back-and-forth discussions.)
Maybe take the max, instead of the sum? That way it’s still comparing the score of individual comments. A highly-upvoted child would rank higher than a less-upvoted toplevel comment, but long threads wouldn’t get any bonus.
For better or worse, the deeper you go into threads the less likely it is that the discussion is still about the original submission. Your proposal, or the max variant presented elsewhere, I fear would unduly reward flamebait and off-topic discussion.
Such off-topic discussions that stray away from the main topic do exist, are a nuisance, and, surely, they should not be rewarded with more attention. However, such sub-threads exists as a long string of short comments with no or very few upvotes. (I wonder if the data confirms this or it is just my impression.)
For this reason I believe that a) not counting in comments with score==1 or b) @dpercy’s max variant would avoid rewarding such discussions.
Just like to point out that the assumption here is that upvotes is a metric of quality. On lobste.rs as in any other social context, upvotes/support is a measure of popularity.
There are many instances where the two are correlated, and some important instances where they are not.
I fully agree: upvotes are a metric of popularity and not of quality. But this proposal does not assume that.
Regardless of what upvotes represent, currently threads are sorted in descending order of (upvotes, time). All that this proposal suggests, is to change that to (sum(upvotes of root-and-children), time).
The only assumption here is: if ranking a thread by upvotes of its root comment is considered good (and at the moment it is), then ranking by the sum of the upvotes of the whole thread is better.
Tangential to your effort here, on a personal note, if I find a discussion interesting I read all the posts regardless of ranking. On the discussions I find worthwhile I find that post quality is unrelated to upvotes.
On some discussions I find the top voted comment has devolved into a long tail of niche discussion, often acrimonious, which is not as useful as later posts.
Worst to pick out of the noise are replies to replies which are great but buried in not so great comments.
Ranking by max or sum upvotes optimizes for photogenic smackdowns of bad opinions.
Maybe disaggregate “agreement upvotes” and “contribution upvotes”?
I feel kind of honored to be taken as an example, so please let me share my personal thoughts (TL;DR below):
It is impossible to find the definite ranking system, but let me elaborate by using 3 extremes to see where the current and proposed systems fail:
It all depends on personal preference though, but I personally would like more controversial topics to be on top and less controversial ones to be secondary. The reason for that is that you otherwise might end up with echo-chambers, given users are encouraged to preach to the choir for a high karma score. As a tangent, I would prefer two karmas for each user: A “paragon-karma” and “renegade-karma” whose sum would be an “influence” score (in the end up- or downvotes both reflect a certain influence) and whose difference would be a “non-controversiality-score” (a low difference reflects controversiality, a high score the opposite). All in all, maybe we should move away complete from the miriad of downvote-options on lobsters and simply have an “agree” and “disagree”. If something is spam, you can report it, if something is incorrect, you can write a response rectifying it (which would then be subject to a vote of agreement and disagreement, depending on how well you state your case).
The proposal to add replies into the weight would not change the ranking of controversial topics (1), but even moreso weigh comments preaching to the choir (2) because everyone replying would aim to also get some karma. It would solve the presented case (3) though. Case (4) would also be favoured.
The proposal by @dpercy to the take the max would not change (1)’s ranking, probably not affect (2) but help (3) as well. (4) is also positively affected.
TL;DR, here’s my proposal so more highly-controversial topics (those that are interesting) are more favoured: Consider root comment and replies as equals and only take influence (sum of up- and downvotes) as a score.
This benefits (1), which in the current form end up at the bottom, which makes zero sense. It also benefits (2) and (4), which is a forced compromise, given the score does not really show how “original” a comment is (this is why Reddit probably introduced awards to allow users to weigh very good posts). By taking all replies into account, it also, of course, benefits (3), which makes sense to push up.
Replies with no votes increase the influence of the thread by 1, indeed, but isn’t the purpose of the score to show what the hivemind thinks? By scoring them as zero (which was proposed here), you effectively value an elaborated response lower than a simple upvote of the original comment. Instead, why not simply “fold” long subthreads so they don’t take up too much space?
As another tangent: By weighing root and children, this might encourage people to respond to “deep” threads instead of writing a new post.
Let’s see what @pushcx decides in the end. :)
I really like this idea but have no idea what its implementation would represent in terms of additional load to the server.
I’ve moved most of the vote into the db, but not all. If
calculated_confidence
finished its move into the db almost any algorithm would be faster than the current implementation.