Two Hobbyists Made One of This Year’s Best Video Games (Chants of Sennaar)

nanoUFO@sh.itjust.works · edit-2 9 months ago

Two Hobbyists Made One of This Year’s Best Video Games (Chants of Sennaar)

FooBarrington@lemmy.world · 9 months ago

I suck at math, but if the mean is sufficiently over the “positive” threshold, and there’s a low standard deviation across reviews, wouldn’t this have the problem I describe?

Since Steam reviews are only positive or negative, not on a point scale, I’m not sure how this problem would come to pass. The distribution of reviews around the mean are expected to be similar for your described 10/10 game and the 7/10 game, and since the review system itself is only boolean in nature there is no distorted result.

The more certain people are about the quality of good games, the less relevant the ratio becomes, which is perhaps the opposite of what you would want.

Why does the ratio become less relevant the more certain people are about the quality of good games? Again, the review is only positive or negative, no actual review number assigned. In which cases do you expect the ratio to drift away from the actual useful information?

bogdugg@sh.itjust.works · edit-2 9 months ago

Why does the ratio become less relevant the more certain people are about the quality of good games? Again, the review is only positive or negative, no actual review number assigned. In which cases do you expect the ratio to drift away from the actual useful information?

It’s because there’s no review number in combination with varying certainty that makes for bad information regarding judgment calls about quality. If people are certain the game is a 7/10, that could produce a better score than being less certain about an 8/10, because the wider distribution (less certainty) could put more reviews below the positive/negative threshold.

The following reviews: 6/10, 6.5/10 , 7/10, 7.5/10, 8/10 will produce a 100% rating. More certain, less useful.

The following reviews: 4/10, 6/10, 8/10, 10/10, 10/10 will produce an 80% rating. Less certain, more useful.

It’s only consistent if you assume all games follow the same distribution, which is not how reviews work in my opinion. There are many websites which do surface score information, and they follow wildly different distribution patterns depending on many different factors.

Again, it is useful for predicting whether you’ll like it, but bad for predicting how much you’ll like it.

FooBarrington@lemmy.world · 9 months ago

It’s because there’s no review number in combination with varying certainty that makes for bad information regarding judgment calls about quality. If people are certain the game is a 7/10, that could produce a better score than being less certain about an 8/10, because the wider distribution (less certainty) could put more reviews below the positive/negative threshold.

But what do you mean with “more certain” and “less certain”? Again, Steam doesn’t have reviews beyond boolean values.

The following reviews: 6/10, 6.5/10 , 7/10, 7.5/10, 8/10 will produce a 100% rating. More certain, less useful.

And since Steam doesn’t have point-based reviews, the 100% rating is fully correct, as presumably each of these reviewers gave a positive review.

The following reviews: 4/10, 6/10, 8/10, 10/10, 10/10 will produce an 80% rating. Less certain, more useful.

How is it “less certain”?

It’s only consistent if you assume all games follow the same distribution, which is not how reviews work in my opinion.

Do you have statistical analyses that show this assumption to be wrong?

bogdugg@sh.itjust.works · 9 months ago

My entire argument stems from the idea that you can ascertain quality from these ratings, which I am refuting. The rating is “correct” in that it is measuring something, and as long as people keep in mind what that something is, there is no problem. But this article, for example, uses the flood of positive reviews to make the case that it is one of the best of the year, which I believe is faulty reasoning.

What I meant by certain is that the reviews are more clumped together (again if you had a score - even though it isn’t present presumably you could attach one to these reviews), so there’s more agreement among different people about the quality of the product. If you don’t agree that games can be more or less polarizing, you won’t agree with this point unless I can back it up with data which I’m not going to spend time doing. You could go through Rotten Tomatoes and compare Critic Score with RT Score because they surface both those values and see how closely they track on different parts of the spectrum.