You all know this, but I wrote it anyway...
Hmm...sounds like you all don't "know" it after all. Let me explain.
I don't think the system is broken at all. It's not like every game is getting review bombed. It's normally only the ones that are giving people problems at release, although I do agree that many gamers are impatient and sometimes leave a review much too soon in those cases. Still, those reviews can generally give you an idea of what issues a game might currently have.
True. Not every game gets review-bombed but some do for reasons that have little to do with the game quality. Take the initial user scores for Hogwarts Legacy as an example for which not even reading the reviews seemed to suffice as without playing the game some of them seemed legit. You would need to know the reason why it would get review-bombed. In statistics there is a fundamental concept of N, number of observations. When N is low, you cannot say much because the uncertainty is high. When N increases, so does certainty around central estimates (simple average in this case). Over time N naturally increases and, as you say, so do does the quality of user reviews because people have actually played the game. Hence, the system is broken in that sense that you cannot base much on initial scores, but you can read the user reviews and find out what may be the issue with the game. None of the sites I know take N into account when they calculate average scores.
I wouldn't say broken; it's coarse because of its binary system, and prone to be review-bombed because it's customers.
This is not true. Binary system can be as effective and sometimes even more effective than a scaled system with more bins. Scaled systems (0-10 as an example) work poorly for user reviews because of this:
Though if you think about it, a big portion of the voters in places like Metacritic can't seem capable of perceiving any nuance anyway and any game they vote for is either a 0 or 10.
Hence binary system is perhaps better for user reviews than a scaled system with sufficient N. One should note that nowhere does the binary system attempt to find the best game ever or even compare games. All it asks is "would you recommend this game". Hence one should not compare games based on the score. I think Steam does it nicely by binning the scores to those words. That's statistically feasible way of doing it and easy for users to understand. What some users seem not to understand, however, is that the system is not designed for comparison.
One must read a number of comments to understand what is going on, but this could be said of any other system after all.
This. One must read
Purely looking at the scores or average terms (e.g. overwhelmingly positive, mostly negative, etc.) does not tell you why the score. Reading helps and our brains are good at compiling information. Far better than a single number.
I don't entirely agree that professional reviewers give a better indication. The information is better formatted, like the pros and cons we often see at the end, but it's also necessary to read a number of them to overcome the bias. They also tend to rush their analysis to publish earlier, and to be more forgiving of the issues.
As we have discussed before, one should not base much on a single score but when one scavenges all available professional review scores online as Open- and Metacritic do, one gets an indication. Still scores gathered this way are not designed for comparison. Hence say 86 is not always better than 74. It depends on the context (type of the game, how many similar games have been released recently, etc.). Also >90 scores typically indicate that the game is somehow evolutionary (using your term, see
), not necessarily a game
I personally would like. We come to reading again. One has to read. The scores are just scores and can be misleading. Hence my comment "the system is broken". Still a game getting
a critic score of 65 indicates that there's something wrong with it.
All the above assuming that a score/review should represent the quality of a game.