Re: On The Subject of Ratings

#81

eric_river Wrote:
Entitled Wrote: Why can't individual ratings simply be as public
though it doesn't bother me that ratings aren't public
i do put my name behind the ratings that i stick
i only rate what i review, though they do get downvotes
but everybody has a different style to float their boats
peoeating

You, my dear poet, are not a troll and not part of the problem. peolove

Re: On The Subject of Ratings

#82

Thedude3445 Wrote: One issue with trying to take reading time into account is that not everyone reads while logged in, and many people read on other websites then rate on Royal Road. For a while on my newest book I actually had more favorites than ratings, because so many readers from SpaceBattles, I assume, came to follow and put it on their favorites list without rating it yet. Surely there's ways to lessen the impact of trolls or make it harder to do something against them, but I still have no idea what it is.

I understand, but I sincerely doubt that "many people" is really the case. This would likely rather be the exception. But anyway, it apparently cannot be done.

Re: On The Subject of Ratings

#83
Wow, talk about a blast from the past. Making me feel like a cub kid again.


Since this post has come back from the grave, I figured I might as well pop in and provide some data for you math lovers out there. Back when I made this thread in late 2019, Delve was ranked on Best Rated in the low single digits (not exactly sure of the ranking, but it peaked at 3rd earlier that year) with an average score of 4.66 according to the analytics. Today, it has fallen to rank 161 with an average score of 4.522.


16,330 followers, 5,314 ratings, 348 reviews
32.5% of followers left ratings*
2.1% of followers left reviews*
*Not quite because of unfollows, but whatever
Stars

Count (Ratings+Reviews)

5

3669

4.5

768

4

509

3.5

222

3

142

2.5

96

2

91

1.5

44

1

29

0.5

92




Doing the average myself, that’s 4.5219, so pretty much exactly what the fiction page displays. No apparent difference between ratings and reviews on the average, though apparently there is a difference when it comes to determining rank.


Comparing positive votes (defined here as 3 and up) to total votes, that’s 5130/5662, or 4.689 on a 5-point scale. Should be a decent approximation of the rating under a thumbs-up/thumbs-down system. Whether that would place it higher or lower in the rankings, I have no idea.


Anyway, no further commentary from me. It’s bad for my liver.


PS: Still not a bear.

Re: On The Subject of Ratings

#84

SenescentSoul Wrote: Wow, talk about a blast from the past. Making me feel like a cub kid again.


Since this post has come back from the grave, I figured I might as well pop in and provide some data for you math lovers out there. Back when I made this thread in late 2019, Delve was ranked on Best Rated in the low single digits (not exactly sure of the ranking, but it peaked at 3rd earlier that year) with an average score of 4.66 according to the analytics. Today, it has fallen to rank 161 with an average score of 4.522.


16,330 followers, 5,314 ratings, 348 reviews
32.5% of followers left ratings*
2.1% of followers left reviews*
*Not quite because of unfollows, but whatever
Stars

Count (Ratings+Reviews)

5

3669

4.5

768

4

509

3.5

222

3

142

2.5

96

2

91

1.5

44

1

29

0.5

92




Doing the average myself, that’s 4.5219, so pretty much exactly what the fiction page displays. No apparent difference between ratings and reviews on the average, though apparently there is a difference when it comes to determining rank.


Comparing positive votes (defined here as 3 and up) to total votes, that’s 5130/5662, or 4.689 on a 5-point scale. Should be a decent approximation of the rating under a thumbs-up/thumbs-down system. Whether that would place it higher or lower in the rankings, I have no idea.


Anyway, no further commentary from me. It’s bad for my liver.


Thank you for the data. In my opinion your data actually confirms that there were a larger number of tactical 0.5 ratings also in your case, many more than can be explained statistically. But since you apparently, for whatever reason, already had enough higher ratings at the time these additional 0.5 ratings were given, they had no impact and didn't harm your story.

Please hear me out (and no answer from you is required - your liver is more important!).

Looking at your data, the number of ratings go down all the way to 1, logarithmically, until there is an anomalous surge for 0.5 ratings. The number for 0.5 ratings is three to four times higher than it should be if it were just a regular statistical distribution and these ratings were what has to be expected, even if we include people who would otherwise have given 0 stars if that were possible.

With your story's large overall number of ratings, this anomalous difference is statistically highly significant (i.e., highly probable). It looks like (again, with a very high level of statistical significance) your story got more or less an additional 70 ratings of 0.5 stars, i.e., in addition to the around 20 that would have been expected if it were a regular statistical distribution, according to the numbers in your table.

Now, it would be interesting to know at what point in time these additional 0.5 ratings were made. If they were given equally distributed over the time since the publication of your story, and such a distribution happens to any story, then there must be another explanation for the unexpected high number of 0.5 ratings. But if they were given while the story was trending or otherwise visible on the front or second pages of RR, then there's something fishy going on, which can more or less only be explained by the existence of around 70 tactical raters. Now, if at that time your story had already enough higher ratings, these artificial additional 70 ratings wouldn't have had a disproportionate impact. But if that hadn't been the case (i.e., not yet enough regular, representative, organic ratings, which might often be the case for young budding first time authors), then the impact would have been dramatic, and your story wouldn't have been able to recover from it later, no matter what.

In your story's case, the tactical raters, from their point of view, were too late and thus "lost" their window of time during which they would have been able to bomb your story. Maybe that even reduced their number, because they kind of gave up early, seeing that they had no impact anymore. That would mean that these 70 might have even been more otherwise.

Small initial differences here on RR have huge later-on effects, and then, once a story has been bombed, nothing can be done (besides maybe canceling or not counting these artifical additional 0.5 ratings).

So, I guess there are around 70 to maybe 100+ tactical raters. They bring a lot of misery to first-time authors, and loss of good stories to readers. As a consequence they also have a large detrimental effect on RR. These comparatively very few tactical raters have a staggering, completely disproportionate large power, compared to the rest of the readers, and thus a huge, totally disproportionate negative impact on everyone else.

One simple solution for RR could be checking at what time a suspicious, statistically unexplainable number of 0.5 ratings happens. This is a simple formula. Maybe in combination with recording from which users these ratings are made. Then, if there are indications for such additional, artificially inflated 0.5 ratings, don't take them into consideration when calculating and showing the ratings for everyone else. If you want to make it a bit more effective, show the uncorrected ratings to the identified tactical raters. Then they might have less incentives to switch to giving 1 star instead of 0.5 (the number of 1 stars seems not artificially inflated according to your table), unless they are organized. If they are, or become organized... then the "normalization" (aka correction) formula will have to be a bit more complex, but the tactical raters will be made impotent.

I can help with the formula if RR is interested. It's fairly simple, and would help so many people...

Re: On The Subject of Ratings

#86

eric_river Wrote:
is there now a limit to how many one can do?
if you spammed a hundred half-stars, would mods detect you?
peoconfused


I hope they would,
but if not, we could
😳
try to force them to
by making the bombing too,
even for them, extreme to
ignore, for all stories, too.
🤯
Let me quickly spin up
a python script to put up
🐍
on a bot net cluster to try
seeing what say the mod guy.
🙈🙉🙊
And don't you worry, I won't
really do this, unless they don't
😜
react somewhat at least meaningfully
within what can be expected realistically.
🙏
Sorry for the bad rhymes, 
but these are trying times. 
❤️ 


Re: On The Subject of Ratings

#87

eric_river Wrote:
is there now a limit to how many one can do?
if you spammed a hundred half-stars, would mods detect you?
peoconfused


Just saying (I have no, absolutely no, intent on doing anything like that), a mere two 0.5 ratings on Short river songs would make the story invisible and chanceless forever at the moment. Four would do it for Hero's song. These young stories are fragile. Seventy 0.5 ratings at this time would give them overall scores of 0.7 and 1.3, respectively. 

Re: On The Subject of Ratings

#88

Entitled Wrote: I understand, but I sincerely doubt that "many people" is really the case. This would likely rather be the exception. But anyway, it apparently cannot be done.


Beware of Chicken only exploded because of offsite readers (also on SpaceBattles) who came en masse to rate and review the story once it started to be posted on Royal Road. That's a whole lot of people who were probably brand-new RR users and hopefully came to enjoy other stories on the site as well. It's only an exception because, strangely, not very many authors crosspost their stories even when it's helpful.

Anyway, yeah, doesn't really matter. What does matter is that young authors get their lives crushed by jerks and bullies are able to downvote a story with one 0.5 much more power than a bunch of people can lift that same story up with a 5, due to the heavy skew towards 5. One 0.5 is enough to take a story down like .2 when you have less than 100 total ratings.

One stupid idea: What if authors could turn off ratings or reviews for a story? It'd remove them from the site ranking list, but it would allow a peace of mind for the authors who really don't want it. 

Re: On The Subject of Ratings

#89
Huh. That explains it. I was wondering for a while how Beware of Chicken came so rapidly and out of nowhere so successfully, even if it is in a good genre and well written for the site. Everything makes sense now.

Also I’m still in the system is fine camp dealing with bombs when getting on trending is just par for the course, the five star system is basically expected because of how online reviews are expected to be formatted, and it doesn’t actually make a difference for the math on the whole. The only actual issue I see is the psychological thing where a creator gets mentally ties to the algorithm like all social media, which is only a problem of how the numbers are visibly displayed by default and how hard it would be to just ignore them for most people. The numbers shouldn’t change, people’s ability to see them (specifically for feedback which can have detrimental outcomes) might need to change to a better optimum, maybe. Hiding the numbers by default or by choice in specific psychologically tactical ways and stuff.

Re: On The Subject of Ratings

#90

Entitled Wrote: With your story's large overall number of ratings, this anomalous difference is statistically highly significant (i.e., highly probable). It looks like (again, with a very high level of statistical significance) your story got more or less an additional 70 ratings of 0.5 stars, i.e., in addition to the around 20 that would have been expected if it were a regular statistical distribution, according to the numbers in your table.


This is false. The statistical distribution of any rating system had a strong bias towards both ends of the scale. It's not recollected to be linear so the way to the bottom. This is not a RR thing. Even if it did follow a proper statistical distribution , it'd have to be a normal distribution centered at the middle of the scale, but rating systems naturally bias towards the two ends due to psychological and social factors .



Thedude3445 Wrote: Beware of Chicken only exploded because of offsite readers (also on SpaceBattles) who came en masse to rate and review the story once it started to be posted on Royal Road.

This is also untrue. BoC was posted on DB and RR at roughly the same time, and had no link to RR to drive traffic here, nor is it nearly as successful there.


Thedude3445 Wrote: One stupid idea: What if authors could turn off ratings or reviews for a story? It'd remove them from the site ranking list, but it would allow a peace of mind for the authors who really don't want it.
This has been raised before, and most authors didn't like the idea at all. It'd also be very unfair towards readers. It'd also have to remove you from virtually every single list on the website.



Endless Wrote: Hiding the numbers by default or by choice in specific psychologically tactical ways and stuff.
We already offer a way to hide them. I can't do any better for those who obsess over it, the lists ultimately have to include the average. 

Re: On The Subject of Ratings

#92
kanadaj Wrote: This is false. The statistical distribution of any rating system had a strong bias towards both ends of the scale. It's not recollected to be linear so the way to the bottom. This is not a RR thing. Even if it did follow a proper statistical distribution , it'd have to be a normal distribution centered at the middle of the scale, but rating systems naturally bias towards the two ends due to psychological and social factors .

I didn't talk about a linear distribution, nor would a normal distribution with a bulge in the center make sense. I agree that the distribution can legitimately be like a "U", especially in cases where opinions are very divided, e.g., for ideological (political or religious...) reasons.

But for most stories that have lots of readers, the distribution will most likely be going up exponentially (more readers = more higher ratings), staring from 1 star, but there will be unexpectedly more 0.5 star ratings on the left of the curve.

I know this is a problem of many rating systems based on stars, but that doesn't mean nothing can be done. Rotten tomatoes' 2019 decision to allow ratings only from users who bought tickets through them reduced the problem tremendously. In their case it has Hollywood level financial consequences. In RR's case, the budget will not be as high (I guess I can safely assume), but the psychological fallout for creators is the same or even worse.

The issue is the sensitivity, precise timing, and specific bias against new authors. A new story from an established author with already many readers from previous stories is always immune against the problem. Exceptions for some first time authors like for BoC or AH etc do not disproved this. They were lucky, and of course very good. But I see long stories with 4.0x ratings and few readers whose quality IMO surpasses stories with 4.6x ratings. What happened there? If what you wrote were true, there couldn't be any bias against new authors.

Many sites allow researchers or even the interested public to analyze their anonymized data, including IMDB and rotten tomatoes. There's a lot of research and even competitions for best algorithms in the field. Maybe allowing independent scrutiny could help. Then we wouldn't have to speculate so much, and everyone could make informed proposals based on the data. Something has to be done.

And what about the idea to make the ratings public, like for reviews, as has been purposes before? Are there any arguments against that? The system is anyway already anonymous, so what's holding this up? And even as a reader, I have no way to find what stories I myself already rated (without having submitted a written review).

Having said that, I would like to express my sincere and overwhelming gratitude to the creators of Royal Road, and to all authors for their writing and sharing their stories. From my experience Royal Road is in so many ways far superior to other similar sites. One of the aspects is the kindness and respect. Let's make it even better, together!

Re: On The Subject of Ratings

#93

Entitled Wrote: I didn't talk about a linear distribution, nor would a normal distribution with a bulge in the center make sense. I agree that the distribution can legitimately be like a "U", especially in cases where opinions are very divided, e.g., for ideological (political or religious...) reasons.
The J curve shape is also present in any 5 or 10 point rating system because extreme opinions tend to dominate these systems. There is evidence of this on Amazon (it's typical for 1 stars to be 1.5-2x the number of 2 stars), IMDB (just look at  Goodfellas (1990) - User ratings - IMDb or  The Shawshank Redemption (1994) - User ratings - IMDb , two of the highest rated movies on there) and virtually any other site with a 5 or 10 point rating system.



Entitled Wrote: Rotten tomatoes' 2019 decision to allow ratings only from users who bought tickets


They do not in fact require you to buy tickets; they separate ratings between those who are verified and those who aren't, but that's all. And this is simply not an option for us, there is no financial background to fictions here, and contrary to what you might think, it's at minimum an order of magnitude harder to tell whether a user has read something than it is to tell if a movie ticket is valid and whether it has been used. Defining what constitutes a read is very hard, correlating sparse data is even harder, and storing and processing this kind of data is a million dollar problem.

Just so you understand, we are talking in the magnitude of millions to tens of millions of data points per day, depending on what is exactly stored. For Rotten Tomatoes, that's equivalent to verifying somewhere between the same number of tickets to 10-100x as many tickets - this is because they simply have to check whether a ticket is legitimate; a simple lookup. On the other hand, we'd have to correlate every single page view and page event from a given user and use heuristics to decide whether it's a legitimate read, so a single review corresponds to hundreds of events in the data, rather than just a single ticket.


Entitled Wrote: Many sites allow researchers or even the interested public to analyze their anonymized data, including IMDB and rotten tomatoes.
Let me present you evidence to why this is a terrible idea. The TL;DR is that such public datasets actually allow you to identify the users and constitutes a potentially severe privacy breach.



Entitled Wrote: And what about the idea to make the ratings public, like for reviews, as has been purposes before?
Okay, so let's suppose for a moment that all those ratings are suddenly public, including the tens of thousands of ratings by regular authors who have actually read and disliked your book. Suddenly, we are looking at a war between these authors, you and your entire fan base.


There is a reason election voting is private and confidential. In the privacy of the booth, you can express your honest opinion without public scrutiny and social pressure.

Reducing the number of 0.5s won't make things any better, and realistically this isn't possible anyway. The J curve will remain, and your attempt to flatten it would have dire consequences to the broad community as well as the already messy rating system of the website. The problem of ratings is not that we have too many 0.5s, it's that we have too many 5s.

The more you reduce the remaining 0.5s, the more the ones that remain will hurt, and they will hurt. At the end of the day, you aren't fixing the problem, you are just shifting it elsewhere.


Entitled Wrote: And even as a reader, I have no way to find what stories I myself already rated (without having submitted a written review).


Let me introduce you to  My Reviews | Royal Road. Contrary to the name, it also includes ratings.

Re: On The Subject of Ratings

#94
kanadaj Wrote: ...


First of all, thanks. This is a lot to process, I'm not sure I can add something relevant. I'll continue thinking about it but accept it for now.

kanadaj Wrote: Let me introduce you to  My Reviews | Royal Road. Contrary to the name, it also includes ratings.

Oh, ok. I didn't know this, it's exactly what I was looking for. Awesome, and thanks a lot! Also the links to all my comments etc. Very useful.

I didn't find any link from my profile pages etc to that page. I'm on the mobile interface, so maybe that's the reason? I tried it on my mobile in desktop mode, but then everything becomes very small. Nevertheless I tried, but couldn't find it. Maybe I'm missing something obvious? Thanks!
DrakanThinking

Re: On The Subject of Ratings

#95
So...  there's a whole lot of talk in this thread about numbers and bears.

What I don't understand is why no one is talking about couches.  

See, the way I see it, is if the comfortability rating of the couch is at a 7 or higher out of a 10 scale, it's going to have a positive impact on reader and writer lives that equate to an increase of anywhere from 1.2 to 2.4 on the happiness survey.  With such an increase level in base happiness,  you'll see writers write more and readers more receptive to new stories. I'd estimate it correlates to anywhere from a 5% to a 35% increase in writing speed, as well as a 45% increase in new readers. People want to spend time on comfortable couches, and the choices of couch activities are reading or watching tv. 

Thus, the problem isn't the rating system, but a deficiency in quality furniture.  

To many people are constrained by rising furniture costs, or choosing appearance over comfort. 

This is a socioeconomic problem that must be resolved immediately!

Re: On The Subject of Ratings

#97
I have written reviews for 2 stories, well 3 but only 2 are useful here. One is Delve and I gave it 1.5 stars. One is Skill Trainer and I gave it 1 star. I will give ratings to stories I really enjoy or really hate but I am only motivated to write reviews for specific stories. Stories, typically, that betrayed me. I liked them a lot until I hated them. I've given 0.5 star ratings, I think, I can't recall for sure. Definitely done 5 star ratings.

People are discussing the too high number of .5 star ratings. I think those people are wrong. For 4.5 to 5 there are definitely stories that I loved but had this one minor thing that annoyed me 4.5 stars. What is the feel or experience of a story worth 1 star? What does that rating describe? I can honestly say I have seen *way* more 0.5 star stories compared to 1-1.5 star stories.

In fact the two stories I reviewed are great examples. In the case of Delve the story made a promise that I was really excited about and then it broke that promise in an extreme way. Also a key problem is weekly releases. Weekly releases magnify negatives. Especially for pacing. If Delve was a book or even a finished web serial I could grind through the promise breaking parts quickly. Or I would see how long that section of the story was and quit. As a weekly release Delve creates a sort of Skinner Box whirlpool that is difficult to escape mentally. "Maybe the story will get back to what I like". Just waiting out the slow part But the slow part takes 2 years or w/e and you can't know that ahead of time. Massive thanks to a few authors who I asked for a broad "plot/pacing spoiler" in DMs which let me make the decision to stop reading their story knowing it wouldn't end up working for me.

In the case of Skill Trainer my primary problem was a character arc moment that I absolutely despised. Even if the author had pulled of a fantastic characterization which I dispute that particular character moment was a massive no. And it came pretty near the end of the story. Also the story had a shock ending, possibly because of all the anger over the plot point, which I wasn't alone in hating.

But essentially the point is that a 1 or 1.5 rating is like, several major things I hated but the story was well written from a grammar perspective or something. Just not a common situation for stories on this site. Whereas 0.5 is I'm only rating it this because you don't let us rate 0 or negative.