Comparing Amazon product ratings

JenniferMurphy

Well-known Member
The product ratings on Amazon can be helpful, but they can also be misleading. For anyone unfamiliar with the Amazon rating system, it works like this. Any Amazon customer can submit a rating on any product on a scale of 1 to 5 stars. Amazon then displays an average rating for each product. If, for example, a product has 3 5-star ratings, 2 3-star ratings, and 1 1-star rating (3 0 2 0 1), it's overall (average) rating would be 3.67 (= (3*5 + 2*3 + 1*1) / 6 = 22 / 6).

It is common for a product with just a few ratings to have a higher rating than a similar product with 10x or even 100x as many ratings. Clearly, if the ratings are equal, say 4.5, and one has just 5 ratings and the other has 500, I will have much more confidence in the one with 500 ratings. But what if the one with 5 ratings has an average rating of 4.8 and the one with 500 has an average rating of 4.6. Now which one should I feel more confident about? One has a higher average rating, but it is based on 1/100th of the data.

There ought to be a statistical measure that will help me compare them and decide.

Here is some sample data that I have made up for 4 hypothetical products. Amazon doesn't show the actual number of ratings in each category, just the percentages. So I have 2 rows for each of the 4 products. The first row contains the hypothetical counts of the actual ratings. The second row shows the percentages, which is what Amazon displays.

Column D has the product name (A, B, C, D). Columns E-I contain the actual tallies. Col J has the number of ratings, Col K has the average rating, and Col L has the Amazon rating (which is rounded to 1 decimal place). Col M has the Statistical Rating that I would like help calculating.

Product A, for example, is on rows 4 & 5. It only has 3 ratings so far and they are all 5s (3 0 0 0 0). Possibly this is a brand new product and these ratings are from three members of the manufacturer's family. In any case, the mean is 5.00 (K4) and the rating, which Amazon rounds to 1 decimal place, is 5.0 (L4).

Product B is a little more interesting. It has 30 ratings -- 28 5s, 1 4, and 1 1 -- for a mean of 4.83 (K6), which Amazon rounds to 4.8 (L6). Products C has 10 times as many ratings as B and D 10 times as many as C. Since the manufacturer has a limited number of family members, the average ratings are dropping.

 C/R D E F G H I J K L M 3 Product 5 4 3 2 1 # Mean A-Rtg S-Rtg 4 A 3 0 0 0 0 3 5.00 5.0 ??? 5 100% 0% 0% 0% 0% 6 B 28 1 0 0 1 30 4.83 4.8 ??? 7 93% 6% 1% 0% 3% 8 C 240 37 8 5 10 300 4.64 4.6 ??? 9 80% 12% 3% 2% 3% 10 D 2,010 650 120 80 140 3,000 4.44 4.4 ??? 11 67% 22% 4% 3% 5%

<tbody>
</tbody>

I would like help calculating some measure of the ratings of these products that I can use to determine which one really is the highest rated. I was thinking that some sort of confidence interval on the mean might work. I could take the lower bound or perhaps the value that is 20% from the lower to the upper bound or something like that.

Can someone help me with the correct statistical tool to compare them?

Excel Facts

Highlight Duplicates
Home, Conditional Formatting, Highlight Cells, Duplicate records, OK to add pink formatting to any duplicates in selected range.

Eric W

MrExcel MVP
Interesting question, and one that I've pondered about myself. Your idea of using a confidence interval sounds promising. Using the instructions from this site:
How to Compute a Confidence Interval in 5 Easy Steps: MeasuringU

I created the Confidence interval from your table:

CDEFGHIJKLMNOPQ
2C/R
3Product54321#MeanA-RtgS-RtgStd DevStd ErrorMargin of errorConfidence interval
4A30000355???0005 - 5
5100%0%0%0%0%
6B281001304.83333334.8???0.746640.1363170.2726344.6 - 5.1
793%6%1%0%3%
8C2403785103004.644.6???0.8867660.0511970.1023954.5 - 4.7
980%12%3%2%3%
10D2,0106501208014030004.43666674.4???1.0229060.0186760.0373514.4 - 4.5
1167%22%4%3%5%

<tbody>
</tbody>
Sheet2

Worksheet Formulas
CellFormula
J4=SUM(E4:I4)
K4=SUMPRODUCT(E\$3:I\$3,E4:I4)/J4
L4=ROUND(K4,1)
O4=N4/SQRT(J4)
P4=O4*2
Q4=ROUND(K4-P4,1)&" - "&ROUND(K4+P4,1)

<tbody>
</tbody>

<tbody>
</tbody>

Array Formulas
CellFormula
N4{=STDEV.S(IF(ROW(INDIRECT("1:"&J4))<=E4,5,IF(ROW(INDIRECT("1:"&J4))<=E4+F4,4,IF(ROW(INDIRECT("1:"&J4))<=E4+F4+G4,3,IF(ROW(INDIRECT("1:"&J4))<=E4+F4+G4+H4,2,1)))))}

<tbody>
</tbody>
Entered with Ctrl+Shift+Enter. If entered correctly, Excel will surround with curly braces {}.
Note: Do not try and enter the {} manually yourself

<tbody>
</tbody>

It does provide some value, but product A still has far too few ratings to rely on, despite the Margin of Error = 0.

As as aside, I try to get an idea of the product by reading some reviews from each level (1-5). I ignore any that have a 1 simply because it never arrived, but I try to look to see if there is a valid complaint, something that comes up more than once, and something that matters to me. "Color not the same as the picture" I tend to ignore, "wire is flimsy and broke the second day" I pay attention to. And I often discount 5s because of the reasons you say, that they are planted.

Because of issues like this, you may want to create a formula that returns "insufficient info" if the number of reviews is below a certain number. And throws out the top 5% and bottom 5% of reviews to hopefully get rid of the planted reviews and the reviews that are too low for non-product related reasons.

Hope this helps.

Last edited:

Replies
0
Views
93
Replies
3
Views
149
Replies
4
Views
114
Replies
0
Views
161
Replies
5
Views
284

1,190,798
Messages
5,982,974
Members
439,810
Latest member
phobo3s

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back