The five-star review system has long been a cornerstone of digital platforms, from app stores to e-commerce sites. But as Terry Godier, creator of the excellent RSS reader Current, recently discovered, the system is fundamentally flawed. Godier noted that many apps receive four-star reviews accompanied by glowing praise like "This is my favorite app!" or "Gamechanger!" Yet those same 4-star ratings actually pull down the app's overall average, because anything below five stars is perceived as a failure. This paradox, exhibit 472,304 in the long-running saga of broken review systems, raises uncomfortable questions about how we evaluate products in the digital age.
The Problem with 4-Star Reviews
Godier's observation strikes at the heart of the issue: a 4-star rating, in the current climate, is effectively a negative review. When users see an app with a 4.2 average, they often hesitate to download it, assuming something must be wrong. Yet the comments attached to those 4-star reviews often express pure delight. One user might write, "This app changed my life! 4 stars because nothing is perfect," while another says, "I use it every day. 4 stars." The net effect is that apps that are genuinely beloved are punished by the very system meant to showcase them.
This phenomenon is not new. Psychologists have studied the so-called "J-shaped distribution" of online ratings, where products tend to receive mostly 1-star or 5-star reviews, with everything in between becoming a rarity. But in recent years, the middle ground has become even more problematic. Many users now treat the 5-star rating as a binary: either it's perfect (5 stars) or it's not. And because no product is truly perfect, even the best tools get marked down to 4 stars.
Historical Context: From Amazon to the App Store
The 5-star system originated with websites like Amazon in the late 1990s, where it was intended to give customers a quick glance at product quality. Early on, it worked reasonably well: a 4-star item was generally considered very good, and 3-star meant average. But over time, user behavior shifted. As the internet became more polarized, so did ratings. By the time Apple launched the App Store in 2008, a 5-star rating was already becoming the only acceptable score for a worthwhile product.
Today, a 4.5-star app is considered mediocre in many categories. Developers fret over their average rating, often begging users for 5-star reviews to compensate for the inevitable 1-star rants. Some apps resort to dark patterns, such as showing the rating prompt only to users who have opened the app multiple times, hoping to catch them in a positive mood. Others offer in-app rewards for reviews, though this violates many store policies. The system has become so gamified that honest feedback is nearly impossible to glean from star ratings alone.
The Psychological Impact on Developers
For independent developers like Terry Godier, the broken review system is more than an annoyance—it's a career threat. A few 4-star reviews can drop an app's average from 4.8 to 4.5, making it appear inferior to competitors who have aggressively solicited 5-star ratings. Godier's app, Current, is a modern RSS reader with a clean interface and robust features. It deserves high praise, yet the system punishes it for not being "perfect."
This creates a perverse incentive: developers must either pressure users for 5-star ratings or accept that their app's rating will slowly decline, even if every single user loves it. The App Store's algorithm compounds the problem by favoring apps with higher ratings in search results, meaning a single 4-star review can have cascading effects on visibility and downloads. Small developers, lacking the marketing budgets of larger companies, are hit hardest.
What Users Actually Mean by Star Ratings
Studies have shown that users interpret star ratings inconsistently. For some, 5 stars means "exceptional and flawless." For others, 5 stars merely means "I like it." A 4-star rating might mean "very good but not perfect" or "I ran into one minor bug." Yet the system treats all 4-star ratings identically, collapsing nuanced opinions into a single number. This flattening of feedback is at the root of the problem.
Moreover, the system encourages a binary mindset: either you love it (5 stars) or you hate it (1 star). The middle ratings become a kind of no-man's-land, where articulate, thoughtful feedback goes to die. A well-written 3-star review that explains both strengths and weaknesses is far more useful than a kneejerk 5-star review that says "great app!" but the numeric system values the latter more.
Potential Solutions and Alternatives
Some platforms have experimented with alternatives. Netflix famously abandoned its 5-star system in favor of a simple thumbs up/thumbs down, recognizing that nuanced ratings were not being used effectively. However, that system has its own drawbacks—it loses granularity and makes recommendations less precise. Others have tried 10-point scales, binary upvote/downvote systems, or entirely text-based reviews without numerical scores.
A more promising approach might be to display not just the average but also the distribution of ratings. If users can see that an app has a 4.0 average comprised of 70% 5-star and 30% 4-star reviews, they can interpret the data more intelligently. Alternatively, platforms could weight reviews by recency or by verified usage. Another idea is to allow muted star ratings—for example, a 4-star rating could be displayed as 4.5 if the written review is positive, or the comment could override the star value to some extent.
Yet none of these solutions addresses the underlying cultural shift. Until users stop equating 4 stars with failure, any numeric system will be broken. Education and norm-setting could help: platforms could show icons or labels (like "Loved it" for 5 stars, "Liked it" for 4 stars) to re-anchor user expectations. But such changes are slow to propagate, and legacy review systems resist overhaul.
The Case of Current: A Microcosm of a Macro Problem
Terry Godier's experience with Current serves as a perfect case study. He built a high-quality app that solves a real need—ad-free, customizable RSS reading in an era of algorithmic feeds. Early reviews were enthusiastic, but soon the 4-star reviews began to pile up. Each one, by itself, was benign. But collectively, they dragged his average from 4.9 to 4.6, making Current look inferior to apps with more aggressive 5-star solicitation.
Godier took to social media to vent, and his thread went viral. Many developers nodded in agreement, sharing their own horror stories. One developer noted that a user left a 1-star review because the app didn't support a niche feature, despite the description clearly stating its limitations. Another shared how a 2-star review was accompanied by a comment that said "Great app, but I got confused by the settings." The system has no mechanism to account for user error or unreasonable expectations.
The irony is that app stores want high-quality feedback. They tout their review systems as a way for users to make informed decisions. But in practice, the systems discourage nuanced feedback and reward simple, binary reactions. The very structure of the 5-star scale, with its implied hierarchy, leads to the distortion that Godier experienced.
Broader Implications for Consumer Choice
Beyond the app ecosystem, broken review systems affect all online purchasing. A hotel with a 4.0 rating may be excellent, but it will be overlooked in favor of a hotel with a 4.3 rating that is actually worse. Restaurants, electronics, books—all suffer from the same inflation. The result is a race to the top of the scale, where anything less than 5 stars is suspect.
This phenomenon also creates opportunities for manipulation. Sellers can pay for fake 5-star reviews to boost their average, while competitors might leave fake 1-star reviews to drag theirs down. The arms race continues, and the honest voices get drowned out. The Federal Trade Commission and other regulators have begun to crack down on fake reviews, but the fundamental design of the numeric rating system remains intact and vulnerable.
Some researchers have proposed a completely star-free future, where products are evaluated via checklists or attribute-based ratings (e.g., "Does this app respect your privacy?" or "Is this hotel clean?"). Such systems would capture multidimensional quality without reducing it to a single number. However, they are more complex to implement and require more effort from users, which many platforms are unwilling to impose.
The Emotional Toll on Creators
For creators like Godier, the broken system takes an emotional toll. Seeing a 4-star review that says "Best app ever!" can be deeply confusing. Did the user mean to give 5 stars? Did they think 4 stars was better? Should the developer reach out to ask them to change it? Many developers obsess over ratings, checking their store listing multiple times a day, watching the number tick down. This anxiety is not healthy, and it stems from a system that is fundamentally misaligned.
Godier has said he tries not to let it bother him, but he admits it's hard. A single 4-star review can ruin his morning, even though intellectually he knows the app is wonderful. The system has gamified his emotions, turning his passion project into a source of stress. He is not alone; countless indie developers report similar feelings. The five-star system, originally designed to help consumers, now harms creators.
What Can Be Done?
Platforms must take responsibility. Apple, Google, Amazon, and others have the power to redesign their review systems to better serve users and developers. They could start by removing the 5-star scale in favor of a simpler thumbs up/down, as Netflix did. Or they could implement a sliding scale that asks users to rate specific attributes (e.g., usability, design, value) and then combine those into a refined score. Another option is to show only written reviews by default, with the star rating hidden or de-emphasized.
But change is slow. App stores are massive ecosystems with billions of existing ratings. Any modification risks confusion or backlash. Still, the status quo is untenable. As Godier noted, "the five-star review system is broken, exhibit 472,304." Each broken review experience adds to a growing pile of evidence that the system is not just flawed but actively harmful. It misrepresents quality, stresses creators, and misleads consumers.
In the meantime, users can help by being more thoughtful about how they rate. Before leaving a 4-star review, consider whether that app actually deserves a 5-star rating in your honest estimation. If you love it, give it 5 stars. If you have minor issues, write a detailed written review instead of letting the star do the talking. And if you ever see a 4-star review with a glowing comment, consider asking the reviewer why they didn't give 5 stars. They might not realize the impact.
The problem will likely persist until platforms decide to fix it. Until then, app developers like Terry Godier will continue to watch their averages decline, one well-meaning 4-star review at a time. And the rest of us will keep wondering why such great products can't seem to get the ratings they deserve.
Source: The Verge News