TL;DR: RICE scores and weighted matrices often turn gut feelings into pseudo-math. Jason Cohen's Binstack cuts through this: decide what "material impact" actually means (be harsh), rank what matters most to your business right now, then eliminate everything that doesn't hit your top priorities. Simple, defensible, and you stop wasting hours debating if something should score 7.2 or 7.8.
You know these prioritization meetings.
The spreadsheet comes out. Features in rows, dimensions in columns. Someone's carefully calculated RICE scores, ICE scores, weighted priorities. Everything multiplied, divided, summed up to two decimal places. Looks rigorous as hell.
And then you spend three hours debating whether "impact" should be scored 2.5 or 3. Whether this feature reaches 10,000 users or 15,000. Whether we're 70% confident or 80% confident.
What are we even doing?
Because here's what's problematic about these frameworks: we're treating subjective judgments as if they're precise numbers. You can't multiply "user impact" by "strategic alignment" and get something objectively meaningful. That's not how any of this works. But we do it anyway because the alternative is admitting we're making judgment calls, and judgment calls feel messy.
So we hide behind the equations.
The RICE Problem (and the ICE problem, and the WSJF problem…)
RICE wants you to estimate Reach, Impact, Confidence, and Effort. Multiply the first three, divide by effort. Great.
But what does "high impact" even mean? Is that a 3 or a 2.5? Compared to what baseline? Your PM scores it differently than I would. And that confidence percentage? Those aren't real probabilities based on historical data. They're vibes. "I feel pretty good about this" becomes 80% and goes into a formula like it means something.
ICE scoring drops the Reach component entirely. So you end up prioritizing features that might help three power users really well over things that would help your entire user base moderately well. That can't be right.
WSJF from the SAFe world wants you to calculate "cost of delay." Cool. What's the dollar cost of delaying a feature that improves user satisfaction but doesn't directly tie to revenue? I'll wait.
You end up guessing. And then treating your guesses like data.
Weighted Scoring is Even More Challenging
This one drives me up a wall. You assign weights to different criteria. "Revenue potential" gets 40%, "strategic fit" gets 30%, "customer satisfaction" gets 30%.
Who decided those percentages?
Usually whoever talks the loudest in the room. Or whoever has the most political capital. Or you reverse-engineer the weights to justify a decision you already made. Change those weights by 5% and suddenly a different feature wins.
And here's the thing about weights: you're implicitly saying "$200K in revenue is exactly equal to 8 points of customer satisfaction improvement." Say that out loud. Does that make any sense? Can you actually trade those things like they're the same unit?
No. You can't add health points to speed points in a video game and get something meaningful. But we do the corporate equivalent every sprint planning.
Then I Found Jason Cohen's Thing
A few years back, Jason Cohen wrote this piece about what he calls "Binstack." (You can read the whole thing at longform.asmartbear.com/maximized-decision/ and you should because he explains it better than most product management content you'll find.)
His core insight: rubrics with weighted scoring are "largely noise."
Stop trying to quantify things that can't be quantified. Stop adding incomparable numbers together. Stop multiplying arbitrary weights.
Instead, get honest about two things: what actually moves the needle (binary materiality), and what matters most right now (stack-ranked priorities).
That's it. That's the framework.
How This Actually Works
First, list out your attributes. Revenue growth, retention, product quality, whatever matters for your decision.
Then define what "material" means for each one. And Cohen's bar here is high. Not "might improve revenue" but "will visibly move the revenue curve." Not "better UX" but "reduces support tickets by 25%." Not "more competitive" but "sales will add it to their pitch deck."
If you can't articulate a measurable change that you'd actually notice in your metrics? It's not material.
This part is brutal. Most ideas don't clear this bar. Which is the point, really. Better to admit an idea is incremental than to give it an impact score of 2.3 and pretend that means something.
Now you've got binary check marks. Feature A materially improves revenue? Yes or no. Not "7 out of 10." Just yes or no.
Next step is where it gets interesting: stack-rank your attributes by what matters most to your business RIGHT NOW. Not weighted percentages. Just ordered priorities. Revenue comes before engagement. Engagement comes before technical debt. Or whatever your actual strategic priorities are.
Then eliminate features. Start with your #1 priority. Cross out everything that doesn't materially impact it. Move to #2 if you still have multiple options. Keep going until one remains.
Done.
Why This Feels Different
The elimination is fast. You're not debating decimal places. You're not negotiating weight percentages. You made the hard decisions up front (what's material? what's most important?) and the rest follows from those.
And you can actually explain your choice: "We picked Feature A because it's the only thing that materially drives revenue, which is our top priority this quarter."
No spreadsheet gymnastics required.
What if nothing survives your materiality threshold? Then Cohen's right: "your problem wasn't one of prioritization after all, but rather of not having ideas worth prioritizing." Go generate better ideas.
That's harsh. But it's honest.
The Part People Find Challenging
This framework forces you to admit you're making tradeoffs.
Weighted scoring lets you pretend you're optimizing everything simultaneously. "This feature scores well across ALL dimensions!" But you're not optimizing everything. You're choosing. Revenue over retention. Growth over profitability. New features over technical debt.
Stack-ranking makes that choice explicit. Some people find that uncomfortable. They want the framework to make the decision for them.
It won't. No framework will.
What it WILL do is make you honest about the decision you're already making. Stop hiding it behind formulas.
Where This Breaks Down
If you've got genuinely comparable options after elimination? Cohen says pick whichever one is more fun to build. I'm not kidding. When impact is truly equal, motivation matters. Your team will do better work on something they're excited about.
Is that rigorous? Not really. But at least it's honest about what's happening.
And your stack-ranking should change over time. Early-stage company burning through runway? Revenue might be #1. Mature product with stable revenue? Maybe retention moves to the top. Your priorities shift. The framework shifts with them.
Look
This isn't some revolutionary insight. Cohen figured this out years ago. It's just disciplined thinking about what actually matters, stripped of the complexity that makes us feel better about subjective calls.
Call it First Principle Decisioning if you want. Call it Binstack. Call it "stop wasting time in three-hour prioritization meetings."
The point is the same. Your RICE score isn't revealing objective truth. It's obscuring judgment. The weighted matrix isn't making decisions objective. It's making politics less visible.
Maybe just… stop pretending?
Define what material impact looks like. Rank what matters most. Pick the thing that does both.
Reference: Jason Cohen's Binstack framework for the original and more detailed breakdown of why rubrics fail and what to do instead.