Impressive Is Not Enough: Julie Belião on What It Really Takes to Turn AI into Product Value
Q1. You’ve spent over 15 years working across AI and multilingual technology, often joining organizations before the direction was fully clear. In your experience, what is the most common mistake teams make when they assume they have an AI capability ready to become a product — and how do you recognize that mistake before it becomes expensive?
The problem is when capability gets confused with usefulness. When a team builds something elegant, technically impressive, genuinely powerful, and nobody stops to ask whether it actually helps anyone or addresses a real problem. The capability exists; the demo works; everyone gets excited; and that excitement starts doing the work that evidence should be doing.
Something else is happening right now, and I do not think we name it enough. As AI tooling makes some technical tasks faster and cheaper, more engineers and scientists are moving into responsibilities that used to sit with product, design, strategy, and operations. Technical fluency helps, of course. But it does not automatically become product judgment. Building something fast is not the same as knowing what to build, for whom, and whether it will survive contact with a real user, a real market, or a real legal constraint. Those are different skills. They are learned differently. And they do not appear automatically just because the cost of shipping has dropped.
If you have not been trained to think about the user first, about what the actual problem is, about compliance, about cultural context, about ethics, you will build impressive things fast. But then what? By the time that becomes clear, a lot has already been spent. Not just money. Belief. Roadmap. Reputation. Those are harder to recover.
Q2. There is a lot of pressure on teams right now to ship AI features quickly and demonstrate value. But speed and real product value don’t always point in the same direction. How do you think about the tension between moving fast and taking the time to genuinely understand whether what you’ve built actually solves something meaningful for users?
Speed is not the problem. Confusion about what speed is for is the problem.
There is a real difference between moving fast to learn and moving fast to look like you are moving. Right now a lot of teams, and honestly also a lot of leaders, are under enormous pressure to show visible “AI” progress. So they ship features, assistants, copilots, workflow layers, etc. Which can be useful. But useful for what, exactly?
If speed helps you surface a real problem faster, invalidate a weak assumption before you overspend, or understand whether users will actually change behavior, then it is doing its job. If it mainly produces demos, internal excitement, and something to put in the board update, then it is a more efficient way to build the wrong thing.
One thing I think is undernamed right now: early curiosity is not evidence of value. People click because it is new. Because it sounds capable. Because leadership is watching and nobody wants to be the person who ignored the future. None of that means the thing matters. Real value shows up later and more quietly. Repetition. Retention. Fewer manual steps. Less friction. Better decisions. Those signals take longer to arrive and they are less exciting in a review meeting, which is precisely why teams stop looking for them.
There is also something people avoid saying plainly. Once you move into user-facing and market-facing work, speed without judgment gets expensive fast. Not in an abstract way. In a very concrete way: users absorb the cost of your speed, through broken workflows, lost trust, or data they did not expect to expose. That is not just a product quality problem. At some point it becomes an ethical one.
So yes, move fast. But be precise about what you are trying to accelerate. The strongest teams I have worked with are not the ones that shipped the most. They are the ones who learned fastest, cut fastest, and were least embarrassed to say that an exciting direction was not ready yet or no longer the right one.
Q3. At Mozilla.ai you helped navigate a transition from a research-first initiative to a more market-facing organization — a journey many AI teams are on right now. What does that transition actually feel like from the inside, and what are the organizational and cultural shifts that teams tend to underestimate or get wrong?
From the inside, it feels much less linear than it looks on paper. And considerably less comfortable.
The “tidy” version of the story is: start with research, find what is promising, turn the best ideas into products. What actually happens is that the shift requires a real change in operating logic, and not everyone in the organization experiences that the same way, or wants it.
People who are strongest at early exploration tend to thrive on autonomy and open problem space. That is often exactly what you want in the early stage. But market-facing work asks for something different at the same time. Prioritization. Repetition. Accountability. Tolerance for constraints that come from outside, not from inside. And suddenly, you are not just navigating what is technically possible or intellectually interesting. You are dealing with users who are honest in ways internal teams are not, buyers with real budgets and real alternatives, compliance requirements that do not bend, and a constant pressure to cut things that are clever but not valuable enough.
That is where many organizations struggle. Not because people are resistant. Because they underestimate how different those muscles are. Strong research instincts do not automatically translate into strong product judgment, strong go-to-market instincts, or strong operating discipline. And the cultural gap is bigger than most leaders anticipate. You cannot run a market-facing organization on research-mode governance. The management style, the meeting rhythms, the success metrics, the funding logic: all of it needs to shift. Most organizations say they want product and market outcomes while still being culturally optimized for exploration. That tension does not resolve by itself.
One thing I would add, because I think it often gets underweighted: rigor is not the enemy of good thinking. At the point where you are building for real users and real customers, rigor is a form of respect. For time. For focus. For the user’s actual problem(s). Becoming more merciless about ideas that are technically elegant but not valuable enough is not a loss. It is what the phase requires.
Q4. “AI-ready” data, “production-ready” models, and “market-ready” products are all different things — but teams often treat them as if they’re the same milestone. Where do you see organizations most commonly confuse internal technical progress with actual readiness to deliver value, and what does a more honest assessment look like?
Internal progress is easier to measure, easier to present, and more flattering to the team that produced it. That is why it keeps getting confused with external readiness.
You can point to dataset coverage, model quality, benchmark gains, latency, inference cost. That is real work and I am not dismissing it. But none of it tells you whether people will adopt the product, trust it enough to rely on it, change their behavior because of it, or keep using it once the novelty wears off.
The confusion usually happens at the transitions. Data is declared ready because it has been cleaned well enough for development. The model is ready because it performs under technical evaluation. And from there, people start speaking as if the product is nearly there, because the momentum feels real and the demos are getting smoother. But product readiness is a different test entirely. It introduces usability, trust, governance, recoverability when things go wrong, workflow fit, procurement friction, and the very blunt question of whether anyone cares enough to pay, commit, or actually change habits.
What makes machine intelligence specifically tricky here is that the outputs can look more capable than the underlying system actually is. The demo looking good is not evidence that the product is ready. It is evidence that one path through the system works, under favorable conditions, with someone who knows how to use it. That is a meaningful gap.
A more honest assessment means treating these as genuinely separate gates. Technical readiness needs technical evidence. Operational readiness needs to be proven in context. Market readiness needs user behavior, adoption, trust, and some form of external pull. Not internal confidence. Not stakeholder enthusiasm. Not a convincing, smooth demo.
If what you mostly have is strong lab performance and a compelling story, you have technical progress. That may be real and valuable. It is not the same as readiness. Teams save themselves a lot of pain when they say that clearly and early, before the investment has already made the conversation politically difficult.
Q5. Looking at the broader landscape of AI products today, where do you see the biggest gap between what teams think they are building and what users or the market actually needs — and what would you tell a product or strategy leader who suspects their team might be building in the wrong direction but isn’t sure how to course-correct?
One of the largest gaps right now is between building what is technically impressive and building what is genuinely useful and worth coming back to.
A lot of teams think they are building intelligence. Users are mostly looking for the reduction of friction. Less wasted time. Fewer manual steps. Better decisions with more confidence. The ability to do something they genuinely could not do before. Many AI products still ask users to adapt to the system, tolerate inconsistency, verify too much, and accept unclear limits. That works for experimentation. It is not a stable basis for a product.
There is another gap I see often and that people are less willing to name. Some teams are not actually building for a market. They are building for a mirror. For investors, for peers, for internal conviction, for the image of being at the frontier. The market is less sentimental than that.
We are also living through a strange moment where some of the people with the deepest design, product, market, regulatory, and operating judgment are being treated as more replaceable than they really are, simply because AI makes parts of execution faster. And some of what is being lost in that process is not replaceable by a model. The judgment about what to build and for whom. The ability to read a room, a market, a regulatory signal. The instinct that comes from having been wrong before in expensive ways. All that is not nostalgia; it is a real capability gap that will show up, just later and less visibly than a headcount reduction.
If a leader suspects the team is going in the wrong direction, I would start by stripping away the internal narrative and asking a few uncomfortable questions. What painful, specific problem are we solving? For whom? What behavior would tell us this matters? What would make users come back without being nudged? What would make a buyer prioritize this over something else? What would users genuinely miss if we removed it tomorrow?
I would also look carefully at where the enthusiasm is coming from. Is there real pull from outside, or mostly energy from inside? Is the value clear in the user’s own terms, or only after the team explains it? Is the product removing friction, or relocating it?
Course-correcting does not always mean throwing the work away. Sometimes the capability is real, but the user, the workflow, or the commercial model is wrong. Sometimes what looked like an end-user product is actually infrastructure. Getting to that answer requires honesty and some willingness to disappoint yourself/boss/investors. It’s harder than it sounds when the team is talented, and the internal belief is high.
Qx. Anything else you wish to add?
The last thing I would add is about the framing itself. I find the term “artificial intelligence” both overused and slightly misleading.
Human intelligence is not one thing. It takes many shapes: logical, emotional, social, creative, embodied. We do not treat it as a single capability, and I am not convinced we should talk about machine intelligence that way either.
What the public conversation currently calls “AI” is also narrower than it sounds. A lot of the narrative today is centered on large language models, for understandable reasons. They are remarkable systems and they have become the dominant interface through which many people now encounter AI. But they are still one paradigm among others, not the whole field. World models, neuromorphic computing, and other specialized approaches are also advancing, with different strengths, constraints, and use cases.
That matters because organizations making workforce or architecture decisions as if the current LLM moment were the whole story may be overfitting to a single phase of the field. Replacing experienced human judgment with a uniform layer of generative AI fluency does not automatically make an organization more intelligent. It can also make it more brittle.
You may not want an army of vibe coders. You probably want people who understand systems, who have seen different technology cycles, who know what breaks and why, and who can adapt when the next paradigm arrives. Because it will.
—————————————————

Julie Belião holds a PhD in Computer Science with a background in NLP and linguistics, and has spent 15 years spanning AI strategy, product, go-to-market, and organizational transformation across global technology companies, including Google, Unbabel, Defined.ai, and Mozilla.ai. She has worked across languages and cultures, and her thinking on machine intelligence is shaped as much by that range as by the technical work. Most of her career has been spent in moments when technical potential still needed to be translated into real products, real adoption, and commercial reality, and she has rarely had the luxury of building for a single language, culture, or demographic, which makes questions of diversity and bias in AI less abstract and more operational for her.