Ai Personalization8 min read

Conversational Commerce Comes of Age: Beyond the Chatbot Hype

After years of underwhelming chatbot experiences, AI-powered conversational commerce is finally delivering on its promise. Here's what's changed, and what the numbers actually say.

Topics

personalisation generative-ai consumer-behaviour uk-retail

Sarah Chen

Senior Editor

—28 April 2026

Walk through a market in Bangkok and you'll see this constantly: someone on their phone, mid-conversation with a LINE Official Account, sorting out an order, asking about sizing, complaining about a delivery. Not reluctantly, not as a fallback. Just shopping, in the way shopping works now.

I've been watching this behaviour for four years. So when UK retailers started saying their AI assistants were "finally working," I paid attention. Because the gap between what conversational commerce promised and what it delivered has been embarrassing for long enough.

The gap is closing. Not because chatbots got friendlier. Because the underlying technology changed.

What Actually Shifted

The old model was retrieval-based. A customer asked something, the system matched keywords to pre-written answers, and if the match failed you got a polite dead end. The frustration of 2018 to 2022 was structural: no amount of better writing could fix a system that literally could not answer questions it hadn't been trained to handle.

Large language models broke that constraint. The shift to generative responses means the AI is reasoning about what you said, not just pattern-matching it. That sounds subtle. In practice, it means a chatbot can handle "I need something for my friend's wedding, she's having a garden party in July and I hate looking too formal" without routing you to a dead end.

Analysis of over 1,200 production LLM deployments by ZenML found that the most effective systems combine all three: generative models for complex and open-ended queries, traditional ML for structured lookups, and deterministic rules where precision matters. Which is exactly what mature conversational commerce architectures look like.

What a Modern Fashion Chatbot Actually Does

Here's a composite illustration of how these conversations go now, based on published deployment patterns from UK fashion retailers:

Customer: "Hi, I need something for a wedding in Italy in June. Not the bride, just a guest. Budget around £200 but flexible for the right thing."

AI: "How exciting! A June wedding in Italy. A few questions to help me find the right outfit: do you have a preference for a dress, or would you consider a jumpsuit or separates? And do you know if it's a daytime or evening ceremony?"

Customer: "Evening I think. Open to anything really, I'm terrible at choosing."

AI: "That actually gives us more options to explore! For an evening wedding in Italy, I'd suggest something that balances elegance with the warm weather. Three directions we could go:

A flowing midi dress in a rich colour (this emerald green number is very on-trend right now)
A sophisticated wide-leg jumpsuit in navy (comfortable for a long evening)
A terracotta linen two-piece you could rewear separately afterwards

Would any of these appeal as a starting point?"

The conversation goes on through colour preferences, body-shape considerations, accessories. The basket at the end of it is fuller than the one a browse session would have produced.

That kind of exchange was not possible with a decision-tree bot. It is not magic, either. It requires careful architecture.

What the Numbers Actually Say

There's a version of the business case for conversational AI that circulates in vendor presentations with very precise-looking tables: conversion rate up from 2.1% to 4.8%, average order value up from £67 to £89, returns down from 24% to 18%. No source cited.

I don't think those figures are usable.

The UK ecommerce average conversion rate benchmarks at around 3.4%, not 2.1%, so the baseline is already wrong. The specific uplifts have no verified provenance.

What the research that does have sources actually says:

Rep AI's analysis of over 17 million shopping sessions found that returning customers who use AI chat during a session spend 25% more than returning customers who don't. That is an AOV signal, not a total conversion figure, and it comes from vendor data, but 17 million sessions is not nothing.

The same dataset shows a 58% success rate for chatbot-handled returns and cancellations. Which suggests the most reliable gains right now are in the operational layer, reducing contact-centre load on straightforward transactional requests, rather than the exploratory shopping experience that the demo transcripts like to highlight.

The market is clearly growing. Estimates for the global conversational commerce sector in 2025 range from $8.8B to $17.2B depending on how broadly you draw the scope, with growth rates consistently in the 14–22% CAGR range across multiple research firms. The variance is methodological; the direction is not disputed.

How the Deployments That Work Are Built

Across the practitioner literature and the retailer announcements worth paying attention to, a few things are consistent.

Hybrid architecture. Nobody sensible is running everything through a foundation model. The LLM handles open-ended conversation; structured product search handles inventory queries; business rules handle pricing, stock policy, and anything where you cannot afford hallucination. These layers need to be deliberately joined, not assumed to talk to each other automatically.

M&S has been vocal about the scale of AI investment: equipping 11,000 colleagues with generative AI tools as of early 2026, alongside customer-facing AI search. The detail of how the consumer-facing layer is actually architected hasn't been published. John Lewis has been similarly active on chatbot deployment for customer queries. Neither has shared results publicly in the way the vendor case studies claim to.

Guardrails as infrastructure, not afterthought. Responsible deployment means: clear disclosure that customers are talking to AI; defined limits on topics the AI will engage with; automatic escalation paths for complaints, sensitive queries, or anything the model is uncertain about; and monitoring for off-brand or factually incorrect responses. The practitioner consensus is that the most reliable guardrails are implemented in code, not in prompts. Prompt-level guardrails can be argued around.

Continuous improvement as a commitment, not a launch feature. The deployments that plateau are the ones that treated go-live as the finish line. The ones that compound are running regular reviews of failed or low-rated conversations, testing response variants, and using conversation data to improve product cataloguing and search.

None of this is complicated in principle. It is quite involved in practice, particularly for retailers who didn't build their product data infrastructure with LLM consumption in mind.

The Challenges That Haven't Gone Away

Product visualisation is still limited conversationally. Describing a sofa or a pair of shoes in text, even very good text, has constraints that images don't. Rich media integrations help but add architectural complexity.

Complex configuration products (custom furniture, made-to-measure clothing, anything with many interdependent options) become laborious to specify through conversation. There's a reason the transcript examples tend to feature fashion, not flat-pack.

Cultural and linguistic nuance is a real limitation. Models trained primarily on standard English can misfire with customers using regional UK dialects, or with international customers whose English doesn't match training patterns. This matters more for UK retailers than the vendor presentations tend to acknowledge.

And some shoppers just prefer to browse on their own. The assisted-shopping model works well for customers who have approximate intent but need help narrowing down. It works less well for the browsing-with-no-particular-goal session that is still a significant share of retail traffic.

Where This Goes Next

The frontier, commercially speaking, is proactive engagement. AI that initiates conversation based on detected signals rather than waiting to be asked.

Rep AI's research puts proactive AI chat at 35% cart abandonment recovery rates, significantly above the 5–15% traditional exit-intent methods achieve. The mechanism is timing: intervening before the cart is abandoned, not chasing it afterwards.

"I noticed you've been browsing summer dresses. Would it help if I showed you some new arrivals that match what you've been looking at?" Done well, that feels like good service. Done badly, it feels like surveillance. The line is fine, and calibrating it requires more data and more iteration than most teams budget for at launch.

I've been watching SEA retailers run proactive chat for years. It's been standard practice on LINE OA in Thailand and on Shopee Live across the region for longer than most UK e-commerce teams have been thinking about it. The pattern that works is immediacy and specificity: respond to what the shopper is actually doing, not a generic "can I help?". The pattern that fails is frequency: messaging everyone who pauses on any page.

The technology is no longer the constraint. The craft is.

Seventy-six per cent of online retailers have deployed or are planning to deploy chatbot capability. That stat probably includes a lot of rule-based systems that will feel obsolete quickly. The question isn't whether to invest in conversational commerce. It's whether you're building something that compounds over time, or something that looks fine at launch and slowly irritates people.

Conversational Commerce Comes of Age: Beyond the Chatbot Hype

After years of underwhelming chatbot experiences, AI-powered conversational commerce is finally delivering on its promise. Here's what's changed, and what the numbers actually say.

Topics

personalisation generative-ai consumer-behaviour uk-retail

Sarah Chen

Senior Editor

—28 April 2026

The gap is closing. Not because chatbots got friendlier. Because the underlying technology changed.

What Actually Shifted

What a Modern Fashion Chatbot Actually Does

Here's a composite illustration of how these conversations go now, based on published deployment patterns from UK fashion retailers:

Customer: "Hi, I need something for a wedding in Italy in June. Not the bride, just a guest. Budget around £200 but flexible for the right thing."

Customer: "Evening I think. Open to anything really, I'm terrible at choosing."

AI: "That actually gives us more options to explore! For an evening wedding in Italy, I'd suggest something that balances elegance with the warm weather. Three directions we could go:

A flowing midi dress in a rich colour (this emerald green number is very on-trend right now)
A sophisticated wide-leg jumpsuit in navy (comfortable for a long evening)
A terracotta linen two-piece you could rewear separately afterwards

Would any of these appeal as a starting point?"

The conversation goes on through colour preferences, body-shape considerations, accessories. The basket at the end of it is fuller than the one a browse session would have produced.

That kind of exchange was not possible with a decision-tree bot. It is not magic, either. It requires careful architecture.

What the Numbers Actually Say

I don't think those figures are usable.

The UK ecommerce average conversion rate benchmarks at around 3.4%, not 2.1%, so the baseline is already wrong. The specific uplifts have no verified provenance.

What the research that does have sources actually says:

How the Deployments That Work Are Built

Across the practitioner literature and the retailer announcements worth paying attention to, a few things are consistent.

None of this is complicated in principle. It is quite involved in practice, particularly for retailers who didn't build their product data infrastructure with LLM consumption in mind.

The Challenges That Haven't Gone Away

Where This Goes Next

The frontier, commercially speaking, is proactive engagement. AI that initiates conversation based on detected signals rather than waiting to be asked.

The technology is no longer the constraint. The craft is.