AI for Reducing Cart Abandonment and Returns
Two numbers define the leakiest parts of every ecommerce funnel: 70% of shopping carts are abandoned before checkout. That's 7 out of 10 people who wanted something enough to add it to their basket, then walked away. 17% of all retail sales are returned. In fashion, that number climbs to 24-30%. Returns cost US retailers $890 billion in 2024. Over half of Gen Z shoppers routinely "bracket" — ordering multiple sizes with the explicit plan to send most of them back. These aren't just conversion problems. They're prediction problems. The data that tells you a shopper is about to abandon is already in your logs. The data that tells you a product will be returned is already in your order history. You just need to ask the right questions before it happens, not after. AI-powered approaches reduce cart abandonment by 18% on average. Retailers using AI-powered fit prediction see size-related returns drop by 27%. Proactive chatbot interventions recover 35% of carts that would otherwise be lost. This post builds two predictive systems from scratch: an abandonment risk scorer that identifies shoppers about to leave (and intervenes), and a return risk predictor that flags sizing issues and high-return products before they ship. Both are implemented in Python for the ML models and Ruby on Rails/Solidus for the ecommerce integration. No client-specific examples — just the patterns, the maths, and the code.
Part 1: Predicting Cart Abandonment
Why People Abandon (and Why It's Predictable)
The top reasons shoppers abandon carts are well-documented: 48% cite unexpected extra costs (shipping, tax), 24% are forced to create an account, 22% find delivery too slow, 18% have concerns about return policies, 17% find the checkout too complicated, and 13% can't find their preferred payment method.
But here's what makes this an ML problem rather than a UX problem: different shoppers abandon for different reasons, and you can predict which reason applies to which shopper based on their behaviour. A shopper who repeatedly toggles between payment methods is having a payment friction problem. A shopper who navigates away to check competitor prices is having a price confidence problem. A shopper who lingers on the delivery information page is having a shipping cost or timeline problem.
The behavioural signals are already in your server logs and analytics events. You just need a model that recognises the patterns.
Python: Abandonment Risk Scorer
This gradient boosting classifier predicts the probability that a current session will end in abandonment:
import pandas as pd
import numpy as np
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import (
classification_report, roc_auc_score, precision_recall_curve
)
import joblib
def train_abandonment_model(sessions_csv: str):
"""
Train an abandonment risk model.
Sessions CSV features (per checkout session):
- session_id, user_id, timestamp
- cart_value, cart_item_count
- time_on_checkout_seconds
- payment_method_switches (how many times they changed)
- items_removed_during_checkout
- pages_viewed_before_checkout
- is_returning_customer
- previous_abandonment_count
- device_type (mobile=1, desktop=0)
- hour_of_day
- delivery_page_time_seconds
- promo_code_attempted (did they try a code)
- promo_code_valid (did the code work)
- competitor_tab_switches (if detectable via visibility API)
- abandoned (target: 1 = abandoned, 0 = completed)
"""
df = pd.read_csv(sessions_csv)
# Engineer additional features
df['avg_time_per_item'] = (
df['time_on_checkout_seconds'] /
df['cart_item_count'].clip(lower=1)
)
df['hesitation_score'] = (
df['payment_method_switches'] +
df['items_removed_during_checkout'] * 2
)
df['promo_frustration'] = (
(df['promo_code_attempted'] == 1) &
(df['promo_code_valid'] == 0)
).astype(int)
feature_cols = [
'cart_value', 'cart_item_count',
'time_on_checkout_seconds', 'payment_method_switches',
'items_removed_during_checkout',
'pages_viewed_before_checkout',
'is_returning_customer', 'previous_abandonment_count',
'device_type', 'hour_of_day',
'delivery_page_time_seconds',
'promo_frustration',
'avg_time_per_item', 'hesitation_score'
]
X = df[feature_cols]
y = df['abandoned']
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42, stratify=y
)
model = GradientBoostingClassifier(
n_estimators=200,
max_depth=5,
learning_rate=0.1,
subsample=0.8,
min_samples_leaf=20
)
model.fit(X_train, y_train)
# Evaluate
y_pred = model.predict(X_test)
y_proba = model.predict_proba(X_test)[:, 1]
print("Classification Report:")
print(classification_report(y_test, y_pred))
print(f"AUC-ROC: {roc_auc_score(y_test, y_proba):.3f}")
# Feature importance
importance = sorted(
zip(feature_cols, model.feature_importances_),
key=lambda x: x[1], reverse=True
)
print("\nFeature Importance:")
for feat, imp in importance:
print(f" {feat}: {imp:.3f}")
# Find optimal threshold for intervention
precision, recall, thresholds = precision_recall_curve(
y_test, y_proba
)
# We want high recall (catch most abandoners) with
# acceptable precision (don't annoy completers)
f1_scores = 2 * (precision * recall) / (precision + recall + 1e-8)
optimal_idx = np.argmax(f1_scores)
optimal_threshold = thresholds[optimal_idx]
print(f"\nOptimal intervention threshold: {optimal_threshold:.3f}")
joblib.dump(model, 'abandonment_model.pkl')
return model, optimal_threshold
def score_live_session(model, session_features: dict,
threshold: float) -> dict:
"""Score a live checkout session for abandonment risk."""
features = pd.DataFrame([session_features])
probability = model.predict_proba(features)[0][1]
return {
'abandonment_probability': round(probability, 3),
'risk_level': (
'high' if probability > threshold else
'medium' if probability > threshold * 0.7 else
'low'
),
'should_intervene': probability > threshold,
'suggested_intervention': suggest_intervention(
session_features, probability
)
}
def suggest_intervention(features: dict, probability: float) -> str:
"""Suggest the right intervention based on behaviour signals."""
if features.get('promo_frustration', 0):
return 'offer_alternative_discount'
if features.get('delivery_page_time_seconds', 0) > 30:
return 'highlight_delivery_options'
if features.get('payment_method_switches', 0) > 2:
return 'show_payment_help'
if features.get('is_returning_customer', 0):
return 'gentle_reminder' # Don't discount loyal customers
if features.get('cart_value', 0) > 100:
return 'offer_free_shipping'
return 'exit_intent_popup'
The feature importance output from this model is illuminating. Across most ecommerce datasets, the top predictors tend to be: previous_abandonment_count (serial abandoners are the strongest signal), hesitation_score (adding and removing items, switching payment methods), device_type (mobile abandons at 85% vs desktop at 70%), and time_on_checkout_seconds (too long means friction, too short means browsing).
Solidus/Rails: The Intervention Engine
The Python model scores sessions. The Rails side decides what to show and when:
module CartRecovery
class InterventionEngine
INTERVENTION_MAP = {
offer_alternative_discount: {
type: :popup,
template: 'checkout/interventions/discount_offer',
delay_seconds: 0,
requires_consent: false
},
highlight_delivery_options: {
type: :inline,
template: 'checkout/interventions/delivery_highlight',
delay_seconds: 0,
requires_consent: false
},
show_payment_help: {
type: :chat,
message: 'Need help with payment? We accept...',
delay_seconds: 5,
requires_consent: false
},
gentle_reminder: {
type: :email,
template: 'cart_recovery/gentle_reminder',
delay_seconds: 3600, # 1 hour after abandonment
requires_consent: true
},
offer_free_shipping: {
type: :banner,
template: 'checkout/interventions/free_shipping',
delay_seconds: 0,
requires_consent: false
},
exit_intent_popup: {
type: :exit_intent,
template: 'checkout/interventions/exit_intent',
delay_seconds: 0,
requires_consent: false
}
}.freeze
def evaluate_and_intervene(session:, order:)
# Build features from live session
features = SessionFeatureBuilder.build(session, order)
# Score with Python model
risk = PythonBridge.score_abandonment(features)
return unless risk[:should_intervene]
# Check we haven't already intervened this session
return if session.intervention_shown?
# Select and record intervention
intervention = INTERVENTION_MAP[risk[:suggested_intervention].to_sym]
record_intervention(session, risk, intervention)
intervention
end
end
class SessionFeatureBuilder
def self.build(session, order)
user = order.user
{
cart_value: order.total.to_f,
cart_item_count: order.line_items.count,
time_on_checkout_seconds: session.checkout_duration_seconds,
payment_method_switches: session.payment_switches_count,
items_removed_during_checkout: session.removals_during_checkout,
pages_viewed_before_checkout: session.page_count,
is_returning_customer: user&.orders&.complete&.any? ? 1 : 0,
previous_abandonment_count: user ? abandoned_count(user) : 0,
device_type: session.mobile? ? 1 : 0,
hour_of_day: Time.current.hour,
delivery_page_time_seconds: session.time_on_page(:delivery),
promo_frustration: session.failed_promo_attempt? ? 1 : 0,
avg_time_per_item: avg_time_per_item(session, order),
hesitation_score: hesitation_score(session)
}
end
end
end
Post-Abandonment Recovery Sequencing
When a cart is abandoned despite intervention, the recovery email sequence should be timed and personalised based on the model's signals:
module CartRecovery
class EmailSequencer
SEQUENCES = {
price_sensitive: [
{ delay: 1.hour, template: :reminder_with_urgency },
{ delay: 24.hours, template: :discount_offer_5pct },
{ delay: 72.hours, template: :last_chance_10pct }
],
delivery_concerned: [
{ delay: 1.hour, template: :delivery_reassurance },
{ delay: 24.hours, template: :express_delivery_offer }
],
returning_customer: [
{ delay: 2.hours, template: :gentle_nudge },
{ delay: 48.hours, template: :product_back_in_stock }
],
default: [
{ delay: 1.hour, template: :simple_reminder },
{ delay: 24.hours, template: :social_proof },
{ delay: 72.hours, template: :small_incentive }
]
}.freeze
def schedule_recovery(order:, abandonment_context:)
return unless order.user&.email_consent?
sequence_key = determine_sequence(abandonment_context)
sequence = SEQUENCES[sequence_key]
sequence.each_with_index do |step, index|
RecoveryEmailJob.set(wait: step[:delay]).perform_later(
order_id: order.id,
template: step[:template],
sequence_position: index + 1,
sequence_key: sequence_key
)
end
end
private
def determine_sequence(context)
if context[:promo_frustration] == 1
:price_sensitive
elsif context[:delivery_page_time_seconds].to_i > 30
:delivery_concerned
elsif context[:is_returning_customer] == 1
:returning_customer
else
:default
end
end
end
end
Part 2: Predicting and Preventing Returns
Why Returns Happen (and Which Ones Are Preventable)
Not all returns are equal. Some are healthy — a customer buying a gift that doesn't fit is a natural part of commerce. But many returns are preventable if you catch the signals early enough.
Fit and sizing issues cause roughly 70% of fashion returns. This is the biggest lever. If you can help customers order the right size the first time, you eliminate the majority of preventable returns.
Expectation mismatches ("it looked different online") cause another significant chunk. Better product data helps here — and connects directly to the GEO optimisation work in the product discovery post.
Bracketing — deliberately ordering multiple sizes to try at home — is rising, with over 51% of Gen Z shoppers admitting to it. This is harder to prevent but can be detected and mitigated.
Python: Return Risk Prediction Model
This model predicts the probability that a specific order will be returned, scored at the point of order placement:
import pandas as pd
import numpy as np
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, roc_auc_score
import joblib
def train_return_risk_model(orders_csv: str):
"""
Train a return risk model.
Orders CSV features (per order line item):
- order_id, user_id, product_id, variant_id
- size_ordered, most_common_size_for_user
- size_deviation (ordered size - usual size in numeric)
- product_category, product_subcategory
- product_avg_return_rate (historical return rate for this product)
- product_review_mentions_sizing_issue (count)
- customer_return_rate (this customer's historical return rate)
- customer_order_count
- order_value, order_item_count
- same_product_multiple_sizes (bracketing indicator: 1/0)
- days_since_last_order
- is_sale_item
- payment_method (BNPL tends to correlate with higher returns)
- was_returned (target: 1 = returned, 0 = kept)
"""
df = pd.read_csv(orders_csv)
# Engineer features
df['is_bracketing'] = df['same_product_multiple_sizes']
df['size_risk'] = df['size_deviation'].abs()
df['high_return_product'] = (
df['product_avg_return_rate'] > 0.25
).astype(int)
df['high_return_customer'] = (
df['customer_return_rate'] > 0.30
).astype(int)
df['review_sizing_flag'] = (
df['product_review_mentions_sizing_issue'] > 3
).astype(int)
df['bnpl_payment'] = (
df['payment_method'] == 'bnpl'
).astype(int)
feature_cols = [
'size_risk', 'product_avg_return_rate',
'customer_return_rate', 'customer_order_count',
'order_value', 'order_item_count',
'is_bracketing', 'high_return_product',
'high_return_customer', 'review_sizing_flag',
'days_since_last_order', 'is_sale_item',
'bnpl_payment'
]
X = df[feature_cols]
y = df['was_returned']
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42, stratify=y
)
model = GradientBoostingClassifier(
n_estimators=200,
max_depth=5,
learning_rate=0.1,
subsample=0.8
)
model.fit(X_train, y_train)
y_proba = model.predict_proba(X_test)[:, 1]
print(f"AUC-ROC: {roc_auc_score(y_test, y_proba):.3f}")
print(classification_report(
y_test, model.predict(X_test)
))
importance = sorted(
zip(feature_cols, model.feature_importances_),
key=lambda x: x[1], reverse=True
)
print("\nReturn Risk Feature Importance:")
for feat, imp in importance:
print(f" {feat}: {imp:.3f}")
joblib.dump(model, 'return_risk_model.pkl')
return model
The top features across most fashion datasets: customer_return_rate (serial returners are the strongest signal), is_bracketing (multiple sizes of the same product is a near-certain partial return), size_risk (ordering an unusual size for this customer), and product_avg_return_rate (some products just have inherent sizing issues).
Python: Collaborative Filtering for Size Recommendation
This is where it gets genuinely interesting. The same collaborative filtering that powers "customers who bought X also bought Y" can predict the right size:
import pandas as pd
import numpy as np
from sklearn.neighbors import NearestNeighbors
def build_size_recommender(purchase_history_csv: str):
"""
Build a size recommender using collaborative filtering.
Purchase history CSV:
- user_id, product_id, size_ordered, was_returned,
kept_size (size they kept if they exchanged)
- user_height_cm, user_weight_kg (if available from profile)
- user_typical_size_tops, user_typical_size_bottoms
"""
df = pd.read_csv(purchase_history_csv)
# Build a "success" dataset: orders that were NOT returned
kept = df[df['was_returned'] == 0].copy()
# For each product, build a profile of who kept which size
def recommend_size(product_id: int,
user_profile: dict,
n_neighbors: int = 20) -> dict:
"""Recommend a size for a user based on similar users."""
product_data = kept[kept['product_id'] == product_id]
if len(product_data) < 10:
return {'recommendation': None,
'confidence': 'insufficient_data'}
# Build feature matrix from users who bought this product
profile_features = ['user_height_cm', 'user_weight_kg',
'user_typical_size_tops',
'user_typical_size_bottoms']
# Filter to users with profile data
with_profile = product_data.dropna(subset=profile_features)
if len(with_profile) < 5:
# Fall back to most popular kept size
most_common = (
product_data['size_ordered']
.value_counts()
.index[0]
)
return {'recommendation': most_common,
'confidence': 'low',
'method': 'popularity'}
X = with_profile[profile_features].values
user_vector = np.array([[user_profile.get(f, 0)
for f in profile_features]])
# Find similar users
nn = NearestNeighbors(n_neighbors=min(n_neighbors,
len(X)))
nn.fit(X)
distances, indices = nn.kneighbors(user_vector)
# Weight by inverse distance
similar_purchases = with_profile.iloc[indices[0]]
weights = 1 / (distances[0] + 1e-6)
# Weighted vote for size
size_votes = {}
for size, weight in zip(
similar_purchases['size_ordered'], weights
):
size_votes[size] = size_votes.get(size, 0) + weight
recommended = max(size_votes, key=size_votes.get)
total_weight = sum(size_votes.values())
confidence = size_votes[recommended] / total_weight
return {
'recommendation': recommended,
'confidence': (
'high' if confidence > 0.6
else 'medium' if confidence > 0.4
else 'low'
),
'confidence_score': round(confidence, 3),
'method': 'collaborative_filtering',
'similar_users_count': len(similar_purchases),
'size_distribution': {
k: round(v / total_weight, 2)
for k, v in sorted(size_votes.items())
}
}
return recommend_size
The beauty of this approach: it gets smarter with every purchase. Each successful (non-returned) order teaches the model what size works for what body type. Over time, the recommendations become highly accurate — retailers using similar approaches report size-related return reductions of 27%.
Solidus/Rails: The Return Prevention Pipeline
Tie the Python models into your Solidus checkout flow:
module ReturnPrevention
class Pipeline
def evaluate_order(order)
results = order.line_items.map do |item|
risk = score_return_risk(item, order)
size_rec = check_sizing(item, order.user)
LineItemRisk.new(
line_item: item,
return_probability: risk[:probability],
risk_level: risk[:risk_level],
size_recommendation: size_rec,
interventions: determine_interventions(risk, size_rec)
)
end
OrderRiskAssessment.new(
order: order,
line_item_risks: results,
overall_risk: aggregate_risk(results),
bracketing_detected: detect_bracketing(order)
)
end
private
def check_sizing(line_item, user)
return nil unless line_item.variant.option_values
.any? { |ov| ov.option_type.name == 'size' }
ordered_size = line_item.variant.option_values
.find { |ov| ov.option_type.name == 'size' }
&.name
recommendation = PythonBridge.recommend_size(
product_id: line_item.product.id,
user_profile: build_user_profile(user)
)
if recommendation[:recommendation] &&
recommendation[:recommendation] != ordered_size &&
recommendation[:confidence] != 'low'
{
ordered: ordered_size,
recommended: recommendation[:recommendation],
confidence: recommendation[:confidence],
mismatch: true,
message: sizing_message(ordered_size,
recommendation)
}
else
{ ordered: ordered_size, mismatch: false }
end
end
def determine_interventions(risk, size_rec)
interventions = []
if size_rec&.dig(:mismatch)
interventions << {
type: :size_suggestion,
priority: :high,
message: size_rec[:message]
}
end
if risk[:probability] > 0.6
interventions << {
type: :fit_confirmation,
priority: :medium,
message: 'Check our size guide for this item — '\
'it runs differently to similar products'
}
end
if risk[:bracketing_signal]
interventions << {
type: :exchange_nudge,
priority: :low,
message: 'Not sure about sizing? Our free exchange '\
'policy means you can swap sizes easily'
}
end
interventions
end
def detect_bracketing(order)
# Check if same product ordered in multiple sizes
order.line_items
.group_by { |li| li.product.id }
.any? { |_, items| items.length > 1 }
end
end
end
Product-Level Return Analytics
Beyond individual order scoring, aggregate return data at the product level to fix systemic issues:
module ReturnPrevention
class ProductAnalytics
def analyse(product)
returns = Spree::ReturnItem
.joins(return_authorization: :order)
.joins(:inventory_unit)
.where(inventory_units: {
variant_id: product.variant_ids
})
total_sold = product.line_items
.joins(:order)
.where(orders: { state: 'complete' })
.sum(:quantity)
return_rate = total_sold.positive? ?
(returns.count.to_f / total_sold * 100).round(1) : 0
{
return_rate: return_rate,
total_sold: total_sold,
total_returned: returns.count,
return_reasons: reason_breakdown(returns),
sizing_analysis: sizing_analysis(product, returns),
review_sentiment: review_sizing_sentiment(product),
recommendations: generate_recommendations(
return_rate, returns, product
)
}
end
private
def sizing_analysis(product, returns)
size_returns = returns.joins(
inventory_unit: { variant: :option_values }
).where(
option_values: {
option_type_id: Spree::OptionType.find_by(name: 'size')&.id
}
).group('spree_option_values.name')
.count
size_sales = product.line_items
.joins(variant: :option_values)
.where(option_values: {
option_type_id: Spree::OptionType.find_by(name: 'size')&.id
})
.group('spree_option_values.name')
.sum(:quantity)
size_sales.map do |size, sold|
returned = size_returns[size] || 0
rate = sold.positive? ? (returned.to_f / sold * 100).round(1) : 0
{
size: size,
sold: sold,
returned: returned,
return_rate: rate,
flag: rate > 30 ? :investigate : :normal
}
end
end
def generate_recommendations(return_rate, returns, product)
recs = []
if return_rate > 25
recs << 'High return rate. Review product description '\
'and imagery for expectation mismatches.'
end
sizing = sizing_analysis(product, returns)
problem_sizes = sizing.select { |s| s[:flag] == :investigate }
if problem_sizes.any?
sizes = problem_sizes.map { |s| s[:size] }.join(', ')
recs << "Sizes #{sizes} have return rates above 30%. "\
"Review size guide accuracy for these sizes."
end
reasons = reason_breakdown(returns)
if reasons['too_small'].to_i > reasons['too_large'].to_i * 2
recs << 'Product consistently runs small. Consider '\
'adding "runs small" note to description or '\
'adjusting size chart.'
end
recs
end
end
end
The Economics of Prevention
Let's make the business case concrete with round numbers.
A Solidus store doing £1M monthly revenue with a 70% cart abandonment rate is losing approximately £2.33M in potential revenue every month (the £1M represents only 30% of checkout initiations). Reducing abandonment by just 5 percentage points (from 70% to 65%) recovers roughly £166,000 in monthly revenue. Even if only 20% of recovered carts convert, that's £33,000/month from a model that costs nothing once trained.
On the returns side: a store with a 25% return rate on £1M revenue processes £250,000 in returns monthly. Each return costs £10-15 in shipping, handling, inspection, and restocking. That's £2.5M-3.75M annually in return processing costs alone, not counting the margin loss on items that can't be resold at full price. Reducing the return rate by 3 percentage points (from 25% to 22%) saves approximately £30,000/month in processing costs and recovers margin on £30,000 in products that would have been returned.
These aren't speculative numbers. They're basic arithmetic applied to industry-standard rates. The ML models don't need to be perfect — they need to be better than doing nothing.
Getting Started
-
Instrument your checkout. Track every interaction: page times, payment switches, item removals, promo code attempts, delivery page engagement. These are your training features.
-
Export your return data. You need: order, product, size, return reason, customer history. Tag returns with structured reason codes, not free-text.
-
Train the abandonment model first. It's simpler, the feedback loop is faster (you see results in days, not weeks), and the revenue impact is immediate.
-
Add size recommendation once you have enough return data. You need at least a few hundred returns per product category to train meaningful size models.
-
Close the feedback loop. Every prevented abandonment and every prevented return feeds back into the model, making it more accurate over time. This is the compounding advantage of ML — it gets better the more data it sees.