Building GDPR-Compliant AI Features in Your SaaS
Here's a story that'll sound familiar if you're building AI features into a SaaS product. You've got your machine learning model working. The recommendation engine is producing solid results. The prediction accuracy looks good. The team is excited. Someone says "let's ship it next sprint." Then someone else asks: "Have we done the DPIA?" Silence. "What about the DPA clauses? Do our customer contracts cover AI processing?" More silence. "And the privacy by design documentation, is that up to date?" And suddenly your two-week sprint turns into a two-month compliance project, because nobody thought about GDPR until the feature was already built. I've been through this cycle more times than I'd like to admit. Across [GrowCentric.ai](https://growcentric.ai) (marketing optimisation SaaS), [Stint.co](https://stint.co) (marketing dashboard with email campaigns and real-time reporting), [Regios.at](https://regios.at) (regional Austrian platform), and [Auto-Prammer.at](https://auto-prammer.at) (automotive marketplace on Solidus), I've had to figure out what GDPR compliance actually looks like in code, not just in legal documents. The thing is, GDPR compliance for AI features isn't as terrifying as it sounds. Most of the complexity comes from not knowing what's required and when. Once you understand the three pillars — Data Processing Agreements, Privacy by Design, and Data Protection Impact Assessments — you can build compliance into your development workflow so naturally that it doesn't slow you down at all. It actually speeds you up, because you're not retrofitting compliance onto finished features. This post is the walkthrough I wish I'd had when I started. No legal jargon. Actual code. Real decisions from real products.
Pillar 1: Data Processing Agreements That Actually Cover AI
Let's start with the contractual foundation. GDPR Article 28 requires a written agreement between data controllers and processors that specifies exactly how personal data will be processed. When your SaaS product adds AI features, your existing DPA almost certainly doesn't cover what's happening.
Here's the problem: a traditional DPA says things like "the processor shall process personal data only on documented instructions from the controller for the purposes of providing the Service." That's fine for a standard SaaS application that stores and retrieves data. But when AI enters the picture, the processing is fundamentally different. Data isn't just being stored and retrieved; it's being analysed, patterns are being extracted, predictions are being generated, and potentially, models are being trained.
I learned this the hard way when building GrowCentric.ai. The platform optimises marketing campaigns using machine learning. When a client connects their campaign data, the system analyses conversion patterns, audience behaviours, and budget effectiveness. That analysis involves processing personal data in ways that go well beyond "store it and show it back."
What Your AI-Specific DPA Needs
Article 28 mandates eight specific elements in every DPA. For AI features, each one needs an AI-specific interpretation:
Processing scope and purpose. Don't just say "providing the Service." Specify: inference (applying a trained model to customer data to generate predictions), analysis (extracting patterns from customer data for reporting), feature extraction (deriving analytical features from personal data), and if applicable, model training (using customer data to improve models). These are distinct processing activities with different risk profiles.
Sub-processor chains. AI features often introduce sub-processors your customers don't expect. If GrowCentric.ai uses a cloud GPU provider for model inference, that's a sub-processor. If Stint.co's insight generation calls a third-party NLP service to analyse email engagement patterns, that's a sub-processor. If Auto-Prammer.at's recommendation engine runs on a separate ML infrastructure stack, that's a sub-processor. Your DPA must list every sub-processor in the AI chain, with their purpose and data access scope.
Model training restrictions. This is the clause most AI SaaS products get wrong. If your system learns from customer data, your DPA must explicitly state whether customer data is used for model training, what type of training (fine-tuning a per-customer model vs contributing to a shared model), how data is isolated between customers, and whether customers can opt out of contributing training data. At GrowCentric.ai, the DPA explicitly states that client campaign data is used for client-specific model optimisation only. No client's data ever trains models used for other clients. This is a contractual commitment backed by the architectural decisions we covered in the machine unlearning post.
Data residency for AI infrastructure. AI workloads sometimes run on different infrastructure than the main application. If your Rails app runs on EU servers but your ML models are served from US-based GPUs, you've got a cross-border transfer issue. Your DPA needs to specify where training happens, where inference happens, and where model artefacts are stored.
Automated decision-making. GDPR Article 22 gives individuals the right not to be subject to decisions based solely on automated processing. If your AI feature makes decisions that significantly affect individuals, your DPA must address how human oversight is implemented and how individuals can challenge automated decisions.
Here's the DPA clause structure I use:
module DPACompliance
class AIProcessingSchedule
# Annex to the DPA: AI Processing Schedule
# This documents every AI processing activity
# and maps it to GDPR requirements
def self.generate_for(product:)
case product
when :growcentric
{
processing_activities: [
{
name: 'Campaign performance prediction',
purpose: 'Generating budget allocation recommendations',
data_categories: ['campaign metrics', 'audience segments', 'conversion data'],
personal_data_involved: true,
anonymisation_applied: true,
anonymisation_method: 'Cohort aggregation, minimum 50 users per cohort',
legal_basis: 'Legitimate interest (Article 6(1)(f))',
automated_decision: false,
sub_processors: ['Hetzner Cloud (EU, inference hosting)'],
data_residency: 'EU only (Frankfurt)',
retention: '90 days for raw data, model artefacts indefinite',
training_use: 'Client-specific models only, no cross-client training'
},
{
name: 'Audience segmentation analysis',
purpose: 'Identifying behavioural clusters for targeting',
data_categories: ['browsing behaviour', 'engagement signals', 'demographic cohorts'],
personal_data_involved: true,
anonymisation_applied: true,
anonymisation_method: 'K-anonymity (k=10), session-level aggregation',
legal_basis: 'Legitimate interest (Article 6(1)(f))',
automated_decision: false,
sub_processors: ['Hetzner Cloud (EU, inference hosting)'],
data_residency: 'EU only (Frankfurt)',
retention: '30 days for raw features, segments indefinite',
training_use: 'Aggregated patterns only, no individual-level training'
}
],
model_training_policy: 'Client data is aggregated and anonymised before model training. No individual personal data enters any training pipeline. Models are client-specific and not shared across accounts.',
human_oversight: 'All AI recommendations require human review before execution. No automated campaign changes without explicit client approval.',
data_subject_rights: 'Data subjects may request access, rectification, or erasure. Erasure requests are processed within 30 days including assessment of model impact per our erasure architecture.'
}
when :stint
{
processing_activities: [
{
name: 'Email engagement insight generation',
purpose: 'Generating real-time reporting and actionable insights',
data_categories: ['email open/click events', 'send timestamps', 'audience segments'],
personal_data_involved: true,
anonymisation_applied: true,
anonymisation_method: 'Segment-level aggregation, minimum 100 recipients per segment',
legal_basis: 'Contract performance (Article 6(1)(b))',
automated_decision: false,
sub_processors: ['Hetzner Cloud (EU, hosting)'],
data_residency: 'EU only',
retention: '12 months for engagement data, insights indefinite',
training_use: 'Aggregate patterns for insight models, no individual-level training'
},
{
name: 'Application form processing',
purpose: 'Processing and routing submitted application forms',
data_categories: ['applicant personal data', 'form responses'],
personal_data_involved: true,
anonymisation_applied: false,
legal_basis: 'Contract performance (Article 6(1)(b))',
automated_decision: false,
sub_processors: [],
data_residency: 'EU only',
retention: 'As specified by controller, default 24 months',
training_use: 'None - application data is never used for model training'
}
]
}
when :regios
{
processing_activities: [
{
name: 'Local business search ranking',
purpose: 'Personalising search results for regional relevance',
data_categories: ['search queries', 'location data', 'interaction signals'],
personal_data_involved: true,
anonymisation_applied: true,
anonymisation_method: 'Session-level anonymisation, k-anonymity (k=10)',
legal_basis: 'Legitimate interest (Article 6(1)(f))',
automated_decision: false,
sub_processors: [],
data_residency: 'Austria',
retention: '30 days for search logs, ranking signals indefinite',
training_use: 'Anonymous session patterns only'
}
]
}
when :auto_prammer
{
processing_activities: [
{
name: 'Vehicle recommendation engine',
purpose: 'Suggesting relevant vehicle listings to users',
data_categories: ['browsing history', 'search filters', 'interaction events'],
personal_data_involved: true,
anonymisation_applied: true,
anonymisation_method: 'Anonymous session hashes, category-level aggregation',
legal_basis: 'Legitimate interest (Article 6(1)(f))',
automated_decision: false,
sub_processors: [],
data_residency: 'Austria',
retention: '90 days for interaction data, model artefacts indefinite',
training_use: 'Anonymised interaction matrices only'
},
{
name: 'Dynamic pricing model',
purpose: 'Suggesting competitive prices based on market conditions',
data_categories: ['listing attributes', 'market prices', 'regional demand signals'],
personal_data_involved: false,
legal_basis: 'Legitimate interest (Article 6(1)(f))',
automated_decision: true,
human_oversight_mechanism: 'Tiered autonomy - price suggestions require seller approval, maximum 8% daily change, minimum 15% margin guardrail',
sub_processors: [],
data_residency: 'Austria',
retention: 'Market data indefinite, no personal data involved',
training_use: 'Market-level features only, no personal data'
}
]
}
end
end
end
end
This structure gives you a machine-readable specification of every AI processing activity, which you can attach as an annex to your DPA. It also serves as the foundation for your DPIA and your Records of Processing Activities (ROPA).
The Sub-Processor Problem
Article 28 requires that you get prior authorisation before engaging sub-processors, and that sub-processors are bound by the same data protection obligations. For AI features, this creates a chain that's longer than most SaaS products expect.
Here's the sub-processor chain for GrowCentric.ai's AI features:
- Application hosting (Hetzner Cloud, Germany) — stores application data
- ML inference hosting (Hetzner Cloud, Germany) — runs model predictions
- Monitoring (self-hosted) — tracks model performance
Every link in this chain needs its own DPA with you, and your DPA with your customer needs to list them all. I maintain a public sub-processor list for each product and a notification system for changes:
class SubProcessorRegistry
def self.notify_controllers_of_change(product:, change:)
# Article 28(2): notify controllers of intended sub-processor changes
# Give 30-day objection period
affected_controllers = Controller.where(product: product, active: true)
affected_controllers.find_each do |controller|
SubProcessorChangeNotification.create!(
controller: controller,
change_type: change[:type], # :addition, :removal, :modification
sub_processor_name: change[:name],
purpose: change[:purpose],
data_access_scope: change[:data_scope],
data_residency: change[:residency],
effective_date: 30.days.from_now,
objection_deadline: 30.days.from_now,
notified_at: Time.current
)
SubProcessorNotificationMailer
.change_notification(controller, change)
.deliver_later
end
end
end
Pillar 2: Privacy by Design That Actually Works in Code
Article 25 requires data protection "by design and by default." That sounds abstract, but it translates into seven concrete architectural decisions you need to make before writing any AI feature code.
And this matters financially. Sambla Group was fined 950,000 euros in 2025 specifically for Article 25 violations — they didn't build data protection into their credit brokerage platform from the outset. Meta was fined 251 million euros partly for failing to implement privacy by design and default in their token system. The Spanish DPA fined a bank 3 million euros for Article 25 violations in their access control systems.
Here's how I implement each principle:
Decision 1: Purpose Limitation at the Model Level
Every AI model needs a documented purpose, and the model should be architecturally constrained to that purpose. A recommendation engine shouldn't be repurposable as a profiling tool.
module PrivacyByDesign
class PurposeBoundModel
# Each model declares its purpose and is constrained to it
PERMITTED_PURPOSES = {
vehicle_recommendation: {
description: 'Suggesting relevant vehicle listings based on browsing patterns',
permitted_inputs: [:category_preferences, :price_range, :region, :session_signals],
prohibited_inputs: [:name, :email, :phone, :address, :financial_data],
output_type: :ranked_listing_ids,
automated_decision: false
},
email_send_optimisation: {
description: 'Predicting optimal send times for email campaigns',
permitted_inputs: [:segment_id, :historical_open_rates, :day_of_week, :time_bucket],
prohibited_inputs: [:individual_email, :name, :browsing_history],
output_type: :recommended_send_time,
automated_decision: false
},
campaign_budget_allocation: {
description: 'Recommending budget distribution across marketing channels',
permitted_inputs: [:channel_metrics, :historical_roas, :budget_constraints, :seasonal_index],
prohibited_inputs: [:individual_user_data, :personal_demographics],
output_type: :budget_recommendation,
automated_decision: false
}
}.freeze
def initialize(purpose:)
raise "Unknown purpose: #{purpose}" unless PERMITTED_PURPOSES.key?(purpose)
@purpose = purpose
@config = PERMITTED_PURPOSES[purpose]
end
def validate_inputs(input_data)
input_keys = input_data.keys.map(&:to_sym)
# Reject any prohibited inputs
violations = input_keys & @config[:prohibited_inputs]
if violations.any?
Rails.logger.warn("PbD violation: prohibited inputs #{violations} for purpose #{@purpose}")
raise PrivacyViolationError, "Inputs #{violations} are prohibited for purpose #{@purpose}"
end
# Warn on unexpected inputs
unexpected = input_keys - @config[:permitted_inputs]
if unexpected.any?
Rails.logger.warn("PbD warning: unexpected inputs #{unexpected} for purpose #{@purpose}")
end
true
end
end
end
This enforces purpose limitation at the code level. The model literally cannot receive data it's not supposed to process. If someone accidentally passes an email address to the recommendation engine, the system raises an error before any processing happens.
Decision 2: Data Minimisation in Feature Engineering
The feature engineering pipeline is where most privacy violations happen in AI systems. A developer pulls in "every column that might be useful" and suddenly the model is processing data it doesn't need.
module PrivacyByDesign
class MinimisedFeatureExtractor
# Only extract features that are necessary for the stated purpose
# Document why each feature is needed
FEATURE_REGISTRY = {
vehicle_recommendation: {
category_affinity: {
source: :browsing_sessions,
extraction: :category_view_counts,
minimisation: 'Aggregated to category level, no individual listing tracking',
necessity: 'Core signal for content-based filtering'
},
price_sensitivity: {
source: :search_filters,
extraction: :price_range_histogram,
minimisation: 'Bucketed into 5 ranges, no exact values',
necessity: 'Prevents irrelevant price-range recommendations'
},
regional_preference: {
source: :search_filters,
extraction: :region_code,
minimisation: 'Federal state level only, no precise location',
necessity: 'Delivery distance affects purchase likelihood'
}
}
}.freeze
def extract_features(purpose:, raw_data:)
registry = FEATURE_REGISTRY.fetch(purpose)
features = {}
registry.each do |feature_name, config|
features[feature_name] = send(config[:extraction], raw_data)
end
# Log what was extracted and what was discarded
FeatureExtractionAudit.create!(
purpose: purpose,
features_extracted: features.keys,
raw_fields_available: raw_data.keys,
raw_fields_discarded: raw_data.keys - features.keys,
minimisation_applied: registry.transform_values { |c| c[:minimisation] },
extracted_at: Time.current
)
features
end
end
end
Decision 3: Pseudonymisation by Default
Every piece of personal data that enters your AI pipeline should be pseudonymised before it reaches any model. The EDPB's January 2025 guidelines on pseudonymisation clarify that this means replacing identifiers with tokens, storing the mapping separately, and restricting access to the mapping.
module PrivacyByDesign
class Pseudonymiser
# EDPB Guidelines 01/2025 compliant pseudonymisation
# Key stored separately from pseudonymised data
# Access to mapping requires explicit authorisation
def pseudonymise_for_training(records, purpose:)
mapping_id = SecureRandom.uuid
pseudonymised = records.map do |record|
pseudo_id = generate_pseudonym(record[:user_id], mapping_id)
record.except(:user_id, :email, :name, :ip_address)
.merge(pseudo_id: pseudo_id)
end
# Store mapping separately with restricted access
PseudonymMapping.create!(
mapping_id: mapping_id,
purpose: purpose,
created_at: Time.current,
expires_at: 90.days.from_now,
access_level: 'dpo_only',
record_count: records.size
# Actual mapping encrypted and stored in separate database
)
{ data: pseudonymised, mapping_id: mapping_id }
end
private
def generate_pseudonym(user_id, mapping_id)
# HMAC-based pseudonym: deterministic within a mapping,
# but different across mappings
OpenSSL::HMAC.hexdigest(
'SHA256',
mapping_key(mapping_id),
user_id.to_s
)
end
def mapping_key(mapping_id)
# Key stored in separate, access-controlled key management system
KeyManagement.retrieve("pseudo_mapping_#{mapping_id}")
end
end
end
Decision 4: Consent-Aware Data Pipelines
Your data pipeline needs to know what each user has consented to. If someone consented to basic analytics but not AI-powered personalisation, the pipeline must respect that distinction.
module PrivacyByDesign
class ConsentAwarePipeline
PROCESSING_PURPOSES = {
essential_service: { consent_required: false, legal_basis: 'contract_performance' },
basic_analytics: { consent_required: false, legal_basis: 'legitimate_interest' },
ai_recommendations: { consent_required: true, legal_basis: 'consent' },
ai_personalisation: { consent_required: true, legal_basis: 'consent' },
email_engagement_insights: { consent_required: false, legal_basis: 'legitimate_interest' },
model_training_contribution: { consent_required: true, legal_basis: 'consent' }
}.freeze
def process_for(user:, purpose:, data:)
config = PROCESSING_PURPOSES.fetch(purpose)
if config[:consent_required]
consent = ConsentRecord.current_for(user: user, purpose: purpose)
unless consent&.granted?
Rails.logger.info("Skipping #{purpose} for user #{user.id}: no consent")
return nil
end
end
# Record the processing in the audit trail
ProcessingRecord.create!(
user_id: user.id,
purpose: purpose,
legal_basis: config[:legal_basis],
consent_id: consent&.id,
processed_at: Time.current,
data_categories: data.keys
)
yield data
end
end
end
At Stint.co, where we send emails to large audiences, this is crucial. The marketing dashboard's insight generation relies on engagement data, but whether that data feeds into AI-powered analytics depends on what each recipient has consented to. The pipeline checks consent before every processing step, not just at data collection.
Decision 5: Automated Retention Enforcement
module PrivacyByDesign
class RetentionEnforcer
# Automatically enforce data retention policies
# Different retention periods for different data types
RETENTION_POLICIES = {
raw_interaction_data: 90.days,
email_engagement_events: 12.months,
search_query_logs: 30.days,
training_features: 6.months,
model_artefacts: :indefinite, # anonymised, no personal data
pseudonym_mappings: 90.days,
consent_records: 5.years, # legal requirement
erasure_documentation: 5.years, # legal requirement
audit_logs: 3.years
}.freeze
def enforce!
RETENTION_POLICIES.each do |data_type, retention|
next if retention == :indefinite
expired_count = data_class_for(data_type)
.where('created_at < ?', retention.ago)
.delete_all
Rails.logger.info("Retention: purged #{expired_count} #{data_type} records older than #{retention.inspect}")
end
end
end
end
I run this as a daily Sidekiq job across every product. It's one of those things that sounds obvious but most teams forget: if you said you'd retain data for 90 days, you need something that actually deletes it after 90 days.
Decisions 6 and 7: Right-Respecting Inference and Transparent Logging
These connect directly to the audit logging and tiered autonomy architecture I covered in the EU AI Act post and the agentic AI post. The AIAuditLoggable module records every AI decision. The TieredAutonomyController ensures human oversight. Together, they satisfy both Article 22 (automated decision-making rights) and Article 25 (privacy by design).
Pillar 3: DPIAs That Don't Sit in a Drawer
A Data Protection Impact Assessment is required under Article 35 when processing is "likely to result in a high risk to the rights and freedoms of natural persons." The EDPB lists nine criteria that indicate high-risk processing. Hit two or more and you need a DPIA.
For AI features, you'll almost always trigger at least two: systematic evaluation or scoring of individuals, automated decision-making with legal or similarly significant effect, innovative use of new technologies, and large-scale processing.
But here's the thing most developers get wrong about DPIAs: they treat them as a one-off document written by legal and then forgotten. A useful DPIA is a living technical artefact that lives alongside your code and gets updated when your processing changes.
Here's how I structure them:
class DPIAGenerator
def generate(ai_feature:)
{
metadata: {
feature_name: ai_feature.name,
product: ai_feature.product,
version: ai_feature.version,
assessed_by: Current.user.email,
assessed_at: Time.current,
review_due: 6.months.from_now,
status: 'draft'
},
section_1_description: {
processing_description: ai_feature.processing_description,
data_categories: ai_feature.data_categories,
data_subjects: ai_feature.data_subject_categories,
data_volume: ai_feature.estimated_data_volume,
processing_purpose: ai_feature.stated_purpose,
technology_description: ai_feature.technology_stack
},
section_2_necessity_and_proportionality: {
legal_basis: ai_feature.legal_basis,
purpose_limitation: ai_feature.purpose_limitation_assessment,
data_minimisation: ai_feature.minimisation_evidence,
storage_limitation: ai_feature.retention_policy,
data_subject_rights: {
access: ai_feature.access_mechanism,
rectification: ai_feature.rectification_mechanism,
erasure: ai_feature.erasure_mechanism,
portability: ai_feature.portability_mechanism,
objection: ai_feature.objection_mechanism,
automated_decision_safeguards: ai_feature.article_22_safeguards
}
},
section_3_risk_assessment: assess_risks(ai_feature),
section_4_mitigation_measures: ai_feature.mitigation_measures,
section_5_consultation: {
dpo_consulted: true,
dpo_opinion: nil, # To be filled by DPO
supervisory_authority_consultation_required: false,
stakeholders_consulted: []
}
}
end
private
def assess_risks(feature)
risks = []
# Assess against EDPB's nine criteria
risks << assess_evaluation_scoring(feature)
risks << assess_automated_decisions(feature)
risks << assess_systematic_monitoring(feature)
risks << assess_sensitive_data(feature)
risks << assess_large_scale(feature)
risks << assess_dataset_matching(feature)
risks << assess_vulnerable_subjects(feature)
risks << assess_innovative_technology(feature)
risks << assess_preventing_rights(feature)
triggered_criteria = risks.select { |r| r[:triggered] }
{
criteria_assessed: risks,
criteria_triggered: triggered_criteria.size,
dpia_required: triggered_criteria.size >= 2,
overall_risk_level: calculate_risk_level(triggered_criteria),
residual_risk_after_mitigation: nil # To be assessed after section 4
}
end
end
What a Real DPIA Looks Like
Let me walk through the DPIA for GrowCentric.ai's audience segmentation feature.
What it does: Analyses campaign data to identify behavioural clusters for audience targeting. Groups users by engagement patterns, purchase signals, and browsing behaviour to recommend which audience segments a client should target with specific campaigns.
EDPB criteria triggered: Evaluation/scoring (yes, we're profiling audience segments), innovative technology (ML-powered segmentation), large-scale processing (thousands of campaign data points). That's three criteria, so a DPIA is required.
Risks identified: Risk of discriminatory targeting (if segments correlate with protected characteristics), risk of opaque profiling (individuals don't know they've been segmented), risk of data leakage (segments could reveal sensitive information about constituent individuals).
Mitigations implemented: Segments are built from anonymised, aggregated data only (minimum 50 users per cohort). Protected characteristics are excluded from feature extraction. Segment definitions are transparent and auditable. Clients can review and override any segment before use. Data provenance tracking records exactly what data informed each segment.
Residual risk: Low, because no individual personal data enters the segmentation pipeline. The aggregation and anonymisation architecture means the AI feature processes statistical patterns, not personal profiles.
That's a useful DPIA. It's not a 50-page legal document. It's a technical assessment that maps directly to code-level decisions.
The DPIA Lifecycle
A DPIA is a living document. I store them in version control alongside the features they assess, and trigger reviews when processing changes:
class DPIALifecycleManager
def check_for_review_triggers(ai_feature:)
dpia = DPIA.current_for(feature: ai_feature)
triggers = []
# Time-based review
triggers << 'Scheduled review due' if dpia.review_due <= Date.current
# Processing change triggers
triggers << 'New data category added' if feature_added_data_category?(ai_feature, dpia)
triggers << 'New sub-processor' if feature_added_sub_processor?(ai_feature, dpia)
triggers << 'Scale increase' if feature_scale_increased?(ai_feature, dpia)
triggers << 'New automated decision' if feature_added_auto_decision?(ai_feature, dpia)
triggers << 'Model architecture change' if feature_model_changed?(ai_feature, dpia)
if triggers.any?
DPIAReviewRequest.create!(
dpia: dpia,
triggers: triggers,
requested_at: Time.current
)
# Notify DPO and feature owner
DPIAReviewNotifier.notify(dpia: dpia, triggers: triggers)
end
triggers
end
end
How This All Connects: The Compliance Architecture
These three pillars aren't separate workstreams. They're layers of one architecture:
The DPA defines what processing is permitted and how it's documented. The Privacy by Design architecture enforces those permissions in code. The DPIA assesses whether the implementation actually works and identifies remaining risks.
And they connect to everything else I've written about. The machine unlearning post covers how the erasure handler works when someone exercises their Article 17 rights. The EU AI Act post covers how the AI Act's transparency and risk management requirements overlap with DPIA obligations. The CRA post covers the security requirements for the infrastructure these features run on.
The same patterns apply across every product. GrowCentric.ai uses them for marketing optimisation. Stint.co uses them for email engagement analytics and application form processing. Regios.at uses them for regional search and content recommendations. Auto-Prammer.at uses them for vehicle recommendations and dynamic pricing on Solidus. Different features, same compliance architecture.
The 10-Point Checklist
Before shipping any AI feature, run through this:
- DPA updated? Does your customer agreement cover this specific AI processing activity?
- Sub-processors listed? Are all AI infrastructure providers documented and notified?
- Purpose documented? Is the model constrained to a single, stated purpose?
- Data minimised? Are you processing only the data the model actually needs?
- Pseudonymisation applied? Is personal data replaced with tokens before reaching the model?
- Consent checked? Does the pipeline respect per-user consent preferences?
- Retention enforced? Will the data be automatically deleted when the retention period expires?
- Audit trail active? Is every AI decision logged with inputs, outputs, and confidence scores?
- DPIA completed? Have you assessed the risks and documented mitigations?
- Human oversight implemented? Can a human review and override every AI decision?
If you can answer yes to all ten, you're in good shape. Not perfect — GDPR compliance is never "done" — but significantly ahead of most AI SaaS products.
The Bottom Line
GDPR compliance for AI features comes down to three documents and seven architectural decisions. The documents are your DPA, your Privacy by Design documentation, and your DPIA. The decisions are purpose limitation, data minimisation, pseudonymisation, consent awareness, retention enforcement, rights-respecting inference, and transparent logging.
None of this requires a law degree. It requires an engineering mindset applied to legal requirements. Write the compliance into your code, not into a document that sits in a drawer.
Because here's the enforcement reality: cumulative GDPR fines now exceed 6.7 billion euros. DPAs are actively investigating SaaS companies of all sizes. Article 25 violations alone carry fines of up to 10 million euros or 2% of global turnover. And with the EU AI Act adding overlapping requirements from August 2026, the compliance bar is only going up.
The good news: if you build these patterns once, they work everywhere. Same architecture, every product, every feature, every market.