Vibe Coding: The Hidden Cost of AI Built Architectures
There is a new pattern emerging in software development. Someone has an idea, fires up ChatGPT or Claude, and starts building. The AI writes the code. The prototype works. The founder is thrilled. Six months later, the system is on fire, the database is melting, and a proper engineer is brought in to deliver the bad news: the entire thing needs to be rebuilt from scratch. I call this vibe coding. It is building software based on vibes rather than architectural understanding. The AI can write syntactically correct code, but it cannot make the strategic decisions that determine whether a system will scale, perform, or even survive contact with real users. This post is a systematic breakdown of where vibe coded architectures fail, which frameworks and languages create which problems, and how to recognise the warning signs before you have invested years into a doomed codebase.
There is a new pattern emerging in software development. Someone has an idea, fires up ChatGPT or Claude, and starts building. The AI writes the code. The prototype works. The founder is thrilled. Six months later, the system is on fire, the database is melting, and a proper engineer is brought in to deliver the bad news: the entire thing needs to be rebuilt from scratch.
I call this vibe coding. It is building software based on vibes rather than architectural understanding. The AI can write syntactically correct code, but it cannot make the strategic decisions that determine whether a system will scale, perform, or even survive contact with real users.
This post is a systematic breakdown of where vibe coded architectures fail, which frameworks and languages create which problems, and how to recognise the warning signs before you have invested years into a doomed codebase.
The Fundamental Problem: AI Writes Code, Not Architecture
Let me be clear about something: AI coding assistants are genuinely useful. I use them daily. They accelerate development, reduce boilerplate, and help explore unfamiliar APIs.
But there is a difference between writing code and designing systems.
Writing code is translating logic into syntax. AI is excellent at this.
Designing systems requires:
- Understanding how data flows and grows over time
- Anticipating bottlenecks before they occur
- Making tradeoffs between competing concerns
- Knowing what you do not know
AI cannot do this. It does not know your business. It does not know your scale. It does not know that your "simple user table" will have 10 million rows in 18 months. It optimises for the immediate request, not the long term architecture.
The result is code that works today and fails catastrophically tomorrow.
Database Architecture: Where Vibe Coding Dies First
The database is where architectural debt accumulates fastest. A bad schema decision made in week one becomes a million row migration nightmare in year two.
The N+1 Query Problem
This is the most common performance killer in vibe coded applications. It happens when you fetch a list of items, then make a separate database query for each item.
# The vibe coded version (N+1 queries)
orders = Order.objects.all() # 1 query
for order in orders:
print(order.customer.name) # N queries (one per order)
# 1000 orders = 1001 database queries
# The correct version (2 queries)
orders = Order.objects.select_related('customer').all()
for order in orders:
print(order.customer.name) # No additional queries
# 1000 orders = 2 database queries
The vibe coded version works perfectly with 10 orders in development. It brings the database to its knees with 10,000 orders in production.
AI writes N+1 queries constantly because each individual piece of code is correct. The AI does not see the loop. It does not understand that order.customer triggers a query. It just writes what you asked for.
Missing Indices
Every WHERE clause, every ORDER BY, every JOIN condition needs an index. Without indices, the database scans every row in the table.
-- Query that runs 1000x per second
SELECT * FROM orders WHERE user_id = 123 AND status = 'pending';
Without an index on (user_id, status), this query scans the entire orders table. At 100,000 rows, it is slow. At 10 million rows, it is a production incident.
AI rarely adds indices because migrations that "just add a table" are simpler. The performance impact is invisible until scale.
Wrong Database Choice
Vibe coding gravitates toward SQLite (it is the default) or PostgreSQL (the AI knows it). But the choice should depend on access patterns:
| Access Pattern | Right Choice | Wrong Choice |
|---|---|---|
| Relational data with complex joins | PostgreSQL, MySQL | MongoDB |
| Document storage, schema flexibility | MongoDB | PostgreSQL with JSONB everywhere |
| High write throughput, time series | TimescaleDB, ClickHouse | PostgreSQL |
| Session storage, caching | Redis | PostgreSQL |
| Full text search | Elasticsearch, Meilisearch | LIKE '%term%' on PostgreSQL |
| Graph relationships | Neo4j | Recursive CTEs on PostgreSQL |
I have seen vibe coded projects store time series data in PostgreSQL with a new row per data point, then wonder why queries take 30 seconds. I have seen projects store highly relational data in MongoDB, then spend months fighting with $lookup aggregations that a SQL JOIN would handle in milliseconds.
The AI picks whatever database you mention first, or defaults to the most common. It does not ask about your access patterns.
Schema Design That Cannot Evolve
Vibe coded schemas are designed for today's requirements, not tomorrow's.
The problem: You start with a users table with an address column (a string). Then you need to support multiple addresses. Then you need to validate addresses. Then you need to geocode them. Then you need to support international formats.
The vibe coded solution adds columns: address_line_1, address_line_2, address_city, address_country, secondary_address_line_1...
The correct solution creates an addresses table from day one with a foreign key relationship.
-- Vibe coded (week 1)
CREATE TABLE users (
id SERIAL PRIMARY KEY,
name VARCHAR(255),
address TEXT
);
-- Vibe coded (week 52, after 47 migrations)
CREATE TABLE users (
id SERIAL PRIMARY KEY,
name VARCHAR(255),
address_line_1 TEXT,
address_line_2 TEXT,
address_city TEXT,
address_country TEXT,
address_postcode TEXT,
billing_address_line_1 TEXT,
billing_address_line_2 TEXT,
-- ... 15 more address columns
);
-- Properly architected (week 1)
CREATE TABLE users (
id SERIAL PRIMARY KEY,
name VARCHAR(255)
);
CREATE TABLE addresses (
id SERIAL PRIMARY KEY,
user_id INTEGER REFERENCES users(id),
address_type VARCHAR(50), -- 'primary', 'billing', 'shipping'
line_1 TEXT,
line_2 TEXT,
city TEXT,
country TEXT,
postcode TEXT,
latitude DECIMAL(10, 8),
longitude DECIMAL(11, 8)
);
The properly architected version supports unlimited addresses, arbitrary types, and geocoding from day one. The vibe coded version requires a migration for every new requirement.
Choosing the Right Tool: Language Selection
Before frameworks, let us talk about languages. The language you choose determines:
- Your hiring pool
- Your library ecosystem
- Your performance ceiling
- Your deployment complexity
Python: The Vibe Coder's Default
Strengths:
- Massive ecosystem, especially for data science and ML
- Low barrier to entry
- Excellent AI/ML libraries (the AI knows Python best)
- Readable syntax
Weaknesses:
- Slow. The Global Interpreter Lock (GIL) limits true parallelism.
- Dynamic typing creates runtime errors that static languages catch at compile time
- Dependency management is a nightmare (pip, pipenv, poetry, conda all solve different problems badly)
- Memory hungry
When to use Python:
- Data pipelines and ML workloads
- Prototyping and MVPs
- Glue code between systems
- When your team already knows Python
When NOT to use Python:
- High throughput APIs (consider Go, Rust, or Elixir)
- CPU bound workloads without NumPy/Pandas optimisations
- Large teams where type safety prevents bugs
- When you need sub-millisecond latency
JavaScript/TypeScript: The Full Stack Trap
Strengths:
- Same language frontend and backend
- Massive npm ecosystem
- TypeScript adds type safety
- Event loop handles I/O concurrency well
Weaknesses:
- npm dependency hell (node_modules folder from hell)
- Callback/async complexity
- Weak standard library (everything is a dependency)
- Runtime type coercion creates subtle bugs
When to use JS/TS:
- Frontend (you have no choice)
- Real time applications (WebSockets, chat)
- When your team is frontend heavy
- Serverless functions (cold start times are good)
Ruby: The Productivity Champion
Strengths:
- Rails is incredibly productive for CRUD applications
- Convention over configuration reduces decisions
- Mature ecosystem with solved problems
- Testing culture is strong
Weaknesses:
- Slower than compiled languages
- Smaller hiring pool than Python/JS
- Metaprogramming can create unmaintainable magic
When to use Ruby:
- Web applications with complex business logic
- Startups that need to move fast
- E-commerce (Solidus, Spree)
- When developer happiness matters
Go: The Pragmatic Choice for Scale
Strengths:
- Compiled, fast, efficient
- Built in concurrency (goroutines)
- Simple language (easy to read others' code)
- Excellent for microservices and APIs
Weaknesses:
- Verbose (no generics until recently, limited abstractions)
- Error handling is tedious
- Less expressive than Ruby or Python
When to use Go:
- High throughput APIs
- Microservices
- DevOps tools (Docker, Kubernetes are written in Go)
- When you need to scale horizontally
Elixir: The Scalability Dark Horse
Strengths:
- Built on Erlang VM (battle tested for telecom scale)
- True concurrency with lightweight processes
- Phoenix framework is Rails-like productivity with better performance
- LiveView eliminates frontend JS for many use cases
Weaknesses:
- Small ecosystem compared to Python/JS
- Functional programming learning curve
- Smaller hiring pool
When to use Elixir:
- Real time features (chat, notifications, live updates)
- High concurrency requirements
- When you need Phoenix LiveView
- IoT and embedded systems
Rust: When Performance Is Non-Negotiable
Strengths:
- C-level performance with memory safety
- No garbage collector (predictable latency)
- Excellent type system
Weaknesses:
- Steep learning curve (borrow checker)
- Slower development velocity
- Overkill for most web applications
When to use Rust:
- Systems programming
- Performance critical services
- WebAssembly
- When you cannot afford GC pauses
Framework Analysis: The Good, The Bad, and The Ugly
Now let us get specific. These are the frameworks I see vibe coded projects reach for, and where they break.
Streamlit: The Prototype That Should Not Have Gone to Production
What it is: A Python library for building data apps with minimal code. Write Python, get a web UI.
The appeal: You can build a working dashboard in 50 lines of code. AI loves generating Streamlit apps because they are simple.
Where it breaks:
- No persistent state. Every interaction reruns the entire script. This is fine for a demo. It is catastrophic for a production app.
# This runs on EVERY interaction
import streamlit as st
import pandas as pd
# This expensive query runs every time you click a button
df = pd.read_sql("SELECT * FROM huge_table", connection) # 30 second query
st.dataframe(df)
if st.button("Do something"):
# The query ran AGAIN just to handle this click
process(df)
-
No URL routing. You cannot link to a specific page or state. Everything is one page with widgets that modify global state.
-
No authentication built in. You need third party libraries or Streamlit Cloud's auth, which locks you into their platform.
-
Cannot handle concurrent users well. Each user runs their own Python process. 100 concurrent users = 100 Python processes = server on fire.
-
No separation of concerns. Business logic, data access, and presentation are all mixed in one script. Impossible to test, impossible to refactor.
The ceiling: 10 concurrent users doing light data exploration. Beyond that, you need a real application.
Real example: A client built their entire analytics platform in Streamlit. 50 dashboards, 200 users. The server needed 64GB RAM just to run. Queries reran on every interaction. The rebuild into a proper FastAPI + React application took 6 months.
Gradio: Even More Limited Than Streamlit
What it is: A library for building ML model demos. Upload an image, get a prediction.
The appeal: You can wrap any Python function in a UI with one decorator.
Where it breaks:
Everything that breaks Streamlit, plus:
-
Designed for demos, not applications. There is no concept of users, sessions, or persistent data.
-
Limited UI components. You get inputs and outputs. That is it. No complex layouts, no custom styling, no interactive components.
-
No state management. Each request is independent. You cannot build multi-step workflows.
-
Scaling is "run more copies." There is no horizontal scaling story beyond load balancers pointing at replicas.
The ceiling: A model demo with a few concurrent users. Anything beyond is misuse.
Real example: A startup built their "AI platform" as a collection of Gradio interfaces behind a proxy. When they needed user accounts, payment processing, and saved results, they discovered Gradio has no answer for any of these. Full rebuild.
Plotly Dash: Streamlit's Older Sibling
What it is: A framework for building analytical web applications in Python.
The appeal: More structured than Streamlit. Callback based architecture. Better for complex dashboards.
Where it breaks:
- Callback hell. Complex applications become a web of callbacks that are impossible to trace.
@app.callback(
Output('graph-1', 'figure'),
Output('graph-2', 'figure'),
Output('table-1', 'data'),
Input('dropdown-1', 'value'),
Input('dropdown-2', 'value'),
Input('date-picker', 'start_date'),
Input('date-picker', 'end_date'),
State('store-1', 'data'),
State('store-2', 'data'),
)
def update_everything(v1, v2, start, end, s1, s2):
# Which output changed? All of them? Some of them?
# Good luck debugging this at 50 callbacks.
pass
-
Performance degrades with complexity. Each interaction triggers callback chains. Complex dashboards become slow.
-
Limited to dashboards. There is no authentication, no user management, no payment processing. It is a data visualisation tool being asked to be an application.
-
Python sending JSON to React. The architecture means you are building a React app with Python syntax. You get the limitations of both.
The ceiling: Internal dashboards with <50 concurrent users. Production applications need more.
Real example: A fintech built their client portal in Dash. It worked until they needed real time updates (Dash uses polling), custom components (requires React knowledge), and sub-second latency (Dash cannot deliver). Migration to Next.js took 4 months.
Flask: The Microframework That Needed a Macro
What it is: A minimal Python web framework. You get routing and not much else.
The appeal: Total control. No opinions. Add only what you need.
Where it breaks:
-
Everything is your problem. No ORM (add SQLAlchemy), no migrations (add Alembic), no authentication (add Flask-Login), no admin (add Flask-Admin), no forms (add WTForms). Each addition is a dependency with its own learning curve and compatibility concerns.
-
No convention. Every Flask project is structured differently. Onboarding new developers takes longer.
-
Sync by default. Flask is synchronous. If you need async (WebSockets, long-running requests), you need workarounds.
-
Global application state. Flask uses global variables (
current_app,g,request). This works until you need to test, or run multiple applications, or understand where state comes from.
The ceiling: Flask CAN scale to large applications (Reddit was Flask for years), but it requires discipline that vibe coding does not provide. Without experienced developers making consistent architectural decisions, Flask projects become unmaintainable.
Real example: A startup built their MVP in Flask because it was "simple." 18 months later, they had 15 different patterns for database access, 3 different authentication mechanisms, and no tests because the global state made testing impossible. The Flask to Django migration took 8 months.
Django: The Batteries Included Monolith
What it is: A full featured Python web framework. ORM, migrations, auth, admin, forms, all built in.
The appeal: Decisions are made for you. There is one way to do most things. Huge ecosystem.
Where it breaks:
- The ORM is a leaky abstraction. Django's ORM is convenient until you need performance. Then you discover
select_related,prefetch_related,defer,only,annotate, and realise you are fighting the ORM instead of using it.
# This innocent line can trigger 1000 queries
for order in Order.objects.all():
print(order.customer.name) # N+1!
for item in order.items.all(): # N*M+1!!
print(item.product.name) # N*M*P+1!!!
-
Synchronous by default. Django 4.0+ has async support, but the ORM is still sync. If you need async database access, you need workarounds.
-
Monolithic architecture fights microservices. Django wants to be one big application. Splitting into microservices requires fighting the framework.
-
Heavy. Django applications use more memory and start slower than lighter frameworks. This matters for serverless and edge deployment.
-
Magic can bite. Django's "convention over configuration" means things happen implicitly. When they break, debugging requires understanding the magic.
The ceiling: Django can scale to very large applications (Instagram runs on Django). But it requires understanding the ORM pitfalls, caching strategies, and async limitations. Vibe coded Django projects hit walls around 1M daily users without significant optimisation.
Real example: An e-commerce platform built with vibe coded Django worked until Black Friday. The ORM generated 10,000 queries per page load. The site went down. Emergency caching, query optimisation, and CDN implementation took 3 weeks of crisis mode.
FastAPI: The API Framework Being Asked to Do Everything
What it is: A modern Python framework for building APIs. Type hints, automatic OpenAPI docs, async support.
The appeal: Fast, modern, excellent developer experience. AI loves generating FastAPI code.
Where it breaks:
-
It is an API framework, not an application framework. There is no ORM, no migrations, no admin, no authentication, no frontend. You are assembling everything yourself.
-
Async is not free. FastAPI is async by default, which is great until you call synchronous code (most Python libraries). Then you need
run_in_executoror you block the event loop.
# This looks fine but blocks the event loop
@app.get("/users")
async def get_users():
# Synchronous database call blocks EVERYTHING
return db.query(User).all() # Wrong!
# This is correct but nobody does it
@app.get("/users")
async def get_users():
loop = asyncio.get_event_loop()
return await loop.run_in_executor(None, lambda: db.query(User).all())
-
Dependency injection complexity. FastAPI's dependency injection is powerful but creates complex dependency graphs that are hard to test and debug.
-
No convention for application structure. Like Flask, every FastAPI project is structured differently. There is no "Rails way" to follow.
The ceiling: FastAPI is excellent for APIs. It is not a full stack framework. Projects that start with "FastAPI + React" end up with two codebases to maintain, no shared authentication, and CORS nightmares.
Real example: A startup built their "platform" as a FastAPI backend + React frontend. Six months in, they had duplicate validation logic in both codebases, three different authentication mechanisms (JWT for API, sessions for admin, API keys for partners), and deployment required coordinating two repos. They considered rewriting in Django or Rails for the unified experience.
Reflex: The New Kid with Promise
What it is: A Python framework for building full stack web apps. Write Python, get a React frontend.
The appeal: No JavaScript required. Full stack in one language. Hot reload. Modern stack.
Where it breaks:
-
Immature ecosystem. Reflex is new. The component library is limited. Edge cases are undocumented. You will hit walls.
-
Compilation step. Reflex compiles Python to React. When the compilation breaks, debugging requires understanding both Python AND React internals.
-
Performance overhead. Every state change round trips to the Python backend. For highly interactive UIs, this latency adds up.
-
Lock-in. Reflex is a unique architecture. If you outgrow it, there is no migration path. You rebuild from scratch.
-
Limited deployment options. Reflex requires a Python backend always running. No static site generation, no edge deployment, no serverless.
The ceiling: Internal tools and MVPs where development speed matters more than performance. Production applications with 10K+ users need careful evaluation.
Real example: Too early for failure stories. But the architecture (Python backend for every interaction) will struggle with latency-sensitive, high-concurrency applications.
Rapid Development Frameworks That Actually Scale
These are the frameworks where "move fast" does not mean "rebuild later."
Ruby on Rails: The Original Rapid Development Framework
What it is: A full stack Ruby framework. Convention over configuration. Batteries included.
Why it works:
-
15+ years of production lessons. Rails has been scaled to millions of users (Shopify, GitHub, Airbnb). The pitfalls are known. The solutions exist.
-
Convention means consistency. Every Rails app is structured the same way. New developers onboard faster. Code reviews are easier.
-
ActiveRecord is mature. The ORM has every optimisation tool you need. N+1 query detection is built in (Bullet gem). Complex queries are possible.
-
Full stack by default. Auth (Devise), admin (ActiveAdmin), background jobs (Sidekiq), email (ActionMailer), all integrate seamlessly.
-
The Rails Way guides decisions. When you are stuck, there is usually a "Rails way" to follow. Less decision fatigue, fewer architectural mistakes.
Where Rails struggles:
- CPU bound workloads (Ruby is slow)
- Sub-millisecond latency requirements
- Microservices (Rails wants to be a monolith)
- Teams that do not embrace convention
Scaling path: Caching (Rails has excellent caching primitives), read replicas, background jobs, and eventually service extraction for hot paths.
Phoenix (Elixir): Rails with Erlang's Power
What it is: A full stack Elixir framework. Rails-like productivity on the Erlang VM.
Why it works:
-
True concurrency. Erlang processes are lightweight (run millions of them). No GIL. No event loop complexity.
-
Real time by default. Phoenix Channels handle WebSockets elegantly. LiveView eliminates frontend JavaScript for most use cases.
-
Fault tolerance. The Erlang "let it crash" philosophy means failures are isolated. One bad request does not take down the server.
-
Performance. Phoenix handles more concurrent connections per server than Rails or Django. WhatsApp scaled to 2 million connections per server with Erlang.
Where Phoenix struggles:
- Smaller ecosystem than Rails or Django
- Functional programming learning curve
- Fewer developers to hire
Scaling path: Often unnecessary. Phoenix handles scale that would require multiple Rails or Django servers with a single instance. When you do need to scale, Elixir's distribution primitives make clustering straightforward.
Laravel (PHP): The Modern PHP Framework
What it is: A full stack PHP framework. Elegant syntax, comprehensive tooling.
Why it works:
-
PHP is everywhere. Shared hosting, VPS, managed hosting all support PHP. Deployment is trivial.
-
Excellent documentation. Laravel's docs are a model for the industry.
-
Comprehensive ecosystem. Auth (Laravel Breeze), admin (Nova), queues (Laravel Horizon), real time (Laravel Echo). Everything works together.
-
Modern PHP. PHP 8.x is a capable language. Type hints, attributes, JIT compilation. It is not 2005 anymore.
Where Laravel struggles:
- PHP's fundamental weirdness (see below)
- Performance ceiling lower than Go, Rust, or Elixir
- Stigma (deserved or not, PHP carries baggage)
Why PHP Is the Worst Offender
I saved this for last because PHP deserves special attention. It is the most common language for vibe coded projects because AI has seen so much PHP, and because it "works" immediately.
But PHP has fundamental design problems that compound over time.
The Language Design Problems
Inconsistent standard library:
// Is the needle or haystack first? It depends!
strpos($haystack, $needle); // haystack, needle
array_search($needle, $haystack); // needle, haystack
in_array($needle, $haystack); // needle, haystack
strstr($haystack, $needle); // haystack, needle
Every function is a memory test. The AI writes it wrong half the time. You write it wrong half the time. You only discover the bug in production.
Weak typing creates silent failures:
// All of these are "equal" in PHP
0 == "0" // true
0 == "" // true
"0" == "" // false (wait, what?)
null == false // true
0 == null // true
0 == false // true
// This leads to actual security vulnerabilities
if ($password == $user_input) { // WRONG: use ===
// Attacker can bypass with type juggling
}
Silent error handling:
// This silently returns null instead of failing
$result = @some_function_that_errors();
// This requires manual checking that nobody does
$file = fopen('missing.txt', 'r');
if ($file === false) {
// Handle error... or don't, nobody checks
}
Global state everywhere:
// $_GET, $_POST, $_SESSION, $_COOKIE are all global
// This makes testing nearly impossible
function get_user() {
return User::find($_SESSION['user_id']); // Global!
}
The Ecosystem Problems
Composer came late. Python had pip, Ruby had gems, Node had npm. PHP got Composer in 2012, decades after the language launched. Millions of PHP applications were built with manual include statements, copied and pasted vendor code, and no dependency management. Legacy codebases cannot adopt modern practices.
No concurrency model. PHP is request-response. Each request is a new process. There is no built in async, no WebSockets without extensions, no background processing without external tools. Every concurrent problem requires leaving PHP.
Framework fragmentation. Laravel, Symfony, CodeIgniter, Yii, CakePHP, Zend/Laminas. Each has different conventions, different ORMs, different patterns. Moving between frameworks means relearning everything.
The Cultural Problems
"It works" is the bar. PHP's low barrier to entry means many PHP developers never learned software engineering. The codebase "works," so why refactor? Why test? Why document?
Tutorial driven development. PHP has more tutorials than any other language. Many are outdated, insecure, or teach bad practices. AI trained on this corpus perpetuates the problems.
WordPress. 40% of the web runs WordPress. WordPress is PHP. WordPress is also the source of infinite security vulnerabilities, plugin conflicts, and performance problems. The association drags down the entire ecosystem.
When PHP Is Acceptable
I am not saying never use PHP. Use PHP when:
- You are extending WordPress, Drupal, or Magento (you have no choice)
- Your team already knows PHP well
- You are using Laravel with modern PHP practices
- Deployment constraints require PHP (shared hosting)
Do not use PHP when:
- Starting a new project with language choice freedom
- Building real time features
- Performance matters
- You want to attract senior engineers (PHP stigma is real in hiring)
The Vibe Coding Checklist: Signs You Are Heading for Trouble
Before you build, or if you are evaluating a codebase, check for these warning signs:
Database Warning Signs
- No foreign keys defined
- No indices beyond primary keys
- All columns are VARCHAR(255) or TEXT
- No migrations (schema changes are manual SQL)
- ORM used for everything, raw SQL never
- SQLite in production
Architecture Warning Signs
- All logic in route handlers (no service layer)
- No background job processing (everything synchronous)
- Secrets in code (not environment variables)
- No caching layer
- Monolithic function files (1000+ line files)
- No error handling (or silent error suppression)
Framework Warning Signs
- Streamlit/Gradio for production applications
- Flask without SQLAlchemy/Alembic
- FastAPI trying to be a full stack framework
- Any framework from a "Build X in 10 Minutes" tutorial
Testing Warning Signs
- No tests
- Tests only pass on the original developer's machine
- "Testing" means clicking through the UI
- No CI/CD pipeline
Deployment Warning Signs
- FTP deployment
- SSH and
git pulldeployment - "It works on my machine"
- No staging environment
- No rollback capability
How to Recover from Vibe Coding
If you recognise your project in this post, you have options:
Option 1: Gradual Refactoring
When it works: The codebase is small, tests exist (or can be added), and the framework choice is not fundamentally wrong.
Approach:
- Add tests for existing functionality
- Extract business logic into services
- Fix database schema incrementally
- Add proper error handling
- Improve one subsystem at a time
Option 2: Strangler Pattern
When it works: The system is large enough that full rewrites are prohibitive, but modules can be extracted.
Approach:
- Put the legacy system behind an API gateway
- Extract one feature to a new service
- Route traffic to the new service
- Repeat until legacy system is empty
Option 3: Full Rewrite
When it works: The existing system is fundamentally broken (wrong database choice, wrong language choice, no tests, unmaintainable code) AND the team understands why the first attempt failed.
Warning: Rewrites fail more often than they succeed. Only rewrite if you:
- Understand what went wrong the first time
- Have experienced architects making decisions
- Have written specifications before writing code
- Have budget and time for the rewrite to fail and try again
Conclusion: Architecture Is Not Automatable
AI coding assistants are powerful tools. They accelerate development, reduce boilerplate, and help explore unfamiliar territory.
But they cannot replace architectural understanding. They cannot ask clarifying questions about scale. They cannot anticipate bottlenecks. They cannot make the tradeoffs that determine success or failure.
The most expensive code is code that needs to be rewritten. The cheapest architectural decision is the one made before the first line is written.
If you are building something meant to last:
-
Spend time on architecture before coding. Whiteboard the data model. Identify access patterns. Plan for 10x your expected scale.
-
Choose boring technology. Rails, Django, Phoenix, Laravel have known scaling paths. Novel frameworks have unknown failure modes.
-
Hire experience early. One senior engineer making architectural decisions saves 10 junior engineers fixing them later.
-
Test relentlessly. Tests are the only way to refactor safely.
-
Use AI for implementation, not design. Let AI write the code that implements your architecture. Do not let AI design the architecture.
The vibe coded prototype that works today becomes the legacy system that blocks progress tomorrow. Build for tomorrow.
I have rebuilt vibe coded projects across e-commerce, fintech, and SaaS. The patterns are consistent: wrong framework choice, missing database indices, no separation of concerns, and the belief that "it works" means "it scales." If you are inheriting a codebase and need an honest assessment of what is salvageable, or if you are starting fresh and want to get the architecture right from day one, I can help. Let us talk before the technical debt compounds.