If you asked a junior software engineer to add a checkout button and they immediately started typing code without checking whether the payment service even existed, you’d stop them. You’d ask them to step away from the keyboard and think through the system first.
Yet, most of us allow our AI tools to do exactly that.
The problem isn’t capability — it’s discipline. To solve this, I’ve been experimenting with a modular ai/roles/ directory that mirrors the roles inside a high-performing engineering team.
Why Separate Roles?
Modern engineering teams naturally divide work into distinct stages: Analysis, Design, and Implementation. While many AI tools now offer ‘Plan’ or ‘Agent’ modes to mimic this flow, they can still compress analysis + design + implementation unless you hard-code the workflow for your team. That speed is impressive, but without specific guidance, it leads to architectural drift and expensive rework.
By defining explicit roles as Markdown files within your repository, you create operational guardrails. You aren’t just relying on an IDE’s default ‘Plan Mode’; you are providing the specific persona and policy for that plan to follow.
Each file represents a distinct responsibility, allowing the team to extend the AI’s behaviour simply by adding new perspectives:
-
ARCHITECT.md: Focuses on system design and high-level logic. This is the ‘Plan Mode’ brain, ensuring new features align with your existing patterns. -
DEVELOPER.md: Focuses on production-grade implementation. This is the ‘Edit Mode’ implementer, ensuring clean code and linting compliance.
A Note on Composition: You can scale this by adding any number of roles — perhaps a SECURITY.md or a QA.md. However, the value lies in how they work together. The goal isn't to create isolated silos, but a cohesive review board where each role's output informs the next.
The Planning Role
In this mode, the AI focuses exclusively on system design; implementation is intentionally deferred. Crucially, I have embedded a boot sequence into this role. This ensures the architect reads the ai/context/ directory before responding to a request. To bridge the gap between abstract logic and reality, we also require a mermaid.js diagram to visually audit the proposed flow.
The Execution Role
Once the design and diagrams have been reviewed and approved, we switch to the developer. This mindset is execution-focused. It doesn't freelance new patterns; it adheres to the approved blueprint and documented boundaries.
A Work in Progress: These role definitions are early-stage blueprints from my weekend research. Over the coming weeks, I plan to evolve both the Architect and Developer roles — refining how they communicate, handle complex refactors, and maintain the integrity of our collective intelligence.
The Role-Based Mental Model
graph LR
%% Knowledge Base
Memory[("📂 ai/context/<br/>(Collective Intelligence)")]
%% User Input
User((Software Engineer)) -- "1. Request" --> Architect
%% Design Phase
subgraph Design ["Phase 1: Design"]
Architect["<b>Architect Role</b><br/>(Audit & Plan)"]
end
%% Human Gate
Architect -- "2. Proposal & Diagrams" --> Approval{<b>Human<br/>Review</b>}
%% Implementation Phase
subgraph Implementation ["Phase 2: Execution"]
Developer["<b>Developer Role</b><br/>(Code & Patch)"]
end
%% The Flow
Approval -- "3. Approve" --> Developer
Memory -.->|Grounding| Architect
Developer -->|4. Writes| Code[("📦 src/<br/>(Code)")]
Developer -- "5. Context Patch" --> Memory
%% Styling
style User fill:#f9f,stroke:#333,stroke-width:2px
style Approval fill:#fff,stroke:#6200ee,stroke-width:2px
style Architect fill:#e1f5fe,stroke:#01579b
style Developer fill:#e8f5e9,stroke:#2e7d32
style Memory fill:#f0f7ff,stroke:#0052cc
style Code fill:#fff7e6,stroke:#ffa940
The Workflow in Practice: The Refund Demo
Let’s look at how this works in a simple payment processing demo for a new Refund feature.
Step 1: The Architectural Phase
User
Architect, we need to implement a refund feature.
AI
Refund Feature — Architectural Proposal
1. Problem Statement
The system needs the ability to refund payments that have reached the Succeeded state. Per the domain model, a Refund entity exists with the invariant that a refund cannot exceed amount_captured. We must design a flow that:
- Validates the refund amount against what was captured (supporting partial and full refunds).
- Coordinates with Stripe's Refund API.
- Handles async webhook confirmation from Stripe.
- Respects all architectural boundaries (service layer access, no PII logging, integer-only monetary values).
2. Technical Proposal
| Concern | Decision |
|---|---|
| New Entity | Refund — linked to a PaymentIntent via FK. Fields: id, payment_intent_id, amount (integer, pence/cents), reason, status, stripe_refund_id, trace_id, created_at. |
| State Machine | Refund: Created → Pending → Succeeded / Failed |
| Invariant Enforcement | PaymentService.create_refund() sums all non-failed refunds for the parent PaymentIntent and rejects if sum + requested > amount_captured. |
| Stripe Integration | PaymentService calls stripe.Refund.create() synchronously to get the initial status, then transitions to Pending. |
| Webhook Handling | A Celery task processes refund.updated / refund.succeeded / refund.failed events from Stripe, advancing the Refund state machine. |
| Idempotency | Each Refund carries a unique trace_id (UUID4), passed as Stripe's idempotency_key. |
| Monetary Values | All amounts stored and transmitted as integers. No floats anywhere in the pipeline. |
| Notifications | On terminal state (Succeeded / Failed), a Celery task dispatches a confirmation email (PII redacted at service boundary). |
| Access Pattern | Views call PaymentService.create_refund() — no direct DB access from the view layer. |
3. Logic Flow
sequenceDiagram
participant V as RefundView
participant PS as PaymentService
participant DB as PostgreSQL
participant S as Stripe API
participant C as Celery Worker
V->>PS: create_refund(payment_intent_id, amount, reason)
PS->>DB: Fetch PaymentIntent (status=Succeeded)
PS->>DB: SUM(refunds.amount) WHERE status != Failed
PS->>PS: Assert sum + amount ≤ amount_captured
PS->>DB: INSERT Refund (status=Created, trace_id=uuid4)
PS->>S: stripe.Refund.create(amount, idempotency_key=trace_id)
S-->>PS: Refund object (pending)
PS->>DB: UPDATE Refund (status=Pending, stripe_refund_id)
PS-->>V: RefundResponse(id, status=Pending)
Note over S,C: Async webhook flow
S->>C: refund.succeeded / refund.failed webhook
C->>DB: UPDATE Refund (status=Succeeded|Failed)
C->>C: Dispatch confirmation email (PII redacted)
4. Context Patch
The following updates to ai/context/DOMAIN_MODEL.md are proposed to formalise the Refund entity:
Entity: Refund
Rule: Refund cannot exceed amount_captured.
+
+State Machine:
+ - Created → Pending → Succeeded/Failed
+
+Invariant: Must possess a unique trace_id (idempotency key) before Stripe transmission.
+Invariant: SUM of non-failed refunds for a PaymentIntent must not exceed amount_captured.
+Association: Refund belongs to PaymentIntent. PaymentIntent status must be 'Succeeded' to allow refund creation.
No changes required to BOUNDARIES.md or SYSTEM_FACTS.md — the existing constraints already cover the design.
5. Flags & Risks
| # | Item |
|---|---|
| 1 | Stripe webhook signing must be verified to prevent forged refund state transitions. |
| 2 | Race condition on concurrent refund requests for the same PaymentIntent — enforce with SELECT ... FOR UPDATE on the PaymentIntent row during invariant check. |
| 3 | The domain model states Succeeded is terminal and cannot revert for PaymentIntent. This design does not revert the PaymentIntent status on refund — it stays Succeeded while the child Refund entity tracks the money movement. |
Next step: If approved, I'll hand over to the Developer role for implementation and context patching.
Step 2: The Implementation Phase
User
Approved. Developer, implement the service.
AI
Implementation complete. Here's the summary of files created:
| File | Purpose |
|---|---|
payments/models.py |
PaymentIntent and Refund Django models with state machines, BigIntegerField for monetary values, UUID trace_id |
payments/exceptions.py |
Domain-specific exceptions: PaymentIntentNotFound, PaymentIntentNotRefundable, RefundAmountExceedsCaptured, RefundGatewayError |
payments/services.py |
PaymentService.create_refund() with SELECT FOR UPDATE race-condition protection, invariant enforcement, Stripe call with idempotency key; process_refund_webhook() for terminal state transitions |
payments/tasks.py |
Celery tasks: process_refund_webhook (retries 3x) and send_refund_notification |
payments/views.py |
CreateRefundView (POST, delegates to service layer) and StripeRefundWebhookView (signature verification, dispatches to Celery) |
payments/urls.py |
Routes: /refunds/ and /webhooks/stripe/refund/ |
payments/tests.py |
13 unit tests covering full/partial refund, over-refund rejection, non-succeeded rejection, Stripe failure handling, failed-refund exclusion from sum, idempotency key propagation, and webhook state transitions |
Key design decisions applied:
- No floats — all monetary fields use
BigIntegerField - No comments — descriptive names per style guide
- Service layer — views never touch the DB directly
SELECT ... FOR UPDATE— prevents race conditions on concurrent refund requests- Idempotency —
trace_idpassed as Stripeidempotency_key - Webhook signing —
stripe.Webhook.construct_eventverifies signatures - Terminal state guard — webhook handler won't revert
Succeeded/Failedrefunds
The Context Patch: Keeping Memory Alive
A particularly useful outcome of this weekend’s research was the Context Patch. In most AI workflows, knowledge is ephemeral; once the chat window is closed, the reasoning behind a decision vanishes. This framework treats knowledge as a live asset instead.
At the end of a task, the AI identifies what has changed — perhaps a new state was added to a lifecycle or a service boundary was shifted — and proposes a Context Patch for the ai/context/ directory.
This creates a simple feedback loop:
-
Interaction: You define a feature or bug.
-
Implementation: The AI roles execute the work.
-
Capture: The AI suggests updates to keep your context files in sync with the latest changes.
By surfacing these updates automatically, the AI helps prevent the usual documentation decay that happens in fast-moving projects. Knowledge evolves alongside the code, rather than being left behind in a forgotten chat history.
Join the Conversation
I’m sharing this research as a series on LinkedIn to gather feedback:
-
Do you separate "Design" from "Coding" when working with AI?
-
Would having your AI generate a sequence diagram first help you spot logic errors?
-
How are you ensuring your AI's memory stays updated as your code evolves?
-
What’s the one "STOP rule" you’d enforce on AI-generated code in your team? (e.g., “no new dependencies”, “no direct DB access from views”, “no PII in logs”, “must include tests”)
Next Step: Deep Domain Expertise
With memory and roles in place, the next challenge is standards. In the next post, we’ll look at how to embed specific engineering standards and observability patterns into the framework, ensuring the AI produces production-ready solutions by default.