Part 2: Separating Architecture from Implementation

If you asked a junior software engineer to add a checkout button and they immediately started typing code without checking whether the payment service even existed, you’d stop them. You’d ask them to step away from the keyboard and think through the system first.

Yet, most of us allow our AI tools to do exactly that.

The problem isn’t capability — it’s discipline. To solve this, I’ve been experimenting with a modular ai/roles/ directory that mirrors the roles inside a high-performing engineering team.

Why Separate Roles?

Modern engineering teams naturally divide work into distinct stages: Analysis, Design, and Implementation. While many AI tools now offer ‘Plan’ or ‘Agent’ modes to mimic this flow, they can still compress analysis + design + implementation unless you hard-code the workflow for your team. That speed is impressive, but without specific guidance, it leads to architectural drift and expensive rework.

By defining explicit roles as Markdown files within your repository, you create operational guardrails. You aren’t just relying on an IDE’s default ‘Plan Mode’; you are providing the specific persona and policy for that plan to follow.

Each file represents a distinct responsibility, allowing the team to extend the AI’s behaviour simply by adding new perspectives:

ARCHITECT.md: Focuses on system design and high-level logic. This is the ‘Plan Mode’ brain, ensuring new features align with your existing patterns.
DEVELOPER.md: Focuses on production-grade implementation. This is the ‘Edit Mode’ implementer, ensuring clean code and linting compliance.

A Note on Composition: You can scale this by adding any number of roles — perhaps a SECURITY.md or a QA.md. However, the value lies in how they work together. The goal isn't to create isolated silos, but a cohesive review board where each role's output informs the next.

The Planning Role

In this mode, the AI focuses exclusively on system design; implementation is intentionally deferred. Crucially, I have embedded a boot sequence into this role. This ensures the architect reads the ai/context/ directory before responding to a request. To bridge the gap between abstract logic and reality, we also require a mermaid.js diagram to visually audit the proposed flow.

ai/roles/ARCHITECT.md

Role: Senior Software Architect

Boot Sequence:
  - Before responding, you MUST read all files in the `ai/context/` directory.
  - Distil the Technical Stack, Domain Rules, and Boundaries.
  - If the user's request contradicts a file in `ai/context/`, flag it immediately.

Objectives:
  - High-Level Analysis: Evaluate the request across system boundaries.
  - Context Validation: Ensure the solution respects the Collective Intelligence.
  - Planning: Propose architecture and logic flow before implementation.

Required Output:
  - Problem Statement & Technical Proposal.
  - Visual Logic: A Mermaid.js sequence or flow diagram of the solution.
  - Context Patch: Proposed updates for `ai/context/` files if the system state changes.

Operational Constraint:
  - Do not generate implementation code in this stage.

The Execution Role

Once the design and diagrams have been reviewed and approved, we switch to the developer. This mindset is execution-focused. It doesn't freelance new patterns; it adheres to the approved blueprint and documented boundaries.

ai/roles/DEVELOPER.md

Role: Senior Software Engineer

Objectives:
  - Produce production-grade implementation.
  - Follow the approved architectural proposal.
  - Maintain clean structure and predictable imports.

Constraints:
  - Adhere to the engineering standards defined in `ai/standards/`.
  - Adhere to rules defined in `ai/context/BOUNDARIES.md`.
  - Prioritise idempotency and error handling.
  - Update documentation via a final Context Patch.
  - Do not introduce new patterns, dependencies, or architectural changes.
    If needed, STOP and return to the Architect role with a proposal + rationale.
  - If implementation reveals a contradiction with `ai/context/`, STOP and raise it
    before writing code.

A Work in Progress: These role definitions are early-stage blueprints from my weekend research. Over the coming weeks, I plan to evolve both the Architect and Developer roles — refining how they communicate, handle complex refactors, and maintain the integrity of our collective intelligence.

The Role-Based Mental Model

graph LR
    %% Knowledge Base
    Memory[("📂 ai/context/<br/>(Collective Intelligence)")]

    %% User Input
    User((Software Engineer)) -- "1. Request" --> Architect

    %% Design Phase
    subgraph Design ["Phase 1: Design"]
        Architect["<b>Architect Role</b><br/>(Audit & Plan)"]
    end

    %% Human Gate
    Architect -- "2. Proposal & Diagrams" --> Approval{<b>Human<br/>Review</b>}

    %% Implementation Phase
    subgraph Implementation ["Phase 2: Execution"]
        Developer["<b>Developer Role</b><br/>(Code & Patch)"]
    end

    %% The Flow
    Approval -- "3. Approve" --> Developer
    Memory -.->|Grounding| Architect
    Developer -->|4. Writes| Code[("📦 src/<br/>(Code)")]
    Developer -- "5. Context Patch" --> Memory

    %% Styling
    style User fill:#f9f,stroke:#333,stroke-width:2px
    style Approval fill:#fff,stroke:#6200ee,stroke-width:2px
    style Architect fill:#e1f5fe,stroke:#01579b
    style Developer fill:#e8f5e9,stroke:#2e7d32
    style Memory fill:#f0f7ff,stroke:#0052cc
    style Code fill:#fff7e6,stroke:#ffa940

The Workflow in Practice: The Refund Demo

Let’s look at how this works in a simple payment processing demo for a new Refund feature.

Step 1: The Architectural Phase

User

Architect, we need to implement a refund feature.

Refund Feature — Architectural Proposal

1. Problem Statement

The system needs the ability to refund payments that have reached the Succeeded state. Per the domain model, a Refund entity exists with the invariant that a refund cannot exceed amount_captured. We must design a flow that:

Validates the refund amount against what was captured (supporting partial and full refunds).
Coordinates with Stripe's Refund API.
Handles async webhook confirmation from Stripe.
Respects all architectural boundaries (service layer access, no PII logging, integer-only monetary values).

2. Technical Proposal

Concern	Decision
New Entity	`Refund` — linked to a `PaymentIntent` via FK. Fields: `id`, `payment_intent_id`, `amount` (integer, pence/cents), `reason`, `status`, `stripe_refund_id`, `trace_id`, `created_at`.
State Machine	`Refund`: `Created → Pending → Succeeded / Failed`
Invariant Enforcement	`PaymentService.create_refund()` sums all non-failed refunds for the parent `PaymentIntent` and rejects if `sum + requested > amount_captured`.
Stripe Integration	`PaymentService` calls `stripe.Refund.create()` synchronously to get the initial status, then transitions to `Pending`.
Webhook Handling	A Celery task processes `refund.updated` / `refund.succeeded` / `refund.failed` events from Stripe, advancing the `Refund` state machine.
Idempotency	Each `Refund` carries a unique `trace_id` (UUID4), passed as Stripe's `idempotency_key`.
Monetary Values	All amounts stored and transmitted as integers. No floats anywhere in the pipeline.
Notifications	On terminal state (`Succeeded` / `Failed`), a Celery task dispatches a confirmation email (PII redacted at service boundary).
Access Pattern	Views call `PaymentService.create_refund()` — no direct DB access from the view layer.

3. Logic Flow

sequenceDiagram
    participant V as RefundView
    participant PS as PaymentService
    participant DB as PostgreSQL
    participant S as Stripe API
    participant C as Celery Worker

    V->>PS: create_refund(payment_intent_id, amount, reason)
    PS->>DB: Fetch PaymentIntent (status=Succeeded)
    PS->>DB: SUM(refunds.amount) WHERE status != Failed
    PS->>PS: Assert sum + amount ≤ amount_captured
    PS->>DB: INSERT Refund (status=Created, trace_id=uuid4)
    PS->>S: stripe.Refund.create(amount, idempotency_key=trace_id)
    S-->>PS: Refund object (pending)
    PS->>DB: UPDATE Refund (status=Pending, stripe_refund_id)
    PS-->>V: RefundResponse(id, status=Pending)

    Note over S,C: Async webhook flow
    S->>C: refund.succeeded / refund.failed webhook
    C->>DB: UPDATE Refund (status=Succeeded|Failed)
    C->>C: Dispatch confirmation email (PII redacted)

4. Context Patch

The following updates to ai/context/DOMAIN_MODEL.md are proposed to formalise the Refund entity:

 Entity: Refund
 Rule: Refund cannot exceed amount_captured.
+
+State Machine:
+ - Created → Pending → Succeeded/Failed
+
+Invariant: Must possess a unique trace_id (idempotency key) before Stripe transmission.
+Invariant: SUM of non-failed refunds for a PaymentIntent must not exceed amount_captured.
+Association: Refund belongs to PaymentIntent. PaymentIntent status must be 'Succeeded' to allow refund creation.

No changes required to BOUNDARIES.md or SYSTEM_FACTS.md — the existing constraints already cover the design.

5. Flags & Risks

#	Item
1	Stripe webhook signing must be verified to prevent forged refund state transitions.
2	Race condition on concurrent refund requests for the same `PaymentIntent` — enforce with `SELECT ... FOR UPDATE` on the `PaymentIntent` row during invariant check.
3	The domain model states `Succeeded` is terminal and cannot revert for `PaymentIntent`. This design does not revert the `PaymentIntent` status on refund — it stays `Succeeded` while the child `Refund` entity tracks the money movement.

Next step: If approved, I'll hand over to the Developer role for implementation and context patching.

Step 2: The Implementation Phase

User

Approved. Developer, implement the service.

Implementation complete. Here's the summary of files created:

File	Purpose
`payments/models.py`	`PaymentIntent` and `Refund` Django models with state machines, `BigIntegerField` for monetary values, UUID `trace_id`
`payments/exceptions.py`	Domain-specific exceptions: `PaymentIntentNotFound`, `PaymentIntentNotRefundable`, `RefundAmountExceedsCaptured`, `RefundGatewayError`
`payments/services.py`	`PaymentService.create_refund()` with `SELECT FOR UPDATE` race-condition protection, invariant enforcement, Stripe call with idempotency key; `process_refund_webhook()` for terminal state transitions
`payments/tasks.py`	Celery tasks: `process_refund_webhook` (retries 3x) and `send_refund_notification`
`payments/views.py`	`CreateRefundView` (POST, delegates to service layer) and `StripeRefundWebhookView` (signature verification, dispatches to Celery)
`payments/urls.py`	Routes: `/refunds/` and `/webhooks/stripe/refund/`
`payments/tests.py`	13 unit tests covering full/partial refund, over-refund rejection, non-succeeded rejection, Stripe failure handling, failed-refund exclusion from sum, idempotency key propagation, and webhook state transitions

Key design decisions applied:

No floats — all monetary fields use BigIntegerField
No comments — descriptive names per style guide
Service layer — views never touch the DB directly
SELECT ... FOR UPDATE — prevents race conditions on concurrent refund requests
Idempotency — trace_id passed as Stripe idempotency_key
Webhook signing — stripe.Webhook.construct_event verifies signatures
Terminal state guard — webhook handler won't revert Succeeded/Failed refunds

The Context Patch: Keeping Memory Alive

A particularly useful outcome of this weekend’s research was the Context Patch. In most AI workflows, knowledge is ephemeral; once the chat window is closed, the reasoning behind a decision vanishes. This framework treats knowledge as a live asset instead.

At the end of a task, the AI identifies what has changed — perhaps a new state was added to a lifecycle or a service boundary was shifted — and proposes a Context Patch for the ai/context/ directory.

This creates a simple feedback loop:

Interaction: You define a feature or bug.
Implementation: The AI roles execute the work.
Capture: The AI suggests updates to keep your context files in sync with the latest changes.

By surfacing these updates automatically, the AI helps prevent the usual documentation decay that happens in fast-moving projects. Knowledge evolves alongside the code, rather than being left behind in a forgotten chat history.

Join the Conversation

I’m sharing this research as a series on LinkedIn to gather feedback:

Do you separate "Design" from "Coding" when working with AI?
Would having your AI generate a sequence diagram first help you spot logic errors?
How are you ensuring your AI's memory stays updated as your code evolves?
What’s the one "STOP rule" you’d enforce on AI-generated code in your team? (e.g., “no new dependencies”, “no direct DB access from views”, “no PII in logs”, “must include tests”)

Next Step: Deep Domain Expertise

With memory and roles in place, the next challenge is standards. In the next post, we’ll look at how to embed specific engineering standards and observability patterns into the framework, ensuring the AI produces production-ready solutions by default.