One of the biggest frustrations when working with AI is the "Average Code" problem. Because LLMs are trained on vast public datasets, their default behaviour is to produce the most statistically common solution — not necessarily the one aligned with your specific architecture.
In a high-stakes environment like a payment system, ‘average’ simply isn’t good enough. We don’t just need Python; we need our team’s Python — code that reflects established programming practices, observability safeguards, and project-specific invariants.
The aim of this research is to reduce this generic drift by providing the AI with a structured Source of Truth within the ai/ directory. By defining our standards, we ensure the AI's output is production-aligned by default.
The ai/ Directory: What Lives Where?
To avoid mixing concerns, I split the repository guidance into three layers:
-
ai/context/— system facts, domain truth, and hard boundaries (what must not be violated) -
ai/roles/— operating procedure (how the AI behaves during planning vs implementation) -
ai/standards/— engineering standards (how code should look and be engineered)
In practice, roles and standards are reusable across repos — context is the part that’s project-specific.
Defining the Standard
To move beyond generic output, we need a persistent quality reference that the AI can't ignore. Instead of repeating style rules in every prompt, we define them clearly within the repository.
When the developer role reads these files, the output shifts from generic to standard-aligned. It no longer has to guess our preferences; the expectations are built into its environment.
Scaling with a Standards Hierarchy
As a project grows, a single file can become a bottleneck. To keep the AI focused, we can break our standards down into a modular directory structure. This allows the AI to load only the specific context required for the task at hand.
Note: This is a simplified demo for the purpose of this research. In a production environment, these files would be significantly more comprehensive.
Visualising the Standards Filter
By introducing a modular hierarchy of standards, we stop the AI from drifting into "Internet Averages". Instead, we force it to pass every decision through a series of project-specific filters.
graph TD
%% Input
Task((Generic Task)) --> AI{AI Specialist Mindset}
%% The Standards Filter Stack
subgraph Standards["📂 ai/standards/"]
direction TB
S["📂 ai/standards/"]
Lang["📄 languages/python.md\n(Type Hints, Async Rules)"]
Frame["📄 frameworks/django.md\n(Service Layer, Atomic Transactions)"]
Project["📄 project/observability.md\n(Logging, Trace IDs)"]
S --> Lang
S --> Frame
S --> Project
end
%% Grounding (use a real node, not the subgraph label)
S -.->|Injects Constraints| AI
%% Output
AI -->|High-Confidence| Result["📦 Standard-Aligned Code\n(Production-Ready)"]
%% Styling
style Task fill:#f5f5f5,stroke:#9e9e9e
style AI fill:#fff,stroke:#333,stroke-width:2px
style Standards fill:#f0f7ff,stroke:#0052cc,stroke-width:2px
style Result fill:#fff7e6,stroke:#ffa940,stroke-width:2px
The Difference: From Generic to Standard-Aligned
The impact of these files isn't just a 'nice to have' — it fundamentally alters the AI’s decision-making process.
Without this context, an AI model acts as a generalist. When asked to build a payment feature, it might suggest putting logic in a Django View because that is what most tutorials on the internet do. It's correct code, but it's the wrong architecture for a scalable system.
By introducing the Engineering Standards, we shift the AI into a specialist mindset. It no longer offers the most statistically likely answer; it offers the answer that is compliant with your repository.
Generic AI Output:
User
Implement the process_refund service.
AI
def refund(id, amt):
# Just creates a record, no safety or type hints
r = Refund.objects.create(intent_id=id, amount=amt)
return rStandard-Aligned Output:
User
Developer, implement the process_refund service.
AI
@transaction.atomic
def create_refund(payment_intent_id: int, amount: int) -> Refund:
# 1. Race-condition protection (from Engineering Standards)
intent = PaymentIntent.objects.select_for_update().get(id=payment_intent_id)
# 2. Invariant enforcement (from Domain Model)
already_refunded = Refund.objects.filter(payment_intent=intent).exclude(
status="FAILED"
).aggregate(total=Sum("amount"))["total"] or 0
if already_refunded + amount > intent.amount_captured:
raise RefundAmountExceedsCaptured("Refund limit reached.")
# 3. Observability & Idempotency (from System Facts)
trace_id = uuid.uuid4()
logger.info("Initiating refund", extra={"trace_id": str(trace_id)})
stripe.Refund.create(
amount=amount,
idempotency_key=str(trace_id)
)
# ...The difference is not model intelligence — it is contextual guidance.
This creates three immediate benefits:
-
Consistency: Every feature looks like it was written by the same senior engineer.
-
Reduced Review Time: You spend less time correcting basic architectural mistakes.
-
Architectural Guardrails: The AI becomes self-policing, flagging when a request might lead to a security risk.
Enforcing Business Invariants
Beyond technical style, we must also define the Invariants — the immutable business rules that govern the system's integrity. These are the critical constraints that, if violated, could lead to financial or data disasters.
In our refund demo, the invariant is critical: A refund must never exceed the original amount_captured. By documenting this as a primary rule in ai/context/DOMAIN_MODEL.md, we change the workflow:
-
Proactive Detection: The architect role is now trained to cross-reference every new proposal against these invariants.
-
The Planning Gate: If a user asks for a feature that inadvertently allows an over-refund, the architect catches the logic error during the planning phase — long before the developer role even starts writing code.
-
Knowledge Persistence: This rule doesn't just live in a developer's mind; it is a permanent fixture of the AI's environment.
The shift is fundamental: We are moving from catching bugs at runtime to catching architectural violations at the design stage. It ensures the AI acts as an automated validator for your business logic.
Why This Matters
Traditionally, architecture decisions live in wikis that are easy to ignore. When standards become machine-readable context within your repo, those decisions begin influencing implementation directly. Architecture stops being passive documentation and becomes an active participant in the development workflow.
Join the Conversation
I’m sharing this research as a series on LinkedIn to gather feedback:
-
How do you document invariants today — in code, Jira, or a wiki?
-
Have you seen your AI drift into “internet-average” patterns for your stack?
-
If you could encode one standard in
ai/standardstomorrow, what would it be? -
What’s one invariant you’ve had to re-explain more than once to an AI?
Next Step: Preserving Momentum
We now have memory, roles, and standards. In the next post, we’ll see how this all comes together to create a Jira-to-Code Workflow for turning requirements into production-ready code while maintaining state and continuity over long tasks.