Planning AI Features When the Tech Changes Every Week

The Moving Target Problem

Last year I watched a team spend three weeks writing a detailed spec for an AI summarization feature. The spec included the model they’d use (GPT-4), the prompt architecture, the token budget, the fallback logic for rate limits, and the caching strategy.

Two weeks into development, a new model dropped that was faster, cheaper, and better at summarization. The team rewrote half the spec. A month later, the API pricing changed. More rewrites.

They spent more time updating the spec than building the feature.

This is the unique challenge of planning AI features. The underlying technology moves so fast that any spec tied to a specific model, API, or framework risks being outdated before you ship.

The Mistake: Specifying the “How” Too Early

When teams write requirements for AI features, they naturally gravitate toward implementation details. It makes sense. The model choice feels like the most important decision. The prompt design feels like the core of the feature.

But model choice and prompt design are implementation details. They’re the “how.” And the “how” is the part that keeps changing.

Here’s what a typical AI feature spec looks like:

“Use GPT-4 to summarize customer support tickets into 2-3 sentences. Prompt should include the full ticket text plus the last 5 messages. Cache responses for 24 hours. Cost target: under $0.02 per summary.”

Every part of this spec is coupled to a specific model’s capabilities and pricing. When the model changes, the prompt strategy changes, the cost assumptions change, the caching logic changes.

Separate the “What” From the “How”

The fix is straightforward: write your requirements at the problem level, not the implementation level.

The “what” is stable. Users have a problem. They need a solution. There are constraints around quality, speed, and cost. Those things don’t change when a new model launches.

The “how” is volatile. Which model, which prompt structure, which API, which framework. Those are engineering decisions that should stay flexible.

Here’s the same feature spec rewritten at the problem level:

User problem: Support agents spend 3-4 minutes reading each ticket before responding. They need a quick summary to triage faster.

Success criteria: Agents can accurately triage a ticket within 30 seconds of opening it. Summary captures the core issue and customer sentiment.

Quality bar: Summaries should be accurate enough that agents don’t need to read the full ticket for 80% of cases. Factual errors are unacceptable.

Constraints: Latency under 2 seconds. Cost under £0.03 per ticket. Must work for tickets up to 10,000 words.

Not specified: Model choice, prompt architecture, caching strategy. Those are implementation details for the engineering team to determine based on current options.

What Stays Stable, What Stays Flexible

Think of your AI feature spec in two layers:

The stable layer (write this down, commit to it):

What user problem does this solve?
Who is the user and what’s their workflow?
What does a good output look like? (Include examples.)
What does a bad output look like? (Include failure cases.)
What are the hard constraints? (Latency, cost, accuracy thresholds.)
How will we measure success?

The flexible layer (document decisions, expect to revisit):

Model selection
Prompt design and chain-of-thought architecture
Fine-tuning vs. few-shot vs. zero-shot approach
Specific API provider
Caching and optimization strategy

The stable layer is your product requirement. The flexible layer is your technical approach. Keep them in separate documents, or at least separate sections, so updating one doesn’t require rewriting the other.

Practical Advice for AI Product Specs

Write evaluation criteria early. Before you pick a model, define what “good enough” looks like with concrete examples. This becomes your benchmark for testing any implementation, current or future.

Include example inputs and expected outputs. Five or ten real examples of what goes in and what should come out. These are worth more than paragraphs of description, and they survive any technology change.

Set cost and latency budgets, not model mandates. “Under 2 seconds and under £0.03” is a better requirement than “use GPT-4 Turbo.” The first lets your team pick the best available option. The second locks them into today’s landscape.

Plan for the model to change. Build abstraction into the requirement itself. “The summarization service should be model-agnostic” is a valid product requirement if you expect to swap models as better options emerge.

The technology will keep moving. Your understanding of the user’s problem shouldn’t have to move with it.

The Moving Target Problem

The Mistake: Specifying the “How” Too Early

Separate the “What” From the “How”

What Stays Stable, What Stays Flexible

Practical Advice for AI Product Specs

Stop writing PRDs from scratch