Commodity First Strategy | Justin Norvell, CFA

Method

Five-step commodity validation sequence

Most AI use cases in financial services can be addressed with commodity models and prompt engineering. This process establishes whether a use case is truly in that category — or whether proprietary investment is warranted.

Step 01

Use case framing

Define the task precisely: input, required output, quality threshold, latency requirement, and acceptable error rate. Ambiguous use cases cannot be evaluated fairly against any model.

Step 02

Commodity model benchmark

Run the use case against available commodity models with baseline prompting. Measure output quality against the defined threshold using representative real-world inputs.

Step 03

Prompt engineering ceiling test

Attempt to close the quality gap through structured prompting, few-shot examples, and chain-of-thought techniques. Establish the practical ceiling achievable without model customization.

Step 04

Gap and risk analysis

Quantify the remaining quality gap and assess whether it represents a material business risk — or an acceptable trade-off given the time and cost of custom builds.

Step 05

Proceed or escalate decision

Make a documented decision: deploy the commodity solution, continue with prompt optimization, or escalate to proprietary model investment with a defined justification.

Outputs

Artifacts produced by the process

Model evaluation report

Benchmark results for commodity models against the defined quality threshold for the use case.

Model and prompt configuration tested
Quality scores across representative samples
Failure mode categorization

Prompt engineering log

Documentation of optimization attempts, results, and the practical ceiling achieved.

Prompt variants and iteration history
Performance delta per technique
Remaining quality gap quantified

Build vs. deploy recommendation

Documented decision with supporting evidence for proceeding with commodity or escalating to custom.

Risk assessment of quality gap
Cost-benefit framing for custom build
Decision owner sign-off

Deployment specification

If proceeding with commodity: configuration, monitoring plan, and escalation triggers.

Prompt and model version locked
Performance monitoring thresholds
Trigger criteria for future re-evaluation

Engagement Cadence

How the process runs in practice

Typical timeline: 1-2 weeks

Days 1–3: use case framing and commodity model benchmark setup
Days 4–7: benchmark execution and prompt engineering optimization
Days 8–10: gap analysis, decision documentation, and deployment or escalation planning

Output: a clear, evidence-based decision on whether commodity models are sufficient — and if not, exactly what the custom build would need to achieve to be worth the investment.