Skip to Content

Automatic AI Model Selection


Every task gets the right model. Every model earns its place through continuous evaluation.

0+ AI Models Brokered
Routed automatically per task
Flat Pricing
Unlimited Usage: Pay one price regardless of volume

GreyMatter brokers across 20+ AI models, routing each task to the best-performing model for that specific job through automated evaluation. The routing pipeline evaluates every request against three criteria—cost, speed, and accuracy—and dispatches it to the model it’s best suited for in milliseconds.

How It Works

The Multi-Stage Routing Pipeline

Every request flows through a multi-stage pipeline that decides, in milliseconds, which model handles it. The pipeline is a hybrid system by design: deterministic checks handle anything with a clear right answer, a lightweight classifier handles fuzzy semantic judgments, and a scoring function combines them. This keeps every routing decision debuggable and auditable post-hoc, which matters when each decision has security consequences.

01
Request Parsing

The pipeline separates the user's instruction from attachments and tenant constraints: data residency requirements, allowed providers, budget ceilings.

02
Hard Constraint Elimination

A deterministic config lookup removes models that cannot serve the request. Filters include context-window length, tool-call support, structured-output capability, vision/file support, and region rules. This is fast, binary, and non-negotiable.

03
Agent Profile Matching

Each of GreyMatter's agents carries a precomputed profile: agent type, tool families it uses, expected output format, and preferred provider family. This narrows the candidate pool before any semantic analysis begins.

04
Semantic Classification

A lightweight, specialized classifier reads the user's instruction (not the full prompt) and predicts:

Task type:Extraction, summarization, code generation, open-ended reasoning, narrative analysis
Complexity profile:Reasoning depth required, domain knowledge required, constraint density
05
Scoring

Each model that survives the constraint filters is ranked on a weighted trade-off across expected quality, cost, and latency. The weighting shifts with the task: high-volume extraction prioritizes cost and speed, while complex incident correlation prioritizes quality, even at higher cost.

The highest-scoring model for this specific request wins.
06
Dispatch

The top-scoring model receives the request. The routing decision, the model selected, and the eventual outcome are all logged for continuous improvement.

Model Evaluation & Promotion Pipeline

Every model earns its routing position through standardized evaluation. No model enters production routing without meeting predefined evaluation criteria.

Fingerprinting

Fingerprinting via Probe Prompts

Each model is evaluated against a standardized set of test prompts spanning security-relevant task domains. Results produce a performance fingerprint—a numerical feature vector representing where the model excels and where it falls short. For example, some models score high on structured extraction but produce loose reasoning, while others handle ambiguous narrative analysis well but generate verbose output. The fingerprint captures these profiles quantitatively.

Shadow Traffic

After fingerprinting, candidate models run in shadow against production traffic—processing real requests in parallel with the currently routed model. Automated scoring compares outputs against the incumbent's results on the same requests, validating that fingerprint-predicted performance holds under real workload conditions.

Scoreboard & Promotion

Results populate a scoreboard that ranks models per task type across cost, speed, and accuracy. When a candidate consistently outperforms the incumbent on a task type, an operator reviews and promotes into active routing.

Ground Truth Feedback

Security work has a structural advantage for AI evaluation: ground truth exists. Every alert eventually resolves to true positive, false positive, or escalation. Every detection fires or doesn't. These outcomes feed directly back into model scoring—meaning performance is informed by actual security results.

Automatic Model Adoption and Failover

The architecture treats models as configuration, not code. Adding a new model to the evaluation pipeline requires no re-engineering; rather, it enters fingerprinting immediately upon availability, progresses through shadow traffic, and promotes into routing when it earns its position on the scoreboard.

What this means operationally:

When a stronger or more cost-efficient model emerges from any provider, GreyMatter evaluates it independently.
No re-procurement cycle, integration management, or configuration changes pushed to customers.
The defense layer improves continuously alongside AI model capabilities.

Mid-operation failover:If a model degrades or goes down during an active workflow, the pipeline automatically dispatches to the next-best model for that task type. Workflows continue without interruption or restart.

Scenario

Scenario: AI Model Broker + Detection Engineering Teammate

GreyMatter's 6 Agentic Teammates—IR, detection engineering, threat hunting, threat intel, IT, and OT—each decomposes jobs into hundreds of single-task agents. Every one of those agents executes through the model broker.

Consider a detection engineering request. The Teammate breaks it into component tasks: one agent writes the detection logic while another validates coverage against known attack patterns. Each of those agents may route to a different model—the logic-writing agent to a model strong at structured code generation, the validation agent to a model strong at reasoning over pattern sets.

Flat Pricing: How Model Selection Makes It Possible

Roughly three-quarters of tasks completed by GreyMatter resolve on lightweight, inexpensive models. High-volume, well-bounded work (log normalization, event summaries, IOC extraction, field mapping) produces accurate results without frontier-model compute. The remaining 25%—complex reasoning, incident correlation, ambiguous narrative extraction—routes to premium models where the need for accuracy justifies the cost.

This per-task cost control is what makes flat, unlimited-usage pricing viable. Customers pay one price regardless of volume because GreyMatter manages model economics internally rather than passing variable AI costs through.

Single-Model Platforms Customer-Choice Platforms
Model architecture One fixed model for all tasks. Multiple models; customer selects. 20+ models; automatic per-task routing.
Pricing model Per token / per query / per investigation. Per token / per query (across chosen model). Flat, unlimited usage.
Cost as usage scales Linear increase with volume. Linear increase (customer absorbs optimization burden). Flat. Routing absorbs cost optimization internally.
When a better model emerges Re-procurement or vendor dependency. Customer re-evaluates, reconfigures, re-procures. Automatic evaluation and promotion; no customer action.
Per-task optimization None. Same model regardless of task complexity. Manual. Customer must understand task-to-model fit. Automatic. Scoring function matches task complexity to model capability.
Accuracy under task diversity Degrades. One model handles everything from triage to complex reasoning. Depends on customer's configuration skill. Maintained. Each task type routes to its strongest model.
Cost visibility Unpredictable; scales with investigation volume. Unpredictable; customer manages cost-quality tradeoffs. Predictable; one price regardless of volume.

GreyMatter’s Approach

The pricing model is a direct consequence of routing architecture:

At a Glance

Attribute Detail
Routing architectureHybrid (deterministic constraints + semantic classifier + scoring function).
Scoring formulaWeighted trade-off across quality, cost, and latency (per-task weighting).
Routing overheadMilliseconds end-to-end.
FailoverAutomatic, mid-operation, to next-best model per task type.
Ground truth signalAlert resolutions (TP/FP/escalation) feed back into model scoring.
ExplainabilityEvery routing decision logged and auditable post-hoc.

Learn More About GreyMatter Agentic AI