Research direction

Meta Models

Per-category predictors trained on AIR outputs.

Researching

Meta models are category-specific surrogate models trained on structured experiment data from AIR. They turn accumulated experimental signal — architecture choices, runtime functions, training recipes, and their measured outcomes — into a predictive layer that estimates what is likely to work before full sweep budget is committed. This is Bayesian optimization grounded in real NAS data, not theoretical priors.

Predictive search policy

Meta models do not replace experiments — they reshape how experiments are prioritized. When AIR has hundreds of structured experiment records mapping configurations to performance, meta models learn which regions of the search space are likely to yield gains and which are likely to waste budget. The result is a search policy that becomes sharper with every sweep cycle.

Architecture predictor

Layer count, width, and structural choices

Predicts which architecture families are likely to deliver the best quality-to-budget tradeoff for a specific problem category. Starting from model depth, width, attention configuration, and embedding strategy — the structural decisions that dominate performance.

Function predictor

Runtime primitives and operator combinations

Scores proposed activation functions, normalizations, and operator combinations so low-signal variants are deprioritized before expensive runs. This is where AIR's runtime function invention feeds forward — new functions are scored before being swept.

Technique predictor

Optimization and training recipe effectiveness

Estimates whether training recipes — learning rate schedules, warmdown strategies, quantization schemes, regularization methods — are likely to help or hurt under current constraints. Built from the same structured logs that drive AIR's steering decisions.

Meta models progress through three stages as data accumulates: Gaussian processes for early uncertainty-aware exploration, gradient-boosted trees for handling discrete architectural features, and MLP ensembles for capturing non-linear interactions at scale.

The current scope is deliberately constrained to specific model families, expanding as calibration quality holds.

Meta Models — Predictive Architecture Search | Rkive... | Rkive AI