Publications

Cost-Aware Contrastive Routing for LLMs

NeurIPS 2025 Spotlight, 2025

CSCR embeds both prompts and LLMs into a shared space using fast logit or perplexity fingerprints. A cost‑banded InfoNCE loss trains the space to balance quality against cost. It generalizes to unseen models and out‑of‑distribution prompts.

Paper , Code