Limshift Insights
4 findings on your simulated workloadsShift 42% of gpt-4o workloads to gpt-4o-mini
Token-size analysis suggests a meaningful share of gpt-4o-2024-08-06 calls are simple completions that gpt-4o-mini handles within a 0.5% quality delta — intelligent routing reclaims the difference.
claude-opus-4-7 in wrkspc_main: rising usage overhead
Long-running workflows are re-processing ~28% of historical payload per turn. HISTORY reduces the repeated processing while preserving workflow continuity — about 18% of usage on this workload is recoverable.
Premium model usage spike on o1 in proj_Research
A reasoning-tier model is being used for prompts that fail validation 23% of the time, doubling effective usage. Intelligent routing sends validation-prone prompts through a guardrail first and reserves premium tiers for what needs them.
wrkspc_research: spend grew 67% week-over-week ($78 → $130)
Approaching limit on this scope. Set a per-scope usage alert to catch the next spike before it lands. LIMIT thresholds are currently configured globally; per-scope alerts ship in v0.2.