FlashLabs Inc. is pleased to announce the availability of Anthropic’s latest model, “Claude Opus 4.8 API,” in its LLM routing gateway “OrcaRouter,” provided by its partner Continuum AI. Claude Opus 4.8 is a top-of-the-line coding model with a 1M token context window and a maximum output of 128K tokens, delivering exceptional performance in agent workflows and long-running autonomous tasks.
Background and Objective
In the field of AI development, LLM usage fees have become a new cost that continues to increase with product growth, posing a challenge for companies. The traditional approach of “handing everything over to high-performance models” results in companies continuing to pay high unit costs for processes that do not inherently require high-performance models, such as extraction, formatting, and classification, causing AI costs to continue to balloon. On the other hand, “manual routing on the application side” relies on if/else statements to manage model names and cost limits, which means that the rules become obsolete with each new model released, leaving the maintenance burden on the development team.
What you really need to look at is the difficulty level of the prompt itself. Many processes don’t require frontier models, and expensive models are only worthwhile for truly difficult inferences. OrcaRouter determines the difficulty level for each prompt and automatically routes difficult inferences to frontier models and routine processes to high-performance open models, reducing LLM spending by approximately 40% while maintaining quality.
Also Read: SoftBank’s Domestic AI Initiative Draws Major Japanese Manufacturers as Japan Expands Sovereign AI Ambitions
With the release of the Claude Opus 4.8 API, Japanese companies will be able to simultaneously leverage top-tier coding performance and OrcaRouter’s cost optimization features.
FlashLabs will support the widespread adoption of OrcaRouter in the Japanese market through its exclusive distribution partnership with Continuum AI. We plan to continue expanding its functionality, including adding new models, improving routing algorithms, and enhancing guardrail features.
SOURCE: PRTimes


