Process massive datasets with Llama-3.1-Nemotron-Ultra-253B-v1, featuring an expansive 128K context window for long-document analysis. This model delivers cost-effective pricing at $0.60/1M input and $1.80/1M output tokens, native tool calling support, open weights architecture. Access Llama-3.1-Nemotron-Ultra-253B-v1 via the Nebius Token Factory API with up to 4K output tokens.
Tokens
Tokens
Tokens
Llama-3.1-Nemotron-Ultra-253B-v1 by Nebius Token Factory costs $0.60 per 1M input tokens and $1.80 per 1M output tokens. Cached reads cost $0.06 per 1M tokens. Cache writes cost $0.75 per 1M tokens.