AI Model APIs
AI Model APIsProviders
AI Model APIs

The complete platform for comparing AI models. Find pricing, capabilities, and the perfect model for your use case.

contact@aimodelapis.com

Resources

  • AI Models APIs
  • Providers

About

  • About
  • Contact
  • Privacy Policy
  • Terms of Service

© 2026 AI Model APIs. All rights reserved.

Back to Models
Kilo Gateway

NVIDIA: Llama 3.1 Nemotron Ultra 253B v1 API

Kilo Gateway
About

Build advanced reasoning agents with NVIDIA: Llama 3.1 Nemotron Ultra 253B v1, a specialized AI model optimized for complex logic and chain-of-thought tasks. This model delivers cost-effective pricing at $0.60/1M input and $1.80/1M output tokens, open weights architecture. Access NVIDIA: Llama 3.1 Nemotron Ultra 253B v1 via the Kilo Gateway API with up to 26K output tokens.

Capabilities

Input Modalities

text

Output Modalities

text
Reasoning
Yes
Structured Output
No
Tool Use
No
WeightsOpen
Temperature
Adjustable
Attachment
Not Supported
Limits
Context Window
131K

Tokens

Input Limit
131K

Tokens

Max Output
26K

Tokens

Pricing

Standard (per 1M tokens)

Input
$0.60
Output
$1.80
Frequently Asked Questions

NVIDIA: Llama 3.1 Nemotron Ultra 253B v1 by Kilo Gateway costs $0.60 per 1M input tokens and $1.80 per 1M output tokens.

Model Details
ID
nvidia/llama-3.1-nemotron-ultra-253b-v1
Provider
Kilo Gateway
Family
Release Date
Jul 1, 2024
Knowledge Cutoff
N/A
API Integration
NPM Package
@ai-sdk/openai-compatible
Environment Variables
KILO_API_KEY
API Base URL
https://api.kilo.ai/api/gateway
Documentation