AI Model APIs
AI Model APIsProviders
AI Model APIs

The complete platform for comparing AI models. Find pricing, capabilities, and the perfect model for your use case.

contact@aimodelapis.com

Resources

  • AI Models APIs
  • Providers

About

  • About
  • Contact
  • Privacy Policy
  • Terms of Service

© 2026 AI Model APIs. All rights reserved.

Back to Models
Inference

Qwen 3 Embedding 4B API

Inference
About

Qwen 3 Embedding 4B is a high-performance LLM available via the Inference API, ideal for scalable text generation and natural language processing. This model delivers cost-effective pricing at $0.01/1M input and Free/1M output tokens, open weights architecture. Access Qwen 3 Embedding 4B via the Inference API with up to 2K output tokens.

Capabilities

Input Modalities

text

Output Modalities

text
Reasoning
No
Structured Output
No
Tool Use
No
WeightsOpen
Temperature
Fixed
Attachment
Not Supported
Limits
Context Window
32K

Tokens

Input Limit
32K

Tokens

Max Output
2K

Tokens

Pricing

Standard (per 1M tokens)

Input
$0.01
Output
Free
Frequently Asked Questions

Qwen 3 Embedding 4B by Inference costs $0.01 per 1M input tokens and Free per 1M output tokens.

Model Details
ID
qwen/qwen3-embedding-4b
Provider
Inference
Family
qwen
Release Date
Jan 1, 2025
Knowledge Cutoff
Dec 1, 2024
API Integration
NPM Package
@ai-sdk/openai-compatible
Environment Variables
INFERENCE_API_KEY
API Base URL
https://inference.net/v1
Documentation