Integrate multimodal vision capabilities using qwen3-vl:235b-instruct, designed to process both text and images seamlessly. This model delivers available completely free of charge, native tool calling support, open weights architecture. Access qwen3-vl:235b-instruct via the Ollama Cloud API with up to 131K output tokens.
Tokens
Tokens
Tokens
qwen3-vl:235b-instruct by Ollama Cloud costs Free per 1M input tokens and Free per 1M output tokens.