Integrate multimodal vision capabilities using Qwen2.5-VL 7B Instruct, designed to process both text and images seamlessly. This model delivers cost-effective pricing at $0.29/1M input and $0.72/1M output tokens, native tool calling support, open weights architecture. Access Qwen2.5-VL 7B Instruct via the Alibaba (China) API with up to 8K output tokens.
Tokens
Tokens
Tokens
Qwen2.5-VL 7B Instruct by Alibaba (China) costs $0.29 per 1M input tokens and $0.72 per 1M output tokens.