Qwen-VL
Multimodal vision-language model by Alibaba for image understanding.
About
Qwen-VL by Alibaba Cloud is a multimodal vision-language model that processes images alongside text prompts for visual question answering, captioning, OCR, and grounding, with a Qwen-VL-Chat variant for dialogue. It is released in multiple sizes with open weights, and the hosted Qwen-VL-Plus and Qwen-VL-Max versions extend the family. The open models are distributed under the Apache 2.0 license with weights on Hugging Face.
Reviews (0)
Leave a Review
No reviews yet. Be the first to review!
Details
- Category
- Large Language Models (LLMs)
- Price
- Free
- Platform
- Local/Desktop
- Difficulty
- Intermediate (3/5)
- License
- Apache-2.0
- Minimum VRAM
- 8 GB
- Added
- Apr 3, 2026
Related Tools
Open-weight models by Google in 2B, 9B, and 27B sizes with strong performance.
Open-weight LLM by Meta in 8B and 70B sizes with strong general capabilities.
Open-weight LLM family by Alibaba with strong multilingual and coding abilities.
Small language model by Microsoft in 3.8B size with strong benchmark performance.
Open-access 176B parameter multilingual LLM by BigScience supporting 46 languages.
Retrieval-augmented generation optimized LLM by Cohere with 128K context.