What Is Qwen AI? Alibaba's Open-Weight Model Explained
When Alibaba published its first open-weight language model in August 2023, the AI community took note. When the Qwen AI family crossed 90,000 derivative models on ModelScope, it became impossible to ignore. Qwen AI is Alibaba Cloud's family of open-weight and proprietary language models, engineered to cover the full spectrum from edge inference on consumer hardware to frontier-tier agentic coding in the cloud. The flagship Qwen3.7-Max places in the global top 5 on the Artificial Analysis Intelligence Index. Most models under 35 billion parameters ship under Apache 2.0, which means you can fine-tune and redistribute them commercially without royalties.
What is Qwen AI in one sentence? Qwen is Alibaba Cloud's AI model family, ranging from free open-weight models you can run on a laptop to a frontier-grade API that outscores Claude Opus on coding benchmarks at a fraction of the cost.
Who Built Qwen? (Alibaba Cloud)
Qwen is an Alibaba Cloud product, built by the Tongyi Large Model Business Unit. The division's chief AI architect is Zhou Jingren, who oversees model research and deployment strategy. The name "Qwen" is the commercial brand; the internal designation is "Tongyi Qianwen," which translates roughly to "understanding all questions."
In March 2026, Alibaba consolidated its AI research under a new entity called Alibaba Token Hub, which brought together model training, inference infrastructure, and developer tooling under one organizational umbrella. Alibaba CEO Eddie Wu has been the executive sponsor of the AI push since the company restructured its operations.
The Qwen project started quietly. A beta version shipped in April 2023, limited to enterprise partners on the Alibaba Cloud platform. By August 2023, Alibaba released Qwen-7B as an open-weight model. It was one of the first billion-plus-parameter models from a Chinese lab to ship downloadable weights under a permissive license, and that decision set the tone for everything that followed.
Understanding the naming convention helps: the "Qwen" prefix is the brand, the number is the generation (Qwen3.7 is the seventh sub-release of the third generation), and for MoE models the suffix format is [total params]B-A[active params]B. So Qwen3.6-35B-A3B has 35 billion total parameters but only 3 billion active per token. Who makes Qwen models at this scale is a team effort: Alibaba reports hundreds of researchers across its Singapore and mainland China research hubs.
Who owns Qwen AI ultimately is Alibaba Group, the Chinese technology company listed on the New York and Hong Kong stock exchanges. The AI model division operates as part of Alibaba Cloud, the group's cloud computing arm.
The Qwen AI Model Family Explained
The Qwen model family spans eight generations as of May 2026, ranging from 0.5-billion-parameter edge models to the approximately 1 trillion parameter Qwen3.7-Max. Each generation introduced architectural advances that separated Qwen from a derivative of Western model research into a distinct technical approach.
The current flagship is Qwen3.7-Max, a proprietary model available through the Alibaba Cloud API. It uses a Hybrid Gated DeltaNet architecture, a design that mixes linear attention layers and standard full attention in a 3:1 ratio. This is not standard Transformer architecture. Linear attention scales more efficiently for long sequences, which is part of why Qwen3.7-Max handles a 1 million token context window (roughly 750,000 words, or a large codebase) without the prohibitive memory costs that affect dense transformer models at the same scale.
Open-weight models currently top out at Qwen3.5-397B-A17B, which has 397 billion total parameters but only 17 billion active per token (this is Mixture of Experts: the model activates a different subset of parameters for each token, keeping runtime cost proportional to the 17B active count, not the 397B total). Independent evaluations place it in the GPT-5.2 tier. It runs in 4-bit quantization on a Mac Studio with 256GB of RAM.
For developers who need something smaller, Qwen3.6-35B-A3B is the current local-deployment sweet spot. At 35 billion total / 3 billion active parameters, it reaches 20-25 tokens per second on a single RTX 4090. It ships under Apache 2.0 and handles text, images, and video natively.
The Qwen3 generation introduced the Hybrid Gated DeltaNet attention mechanism across the entire family. Qwen models use a 3:1 ratio of linear attention layers to full attention layers. Linear attention computes in linear time relative to sequence length (versus quadratic time for standard attention), which enables the long context windows that make Qwen models practical for large codebases and document analysis. The tradeoff is that linear attention can lose recall on certain retrieval tasks, which is why the full attention layers are retained at the 1-in-4 position.
How Qwen AI Benchmarks Against GPT and Claude
Qwen3.7-Max leads two public benchmarks as of May 2026. These are not self-reported scores. They come from third-party leaderboards that accept model submissions and verify results against held-out test sets.
On SWE-Bench Pro (the harder version of the standard SWE-Bench that evaluates models against 2,294 real GitHub issues from production open-source projects), Qwen3-235B-A22B scores 60.6%. A score of 60.6% means the model autonomously resolves roughly 6 in 10 real bug reports without human help. Claude Opus 4.8 scores 57.3% on the same benchmark. That 3.3-point gap on a benchmark designed to simulate real software engineering work is meaningful.
On Terminal-Bench 2.0, which simulates a software engineer working in a sandboxed terminal environment with a 5-hour timeout and access to shell commands, file I/O, and internet access, Qwen3-235B-A22B scores 69.7%, the top result on the public leaderboard at time of writing.
On math reasoning, Qwen3-235B-A22B scores 90.2% on MATH-500 and holds a CodeForces ELO of 2,056. On the Artificial Analysis Intelligence Index v4.0, Qwen3.7-Max scores 56.6, placing it in the global top 5. See our Qwen vs DeepSeek comparison for a full head-to-head on benchmarks and pricing.
AI Risk Management Template
Identify, assess, and mitigate AI deployment risks
Download Free →Open Source or Proprietary? Qwen Licensing Explained
Qwen's licensing is tiered by model size. Models at or below 35 billion parameters ship under Apache 2.0, the most permissive open-source license available. That means you can download the weights, fine-tune on proprietary data, build products, and distribute commercially without paying Alibaba anything or negotiating a separate agreement.
Models at 72 billion parameters and above use the Tongyi Qianwen License, which places additional restrictions on commercial deployment and redistribution. If you are building a SaaS product on a 72B+ model, or distributing a fine-tune to downstream customers, the license text governs. Alibaba publishes the full terms on each model's Hugging Face card.
The practical split: Qwen3.6-35B-A3B (35B total, 3B active MoE) and Qwen3.5-397B-A17B are both Apache 2.0, unusually permissive for models operating at near-frontier capability. The API-only Qwen3.7-Max is a proprietary hosted model; its weights are not available for download.
One nuance: "open-weight" is not the same as "open source." Qwen AI releases the weights, but training pipelines, full architecture specifications, and training data are not always disclosed. Apache 2.0 gives you freedom over the weights themselves; it does not provide a reproducible training pipeline.
The Qwen Ecosystem
One indicator that a model family has reached mainstream adoption: the derivative model count on ModelScope. Qwen has crossed 90,000, including fine-tunes, quantizations, domain-adapted variants, and multilingual models built by the community on top of the base weights. That puts it alongside Llama in open-weight ecosystem depth.
Language support is another dimension where Qwen separates from most Western-developed models. Official support extends to 201 languages, reflecting Alibaba's emphasis on markets where English-first models leave significant gaps. Chinese, Arabic, Korean, and Southeast Asian languages receive dedicated tuning alongside major European languages.
On the protocol side, Qwen supports MCP (Model Context Protocol) natively: tool-calling integrations built for Claude or other MCP-compatible systems work without modification. Qwen also implements the Anthropic API protocol, so swapping Claude for Qwen3.7-Max in a compatible coding agent requires changing three environment variables and nothing else.
Qwen Code: The Terminal Coding Agent
Qwen Code is Alibaba's terminal-native coding agent, positioned as a direct alternative to Claude Code. It operates as a CLI tool, accepts natural language task descriptions, and runs multi-step coding sessions autonomously, reading files, writing tests, running shell commands, and iterating on failures without human input between steps.
The benchmarks reflect the capability. On Terminal-Bench 2.0, Qwen3-235B-A22B scores 69.7%, the top result globally as of this writing. On SWE-Bench Pro (which evaluates agents against real GitHub issues), the same model reaches 60.6%, compared to Claude Opus 4.8 at 57.3%.
Because Qwen implements the Anthropic API protocol natively, MCP server configurations and tool-calling patterns that work in Claude Code work with Qwen Code. Developers running Claude-based pipelines can evaluate Qwen as a drop-in at roughly one-sixth the API cost of Claude Opus.
Who Should Use Qwen?
Qwen's range, from edge-deployable 0.8B models to a frontier trillion-parameter API, means different users get genuinely different things from it. The right entry point depends on your infrastructure, budget, and latency requirements.
Limitations to Know Before You Commit
Qwen's pricing and open-weight availability make a strong case. Three structural limitations apply regardless of which model or tier you choose.
Frequently Asked Questions
ollama pull to a production vLLM server: the full local deployment path for every hardware tier.Go Deeper
Resources from across Tech Jacks Solutions
FREEAI Risk Management Template
Identify, assess, and mitigate AI deployment risks
EU AI Act Guide
Check your compliance obligations under the EU AI Act
FREEAI Bias Assessment
Evaluate bias risks before deploying any AI system
What Is Agentic AI?
Understand the architecture behind autonomous AI agents
AI Career Paths
Explore roles that work with these tools daily
Qwen, Tongyi Qianwen, Qwen Code, and Alibaba Cloud are trademarks of Alibaba Group Holding Limited. All product names, logos, and brand identifiers are the property of their respective owners. Tech Jacks Solutions has no commercial relationship with Alibaba Cloud. This article is editorially independent.