Qwen2.5 32B
accounts/fireworks/models/qwen2p5-32b
Qwen2.5 are a series of decoder-only language models developed by Qwen team, Alibaba Cloud, available in 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B sizes, and base and instruct variants.
Qwen2.5 32B can be fine-tuned on your data to create a model with better response quality. Fireworks uses low-rank adaptation (LoRA) to train a model that can be served efficiently at inference time.
On-demand deployments allow you to use Qwen2.5 32B on dedicated GPUs with Fireworks' high-performance serving stack with high reliability and no rate limits.