Llama 4 Maverick Instruct (Basic)

accounts/fireworks/models/llama4-maverick-instruct-basic

Meta Llama

LLM

Vision

Serverless

The Llama 4 collection of models are natively multimodal AI models that enable text and multimodal experiences. These models leverage a mixture-of-experts architecture to offer industry-leading performance in text and image understanding.

Features

Serverless API
Docs
Llama 4 Maverick Instruct (Basic) is available via Fireworks' serverless API, where you pay per token. There are several ways to call the Fireworks API, including Fireworks' Python client, the REST API, or OpenAI's Python client.
On-demand Deployments
Docs
On-demand deployments allow you to use Llama 4 Maverick Instruct (Basic) on dedicated GPUs with Fireworks' high-performance serving stack with high reliability and no rate limits.

Available Serverless

Run queries immediately, pay only for usage

$0.22 / $0.88

Per 1M Tokens (input/output)

import requests
import json

url = "https://api.fireworks.ai/inference/v1/chat/completions"
payload = {
  "model": "accounts/fireworks/models/llama4-maverick-instruct-basic",
  "max_tokens": 131072,
  "top_p": 1,
  "top_k": 40,
  "presence_penalty": 0,
  "frequency_penalty": 0,
  "temperature": 0.6,
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "Can you describe this image?"
        },
        {
          "type": "image_url",
          "image_url": {
            "url": "https://images.unsplash.com/photo-1582538885592-e70a5d7ab3d3?ixlib=rb-4.0.3&ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D&auto=format&fit=crop&w=1770&q=80"
          }
        }
      ]
    }
  ]
}
headers = {
  "Accept": "application/json",
  "Content-Type": "application/json",
  "Authorization": "Bearer <API_KEY>"
}
requests.request("POST", url, headers=headers, data=json.dumps(payload))

Metadata

State

Ready

Created on

4/5/2025

Kind

Base model

Provider

Meta Llama

Hugging Face

Visit link

Specification

Calibrated

Mixture-of-Experts

Parameters

400B

Supported Functionality

Fine-tuning

Not supported

Serverless

Supported

Serverless LoRA

Not supported

Context Length

1M tokens

Function Calling

Supported

import requests
import json

url = "https://api.fireworks.ai/inference/v1/chat/completions"
payload = {
  "model": "accounts/fireworks/models/llama4-maverick-instruct-basic",
  "max_tokens": 131072,
  "top_p": 1,
  "top_k": 40,
  "presence_penalty": 0,
  "frequency_penalty": 0,
  "temperature": 0.6,
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "Can you describe this image?"
        },
        {
          "type": "image_url",
          "image_url": {
            "url": "https://images.unsplash.com/photo-1582538885592-e70a5d7ab3d3?ixlib=rb-4.0.3&ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D&auto=format&fit=crop&w=1770&q=80"
          }
        }
      ]
    }
  ]
}
headers = {
  "Accept": "application/json",
  "Content-Type": "application/json",
  "Authorization": "Bearer <API_KEY>"
}
requests.request("POST", url, headers=headers, data=json.dumps(payload))

Features

Serverless API

On-demand Deployments

Available Serverless

$0.22 / $0.88

Metadata

Specification

Supported Functionality

Features

Serverless API

On-demand Deployments

Available Serverless

$0.22 / $0.88

Metadata

Specification

Supported Functionality