Fireworks.ai

Lightning-fast _

Experience the world's fastest Generative AI inference platform. Use a state-of-the-art, open-source model or fine-tune and deploy your own at no additional cost, with Fireworks.ai.

Query Large Language Models with Fireworks.ai

It is super easy to get started with Fireworks.ai LLM API. You can use the Python client (install from PyPi):
import fireworks.client fireworks.client.api_key = "your-key" completion = fireworks.client.Completion.create( "accounts/fireworks/models/llama-v2-7b", "Once upon a time", max_tokens=16 )
Or directly using the CURL command:
curl --request POST \ --url https://api.fireworks.ai/inference/v1/chat/completions \ --header 'Accept: application/json' \ --header 'Authorization: Bearer YOUR_TOKEN_HERE' \ --data ' { "messages": [ { "role": "user", "content": "hello there!" } ], "model": "accounts/fireworks/models/llama-v2-7b-chat" }'

Fireworks Supported Models

Capybara 34B

34B LLM model from NousResearch, based on Yi-34B

Yi 6B

6B LLM model from 01.ai

LlaVa v1.5 13b model

Multi-modal with both image and text as input.

Segmind Stable Diffusion 1B (SSD-1B)

Image generation model. Distilled from Stable Diffusion XL 1.0 and 50% smaller.

Mistral 7B Instruct

Mistral-7B model fine-tuned for conversation

StableDiffusion XL

Image generation model, produced by stability.ai.

Llama 2 13B code instruct

A instruction-tuned version of Llama 2 13B, optimized for code generation.

Llama 2 34B Code Llama instruct

Code Llama 34B, optimized for code generation.

Llama 2 7B Chat

A fine-tuned version of Llama 2 7B, optimized for dialogue applications using Reinforcement Learning from Human Feedback (RLHF), and perform comparably to ChatGPT according to human evaluations.

Llama 2 13B Chat

A fine-tuned version of Llama 2 13B, optimized for dialogue applications using Reinforcement Learning from Human Feedback (RLHF), and perform comparably to ChatGPT according to human evaluations.

Llama 2 70B Chat

A fine-tuned version of Llama 2 70B, optimized for dialogue applications using Reinforcement Learning from Human Feedback (RLHF), and perform comparably to ChatGPT according to human evaluations.

StarCoder 7B int8

A 7B parameter model trained on 80+ programming languages from The Stack (v1.2), using Multi Query Attention and the Fill-in-the-Middle objective.

StarCoder 15.5B int8

A 15.5B parameter model trained on 80+ programming languages from The Stack (v1.2), using Multi Query Attention and the Fill-in-the-Middle objective.

falcon-7b

A 7B parameter causal decoder-only model developed by TII, trained on 1,000B tokens of RefinedWeb enhanced with curated corpora. It is optimized for inference and outperforms several other models, making it the best open-source model currently available.

Community Open Source Add-ons

We will also support uploading your own addons to the platform, so that you can share it with everyone. Here are a few example addons that we got from the open source community.

beomi-llama-2-ko-7b

Community addons for Korean from Beomi

linksoul-chinese-llama-2-7b

Community addons for Chinese and English from LinkSoul

flagalpha-llama2-chinese-7b-chat

Community addons for Chinese and English from FlagAlpha

traditional-chinese-qlora-llama2

Community addons for Tranditional Chinese

Llama 2 13B French

Fine-tuned meta-llama/Llama-2-13b-chat-hf to answer French questions in French.

Chinese Llama 2 LoRA 7B

The LoRA version of Chinese-Llama-2 base on Llama-2-7b-hf.

Bleat

Bleat allows you to enable function calling in LLaMA 2 in a similar fashion to OpenAI's implementation for ChatGPT.

Llama2 13B Guanaco QLoRA GGML

A fine-tuned Llama 2 13B model using the Open Assist dataset.

llama2-7b-summarize

Community addons for summarization model.

© 2023 Fireworks AI. All rights reserved. | Privacy Policy | Terms of Service