← Back to Providers
Fireworks logo

Fireworks AI

InferenceOptimizationUS
Total Models234
Free Models229
Paid Models5

Fast inference platform

🆓 Free Models (229)

ModelContextCapabilities
OpenChat 3.5 0106
openchat-3p5-0106-7b
-
Llama 4 Maverick Instruct (Basic)
llama4-maverick-instruct-basic
-visionfunction_calling
Cogito v1 Preview Llama 3B
cogito-v1-preview-llama-3b
-function_calling
Cogito v1 Preview Llama 8B
cogito-v1-preview-llama-8b
-function_calling
Cogito v1 Preview Qwen 14B
cogito-v1-preview-qwen-14b
-function_calling
Cogito v1 Preview Qwen 32B
cogito-v1-preview-qwen-32b
-function_calling
Cogito v1 Preview Llama 70B
cogito-v1-preview-llama-70b
-function_calling
Gemma 3 27B Instruct
gemma-3-27b-it
-
Qwen3 30B-A3B
qwen3-30b-a3b
-function_calling
Qwen3 14B
qwen3-14b
-function_calling
Qwen3 0.6B
qwen3-0p6b
-function_calling
Qwen3 32B
qwen3-32b
-function_calling
Qwen3 4B
qwen3-4b
-function_calling
Qwen3 1.7B
qwen3-1p7b
-function_calling
DeepSeek Prover V2
deepseek-prover-v2
-
Qwen2.5 1.5B Instruct
qwen2p5-1p5b-instruct
-
Devstral-Small-2505
devstral-small-2505
-
Dobby Mini Unhinged Plus Llama 3.1 8B
dobby-mini-unhinged-plus-llama-3-1-8b
-
DeepSeek R1 0528 Distill Qwen3 8B
deepseek-r1-0528-distill-qwen3-8b
-function_calling
InternVL3 8B
internvl3-8b
-vision
InternVL3 38B
internvl3-38b
-vision
InternVL3 78B
internvl3-78b
-vision
Rolm OCR
rolm-ocr
-vision
MiniMax-M1-80k
minimax-m1-80k
-
ERNIE-4.5-21B-A3B-PT
ernie-4p5-21b-a3b-pt
-
ERNIE-4.5-300B-A47B-PT
ernie-4p5-300b-a47b-pt
-
Kimi K2 Instruct
kimi-k2-instruct
-function_calling
Qwen3 30B A3B Instruct 2507
qwen3-30b-a3b-instruct-2507
-
GLM-4.5
glm-4p5
-function_calling
Qwen3 30B A3B Thinking 2507
qwen3-30b-a3b-thinking-2507
-function_calling
GLM-4.5-Air
glm-4p5-air
-function_calling
GLM-4.5V
glm-4p5v
-visionfunction_calling
Qwen3 Coder 480B Instruct BF16
qwen3-coder-480b-instruct-bf16
-
Qwen3 Next 80B A3B Instruct
qwen3-next-80b-a3b-instruct
-
Qwen3 Next 80B A3B Thinking
qwen3-next-80b-a3b-thinking
-
NVIDIA Nemotron Nano 12B v2
nvidia-nemotron-nano-12b-v2
-function_calling
NVIDIA Nemotron Nano 9B v2
nvidia-nemotron-nano-9b-v2
-function_calling
Qwen 3 4B Instruct 2507
qwen3-4b-instruct-2507
-
NVIDIA Nemotron Nano 2 VL
nemotron-nano-v2-12b-vl
-vision
OpenAI gpt-oss-safeguard-20b
gpt-oss-safeguard-20b
-function_calling
OpenAI gpt-oss-safeguard-120b
gpt-oss-safeguard-120b
-function_calling
KAT Dev 72B Exp
kat-dev-72b-exp
-function_calling
KAT Dev 32B
kat-dev-32b
-
FARE-20B
fare-20b
-
KAT Coder
kat-coder
-
Ministral 3 3B Instruct 2512
ministral-3-3b-instruct-2512
-visionfunction_calling
Ministral 3 8B Instruct 2512
ministral-3-8b-instruct-2512
-visionfunction_calling
Ministral 3 14B Instruct 2512
ministral-3-14b-instruct-2512
-visionfunction_calling
Mistral Large 3 675B Instruct 2512
mistral-large-3-fp8
-visionfunction_calling
Qwen3-VL-8B-Instruct
qwen3-vl-8b-instruct
-vision
Llama 2 13B Chat
llama-v2-13b-chat
-
Llama 2 13B
llama-v2-13b
-
Llama 2 7B Chat
llama-v2-7b-chat
-
Llama 2 7B
llama-v2-7b
-
Mistal 7B Instruct V0.1
mistral-7b-instruct-4k
-
Mistral 7B
mistral-7b
-
Zephyr 7B Beta
zephyr-7b-beta
-
Mixtral 8x7B
mixtral-8x7b
32Kchatfunction_calling
Devstral Small 2 24B Instruct 2512
devstral-small-2-24b-instruct-2512
-visionfunction_calling
Seed OSS 36B Instruct
seed-oss-36b-instruct
-function_calling
Gemma 3 4B Instruct
gemma-3-4b-it
-
Gemma 3 12B Instruct
gemma-3-12b-it
-
Qwen3 Omni 30B A3B Instruct
qwen3-omni-30b-a3b-instruct
-visionfunction_calling
Molmo2-4B
molmo2-4b
-vision
Molmo2-8B
molmo2-8b
-vision
GLM-4.7 Flash
glm-4p7-flash
-
Deepseek V3 03-24
deepseek-v3-0324
-function_calling
Qwen2.5-VL 32B Instruct
qwen2p5-vl-32b-instruct
-vision
Qwen3 8B
qwen3-8b
-function_calling
DeepSeek V3.1 Terminus
deepseek-v3p1-terminus
-function_calling
Qwen3 VL 30B A3B Instruct
qwen3-vl-30b-a3b-instruct
-visionfunction_calling
Qwen3 VL 30B A3B Thinking
qwen3-vl-30b-a3b-thinking
-visionfunction_calling
MiniMax-M2
minimax-m2
-function_calling
Cogito 671B v2.1
cogito-671b-v2-p1
-
DeepSeek R1 (Fast)
deepseek-r1
-
Qwen3 Coder 30B A3B Instruct
qwen3-coder-30b-a3b-instruct
-
Deepseek R1 05/28
deepseek-r1-0528
-function_calling
Llama 3.3 70B Instruct
llama-v3p3-70b-instruct
-
Qwen3 235B A22B Instruct 2507
qwen3-235b-a22b-instruct-2507
-function_calling
Qwen3 Coder 480B A35B Instruct
qwen3-coder-480b-a35b-instruct
-function_calling
Qwen3 235B A22B Thinking 2507
qwen3-235b-a22b-thinking-2507
-
GLM-4.6
glm-4p6
-function_calling
Kimi K2 Thinking
kimi-k2-thinking
-function_calling
Qwen3 235B A22B
qwen3-235b-a22b
-function_calling
OpenAI gpt-oss-20b
gpt-oss-20b
-
OpenAI gpt-oss-120b
gpt-oss-120b
-function_calling
DeepSeek V3.1
deepseek-v3p1
-function_calling
Kimi K2 Instruct 0905
kimi-k2-instruct-0905
-function_calling
Qwen3 VL 235B A22B Instruct
qwen3-vl-235b-a22b-instruct
-visionfunction_calling
Qwen3 VL 235B A22B Thinking
qwen3-vl-235b-a22b-thinking
-visionfunction_calling
Deepseek v3.2
deepseek-v3p2
-function_calling
GLM-4.7
glm-4p7
-function_calling
MiniMax-M2.1
minimax-m2p1
-function_calling
Kimi K2.5
kimi-k2p5
-visionfunction_calling
Llama Guard v3 1B
llama-guard-3-1b
-
Qwen2.5-Coder 7B Instruct
qwen2p5-coder-7b-instruct
-
Qwen2.5-Coder 7B
qwen2p5-coder-7b
-
Qwen2.5-Coder 1.5B
qwen2p5-coder-1p5b
-
Qwen2.5-Coder 1.5B Instruct
qwen2p5-coder-1p5b-instruct
-
Qwen2.5 72B Instruct
qwen2p5-72b-instruct
-function_calling
Qwen2.5 72B
qwen2p5-72b
-
Qwen2.5 32B Instruct
qwen2p5-32b-instruct
-
Qwen2.5 32B
qwen2p5-32b
-
Qwen2.5 14B Instruct
qwen2p5-14b-instruct
-
Qwen2.5 14B
qwen2p5-14b
-
Qwen2.5 7B
qwen2p5-7b
-
Qwen2.5 7B Instruct
qwen2p5-7b-instruct
-
Llama 3.1 70B Instruct 1B
llama-v3p1-70b-instruct-1b
-
Qwen2.5-Math 72B Instruct
qwen2p5-math-72b-instruct
-
Llama 3.1 Nemotron 70B
llama-v3p1-nemotron-70b-instruct
-
FLUX.1 [schnell]
flux-1-schnell
-
DeepSeek V2 Lite Chat
deepseek-v2-lite-chat
-
Llama Guard 3 8B
llama-guard-3-8b
-
Qwen2.5-Coder 0.5B Instruct
qwen2p5-coder-0p5b-instruct
-
Qwen2.5-Coder 3B Instruct
qwen2p5-coder-3b-instruct
-
Qwen2.5-Coder 14B Instruct
qwen2p5-coder-14b-instruct
-
Qwen2.5-Coder 14B
qwen2p5-coder-14b
-
Qwen2.5-Coder 0.5B
qwen2p5-coder-0p5b
-
Qwen2.5-Coder 3B
qwen2p5-coder-3b
-
Qwen2.5-Coder 32B Instruct 64k
qwen2p5-coder-32b-instruct-64k
-
Qwen2.5-Coder 32B Instruct 32K RoPE
qwen2p5-coder-32b-instruct-32k-rope
-
Qwen2.5-Coder 32B Instruct 128K
qwen2p5-coder-32b-instruct-128k
-
Qwen2.5-Coder 32B Instruct
qwen2p5-coder-32b-instruct
-
NVIDIA Nemotron Nano 3 30B A3B
nemotron-nano-3-30b-a3b
-function_calling
Qwen2.5-Coder 32B
qwen2p5-coder-32b
-
Llama 3 8B
llama-v3-8b
-
Qwen QWQ 32B Preview
qwen-qwq-32b-preview
-
Qwen2-VL 2B Instruct
qwen2-vl-2b-instruct
-vision
Qwen2-VL 7B Instruct
qwen2-vl-7b-instruct
-vision
Qwen2-VL 72B Instruct
qwen2-vl-72b-instruct
-vision
Firesearch OCR V6
firesearch-ocr-v6
-vision
Qwen2.5 0.5B Instruct
qwen2p5-0p5b-instruct
-
DeepSeek V3
deepseek-v3
64Kchatfunction_calling
Code Llama 70B Python
code-llama-70b-python
-
Nous Capybara 7B V1.9
nous-capybara-7b-v1p9
-
Gemma 7B
gemma-7b
-
Nous Hermes Llama2 70B
nous-hermes-llama2-70b
-
Phind CodeLlama 34B Python v1
phind-code-llama-34b-python-v1
-
Nouse Hermes 2 Mixtral 8x7B DPO
nous-hermes-2-mixtral-8x7b-dpo
-
Phind CodeLlama 34B v2
phind-code-llama-34b-v2
-
Phind CodeLlama 34B v1
phind-code-llama-34b-v1
-
Snorkel Mistral PairRM DPO
snorkel-mistral-7b-pairrm-dpo
-
Pythia 12B
pythia-12b
-
Hermes 2 Pro Mistral 7B
hermes-2-pro-mistral-7b
-function_calling
DeepSeek Coder 7B Base
deepseek-coder-7b-base
-
Mistral 7B v0.2
mistral-7b-v0p2
-
Mixtral 8x22B
mixtral-8x22b
65Kchatfunction_calling
Qwen3 VL 32B Instruct
qwen3-vl-32b-instruct
-vision
Mixtral MoE 8x7B Instruct
mixtral-8x7b-instruct
-
Llama Guard 7B
llamaguard-7b
-
Llama 2 70B
llama-v2-70b
-
Mixtral MoE 8x7B Instruct (HF version)
mixtral-8x7b-instruct-hf
-
MythoMax L2 13B
mythomax-l2-13b
-
FireFunction V1
firefunction-v1
-function_calling
Gemma 7B Instruct
gemma-7b-it
-
Mistral 7B OpenOrca
openorca-7b
-
Qwen1.5 72B Chat
qwen1p5-72b-chat
-
DeepSeek Coder 33B Instruct
deepseek-coder-33b-instruct
-
DeepSeek Coder 7B Instruct v1.5
deepseek-coder-7b-instruct-v1p5
-
DeepSeek Coder 7B Base v1.5
deepseek-coder-7b-base-v1p5
-
Chronos Hermes 13B v2
chronos-hermes-13b-v2
-
Nous Hermes Llama2 13B
nous-hermes-llama2-13b
-
Mistral 7B Instruct v0.2
mistral-7b-instruct-v0p2
-
Toppy M 7B
toppy-m-7b
-
Nous Hermes Llama2 7B
nous-hermes-llama2-7b
-
Dolphin 2.6 Mixtral 8x7b
dolphin-2p6-mixtral-8x7b
-
OpenHermes 2 Mistral 7B
openhermes-2-mistral-7b
-
OpenHermes 2.5 Mistral 7B
openhermes-2p5-mistral-7b
-
Code Llama 7B
code-llama-7b
-
Code Llama 7B Instruct
code-llama-7b-instruct
-
Code Llama 13B
code-llama-13b
-
Code Llama 13B Python
code-llama-13b-python
-
Code Llama 13B Instruct
code-llama-13b-instruct
-
Code Llama 34B Python
code-llama-34b-python
-
Code Llama 34B
code-llama-34b
-
Code Llama 70B
code-llama-70b
-
Code Llama 34B Instruct
code-llama-34b-instruct
-
Code Llama 70B Instruct
code-llama-70b-instruct
-
Mixtral MoE 8x22B Instruct
mixtral-8x22b-instruct
-function_calling
Llama 3 70B Instruct
llama-v3-70b-instruct
-
Llama 3 8B Instruct
llama-v3-8b-instruct
-
Llama Guard v2 8B
llama-guard-2-8b
-
Llama 3 8B Instruct (HF version)
llama-v3-8b-instruct-hf
-
Llama 3 70B Instruct (HF version)
llama-v3-70b-instruct-hf
-
Gemma 2B Instruct
gemma-2b-it
-
Phi-3 Mini 128k Instruct
phi-3-mini-128k-instruct
-
Phi-3.5 Vision Instruct
phi-3-vision-128k-instruct
-vision
Mistral 7B Instruct v0.3
mistral-7b-instruct-v3
-function_calling
Qwen2 72B Instruct
qwen2-72b-instruct
-
Qwen2 7B Instruct
qwen2-7b-instruct
-
Dolphin 2.9.2 Qwen2 72B
dolphin-2-9-2-qwen2-72b
-
DeepSeek Coder 1.3B Base
deepseek-coder-1b-base
-
CodeQwen 1.5 7B
code-qwen-1p5-7b
-
CodeGemma 2B
codegemma-2b
-
CodeGemma 7B
codegemma-7b
-
Gemma 2 9B Instruct
gemma2-9b-it
-
DeepSeek Coder V2 Lite Instruct
deepseek-coder-v2-lite-instruct
-
DeepSeek Coder V2 Lite Base
deepseek-coder-v2-lite-base
-
DeepSeek Coder V2 Instruct
deepseek-coder-v2-instruct
-
Llama 3.1 70B Instruct
llama-v3p1-70b-instruct
-function_calling
Mistral Nemo Base 2407
mistral-nemo-base-2407
-
Llama 3.1 405B Instruct
llama-v3p1-405b-instruct
-function_calling
Mistral Nemo Instruct 2407
mistral-nemo-instruct-2407
-
Llama 3.1 8B Instruct
llama-v3p1-8b-instruct
-
FireFunction V2
firefunction-v2
-function_calling
Llama 3.1 405B Instruct Long
llama-v3p1-405b-instruct-long
-
DeepSeek V2.5
deepseek-v2p5
-
Llama 3.2 1B Instruct
llama-v3p2-1b-instruct
-
Llama 3.2 3B Instruct
llama-v3p2-3b-instruct
-
Qwen2.5 7B
qwen-v2p5-7b
-
Qwen2.5 14B Instruct
qwen-v2p5-14b-instruct
-
Llama 3.2 90B Vision Instruct
llama-v3p2-90b-vision-instruct
-vision
Llama 3.2 11B Vision Instruct
llama-v3p2-11b-vision-instruct
-vision
Llama 3.2 1B
llama-v3p2-1b
-
Llama 3.2 3B
llama-v3p2-3b
-
DeepSeek R1 Distill Llama 70B
deepseek-r1-distill-llama-70b
-
DeepSeek R1 Distill Qwen 32B
deepseek-r1-distill-qwen-32b
-
DeepSeek R1 Distill Qwen 1.5B
deepseek-r1-distill-qwen-1p5b
-
DeepSeek R1 Distill Qwen 7B
deepseek-r1-distill-qwen-7b
-
DeepSeek R1 Distill Llama 8B
deepseek-r1-distill-llama-8b
-
DeepSeek R1 Distill Qwen 14B
deepseek-r1-distill-qwen-14b
-
Mistral Small 24B Instruct 2501
mistral-small-24b-instruct-2501
-
Dobby-Unhinged-Llama-3.3-70B
dobby-unhinged-llama-3-3-70b-new
-
QWQ 32B
qwq-32b
-
DeepSeek R1 (Basic)
deepseek-r1-basic
-
Qwen2.5-VL 3B Instruct
qwen2p5-vl-3b-instruct
-vision
Qwen2.5-VL 7B Instruct
qwen2p5-vl-7b-instruct
-vision
Qwen2.5-VL 72B Instruct
qwen2p5-vl-72b-instruct
-vision
Llama 4 Scout Instruct (Basic)
llama4-scout-instruct-basic
-visionfunction_calling

💰 Paid Models (5)

ModelInput/1MOutput/1MContextCapabilities
Llama 3.1 8B
llama-3-1-8b
$0.200$0.200131Kchatfunction_calling
Llama 4 Maverick
llama-4-maverick
$0.220$0.8801.0Mchatfunction_calling
Qwen3 235B
qwen3-235b
$0.220$0.880256Kchatfunction_calling
Llama 3.3 70B
llama-3-3-70b
$0.900$0.900131Kchatfunction_calling
Llama 3.1 405B
llama-3-1-405b
$3.00$3.00131Kchatfunction_calling