LightweightMistral

Ministral 3 14B

Name: Ministral 3 14B
Availability: InStock
Author: Mistral

Ministral 3 14B is Mistral's lightweight instruction-following model with tool calling capabilities and a 128K token context window.

Context 128K

Tier Lightweight

Tools Supported

Input from

$0.100 / 1M tokens

across 3 providers

Compare Prices

API Pricing

Cheapest on Amazon AWS — 43% below avg

Provider	Input / 1M	Output / 1M	Cached / 1M	Speed	TTFT	Updated
Amazon AWSBatch	$0.100	$0.100	-	85.9 t/s	431ms	7/13/2026
OpenRouter	$0.200	$0.200	$0.020	85.9 t/s	431ms	7/13/2026
Together AI	$0.200	$0.200	-	85.9 t/s	431ms	7/13/2026
Amazon AWS	$0.200	$0.200	-	85.9 t/s	431ms	7/13/2026

Prices updated daily. Last check: Jul 13, 2026

Performance & Benchmarks

Source: Artificial Analysis →

Intelligence

11.1 / 100

Coding

14.4 / 100

Math

30.0 / 100

Output Speed

85.9 t/s

Latency (TTFT)

431ms

Reasoning & Knowledge

MMLU-Pro69.3%
GPQA Diamond57.2%
Humanity's Last Exam4.6%

Coding

LiveCodeBench35.1%
SciCode23.6%

Math

AIME 202530.0%

Agentic & Tool Use

Terminal-Bench Hard4.5%
Terminal-Bench v2.19.7%
τ²-bench27.2%
τ-bench Banking6.6%

Instruction & Long Context

IFBench32.0%
Long-Context Reasoning22.0%

Benchmarks measured Jul 2026. Scores are independent evaluations, not vendor-reported.

Model Details

General

Creator: Mistral
Family: Ministral
Tier: Lightweight
Context Window: 128K
Modalities: Text

Capabilities

Tool Calling: Yes
Open Source: No
Subtypes: Chat Completion
Aliases: ministral-3-14b-instruct-2512, ministral-14b-2512, Ministral 14B 3.0, Ministral 14B 3

Strengths & Limitations

Strengths

Tool calling support enables integration with external APIs and services
128K token context window handles substantial document processing
Fast inference speed at 297.22 tokens per second output
Low latency with 293ms time to first token
Lightweight architecture reduces computational requirements
Instruction-following design optimized for chat and assistant tasks
Part of Mistral's established model ecosystem

Limitations

Text-only modalities - no image, audio, or video input support
Proprietary model with no open-source weights available
Smaller parameter count may limit complex reasoning compared to frontier models
Newer model family with less extensive real-world testing than established alternatives

Key Features

•128K token context window

•Tool calling with structured output

•Chat completion interface

•Streaming response support

•Fast inference at 297+ tokens per second

•Low-latency response initiation

•API-based access

•Instruction-following optimization

About Ministral 3 14B

Ministral 3 14B is a lightweight model in Mistral's Ministral family, designed for efficient instruction-following tasks. As part of Mistral's model lineup, it occupies the lightweight tier below the company's larger frontier models, offering a balance of capability and computational efficiency. The model features a 128K token context window and supports tool calling functionality, enabling it to interact with external APIs and services. With text-only input and output capabilities, it processes at approximately 297 tokens per second with a time to first token of 293 milliseconds. The model is proprietary and available through API access rather than open-source distribution. Ministral 3 14B serves applications requiring moderate complexity reasoning and automation without the computational overhead of larger models. Its tool calling capabilities make it suitable for building agents and automated workflows, while its lightweight architecture enables cost-effective deployment for high-volume applications.

Common Use Cases

Ministral 3 14B is well-suited for applications requiring efficient automation and moderate reasoning capabilities. Its tool calling functionality makes it effective for building chatbots that need to query databases, retrieve real-time information, or interact with business systems. The 128K context window enables document analysis, customer support with conversation history, and content summarization tasks. Its lightweight architecture and fast inference make it appropriate for high-volume deployments like automated content moderation, simple coding assistance, and structured data extraction where cost efficiency is important but sophisticated reasoning is not required.

Frequently Asked Questions

How much does Ministral 3 14B cost per million tokens?

Ministral 3 14B pricing varies by provider and pricing type (standard vs batch). Check the pricing table above for current rates across all providers.

What is Ministral 3 14B best used for?

Ministral 3 14B excels at instruction-following tasks that require tool integration, such as building automated agents, processing documents within its 128K context window, and creating chatbots that need to query external systems. Its lightweight design makes it cost-effective for high-volume applications where moderate reasoning capability is sufficient.

How does Ministral 3 14B compare to other Mistral models?

Ministral 3 14B sits in the lightweight tier of Mistral's model family, offering faster inference and lower computational requirements than Mistral's larger models. While it may have less sophisticated reasoning capabilities than flagship Mistral models, it provides tool calling functionality and a substantial 128K context window for efficient automation tasks.