DeepSeek: DeepSeek V3.1

Deepseek-V3.1

DeepSeek V3.1 is a large-scale hybrid reasoning model with 671B parameters (37B active), capable of operating in both “thinking” and “non-thinking” modes through prompt templates. Building on the DeepSeek-V3 base, it introduces a two-phase long-context training process supporting up to 128K tokens, and leverages FP8 microscaling for more efficient inference. Users can directly control reasoning behavior via a simple boolean toggle.

The model enhances tool use, code generation, and reasoning efficiency, delivering performance on par with DeepSeek-R1 on challenging benchmarks while offering faster response times. With support for structured tool calling, code agents, and search agents, DeepSeek-V3.1 is well-suited for research, programming, and agent-driven workflows. As the successor to DeepSeek-V3-0324, it demonstrates strong performance across a wide range of tasks.

Conversations

Download TXT
Download PDF

Creator Deepseek
Release Date August, 2025
License MIT
Context Window 128,000
Image Input Support No
Open Source (Weights) Yes
Parameters 685B, 37B active at inference time
Model Weights Click here

Leave a Reply

Your email address will not be published. Required fields are marked *