DeepSeek-LLM
model paperFirst general-purpose 67B model, outperforming Llama 2, with a technical report on scaling open-source language models with a long-term vision.
Outputs 2
DeepSeek-LLM
modelFirst general-purpose 67B model, outperforming Llama 2 at the time.
Architecture DENSE
Parameters 67B
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
paperTechnical report on scaling open-source language models with a long-term vision.
arXiv: 2401.02954