Timeline | Lab Index

Sort

Date ▾	Name	Lab	Type	Stars	Downloads	Citations	Params	Active	Tokens	Context	Intel	Open	Questions	Tasks
2026-06-30	★ LongCat-2.0	Meituan	model				1.6T	48B		1M
2026-06-30	Meituan Releases LongCat-2.0 (1.6T MoE) — Trained Entirely on Domestic Chinese Chips, No NVIDIA Reuters	Meituan	news
2026-06-29	Smooth Scaling Laws Hide Stepwise Token Learning	Xiaohongshu	paper
2026-06-22	AudioCALM: Continuous Autoregressive Language Modeling for Universal Audio Generation	Alibaba	paper
2026-06-22	Reinforcement Learning Towards Broadly and Persistently Beneficial Models	OpenAI	paper
2026-06-22	SpaceX Inks Compute Deal with Reflection AI for Colossus 2 (Up to ~$6.3B Through 2029) TechCrunch	SpaceX	news
2026-06-18	Laguna M.1 Released as Open Weights (Apache 2.0); Base and Post-Trained Checkpoints on Hugging Face Hugging Face	Poolside	news
2026-06-17	Xiaohongshu plans for Hong Kong IPO by year-end, targets US$70b valuation The Standard (via WSJ)	Xiaohongshu	news
2026-06-16	SpaceX Agrees to Acquire Cursor for $60B All-Stock, Days After IPO (Pending Regulatory Approval) TechCrunch	SpaceX	news
2026-06-16	Z.ai GLM-5.2 Tops the AA Intelligence Index as the Highest-Scoring Open Model (51); MIT Open Weights Released Crypto Briefing	Z.ai	news
2026-06-15	HCLTech to Buy 10.5% Stake in Sarvam AI for ~$150M (₹14.27B), Valuing It at $1.5B — Series B First Close ($234M Raised), Unicorn Status Reuters	Sarvam	news
2026-06-13	★ GLM-5.2	Z.ai	model				753B			1M	51
2026-06-12	Kimi Code CLI	Moonshot AI	library
2026-06-12	★ Kimi K2.7-Code	Moonshot AI	model				1T	32B		262.14K
2026-06-11	MiniMax Sparse Attention (MSA)	MiniMax	paper
2026-06-11	NexAU (Agent Universe)	Nex-AGI	library
2026-06-11	Zonos2 (ZONOS2)	Zyphra	model
2026-06-11	SpaceX Prices Largest IPO Ever at $135/Share; Trades on Nasdaq as SPCX TechCrunch	SpaceX	news
2026-06-10	MiMoCode	Xiaomi	library
2026-06-09	★ Claude Fable 5	Anthropic	model								60	11/100
2026-06-09	★ Claude Mythos 5	Anthropic	model									11/100
2026-06-09	North Mini Code	Cohere	model				30B	3B		262.14K	21
2026-06-09	★ DiffusionGemma	Google	model				25.2B	3.8B		262.14K
2026-06-09	★ Nex-N2	Nex-AGI	model				397B	17B
2026-06-09	Cohere Releases North Mini Code — Its First Developer-Focused Model, a 30B-A3B Apache-2.0 Agentic Coder Cohere	Cohere	news
2026-06-08	OpenAI Confidentially Files for IPO (Last Valued at $852B), One Week After Anthropic TechCrunch	OpenAI	news
2026-06-08	Xiaomi MiMo-V2.5-Pro-UltraSpeed Breaks 1000 Tokens/s on a 1T-Parameter Model (with TileRT) Xiaomi MiMo	Xiaomi	news
2026-06-05	DaX (大象)	Alibaba	paper
2026-06-05	★ Chronos-2	Amazon	model	5.45K	12.42M	1	120M			8.19K
2026-06-04	BioMysteryBench	Anthropic	eval										99
2026-06-04	OPI-Struc (STELLA)	BAAI	dataset			1
2026-06-04	★ Nemotron 3 Ultra	NVIDIA	model	1.35K			550B	55B		1M	38
2026-06-04	Nemotron 3.5 Content Safety	NVIDIA	model		5.91K		4B
2026-06-04	SciCore-Omics	OpenBMB	model	8	238		8B
2026-06-04	dots.tts 2	Xiaohongshu	model paper				2B
2026-06-03	Gemma 4 12B	Google	model		816.16K		11.95B			262.14K
2026-06-03	ChartNet	IBM	dataset
2026-06-03	OfficeComprehensionBenchmark	Microsoft	eval	3									1.04K	2
2026-06-03	RHELM	Microsoft	eval										1.3K	7
2026-06-02	MAI-Code-1-Flash	Microsoft	model				137B			262.14K
2026-06-02	★ MAI-Thinking-1	Microsoft	model				1T	35B	30T	262.14K
2026-06-02	Zamba2-VL (Vision-Language)	Zyphra	model				7B
2026-06-02	Build 2026: Seven MAI Models Launched — MAI-Thinking-1 (1T/35B Reasoning), MAI-Code-1-Flash, Multimodal Stack Refresh Microsoft AI	Microsoft	news
2026-06-02	Zamba2-VL Released: Hybrid SSM Vision-Language Models (1.2B / 2.7B / 7B) Zyphra	Zyphra	news
2026-06-01	★ MiniMax-M3	MiniMax	model				428B	23B		1.05M	44
2026-06-01	Anthropic Confidentially Files for IPO at ~$965B Valuation, First Among AI Labs Fortune	Anthropic	news
2026-06-01	Nemotron 3 Ultra Announced at Computex Taipei — 550B/55B MoE, AAII 48, Ships June 4 on HuggingFace Artificial Analysis	NVIDIA	news
2026-05-31	HakushoBench	NII	eval	3									2.05K
2026-05-29	SchGen	Microsoft	model		15		20B			13.31K
2026-05-29	Universal Audio Tokenizer	Tencent	model	4
2026-05-29	Trinity Is Moving to OpenMDW-1.1 Arcee AI Blog	Arcee	news
2026-05-28	★ Claude Opus 4.8	Anthropic	model								56	11/100
2026-05-28	Ultra-FineWeb-L3	OpenBMB	dataset
2026-05-28	★ Step-3.7-Flash	StepFun	model		50.19K		198B	11B		262.14K
2026-05-28	Autonomous Agentic Data Engineering	Tencent	paper
2026-05-28	ByteDance Developing Custom CPU Chips to Support AI Rollout; Pursuing Both Arm and RISC-V Tracks Reuters	ByteDance	news
2026-05-28	Microsoft to Unveil Homegrown Coding Model + Image / Reasoning / Speech / Transcription Suite at Build 2026 The Information	Microsoft	news
2026-05-28	Mistral Chases AI Superintelligence to Counter U.S. Dominance WSJ	Mistral	news
2026-05-28	SKT Launches A.Biz Cowork Internal AI Agent (Beta) and AXMS 1.5 Platform Upgrade Seoul Economic Daily	SK Telecom	news
2026-05-27	Sci-Base	PJLab	dataset
2026-05-26	Granite Guardian 4.1	IBM	model		1.17K		8B
2026-05-26	The MiniMax-M2 Series: Technical Report	MiniMax	paper
2026-05-26	LocateAnything-3B	NVIDIA	model		131.79K		3B
2026-05-26	DeepMind CEO Demis Hassabis: Humanity Has 'a Few Years' to Prepare for AGI; 'Foothills of the Singularity' Axios	Google	news
2026-05-25	MiniCPM5-1B	OpenBMB	model		137.34K		1.08B			131.07K
2026-05-24	Granite Switch 4.1	IBM	model	76	1.1K		30B			131.07K
2026-05-23	ScaleAcross Explorer	Meta	paper
2026-05-23	Nemotron-Labs Diffusion	NVIDIA	model		12.46K		14B
2026-05-23	BitCPM-CANN	OpenBMB	model	9.42K	7.17K		8B
2026-05-22	★ Intern-S2-Preview	PJLab	model	809	6.73K		36B			131.07K
2026-05-22	Liang Wenfeng Commits to AGI Mission Over Near-term Commercialization as ¥70B (~$10B) Round Advances Bloomberg	DeepSeek	news
2026-05-21	Hunyuan Model Matrix Refresh: TurboS, T1, T1-Vision, and Hunyuan Voice (Top-8 Chatbot Arena, +50% T1-Vision Speed) KrASIA	Tencent	news
2026-05-21	Tencent Open-Sources Hy-MT2 Translation Family (1.8B / 7B / 30B-A3B) + IFMTBench Tencent Hunyuan	Tencent	news
2026-05-20	★ Qwen3.7-Max-Preview	Alibaba	model							1M	46
2026-05-20	★ Command A+	Cohere	model		113.99K		218B	25B		128K	29	39/100
2026-05-20	Lens	Microsoft	model	234	3.93K		3.8B
2026-05-20	Qwen3.7-Max-Preview Unveiled at Alibaba Cloud Summit — AA Intelligence Index 57 (#1 Among Chinese Labs) Qwen	Alibaba	news
2026-05-20	Cohere Releases Command A+ — First Full Apache-2.0 Open Model with Lossless 4-bit Quantization and Native Citations VentureBeat	Cohere	news
2026-05-20	Kakao Partners with Google DeepMind on SynthID for Kanana Models — First Asian Firm to Adopt Seoul Economic Daily	Kakao	news
2026-05-20	Q1 FY27 Earnings: $81.6B Revenue (+85% YoY), $80B Buyback Authorized, Dividend Raised 25x to $0.25 NVIDIA	NVIDIA	news
2026-05-19	Antigravity 2.0	Google	model
2026-05-19	★ Gemini 3.5 Flash	Google	model							1M	50	6/100
2026-05-19	Gemini Omni	Google	model
2026-05-19	Anthropic Hires Andrej Karpathy to Pre-training Team; Will Lead Sub-team Using Claude to Accelerate Pretraining Research CNBC	Anthropic	news
2026-05-19	Hitachi Deploys Claude to ~290,000 Employees; Embeds in Lumada 3.0 / HMAX for Critical Infrastructure Hitachi	Anthropic	news
2026-05-19	I/O 2026: Gemini 3.5 Flash (AAII 55), Gemini Omni Video Generator, Antigravity 2.0, Gemini Spark, AI Ultra Repriced $200/mo Google	Google	news
2026-05-19	Four years after ChatGPT, Xiaohongshu's AI restraint gives way to urgency KrASIA	Xiaohongshu	news
2026-05-18	First Vera CPU Deliveries to Anthropic, OpenAI, SpaceXAI, and Oracle NVIDIA	NVIDIA	news
2026-05-16	Full Attention Strikes Back (RTPurbo)	Alibaba	paper
2026-05-15	Grok Build Launches — Agentic Coding CLI Competing with Claude Code, Codex, and Antigravity Engadget	SpaceX	news
2026-05-14	TWN: Think When Needed	Alibaba	paper
2026-05-14	Realtime Voice API GA + gpt-realtime-2 Family (3 New Audio Models) OpenAI	OpenAI	news
2026-05-14	HCLTech to Anchor $300M Sarvam Round at $1.5B; Bessemer +$50M; NVIDIA, Prosperity7 Participating Outlook Business	Sarvam	news
2026-05-14	SKT × Korean Defense Ministry Sign MOU on Applying Sovereign AI Foundation Model to Defense SK Telecom	SK Telecom	news
2026-05-14	SpaceXAI Division Bleeding Researchers Since Merger; 11+ to Meta, 7+ to Thinking Machines TechCrunch	SpaceX	news
2026-05-13	Granite Embedding Multilingual R2	IBM	paper
2026-05-13	NexRL	Nex-AGI	library
2026-05-12	Kuaishou Plans to Spin Off Kling AI Video Unit at \$20B Valuation; Tencent in Talks for \$2B Pre-IPO Round The Information	Kuaishou	news
2026-05-11	MiniCPM-V 4.6	OpenBMB	model		615.51K		1.3B			262.14K	7
2026-05-11	DeepSeek First External Funding Round Reportedly Near Close at $45–50B Valuation, Led by China's 'Big Fund III' SCMP	DeepSeek	news
2026-05-11	Zyphra Announces 15 MW of AMD Instinct MI355X GPU Capacity for Zyphra Cloud Memeburn	Zyphra	news
2026-05-09	★ ERNIE 5.1	Baidu	model
2026-05-09	Step-Audio-R1.1 (Realtime) Tops Big Bench Audio at 96.4%, Surpassing Grok Voice Agent Artificial Analysis	StepFun	news
2026-05-07	Cola DLM	ByteDance	paper
2026-05-07	AI Co-Mathematician	Google	paper
2026-05-07	ZAYA1-74B-Preview	Zyphra	model				74B	4B	15T	262.14K
2026-05-07	OMAI Compute Cluster Goes Live — $152M NSF + Blackwell-Ultra Infrastructure for Open Science AI Ai2	Ai2	news
2026-05-07	Kakao Announces Kanana 2.5 — 150B Agent-Focused LLM at Q1 Earnings Call Korea Herald	Kakao	news
2026-05-07	Kimi Chatbot Maker Moonshot AI Valued at $20 Billion in Meituan-Led Round Bloomberg	Meituan	news
2026-05-07	Kimi Chatbot Maker Moonshot AI Valued at $20 Billion in Meituan-Led Round Bloomberg	Moonshot AI	news
2026-05-07	ZAYA1-74B-Preview: Scaling Pretraining on AMD (74B/4B MoE) Zyphra	Zyphra	news
2026-05-06	★ ZAYA1-8B	Zyphra	model				8B	700M
2026-05-06	DeepSeek in Talks for First-Ever Outside Round at $45B; Tencent + Big Fund III in Lead Group TechCrunch	DeepSeek	news
2026-05-05	TRIBE v2 (Brain Activity Foundation Model)	Meta	paper			1
2026-05-05	iOS 27 to Let Users Swap in Claude, Gemini, and Others as Default Apple Intelligence Model Bloomberg	Apple	news
2026-05-05	GPT-5.5 Instant Becomes Default ChatGPT Model; 52.5% Fewer Hallucinated Claims vs 5.3 Instant OpenAI	OpenAI	news
2026-05-04	Horizon Length in LLM Agent Training	Microsoft	paper
2026-05-04	Zyphra Launches Zyphra Cloud & Zyphra Inference — Serverless Inference for Open Models, AMD-First Zyphra	Zyphra	news
2026-05-03	Korea's National Growth Fund and SIF Approve KRW 560B (~$400M) Direct Equity in Upstage — First Software Co. Recipient Seoul Economic Daily	Upstage	news
2026-05-01	Huawei's AI Chip Gains Ground as DeepSeek and Others Shift Away from Nvidia Financial Times	Huawei	news
2026-04-30	OlmPool: Cracks in the Foundation	Ai2	paper	7
2026-04-30	SenseTime Is Running Its New Model on Chinese Chips WIRED	SenseTime	news
2026-04-30	xAI Launches Grok 4.3 with Improved Agentic Performance and Lower Pricing Artificial Analysis	SpaceX	news
2026-04-29	★ Granite 4.1	IBM	model	152	619.44K		30B		15T	512K	9	61/100
2026-04-29	★ Mistral Medium 3.5	Mistral	model		400.07K		128B			256K	30	33/100
2026-04-29	Granite 4.1 Released: 3B/8B/30B Dense Models, 512K Context, 8B Matches Prior 32B MoE IBM Research	IBM	news
2026-04-28	★ Laguna M.1	Poolside	model				225.8B	23.4B	30T	262.14K
2026-04-28	Laguna XS.2	Poolside	model				33.4B	3B	30T	262.14K
2026-04-28	★ MiMo-V2.5-Pro	Xiaomi	model		72.22K		1.02T	42B		1M	42	39/100
2026-04-28	Poolside Launches Laguna XS.2 (Open, Apache 2.0) and Laguna M.1 Agentic Coding Models Poolside	Poolside	news
2026-04-24	★ DeepSeek-V4 3	DeepSeek	model paper		6.84M		1.6T	49B	33T	1M	44	50/100
2026-04-24	Cohere Completes Merger with Germany's Aleph Alpha, Creating Transatlantic AI Champion Financial Times	Cohere	news
2026-04-24	DeepSeek-V4 Released: 1.6T/49B MoE, First Frontier Model Trained Entirely on Huawei Ascend 950PR, MIT License DeepSeek	DeepSeek	news
2026-04-23	Sapiens2	Meta	model				5B
2026-04-23	★ GPT-5.5	OpenAI	model				~9.7T			1M	55	6/100
2026-04-23	★ Hy3 Preview	Tencent	model	373	82.33K		295B	21B		256K	34	33/100
2026-04-22	★ Qwen3.6 Open-Weight Models 2	Alibaba	model	3.54K	10.41M		35B	3B		262.14K
2026-04-22	★ LLaDA 2.0-Uni	Ant Group	model		7.22K		16B
2026-04-22	Tencent and Alibaba in Talks to Invest in DeepSeek at $20B+ Valuation — First External Funding Bloomberg	DeepSeek	news
2026-04-22	Cloud Next '26: TPU 8t/8i Announced, Deep Research Max Agents, Chrome Auto Browse, Thinking Machines Lab Multi-Billion Deal 9to5Google	Google	news
2026-04-21	SpaceX Strikes Deal for Right to Acquire Cursor for $60B Bloomberg	SpaceX	news
2026-04-20	BAR: Branch-Adapt-Route	Ai2	paper	150
2026-04-20	★ Kimi K2.6	Moonshot AI	model				1T	32B		262.14K	43	33/100
2026-04-20	Qwen3.6-Max-Preview: Alibaba's Most Powerful Model, #1 on Six Coding Benchmarks (AA Intelligence Index 52) Qwen	Alibaba	news
2026-04-20	Amazon Invests $5B More (Total $13B); Anthropic Commits $100B+ AWS Spend Over 10 Years, Secures Up to 5GW Compute Anthropic	Anthropic	news
2026-04-17	★ Grok 4.3	SpaceX	model								38
2026-04-16	★ Claude Opus 4.7	Anthropic	model				~4T				54	11/100
2026-04-16	LeapAlign: Post-Training Flow Matching Models at Any Generation Step	ByteDance	paper
2026-04-16	Prefill-as-a-Service: Cross-Datacenter KVCache for Next-Generation Models	Moonshot AI	paper
2026-04-16	Claude Opus 4.7 Released Anthropic	Anthropic	news
2026-04-16	ByteDance Recruits DeepSeek R1 Lead Author Daya Guo for Seed Agent Team SCMP	ByteDance	news
2026-04-16	DeepSeek R1 Lead Author Daya Guo Joins ByteDance Seed Amid Intensifying AI Talent War SCMP	DeepSeek	news
2026-04-16	DeepSeek V4 Imminent — 1T-Parameter MoE to Run Solely on Huawei Ascend 950PR Chips Dataconomy	DeepSeek	news
2026-04-16	How France's Mistral Built a $14 Billion AI Empire by Not Being American Forbes	Mistral	news
2026-04-16	GPT-Rosalind Launched for Life Sciences Drug Discovery OpenAI	OpenAI	news
2026-04-15	Revenue Run Rate Hits $30B; VCs Offer Up to $800B Valuation Axios	Anthropic	news
2026-04-15	Upstage Becomes Korea's First Generative AI Unicorn with $126M Series C Seoul Economic Daily	Upstage	news
2026-04-14	Lightning OPD: Efficient Post-Training for Large Reasoning Models	NVIDIA	paper
2026-04-14	Gemini Robotics-ER 1.6 Launched; Boston Dynamics Partnership for Industrial AI Boston Dynamics	Google	news
2026-04-14	NAACP Sues xAI Over Memphis Colossus Data Center Pollution CNBC	SpaceX	news
2026-04-13	OpenAI Touts Amazon Alliance, Says Microsoft Has 'Limited Our Ability' to Reach Enterprise CNBC	OpenAI	news
2026-04-13	StepFun Unwinding Offshore Structure to Pave Way for HK IPO at Up to $10B Reuters	StepFun	news
2026-04-12	SoftBank/NEC/Honda/Sony Form JV for Trillion-Parameter Physical AI Model; $6.3B Government Backing Nikkei Asia	SB Intuitions	news
2026-04-10	Nexus: Common Minima for Better Generalization	ByteDance	paper
2026-04-10	Alibaba Token Hub Created: 5 AI Units Consolidated Under CEO Eddie Wu; RMB 380B ($53B) 3-Year Commitment SCMP	Alibaba	news
2026-04-10	Cohere in Advanced Merger Talks with Germany's Aleph Alpha Reuters	Cohere	news
2026-04-10	SK Telecom Partners with Rebellions and Arm for Sovereign AI Inference Infrastructure Rebellions	SK Telecom	news
2026-04-10	xAI Spending Pushed SpaceX to Nearly $5B Loss; CFO Anthony Armstrong Departs The Information	SpaceX	news
2026-04-09	Metis: Cultivating Meta-Cognitive Tool Use in Agentic Multimodal Models	Alibaba	paper
2026-04-09	HiFloat4 Format for LLM Pre-training on Ascend NPUs	Huawei	paper
2026-04-09	★ EXAONE 4.5	LG	model				33B			256K
2026-04-09	Efficient RL Training for LLMs with Experience Replay	Meta	paper
2026-04-09	EXAONE 4.5 Released — LG's First Open-Weight Vision-Language Model Korea Herald	LG	news
2026-04-09	Naver Shuts Down Clova X Chatbot; Pivots to Vertical AI Integrated into Search, Shopping, Finance Seoul Economic Daily	Naver	news
2026-04-08	★ Muse Spark	Meta	model							260K	43
2026-04-08	Muse Spark Unveiled — First Model from Superintelligence Labs (Proprietary) Bloomberg	Meta	news
2026-04-08	Zhipu Hikes Prices Again as China AI Monetization Wave Quickens Bloomberg	Z.ai	news
2026-04-07	★ Harrier	Microsoft	model		387.22K		27B (max)
2026-04-07	★ GLM-5.1	Z.ai	model	3.39K	123.47K		754B				40	44/100
2026-04-07	Claude Mythos Withheld from Public Release; Project Glasswing Cybersecurity Consortium Launched with Apple and Google Fortune	Anthropic	news
2026-04-07	Ascend 950PR AI Chip in Production; 750K Units Planned for 2026; Alibaba, ByteDance, Tencent Place Massive Orders TrendForce	Huawei	news
2026-04-07	GLM-5.1 Open-Source Release Scores #3 on Code Arena (1530 Elo); Stock Surges 19% BuildFastWithAI	Z.ai	news
2026-04-06	AI Agent Traps	Google	paper
2026-04-06	MedGemma 1.5	Google	model	1.51K	416.35K		4B
2026-04-06	Multi-GW Compute Partnership Expansion with Google Cloud and Broadcom TechCrunch	Anthropic	news
2026-04-06	NVIDIA Acquires SchedMD (Slurm Workload Manager); Draws Regulatory Scrutiny Reuters	NVIDIA	news
2026-04-03	Microsoft Announces $10B Japan AI Infrastructure Investment (2026-2029) WSJ	Microsoft	news
2026-04-03	TII Launches Falcon Perception — 600M-Parameter Open Multimodal Model for Grounding and Segmentation TII	TII	news
2026-04-03	Xiaomi Reveals MiMo-V2-Pro (1T Parameters), Approaching GPT-5.2 / Opus 4.6 Performance VentureBeat	Xiaomi	news
2026-04-02	★ Qwen 3.6-Plus	Alibaba	model							1M
2026-04-02	★ Trinity Large Thinking	Arcee	model		10.66K					512K	24	44/100
2026-04-02	★ Gemma 4	Google	model				31B	4B (max)		256K	29	39/100
2026-04-02	★ MAI Multimodal Stack (Transcribe / Voice / Image)	Microsoft	model
2026-04-02	SWE-HERO	NVIDIA	paper
2026-04-02	Alibaba Unveils Third Closed-Source AI Model in Focus on Profit Bloomberg	Alibaba	news
2026-04-02	Arcee's New Open-Source Trinity Large Thinking Is the Rare Powerful U.S.-Made Model VentureBeat	Arcee	news
2026-04-02	Gemma 4 Open Models Released Google Developers Blog	Google	news
2026-04-02	Poolside's $2B Series C Collapses; CoreWeave Exits 2GW Texas Data Center (Project Horizon) DataCenterDynamics	Poolside	news
2026-04-02	Sarvam AI Nearing $300-350M Raise at $1.5B Valuation Led by Bessemer with Nvidia and Amazon Bloomberg	Sarvam	news
2026-04-01	Simple Self-Distillation for Code Generation	Apple	paper
2026-04-01	Scaling Reasoning Tokens via RL and Parallel Thinking	ByteDance	paper
2026-04-01	Procedural Knowledge at Scale Improves Reasoning	Meta	paper
2026-04-01	Speech LLMs as Contextual Reasoning Transcribers	Microsoft	paper
2026-04-01	★ GLM-5V-Turbo	Z.ai	model				744B	40B	28.5T	202.75K	34
2026-04-01	Moonshot AI Raising $1B at $18B Valuation; Working with CICC and Goldman Sachs on HK IPO Bloomberg	Moonshot AI	news
2026-03-31	Think-Anywhere	Alibaba	paper	55
2026-03-31	ASI-Evolve	SII	paper	726
2026-03-31	OpenAI Closes $122B Round at $852B Valuation OpenAI	OpenAI	news
2026-03-31	Zhipu's Losses Climb 60% After Chinese AI Rivalry Worsens Bloomberg	Z.ai	news
2026-03-31	Zhipu's Losses Climb 60% After Chinese AI Rivalry Worsens Bloomberg	Z.ai	news
2026-03-30	Mistral AI Raises $830M in Debt to Set Up a Data Center Near Paris TechCrunch	Mistral	news
2026-03-28	★ daVinci-LLM	SII	model	154			3B
2026-03-28	★ Falcon Perception	TII	model	715	12.87K		600M
2026-03-28	DeepSeek Before V4: Culture, Organization, and Liang Wenfeng's Unique Goals (English summary) LatePost (晚点)	DeepSeek	news
2026-03-26	Cohere Transcribe	Cohere	model		551.93K		2B
2026-03-26	Intern-S1-Pro	PJLab	model				1T	22B
2026-03-26	China's Moonshot AI Seeks Listing in Hong Kong Under Heightened Scrutiny WSJ	Moonshot AI	news
2026-03-25	★ LongCat-Next	Meituan	model	438	2.18K		74B	3B
2026-03-25	Alibaba Launches AI Model Task Force; Top Researcher Resigns The Information	Alibaba	news
2026-03-25	MiniMax-M2.7, GLM-5 at 1/3 Cost Latent Space	MiniMax	news
2026-03-24	DeepSeek's Latest Job Postings Highlight Pivot to Agentic AI Bloomberg	DeepSeek	news
2026-03-23	SkillRouter	Alibaba	paper
2026-03-23	Felis	ByteDance	paper
2026-03-20	LongCat-Flash-Prover	Meituan	model	85	64		560B	27B
2026-03-19	dots.mocr	Xiaohongshu	model				3B
2026-03-18	Path-Constrained Mixture-of-Experts	Apple	paper
2026-03-18	Qianfan-OCR	Baidu	model		174.67K		4B
2026-03-18	★ MiniMax-M2.7	MiniMax	model								38	22/100
2026-03-18	MiMo-V2-Omni	Xiaomi	model								35
2026-03-18	★ MiMo-V2-Pro	Xiaomi	model				1T	42B		1M	40
2026-03-18	MiMo-V2-TTS	Xiaomi	model
2026-03-18	Chinese AI Developer Zhipu to Create New Unit for Product Development The Information	Z.ai	news
2026-03-17	PRISM: Demystifying Retention and Interaction in Mid-Training	IBM	paper
2026-03-17	★ Pre-training LLM without Learning Rate Decay Enhances Supervised Fine-Tuning	SB Intuitions	paper
2026-03-16	Mixture-of-Depths Attention (MoDA)	ByteDance	paper	267
2026-03-16	★ Mistral Small 4	Mistral	model		45.71K		119B	6.5B		256K	12	39/100
2026-03-16	Attention Residuals 2	Moonshot AI	paper library	3.3K
2026-03-16	CUBE: A Standard for Unifying Agent Benchmarks	ServiceNow	paper
2026-03-15	Scientific Judge 2	Baidu	paper dataset	405
2026-03-13	OpenSWE / daVinci-Env	SII	dataset	187
2026-03-12	RoboBrain-Dex	BAAI	model	41
2026-03-12	IndexShare (IndexCache): Cross-Layer Index Reuse for Sparse Attention	Z.ai	paper
2026-03-11	★ Nemotron 3 Super	NVIDIA	model		745.73K		120B	12B		1M	25	83/100
2026-03-10	★ Exclusive Self Attention	Apple	paper
2026-03-10	Ai2 CEO Ali Farhadi Steps Down; Microsoft Hires Key Researchers GeekWire	Ai2	news
2026-03-09	Anthropic Sues Trump Admin Over Pentagon AI Blacklist CNBC	Anthropic	news
2026-03-09	OpenAI Acquires Promptfoo for AI Agent Security OpenAI	OpenAI	news
2026-03-08	Scalable Training of MoE Models with Megatron Core	NVIDIA	paper
2026-03-06	★ Sarvam-105B	Sarvam	model		12.57K		105B	10.3B		128K	12	39/100
2026-03-06	★ Sarvam-30B	Sarvam	model				30B	2.4B		32K	7	39/100
2026-03-05	★ GPT-5.4	OpenAI	model				~2.2T			1M	51	6/100
2026-03-05	GPT-5.4 Released with 1M Token Context OpenAI	OpenAI	news
2026-03-04	RIVER	PJLab	dataset	10
2026-03-01	★ OLMo Hybrid	Ai2	model		48.9K		7B		6T
2026-03-01	★ LLM-jp-4	NII	model		4.22K		32B	3.8B	11.7T	65.54K
2026-02-28	AnyTouch2 / ToucHD 2	BAAI	dataset paper
2026-02-25	MaxClaw	MiniMax	library
2026-02-25	ZUNA (EEG Foundation Model)	Zyphra	model				380M
2026-02-25	Tencent-Backed AI Startup StepFun Is Said to Plan Hong Kong IPO Bloomberg	StepFun	news
2026-02-19	★ Gemini 3.1 Pro	Google	model							1M	46	6/100
2026-02-19	Gemini 3.1 Pro Released, Ties #1 on AA Intelligence Index Google DeepMind	Google	news
2026-02-17	OLMix	Ai2	paper
2026-02-17	Tiny Aya	Cohere	model		1.79K		3.35B
2026-02-17	★ Grok-4.20	SpaceX	model							2M	37
2026-02-17	Mercury 2 Released: Diffusion LLM with AA Index 33 at 1000 tok/s Inception Labs	Inception Labs	news
2026-02-16	★ Qwen3.5 5	Alibaba	model	3.54K			397B	17B		1M	34	39/100
2026-02-16	WebWorld	Alibaba	model	39	1.26K		32B
2026-02-16	★ Ling 2.5	Ant Group	model		215		1T			1M
2026-02-16	ZoomBench	Ant Group	dataset	155
2026-02-15	Optimal Batch Size Scheduling via Functional Scaling Laws	Meituan	paper
2026-02-14	★ Doubao-Seed-2.0	ByteDance	model
2026-02-14	Doubao-Seed-2.0 Family Launched (Pro / Lite / Mini / Code) TechNode	ByteDance	news
2026-02-13	Cohere's $240M Year Sets Stage for IPO TechCrunch	Cohere	news
2026-02-12	★ MiniMax-M2.5	MiniMax	model	585	589.2K		229B				34	28/100
2026-02-12	GEBench	StepFun	dataset	54
2026-02-12	FireRed-Image-Edit	Xiaohongshu	model
2026-02-12	Xiaomi-Robotics-0 2	Xiaomi	model paper				4.7B
2026-02-12	Anthropic Raises $30B Series G at $380B Valuation Anthropic	Anthropic	news
2026-02-11	Ming-Flash-Omni-2.0	Ant Group	model		2.66K
2026-02-11	MiniCPM-SALA	OpenBMB	model	9.42K	6.33K					1M
2026-02-11	★ Step-3.5-Flash 3	StepFun	model paper dataset	2.08K	325.86K		196B	11B		256K	26
2026-02-11	★ GLM-5 3	Z.ai	model paper	3.39K	102.79K		744B	44B	28.5T		40	50/100
2026-02-11	Slime: Asynchronous RL for Agentic Tasks	Z.ai	library	6.07K
2026-02-09	Protenix	ByteDance	model	1.94K
2026-02-09	InternAgent-1.5	PJLab	paper
2026-02-08	★ Data Darwinism / Darwin Corpora	SII	dataset	154
2026-02-07	Seedance 2.0	ByteDance	model
2026-02-07	FireRed-OpenStoryline	Xiaohongshu	library
2026-02-06	Baichuan-M3 2	Baichuan	paper model	245	1.84K		235B
2026-02-05	★ Claude Opus 4.6	Anthropic	model				~5.3T			1M	44	11/100
2026-02-05	★ Kling 3.0 2	Kuaishou	model paper
2026-02-05	Claude Opus 4.6 Released with 1M Context Anthropic	Anthropic	news
2026-02-04	RationaleRM	Alibaba	dataset
2026-02-03	★ MiniCPM-o 4.5	OpenBMB	model	25.58K	203.83K
2026-02-02	WAXAL: African Language Speech Corpus	Google	dataset
2026-02-02	★ Kimi K2.5 2	Moonshot AI	model paper	2.02K	1.64M			32B			38	33/100
2026-02-02	daVinci-Agency	SII	model		9
2026-02-02	SpaceX Acquires xAI at $1.25T Combined Valuation Fortune	SpaceX	news
2026-01-30	Keel: Post-LayerNorm Is Back	ByteDance	paper
2026-01-29	SenseNova-MARS 3	SenseTime	model paper dataset	114	74		32B (max)
2026-01-28	★ Trinity Large	Arcee	model		523		398B	13B	17T	512K
2026-01-28	Trinity Mini / Nano	Arcee	model		14.85K		26B	3B		131K
2026-01-28	ACE-Step-1.5	StepFun	model	10.97K
2026-01-27	★ K2 Think V2	MBZUAI	model		1.28K		70B			262.14K	17	89/100
2026-01-27	LongCat-Flash-Lite 2	Meituan	model paper		2.08K						17	44/100
2026-01-27	Mistral AI Surges Revenue 20-Fold to Over $400 Million ARR MLQ	Mistral	news
2026-01-27	Tencent Bets Its AI Future on 28-Year-Old From OpenAI Caixin	Tencent	news
2026-01-26	DeepPlanning	Alibaba	dataset
2026-01-26	★ daVinci-Dev	SII	model		28
2026-01-26	★ Solar Pro 3	Upstage	model				102B	12B		128K
2026-01-23	★ LongCat-Flash-Thinking-2601 2	Meituan	model paper	254	7.66K		560B	27B
2026-01-22	★ ERNIE 5.0 2	Baidu	model paper			2	2.4T				22
2026-01-22	EvoCUA	Meituan	library	323	4.12K
2026-01-21	CorpusQA: A 10 Million Token Benchmark for Corpus-Level Analysis and Reasoning	Alibaba	eval
2026-01-20	★ Yuan 3.0 Ultra	Inspur	model	236	35		1T	68.8B
2026-01-20	Step-3-VL-10B 2	StepFun	model paper	406	484.26K		10B
2026-01-15	Tao Qin Elected 2025 ACM Fellow ACM	ZGCA	news
2026-01-12	Engram: Conditional Memory via Scalable Lookup	DeepSeek	paper
2026-01-12	Alphabet Hits $4T Market Cap CNBC	Google	news
2026-01-11	Distributional Clarity: The Hidden Driver of RL-Friendliness in Large Language Models	Baidu	paper
2026-01-11	★ Solar Open 100B	Upstage	model		3.87K		102B	12B	19.7T	131.07K
2026-01-09	PaCoRe: Learning to Scale Test-Time Compute	StepFun	paper	334	110
2026-01-09	Zhipu and MiniMax IPO ChinaTalk	MiniMax	news
2026-01-09	Zhipu and MiniMax IPO ChinaTalk	Z.ai	news
2026-01-06	xAI Raises $20B Series E at $230B Valuation CNBC	SpaceX	news
2026-01-05	Yuan 3.0 Flash	Inspur	model	187	8		40B	3.7B
2026-01-05	★ K-EXAONE	LG	model				236B	23B			25	28/100
2026-01-05	★ HyperCLOVA X SEED Omni	Naver	model		648		8B
2026-01-05	★ Falcon-H1R	TII	model	9	5.76K		7B			256K	10	44/100
2026-01-03	★ HyperCLOVA X SEED Think	Naver	model		156.09K		32B			128K	17	31/100
2026-01-01	FlashInfer-python-paddle	Baidu	library
2026-01-01	Agentar-Z-100K	Z.ai	dataset
2025-12-31	FineWeb-Mask	ByteDance	dataset
2025-12-31	mHC: Manifold-Constrained Hyper-Connections	DeepSeek	paper
2025-12-31	OpenOneRec	Kuaishou	library	812	33
2025-12-30	SeedFold	ByteDance	paper
2025-12-30	LongCat ZigZag Attention	Meituan	paper	8	42
2025-12-27	RollArt: Disaggregated Multi-Task Agentic RL Training at Scale	Alibaba	paper
2025-12-27	★ A.X K1	SK Telecom	model	30	33.17K		519B	33B	10T	131K
2025-12-23	★ MiniMax-M2.1	MiniMax	model	544	10.19K		229B				31	28/100
2025-12-23	VIBE & OctoCodingBench	MiniMax	dataset
2025-12-23	Step-DeepResearch	StepFun	library	561
2025-12-23	Zhipu AI's Rise from Tsinghua Lab Pandaily	Z.ai	news
2025-12-22	SekoTalk / Seko 2.0	SenseTime	model	43
2025-12-22	★ GLM-4.7	Z.ai	model								34	44/100
2025-12-19	★ Kanana-2	Kakao	model		150		30B	3B		32K
2025-12-19	Kakao Open-Sources Kanana-2 Model Optimized for Agentic AI Korea Times	Kakao	news
2025-12-18	★ Seed1.8	ByteDance	model	218
2025-12-18	EXAONE Path 2.5	LG	paper
2025-12-18	Towards Scalable Pre-training of Visual Tokenizers	MiniMax	paper	490
2025-12-18	HY-Motion 1.0	Tencent	paper
2025-12-18	Seed1.8 Released as a Generalized Agentic Model ByteDance Seed	ByteDance	news
2025-12-17	Peter DeSantis to Lead Unified AGI Org; Rohit Prasad Departing CNBC	Amazon	news
2025-12-17	Tencent restructures AI operations, promotes high-profile recruit to chief AI scientist SCMP	Tencent	news
2025-12-16	★ Molmo 2	Ai2	model	641		1	8B (max)
2025-12-16	★ MiMo-V2-Flash 2	Xiaomi	model paper	1.33K	70.62K		309B	15B	27T		33
2025-12-16	MOPD (Multi-Teacher On-Policy Distillation)	Xiaomi	library	1.33K
2025-12-15	★ Nemotron 3 Nano	NVIDIA	model		2.18M		30B	3.5B	25T	1M	18	83/100
2025-12-15	NVIDIA in Advanced Talks to Acquire AI21 Labs for $2-3B SiliconANGLE	AI21 Labs	news
2025-12-10	★ LLaDA 2 2	Ant Group	model paper	430	9.38K
2025-12-09	★ JAIS 2	MBZUAI	model		2.5K		70B		2.6T	8.19K
2025-12-08	LongCat-Image 3	Meituan	model paper	695	47.59K		6B
2025-12-06	★ K2-V2 (LLM360)	MBZUAI	model		183		70B
2025-12-05	NEO (Native VLM Architecture) 2	SenseTime	model paper	825		1	9B (max)
2025-12-05	★ Hunyuan 2.0	Tencent	model				406B	32B		256K
2025-12-04	Nex-N1: Agentic Models via Large-Scale Environment Construction	Nex-AGI	paper
2025-12-02	★ Amazon Nova 2	Amazon	model							1M	22	11/100
2025-12-02	★ Mistral Large 3	Mistral	model		1.98K		675B	41B		256K	16	39/100
2025-12-02	Nova 2 Model Family and Nova Act GA at re:Invent 2025 TechCrunch	Amazon	news
2025-12-02	Anthropic Acquires Bun, Claude Code Hits $1B ARR Anthropic	Anthropic	news
2025-12-01	Ministral 3	Mistral	model		782.22K		14B (max)				6
2025-12-01	John Giannandrea to Retire; Amar Subramanya Named VP of AI Apple	Apple	news
2025-11-30	gelab-zero (STEP-GUI)	StepFun	library	2.19K	2.21K
2025-11-28	★ LFM2 (Liquid Foundation Models 2)	Liquid AI	model				24B	2.3B			5	28/100
2025-11-27	DeepSeek-Math-V2 2	DeepSeek	model dataset	1.59K	384
2025-11-24	★ Claude 4.5 Opus	Anthropic	model				~3.4T			200K	35	11/100
2025-11-24	HunyuanOCR	Tencent	model	1.65K	347.7K		1B
2025-11-20	★ OLMo 3	Ai2	model		10.31K		32B		5.9T	65.54K	8	89/100
2025-11-20	AICC: A 7.3T AI-Ready Corpus Built by a Model-Based HTML Parser	PJLab	dataset
2025-11-20	HunyuanVideo-1.5	Tencent	model	4.47K	2.34K
2025-11-20	MiMo-Embodied: X-Embodied Foundation Model	Xiaomi	paper		1.12K
2025-11-19	LPLB (Linear-Programming Load Balancer)	DeepSeek	library	505
2025-11-19	Step-Audio-R1	StepFun	model	673	222		33B
2025-11-19	Yann LeCun Departs Meta to Found AMI Labs CNBC	Meta	news
2025-11-17	SenseNova-SI (Spatial Intelligence) 3	SenseTime	model paper dataset	271			8B (max)
2025-11-15	★ Doubao Seed Code	ByteDance	model							256K	26	11/100
2025-11-15	Doubao Seed Code (Reasoning Coder) Hits AA Intelligence Index 34 Artificial Analysis	ByteDance	news
2025-11-14	Miloco (Xiaomi Local Copilot)	Xiaomi	library	2.61K
2025-11-13	M100 Chip	Baidu	announcement
2025-11-12	★ AlphaProof	Google	paper
2025-11-12	Interview: Ant Group's Open Model Ambitions Interconnects	Ant Group	news
2025-11-10	kosong	Moonshot AI	library	520
2025-11-06	InfinityStar	ByteDance	model
2025-11-06	Step-Audio-EditX	StepFun	model	929	43.36K		3B
2025-11-05	SoftBank and SB Intuitions launch Sarashina API for enterprise access to Japanese LLM SoftBank	SB Intuitions	news
2025-11-03	★ LongCat-Flash-Omni 2	Meituan	model paper	492	62		560B	27B
2025-11-01	LightX2V	SenseTime	library	2.36K
2025-11-01	Inception Labs Raises $56M Seed from Menlo, Andrew Ng, Karpathy Inception Labs	Inception Labs	news
2025-10-31	GATE	LG	paper
2025-10-30	★ Emu3.5	BAAI	model	1.52K	907
2025-10-30	Kimi Linear 2	Moonshot AI	model paper	1.4K			48B	3B
2025-10-29	★ Ouro	ByteDance	model		87.71K		2.6B		7.7T
2025-10-28	ODesign	BAAI	model	311
2025-10-28	URSA (Uniform Discrete Diffusion)	BAAI	model
2025-10-28	Parallel Loop Transformer	ByteDance	paper
2025-10-28	OpenAI Completes For-Profit PBC Restructuring OpenAI	OpenAI	news
2025-10-27	★ MiniMax-M2	MiniMax	model	2.6K	127.97K		230B	10B			28	28/100
2025-10-27	CoKE: Context as the Key to Biomolecular Understanding	PJLab	paper	18
2025-10-27	JanusCoder	PJLab	model	80	47
2025-10-27	Hunyuan Mirror	Tencent	paper	1.14K	2.99K	1
2025-10-25	LongCat-Video 3	Meituan	model paper	4.26K	3.21K		13.6B
2025-10-24	KAT-Coder 2	Kuaishou	model paper					72B
2025-10-23	Anthropic to Expand Google Cloud TPU Use to 1M+ TPUs Anthropic	Anthropic	news
2025-10-22	Seed3D 1.0	ByteDance	model
2025-10-20	DeepSeek-OCR / OCR-2	DeepSeek	model	23.27K	2.35M
2025-10-17	LongCat-Audio-Codec	Meituan	paper	301
2025-10-16	MorphoBench	ZGCA	paper	13
2025-10-15	★ Granite 4.0	IBM	model				32B	9B			5	56/100
2025-10-15	InteractiveOmni 2	SenseTime	model paper	8			8B (max)
2025-10-15	Granite 4.0: Hybrid Mamba Architecture, First ISO 42001 Certified Open Models IBM	IBM	news
2025-10-14	Rex-Omni 2	IDEA Lab	model paper	1.44K	33.78K		3B
2025-10-14	Zhipu AI Breaks US Chip Reliance With First Major Model Trained on Huawei Stack SCMP	Z.ai	news
2025-10-13	RITE: Reinforcement Learning for Tool-Integrated Interleaved Thinking	Meituan	paper
2025-10-09	★ Ling 2.0 / Ling-1T 2	Ant Group	model paper		3.35K		1T	50B			10	44/100
2025-10-01	R-HORIZON-Websearch	Meituan	dataset	26
2025-10-01	GDPval	OpenAI	eval			1								1320
2025-10-01	IBM Research Names Jay Gambetta as Director; Dario Gil to DOE IBM	IBM	news
2025-09-30	★ GLM-4.6	Z.ai	model				355B				23	44/100
2025-09-29	Ring 4	Ant Group	model paper	258	38.2K		1T	63B		262.14K
2025-09-29	★ DeepSeek-V3.2 2	DeepSeek	model paper		3.59M	2	685B	37B			33
2025-09-28	HunyuanImage-3.0 2	Tencent	model paper	3.12K	2.61K			13B
2025-09-26	Qwen3Guard	Alibaba	model	465
2025-09-25	Expanding Reasoning Potential (CoTP)	Meituan	paper
2025-09-24	LRM-Eval / ROME	BAAI	dataset	5
2025-09-23	ByteWrist	ByteDance	model
2025-09-23	★ LongCat-Flash-Thinking 2	Meituan	model paper	285	103		560B	27B
2025-09-23	Symphony-MoE	PCL	paper
2025-09-22	BGE-Reasoner	BAAI	model	31	959
2025-09-22	ScaleCUA	PJLab	model	1.11K	58
2025-09-18	Seedream 4.0	ByteDance	model			1
2025-09-17	★ AToken	Apple	paper	140
2025-09-16	Shanghai launches innovation institute to bridge AI research and industry Shanghai Municipal Government	SII	news
2025-09-15	checkpoint-engine	Moonshot AI	library	963
2025-09-08	★ PLaMo 2	PFN	model		34.05K		31B		2T	32K
2025-09-05	Klear 3	Kuaishou	model paper	82	58	1	46B	2.5B
2025-09-05	★ MiniCPM4.1 2	OpenBMB	model paper	9.42K	49.98K		8B
2025-09-02	Baichuan-M2 2	Baichuan	paper model	212	938	1	32B
2025-09-02	★ Apertus	Swiss AI	model		161.34K	2	70B		15T	65.54K	2	89/100
2025-09-01	VeOmni	ByteDance	library	2K
2025-09-01	★ LongCat-Flash-Chat 2	Meituan	model paper	1.34K	81.37K	1	560B	27B		128K
2025-09-01	Hunyuan-MT	Tencent	model	710	56.53K		30B	3B
2025-09-01	RLinf	ZGCA	library
2025-09-01	TwinBrainVLA	ZGCA	paper
2025-09-01	Mistral AI Raises EUR 2B at EUR 12B Valuation Mistral AI	Mistral	news
2025-08-28	HyperOS 3	Xiaomi	announcement
2025-08-26	★ MiniCPM-V 4.5 2	OpenBMB	model paper	25.58K	93.36K
2025-08-25	GEPO	PCL	paper
2025-08-25	InternVL 3.5	PJLab	model	10.06K		3	241B (max)	28B (max)
2025-08-23	HunyuanVideo-Foley	Tencent	paper
2025-08-21	Fin-PRM: Process Reward Model for Financial Reasoning	Alibaba	paper	502
2025-08-21	Waver	ByteDance	model	938
2025-08-21	★ DeepSeek-V3.1	DeepSeek	model								21
2025-08-21	Intern-S1	PJLab	model				241B	28B
2025-08-20	Seed-OSS-36B	ByteDance	model	885	36.95K		36B		12T	512K	18	44/100
2025-08-20	Nemotron Nano V2	NVIDIA	model		15.37K		12B			128K
2025-08-20	Seed-OSS-36B Released as Apache-2.0 Open-Weight Model VentureBeat	ByteDance	news
2025-08-15	PXDesign	ByteDance	model	229
2025-08-15	Physical Autoregressive Model (PAR)	PCL	paper
2025-08-14	NextStep-1 2	StepFun	model paper	688	62		14B
2025-08-14	Hunyuan-GameCraft 1.0	Tencent	model	723	76
2025-08-14	Cohere Raises $500M at $6.8B Valuation Cohere	Cohere	news
2025-08-14	Cohere Hires Long-Time Meta Research Head Joelle Pineau as Chief AI Officer TechCrunch	Cohere	news
2025-08-12	Mistral Medium 3.1	Mistral	model							128K	15	11/100
2025-08-12	InternBootcamp	PJLab	library	349
2025-08-11	GLM-4.5V	Z.ai	model	2.33K	167.01K		106B	12B		64K
2025-08-07	CANN	Huawei	library
2025-08-07	★ GPT-5	OpenAI	model				~4.1T			400K	36	6/100
2025-08-07	TMA-Adaptive FP8 Grouped GEMM	PJLab	paper	25
2025-08-06	ACAVCaps	Xiaomi	dataset	424
2025-08-05	OmniScale	ByteDance	paper	2K
2025-08-05	Seed Diffusion	ByteDance	model
2025-08-05	dots.vlm1	Xiaohongshu	model
2025-08-01	Qwen-Image 2	Alibaba	model	7.98K	173.33K		20B
2025-08-01	MegaDFT	ZGCA	paper
2025-08-01	Ai2 and UW Awarded $152M from NSF and NVIDIA for Open Scientific AI GeekWire	Ai2	news
2025-07-31	Seed-Prover	ByteDance	model	433
2025-07-30	dots.ocr	Xiaohongshu	model				3B
2025-07-29	Libra-Bench & PIE_bench	Meituan	dataset
2025-07-28	MixGRPO	Tencent	paper	1.14K
2025-07-28	★ GLM-4.5 2	Z.ai	model paper			2	355B				19
2025-07-27	SenseNova V6.5	SenseTime	model
2025-07-27	StepFun-Prover-Preview	StepFun	model	35	60		32B
2025-07-27	HunyuanWorld 3	Tencent	model	2.85K	615
2025-07-25	★ Step-3 2	StepFun	model paper	453	144.09K		321B	38B
2025-07-24	A.X 3.1	SK Telecom	model	13	305		34B
2025-07-24	SoftBank Corp. to Build the World's Largest AI Computing Infrastructure Using NVIDIA DGX SuperPOD with NVIDIA Blackwell GPUs SB Intuitions	SB Intuitions	news
2025-07-23	Towards Greater Leverage: Scaling Laws for Efficient MoE	Ant Group	paper
2025-07-23	ASI-Arch	SII	paper	726
2025-07-22	Qwen-Code	Alibaba	library	25.07K
2025-07-22	Qwen3-Coder 2	Alibaba	model	16.61K	1M		480B (max)	35B (max)			14	44/100
2025-07-22	Seed-X Series	ByteDance	model	172	130		7B
2025-07-22	Reka Raises $110M Series B at $1B Valuation Reka	Reka	news
2025-07-17	Agentar-DeepFinance-100K	Ant Group	dataset	35
2025-07-17	Apple Intelligence Foundation Models Tech Report 2025 Apple	Apple	news
2025-07-14	★ EXAONE 4.0	LG	model		33.89K		32B				6	28/100
2025-07-12	Scaling Laws for Optimal Data Mixtures	Apple	paper
2025-07-11	★ Kimi K2 4	Moonshot AI	model paper	10.84K	2.71M	5	1T	32B		256K	19	44/100
2025-07-10	FlexOlmo	Ai2	model	150	1.4K		33B
2025-07-10	KAT (Kwai-AutoThink) 2	Kuaishou	paper model		55			72B
2025-07-09	EXAONE Path 2.0	LG	paper
2025-07-09	★ Grok-4	SpaceX	model				~3.2T			256K	33	6/100
2025-07-09	Grok-4 Released with Native Tool Use and Reasoning xAI	SpaceX	news
2025-07-07	POLAR	PJLab	paper	167
2025-07-05	How to Train Your LLM Web Agent	ServiceNow	paper
2025-07-03	★ IFBench	Ai2	eval	140		1							300	58
2025-07-03	★ A.X 4.0	SK Telecom	model	158	653
2025-07-01	CodePRM: Execution Feedback-enhanced Process Reward Model for Code Generation	Huawei	paper
2025-07-01	Voxtral	Mistral	model		337.47K
2025-07-01	★ Solar Pro 2	Upstage	model				31B			64K	8	11/100
2025-06-30	★ openPangu	Huawei	announcement
2025-06-30	Meta Superintelligence Labs Created; Wang Named Chief AI Officer CNBC	Meta	news
2025-06-27	★ HyperCLOVA X THINK	Naver	model							128K
2025-06-27	Hunyuan-A13B 2	Tencent	model paper	816	46.54K		80B	13B		256K
2025-06-26	Kwai Keye-VL 3	Kuaishou	model paper	785	192.8K		31B	3B		262.14K
2025-06-25	OctoThinker	SII	model	188			8B (max)
2025-06-24	Video-XL-2	BAAI	model		54
2025-06-17	★ Mercury (Diffusion LLM)	Inception Labs	model			1				128K	25
2025-06-16	SciSage / SurveyScope	BAAI	library
2025-06-16	★ MiniMax-M1 2	MiniMax	model paper	3.15K	855		456B	45.9B		1M
2025-06-15	★ AI-Driven Agentic Design Platform for Tumor Immunotherapy Drugs	ZGCA	announcement
2025-06-15	ZGCA & ZGCI Unveil AI-Driven Tumor Immunotherapy Drug Design Platform Zhongguancun Academy	ZGCA	news
2025-06-13	Scientists' First Exam	PJLab	eval										830	66
2025-06-12	Seed-1.6 (AdaCoT)	ByteDance	model							256K
2025-06-12	★ Magistral	Mistral	model		38.49K	2	24B (max)				12	50/100
2025-06-12	Predictable Scale Part II: Farseer	StepFun	paper
2025-06-12	Seed-1.6 Introduces Adaptive Chain-of-Thought (AdaCoT) ByteDance Seed	ByteDance	news
2025-06-11	FlagEvalMM	BAAI	library	106
2025-06-10	Seedance 1.0	ByteDance	model
2025-06-09	RedNote joins AI race with its own open-source model that it says bests Alibaba, DeepSeek SCMP	Xiaohongshu	news
2025-06-07	★ The Illusion of Thinking	Apple	paper
2025-06-06	RoboBrain 2.0 2	BAAI	model	1.09K	573
2025-06-06	★ MiniCPM4 2	OpenBMB	model paper	9.42K	11.73K		8B
2025-06-06	Ultra-FineWeb	OpenBMB	dataset		30
2025-06-06	★ dots.llm1 2	Xiaohongshu	model paper				142B	14B	11.2T	32.77K
2025-06-05	RoboRefer / RefSpatial	BAAI	model	263
2025-06-04	MiMo-VL 2	Xiaomi	model paper	643	3.93K		7B
2025-06-04	Chinese social media app Xiaohongshu's $26 billion valuation bolsters GSR fund Bloomberg	Xiaohongshu	news
2025-06-01	HumanSense Benchmark	Ant Group	dataset
2025-06-01	BrowseComp & WideSearch	Moonshot AI	dataset
2025-06-01	kimi-agent-sdk	Moonshot AI	library	487
2025-06-01	kimi-cli	Moonshot AI	library	8.94K
2025-06-01	Kimi-Dev 2	Moonshot AI	model paper	1.23K	2.67K		72B
2025-06-01	Kimi-Researcher	Moonshot AI	model	81
2025-06-01	walle	Moonshot AI	library	21
2025-06-01	AgentCPM Series 3	OpenBMB	paper	800	1.67K
2025-06-01	A.X Encoder	SK Telecom	model		2.68K
2025-06-01	CF-Div2-Stepfun	StepFun	dataset
2025-06-01	SteptronOss	StepFun	library	575
2025-06-01	MiMo-Audio 2	Xiaomi	model paper				7B
2025-06-01	BrowseComp	Z.ai	dataset
2025-06-01	KTransformers	Z.ai	library	17.26K
2025-05-30	AReaL	Ant Group	library	5.29K
2025-05-28	Ming-Omni	Ant Group	model	656	42
2025-05-28	★ DeepSeek-R1-0528	DeepSeek	model				671B	37B
2025-05-28	Pangu Embedded	Huawei	paper		118
2025-05-28	★ Skywork Open Reasoner 1	Skywork	model	743			32B
2025-05-27	Pangu Pro MoE	Huawei	paper		133	1
2025-05-27	HunyuanVideo-Avatar	Tencent	paper
2025-05-26	SynLogic 2	MiniMax	paper dataset	203	121
2025-05-23	One RL to See Them All: Visual Triple Unified RL	MiniMax	paper	333
2025-05-22	★ Claude 4	Anthropic	model				~1.4T			1M	25	11/100
2025-05-22	XRing O1	Xiaomi	announcement
2025-05-21	Devstral 2	Mistral	model				123B			256K	15
2025-05-21	★ Falcon-H1	TII	model	118	11.02K		34B			256K
2025-05-20	BAGEL	ByteDance	model	6K	820		14B
2025-05-17	Video-SafetyBench	BAAI	eval	12									2.26K
2025-05-17	Model Merging in Pre-training of LLMs	ByteDance	paper
2025-05-15	BGE-Code-v1	BAAI	model	11.8K	4.84K
2025-05-15	Apriel-Nemotron-15B Reasoning Model with NVIDIA ServiceNow	ServiceNow	news
2025-05-14	★ AlphaEvolve	Google	paper			6
2025-05-12	Seed1.5-VL	ByteDance	model	1.58K			200B	20B
2025-05-12	MiniMax-Speech: Intrinsic Zero-Shot TTS	MiniMax	paper
2025-05-12	Step1X-3D: High-Fidelity Textured 3D Assets	StepFun	model	870
2025-05-10	Gated Attention for Large Language Models	Alibaba	paper
2025-05-08	Seed-Coder-8B	ByteDance	model	755			8B			65.54K
2025-05-07	DeerFlow	ByteDance	library	70.89K
2025-05-07	★ Pangu Ultra MoE 2	Huawei	model paper		16		718B	39B
2025-05-07	HunyuanCustom	Tencent	paper	1.22K		1
2025-05-06	CCI 4.0	BAAI	dataset
2025-05-06	OpenSeek	BAAI	model	261	4
2025-05-06	RoboOS 2	BAAI	library	575
2025-05-02	★ MiMo (Reasoning) 2	Xiaomi	model paper		91.71K	1	7B
2025-05-01	AWorld	Ant Group	library	1.2K
2025-05-01	★ Kanana 1.5	Kakao	model		89		15.7B	3B		32K
2025-04-30	★ Amazon Nova Premier	Amazon	model				~470B			1M	13	11/100
2025-04-30	DeepSeek-Prover-V2	DeepSeek	model	1.27K	633		671B
2025-04-30	Nova Premier Launched as Amazon's Most Capable AI Model TechCrunch	Amazon	news
2025-04-30	Phi-4 Reasoning Models Released with Chain-of-Thought Microsoft	Microsoft	news
2025-04-29	★ Qwen3 9	Alibaba	model paper	27.3K		74	1T	22B (max)			13
2025-04-29	First LlamaCon Developer Conference Meta AI	Meta	news
2025-04-25	PolyMath	Alibaba	dataset	44
2025-04-25	Kimi-Audio 2	Moonshot AI	model paper	4.65K	80.22K		7B
2025-04-24	Step1X-Edit	StepFun	model	2.22K	110
2025-04-22	TTRL: Test-Time Reinforcement Learning	PJLab	paper
2025-04-19	SRPO: Staged History-Resampling Policy Optimization	Kuaishou	paper
2025-04-17	Nemotron-CLIMB: Clustering-based Iterative Data Mixture Bootstrapping 3	NVIDIA	paper dataset			1
2025-04-16	★ o3	OpenAI	model				~3T			200K	30	6/100
2025-04-15	DataDecide	Ai2	paper
2025-04-15	ReTool: Reinforcement Learning for Strategic Tool Use in LLMs	ByteDance	paper			2
2025-04-15	★ Kling 2.0 1	Kuaishou	model
2025-04-15	Kimina-Prover 2	Moonshot AI	model paper	370	717
2025-04-15	miniF2F-test (Rectified)	Moonshot AI	dataset	370
2025-04-15	★ Apriel	ServiceNow	model	3	24		15B (max)		4.5T
2025-04-15	Step-R1-V-Mini	StepFun	model
2025-04-15	ZR1-1.5B	Zyphra	model		1.5K		1.5B
2025-04-15	Apriel-5B: ServiceNow's First Open SLM ServiceNow	ServiceNow	news
2025-04-14	InternVL3	PJLab	model	10.06K		5	78B (max)
2025-04-12	SenseNova V6	SenseTime	model				600B
2025-04-10	★ Scaling Laws for Native Multimodal Models	Apple	paper
2025-04-10	★ Seed1.5-Thinking: Advancing Superb Reasoning Models with RL	ByteDance	paper			1
2025-04-10	★ Pangu Ultra 2	Huawei	model paper	77			135B
2025-04-10	Kimi-VL 2	Moonshot AI	model paper	1.2K	123.09K	1	16B	2.8B
2025-04-08	Amazon Nova Sonic	Amazon	model
2025-04-08	Dream 7B	Huawei	model	1.25K			7B
2025-04-08	★ Skywork R1V Series	Skywork	model		41		38B
2025-04-08	Nova Sonic Speech-to-Speech Model Launched on Bedrock AWS	Amazon	news
2025-04-07	BaichuanMed-OCR	Baichuan	model		169		72B (max)
2025-04-05	★ Llama 4	Meta	model				400B	17B		1M	14	28/100
2025-04-05	Llama 4 Scout and Maverick Released (First MoE, Multimodal) Meta AI	Meta	news
2025-04-04	★ Nemotron-H	NVIDIA	model		58.35K		56B		20T
2025-04-03	DeepSeek-GRM: Inference-Time Scaling for Generalist Reward Modeling	DeepSeek	paper			1
2025-04-01	MiniMax Speech Series	MiniMax	model
2025-03-31	Amazon Nova Act	Amazon	model	909
2025-03-31	Amazon Unveils Nova Act, an AI Agent That Controls a Web Browser TechCrunch	Amazon	news
2025-03-30	★ ToRL: Scaling Tool-Integrated RL	SII	paper	349		1
2025-03-28	Doubao-Deep-Thinking	ByteDance	model
2025-03-27	OpenComplex 2	BAAI	model	269
2025-03-26	Qwen2.5-Omni-7B	Alibaba	model	4.02K	777.76K		7B
2025-03-25	★ Gemini 2.5 Pro	Google	model				~1.2T			1M	27	6/100
2025-03-24	SimpleRL-Zoo: Investigating and Taming Zero RL for Open Base Models	ByteDance, Meituan	paper
2025-03-21	Hunyuan-T1	Tencent	model
2025-03-18	Sable	BAAI	model	2
2025-03-18	★ Llama-Nemotron (Nano/Super/Ultra)	NVIDIA	model		123.53K	1				128K	9	53/100
2025-03-18	HaploVL	Tencent	model	65
2025-03-17	★ EXAONE Deep	LG	model		1.4K		32B
2025-03-16	★ ERNIE 4.5	Baidu	model	7.72K			424B (max)	47B (max)			9	56/100
2025-03-16	ERNIE X1	Baidu	model
2025-03-12	★ Gemma 3	Google	model		1.42M	40	27B			128K
2025-03-10	Seedream 2.0	ByteDance	paper
2025-03-10	Reka Flash 3 Released (Open-Weight, 21B) Reka	Reka	news
2025-03-07	★ Ling 2	Ant Group	paper model	258	38.63K	1	16B
2025-03-06	QwQ-32B	Alibaba	model		52.45K		32B
2025-03-06	BGE-VL 2	BAAI	model dataset	11.8K	4.55K
2025-03-06	Predictable Scale Part I: Step Law	StepFun	paper			1
2025-03-04	CogView-4	Z.ai	model	1.1K
2025-03-03	★ Aya Vision	Cohere	model		160.27K		32B
2025-03-01	★ Command A	Cohere	model		2.28K		111B			256K	8	33/100
2025-03-01	★ Sarashina2.2 3	SB Intuitions	model		12.98K		4B
2025-02-28	3FS (Fire-Flyer File System)	DeepSeek	library	9.96K
2025-02-28	Smallpond	DeepSeek	library	4.96K
2025-02-28	Image-01	MiniMax	model
2025-02-27	RoboBrain	BAAI	model	553	132
2025-02-27	UniTok	ByteDance	paper	526
2025-02-27	DualPipe	DeepSeek	library	2.96K
2025-02-27	EPLB (Expert Parallelism Load Balancer)	DeepSeek	library	1.39K
2025-02-27	Hunyuan Turbo S 2	Tencent	model paper				560B	56B
2025-02-26	DeepGEMM	DeepSeek	library	7.36K
2025-02-26	BIG-Bench Extra Hard (BBEH)	Google	eval										4.52K	23
2025-02-26	★ Kanana	Kakao	model	280		1	32.5B		3T
2025-02-26	Granite 3.2: Multimodal Vision and Chain-of-Thought Reasoning IBM	IBM	news
2025-02-25	DeepEP	DeepSeek	library	9.71K
2025-02-24	★ Claude Code	Anthropic	library	131.53K
2025-02-24	Baichuan-Audio 2	Baichuan	paper model	222	103		10B
2025-02-24	FlashMLA	DeepSeek	library	12.7K
2025-02-24	Reasoning with Latent Thoughts: On the Power of Looped Transformers	Google	paper			1
2025-02-24	Muon Optimizer 2	Moonshot AI	paper library			1
2025-02-24	Topic Over Source: The Key to Effective Data Mixing for LLM Pre-training	PJLab	paper			1
2025-02-22	Moonlight-3B/16B	Moonshot AI	model	1.49K	32.57K	1	16B (max)
2025-02-20	★ SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines	ByteDance	eval	188		4							26.53K	285
2025-02-19	Qwen2.5-VL	Alibaba	model			52
2025-02-19	FlexTok	Apple	paper	319
2025-02-18	MoBA: Mixture of Block Attention for Long-Context LLMs	Moonshot AI	paper	2.13K		2
2025-02-18	Hunyuan-Large-Vision	Tencent	model				389B	52B
2025-02-17	Mistral Saba	Mistral	model				24B			32K
2025-02-17	OpenDWM / MaskGWM 2	SenseTime	library paper	398
2025-02-17	★ Grok-3	SpaceX	model				~2.1T			1M	18
2025-02-17	Step-Audio / Step-Audio2	StepFun	model	27
2025-02-17	Grok-3 Launched, Trained on 200K GPU Colossus Cluster xAI	SpaceX	news
2025-02-16	AdaGC: Improving Training Stability for Large Language Model Pretraining	Baidu	paper
2025-02-16	NSA: Native Sparse Attention	DeepSeek	paper			2
2025-02-15	1bit-Merging 1	Huawei	paper
2025-02-14	WebOrganizer	Ai2	paper
2025-02-14	★ LLaDA 2	Ant Group	model	3.82K	2.27K	5
2025-02-14	Step-Video-T2V 2	StepFun	model paper	3.19K			300B
2025-02-12	Wu Yonghui Joins ByteDance as Head of Seed Basic Research SCMP	ByteDance	news
2025-02-11	★ Nature Language Model (NatureLM)	Microsoft	model	46		1	46.7B	13B
2025-02-05	★ Scaling Laws for Upcycling Mixture-of-Experts Language Models	SB Intuitions	paper	9
2025-02-05	★ LIMO	SII	paper		54	3
2025-02-04	OpenAI and Kakao to Jointly Develop AI Products for South Korea CNBC	Kakao	news
2025-02-01	ModernBERT-Ja	SB Intuitions	model		44.81K		310M
2025-01-30	MedXpertQA: Benchmarking Expert-Level Medical Reasoning and Understanding	PJLab	paper
2025-01-30	MedXpertQA	PJLab	eval										4.46K
2025-01-26	Baichuan-Omni-1.5 2	Baichuan	paper model	191	1.84K		7B
2025-01-24	Baichuan-M1 3	Baichuan	paper model		821	5	14.5B
2025-01-24	FireRedASR	Xiaohongshu	model				8.3B (max)
2025-01-23	UltraRAG	OpenBMB	library	5.58K
2025-01-22	★ Doubao-1.5-Pro	ByteDance	model				200B	20B		256K
2025-01-22	UI-TARS	ByteDance	library	10.9K	621.09K	5
2025-01-22	★ DeepSeek-R1	DeepSeek	model		5.35M		671B	37B			20	50/100
2025-01-22	Revisit Self-Debugging with Self-Generated Tests for Code Generation	Meituan	paper			1
2025-01-21	Hunyuan3D 2.0 3	Tencent	model paper	13.93K	75.95K	6
2025-01-21	Stargate Project: $500B AI Infrastructure Initiative OpenAI	OpenAI	news
2025-01-20	★ Kimi k1.5 2	Moonshot AI	model paper	3.47K		11
2025-01-17	ComplexFuncBench	Z.ai	dataset	180
2025-01-15	★ InternLM3	PJLab	model	7.22K
2025-01-14	★ MiniMax-01 3	MiniMax	model paper	3.43K	172.55K	1	456B			4M
2025-01-14	★ MiniCPM-o 2.6	OpenBMB	model	25.58K	386.43K		8B
2025-01-10	GThinker	PCL	model
2025-01-09	WanJuan 3.0 (WanJuan-SiLu)	PJLab	dataset
2025-01-07	★ Cosmos	NVIDIA	model			9
2025-01-03	AgentRefine	Meituan	paper
2025-01-01	Document Parse	Upstage	library
2024-12-31	★ OLMo 2	Ai2	model		3.72K	2	32B		6T	4.1K
2024-12-26	★ DeepSeek-V3 3	DeepSeek	model paper			227	671B	37B			10
2024-12-25	QVQ	Alibaba	model
2024-12-24	★ LLM-jp-3 (172B)	NII	model		20		172B		2.1T	4.1K
2024-12-23	Baichuan4-Finance 2	Baichuan	paper model			1
2024-12-18	NOVA (Non-quantized Video Autoregressive)	BAAI	model	651
2024-12-13	DeepSeek-VL2	DeepSeek	model	5.3K	2.68K	22
2024-12-13	Liquid AI Raises $250M Series A Led by AMD Liquid AI	Liquid AI	news
2024-12-13	Profile: Shanghai AI Lab: Driving both AI safety and development MERICS	PJLab	news
2024-12-12	★ Phi-4	Microsoft	model		801.85K	24	14B				5	50/100
2024-12-12	Phi-4 Released: 14B SLM Specializing in Complex Reasoning Microsoft Research	Microsoft	news
2024-12-09	ProcessBench	Alibaba	dataset			1
2024-12-06	★ Aya Expanse	Cohere	model		48.1K	3	32B			128K
2024-12-06	★ EXAONE 3.5	LG	model		10.25K	2	32B
2024-12-06	Densing Law of LLMs	OpenBMB	paper			2
2024-12-06	InternVL 2.5	PJLab	model	10.06K	349	13	78B (max)
2024-12-05	Language Model Ladders	Ai2	paper
2024-12-05	Infinity & InfinityStar	ByteDance	model	1.57K		1
2024-12-05	Liquid: Scalable Multi-modal Generation	ByteDance	model	643		1
2024-12-05	Divot	Tencent	model	87
2024-12-05	Moto	Tencent	paper	177
2024-12-04	GenCast	Google	paper
2024-12-04	RedStone	Microsoft	dataset	161
2024-12-03	Amazon Nova	Amazon	model			2	~90B			300K	8	11/100
2024-12-03	AWS Trainium2 (Trn2 / Trn2 UltraServer)	Amazon	announcement
2024-12-03	HunyuanVideo	Tencent	model	12.19K		6	13B
2024-12-03	SEED-Voken	Tencent	paper	1.01K
2024-12-03	GLM-4-Voice: End-to-End Spoken Chatbot	Z.ai	model			2
2024-12-03	AWS Trainium2 Chips Generally Available; Trainium3 Previewed TechCrunch	Amazon	news
2024-12-01	★ Falcon 3	TII	model				10B
2024-11-25	Model Context Protocol (MCP)	Anthropic	library
2024-11-22	★ Tülu 3	Ai2	model	3.75K	2.6K	7
2024-11-22	★ Zamba2 (Hybrid SSM/Transformer Suite)	Zyphra	model	193	110	3	7.4B		3T
2024-11-22	Amazon Doubles Anthropic Investment to $8 Billion CNBC	Amazon	news
2024-11-21	★ AIMv2	Apple	paper	1.42K		2
2024-11-21	DINO-X 2	IDEA Lab	model paper	1.39K		4
2024-11-20	Hymba	NVIDIA	paper		279	2
2024-11-19	Aquila-VL-2B	BAAI	model		49
2024-11-15	MARS (Make vAriance Reduction Shine)	ByteDance	paper	721
2024-11-08	★ Sarashina2-8x70B	SB Intuitions	model		12
2024-11-08	SB Intuitions releases 460B-parameter Japanese LLM Sarashina2-8x70B for academia and industry SB Intuitions	SB Intuitions	news
2024-11-04	★ Hunyuan-Large 2	Tencent	model paper	1.59K	482	6	389B	52B	7T	256K
2024-11-04	Hunyuan3D 1.0	Tencent	model	3.48K
2024-11-01	SimpleQA	OpenAI	eval			12							4.33K
2024-11-01	InternThinker	PJLab	model
2024-10-29	Agentforce Platform Launched for Enterprise AI Agents Salesforce	Salesforce	news
2024-10-28	AutoGLM 2	Z.ai	model paper			1
2024-10-28	Zhongguancun Institute of Artificial Intelligence Established ZGCI	ZGCA	news
2024-10-24	Infinity-MM	BAAI	dataset
2024-10-24	MotionCLR 2	IDEA Lab	library paper	17
2024-10-24	★ Skywork-Reward 2	Skywork	model		37
2024-10-22	OmniGen	BAAI	model	4.33K	17	1
2024-10-17	Janus 4	DeepSeek	model paper dataset	17.75K	10.89K	11
2024-10-15	Zyda-2	Zyphra	dataset
2024-10-11	Baichuan-Omni 2	Baichuan	paper model	273			7B
2024-10-09	MLE-bench	OpenAI	eval	1.57K		9								75
2024-10-09	★ PLaMo-100B	PFN	model		118		100B		2T
2024-10-09	Demis Hassabis & John Jumper Awarded Nobel Prize in Chemistry Google DeepMind	Google	news
2024-10-07	★ Falcon Mamba	TII	model		123.63K	3	7.27B		5.8T	8.19K
2024-10-02	★ Llama-3.1-Nemotron-70B	NVIDIA	model		58	4				128K
2024-10-02	Poolside Raises $500M Series B at ~$3B Valuation, Led by Bain Capital Ventures Crunchbase News	Poolside	news
2024-10-01	TxT360	MBZUAI	dataset	22
2024-09-27	★ Emu3 2	BAAI	paper	2.42K		4
2024-09-25	★ Molmo	Ai2	model	913	2.17K	8
2024-09-23	MobileUI Dataset	Xiaomi	dataset	79
2024-09-23	MobileVLM	Xiaomi	model	79
2024-09-19	★ Qwen2.5 3	Alibaba	model paper	27.3K		74	72B (max)		18T		10
2024-09-18	Qwen2-VL	Alibaba	model		27.42K	76
2024-09-18	Qwen2.5-Coder 2	Alibaba	model paper	16.61K		37	480B (max)				7
2024-09-18	Qwen2.5-Math 2	Alibaba	model paper	1.08K		12	72B (max)
2024-09-12	★ o1	OpenAI	model				~3.5T			200K	23	6/100
2024-09-11	★ Pixtral 12B	Mistral	model		4.18K	6	12B			128K
2024-09-05	★ AdEMAMix Optimizer	Apple	paper
2024-09-05	★ DeepSeek-V2.5	DeepSeek	model		7.71K
2024-09-05	★ MiniCPM3-4B	OpenBMB	model	9.42K	40.57K		4B
2024-09-05	Open-MAGVIT2	Tencent	library	1.01K
2024-09-05	FireRedTTS	Xiaohongshu	model
2024-09-05	Silvio Savarese Named to TIME 100 Most Influential in AI Salesforce	Salesforce	news
2024-09-03	★ OLMoE	Ai2	model	1.03K	103.29K	7	6.9B	1.3B	5.13T	4.1K
2024-08-31	Hailuo AI (Video-01 / 2.3)	MiniMax	model
2024-08-29	CogVLM2	Z.ai	model			6
2024-08-28	Auxiliary-Loss-Free Load Balancing Strategy	DeepSeek	paper			6
2024-08-26	Fire-Flyer AI-HPC: Cost-Effective Software-Hardware Co-Design	DeepSeek	paper
2024-08-21	Minitron	NVIDIA	paper	380		3
2024-08-21	★ Sarashina2	SB Intuitions	model		913		70B		2.1T	4.1K
2024-08-12	CogVideoX: Text-to-Video Diffusion Models	Z.ai	model	12.78K		16
2024-08-07	★ EXAONE 3.0	LG	model		41.67K	2	7.8B
2024-08-05	MiniCPM-V 2.6	OpenBMB	model	25.58K		23	8B
2024-08-01	EXAONEPath 1.0	LG	paper			1
2024-08-01	MiniMax Music Series	MiniMax	model
2024-07-29	★ Apple Foundation Models (AFM)	Apple	paper			4
2024-07-29	MindSearch	PJLab	library	6.87K		2
2024-07-24	★ Mistral Large 2	Mistral	model		6.26K		123B			128K
2024-07-23	★ Llama 3.1	Meta	model		213.13K		405B		15.6T	128K	9	39/100
2024-07-22	RazorAttention: KV Cache Compression Through Retrieval Heads	Huawei	paper
2024-07-20	Consent in Crisis: The Rapid Decline of the AI Data Commons	Cohere	paper			11
2024-07-20	★ Falcon 2	TII	model		4.48K	2	11B		5.5T	8.19K
2024-07-18	Mistral NeMo	Mistral	model		32.02K		12B			128K
2024-07-16	Codestral Mamba	Mistral	model		47.98K		7.3B
2024-07-11	EchoMimicV2 & V3	Ant Group	paper	4.25K		2
2024-07-11	Skywork-Math	Skywork	paper
2024-07-06	Kolors 2	Kuaishou	model paper	4.61K	401
2024-07-05	★ SenseNova 5.5	SenseTime	model
2024-07-05	Vimi	SenseTime	model
2024-07-04	★ LLM-jp (v1/v2)	NII	model			5	13B
2024-07-04	★ Step-2	StepFun	model				1T
2024-07-03	LivePortrait 2	Kuaishou	library paper	18.53K	7.37K	9
2024-07-03	★ InternLM2.5	PJLab	model	7.22K	106.78K					1M
2024-07-02	InternVL 2.0	PJLab	model	10.06K	305.47K		108B (max)
2024-07-01	QPlanner	LG	paper
2024-07-01	Mathstral 7B	Mistral	model		19.29K		7B
2024-07-01	MMLongBench-Doc	PJLab	dataset	147		1
2024-06-28	ERNIE 4.0 Turbo	Baidu	model
2024-06-26	Zhang Hongjiang, founder of BAAI: 'AI systems should never be able to deceive humans' Financial Times	BAAI	news
2024-06-24	Mooncake 2	Moonshot AI	paper dataset	5.55K		13
2024-06-24	★ Large Vocabulary Size Improves Large Language Models	SB Intuitions	paper			1
2024-06-20	★ Claude 3.5 Sonnet	Anthropic	model							200K	10	11/100
2024-06-17	AquilaMed-RL	BAAI	model		24	1
2024-06-17	DeepSeek-Coder-V2 2	DeepSeek	model paper	6.83K	4.01K	48					5
2024-06-17	★ Nemotron-4 340B	NVIDIA	model		543	7	340B		9T	4.1K
2024-06-14	MASt3R	Naver	paper	2.99K		1
2024-06-12	SciRIFF	Ai2	dataset			1
2024-06-11	Dasheng 3	Xiaomi	paper model	424			7B
2024-06-10	LlamaGen	ByteDance	model	1.95K		5
2024-06-10	Apple Intelligence Introduced at WWDC 2024 Apple	Apple	news
2024-06-06	★ Qwen2 2	Alibaba	model paper	27.3K		49	72B (max)	14B (max)			6
2024-06-06	★ Kling 2	Kuaishou	model paper
2024-06-05	GLM-4V	Z.ai	model	2.33K
2024-06-04	Seed-TTS 2	ByteDance	model paper	1.56K		4
2024-06-03	★ Skywork-MoE	Skywork	model	140	556	3	146B	22B
2024-06-01	agentUniverse	Ant Group	library	2.27K		1
2024-06-01	FlagScale	BAAI	library	518
2024-06-01	Yuan Embedding	Inspur	model		3.53K
2024-05-30	MotionLLM 2	IDEA Lab	model paper	386		6
2024-05-29	★ Codestral	Mistral	model		13.02K		22B			32K
2024-05-28	Yuan 2.0-M32	Inspur	model	195	1.68K	1	40B	3.7B
2024-05-27	RLAIF-V	OpenBMB	paper	455	22	3
2024-05-23	DeepSeek-Prover	DeepSeek	model	577	118	6
2024-05-22	★ Baichuan 4	Baichuan	model
2024-05-21	★ Scaling Monosemanticity	Anthropic	paper
2024-05-16	★ Grounding DINO 1.5 2	IDEA Lab	model paper	1.12K		11
2024-05-15	ByteFF	ByteDance	model	84
2024-05-14	Piccolo2 Embedding Model 2	SenseTime	model paper	145	43	1
2024-05-14	Hunyuan-DiT 2	Tencent	model paper	4.29K		1	1.5B
2024-05-13	★ GPT-4o	OpenAI	model				~720B			128K	11	6/100
2024-05-13	Plot2Code	Tencent	dataset	24		1
2024-05-08	AlphaFold 3	Google	paper
2024-05-07	★ DeepSeek-V2 2	DeepSeek	model paper	5.01K	5.35K	102	236B	21B			4
2024-05-07	★ Granite Code	IBM	model	1.25K		10	34B		4.5T
2024-04-26	llm-jp-corpus	NII	dataset	47		4
2024-04-25	InternVL 1.5	PJLab	model	10.06K	9.81K	16	26B
2024-04-25	ShareGPT-4o	PJLab	dataset			16
2024-04-24	★ SenseNova 5.0	SenseTime	model							200K
2024-04-22	OpenELM	Apple	model			2	3B
2024-04-22	SEED-X	Tencent	model	558		1
2024-04-18	★ Reka Core, Flash, and Edge	Reka	model		125	3	67B			128K	4	39/100
2024-04-17	★ ABAB 6 / 6.5	MiniMax	model
2024-04-17	★ Mixtral 8x22B	Mistral	model		4.66K		141B	39B		64K
2024-04-12	★ MiniCPM-V 4	OpenBMB	model paper	25.58K	163.09K	23	9B
2024-04-11	MiniCPM-V 2.0	OpenBMB	model		67.98K		2.8B
2024-04-03	VAR (Visual Autoregressive Modeling)	ByteDance	model	8.7K		7
2024-04-02	★ HyperCLOVA X	Naver	model			6
2024-04-01	RULER: What's the Real Context Size of Your LLM?	NVIDIA	eval			11								13
2024-03-28	★ Jamba	AI21 Labs	model		515	41	398B	94B		256K	5	22/100
2024-03-28	Dataverse	Upstage	library	564
2024-03-28	sDPO	Upstage	paper			2
2024-03-28	Jamba: First Production-Grade SSM-Transformer Hybrid Released AI21 Labs	AI21 Labs	news
2024-03-23	★ Step-1	StepFun	model				130B
2024-03-23	Step-1V / 1.5V / 2V	StepFun	model
2024-03-23	Understanding Emergent Abilities from the Loss Perspective	Z.ai	paper			1
2024-03-19	MergeKit	Arcee	library	7.13K		3
2024-03-17	★ Grok-1	SpaceX	model				314B	78B		8.19K
2024-03-12	★ Command R / R+	Cohere	model		33.71K		104B			128K
2024-03-11	Unraveling the Mystery of Scaling Laws: Part I	Meituan	paper
2024-03-08	DeepSeek-VL	DeepSeek	model	4.13K	8.76K	45
2024-03-08	CogView3	Z.ai	model		144
2024-03-04	★ Claude 3	Anthropic	model							200K	12	11/100
2024-03-01	Kimi 2M	Moonshot AI	model							2M
2024-02-28	WanJuan 2.0 (WanJuan-CC)	PJLab	dataset
2024-02-27	BioT5+	Microsoft	paper			4
2024-02-23	MegaScale	ByteDance	library			24
2024-02-21	SDXL-Lightning	ByteDance	model		72.67K	6
2024-02-21	★ Gemma	Google	model		26.93K	225	7B
2024-02-15	★ Gemini 1.5 Pro	Google	model			282				1M	10	6/100
2024-02-15	SAMformer 2	Huawei	model paper	190		1
2024-02-15	★ Sora	OpenAI	model
2024-02-07	★ Moirai	Salesforce	model	1.52K	109.96K	32	311M
2024-02-06	★ SenseNova 4.0	SenseTime	model
2024-02-05	★ BGE-M3 2	BAAI	model paper	11.8K	28.87M	52
2024-02-05	DeepSeek-Math 2	DeepSeek	model paper	3.32K		69
2024-02-04	★ Qwen1.5	Alibaba	model				110B (max)
2024-02-01	★ OLMo	Ai2	model	6.53K	2.4K	9	7B		2.46T	2.05K
2024-02-01	★ Aya 101	Cohere	model		8.81K	12	13B
2024-02-01	★ MiniCPM 3	OpenBMB	model paper	9.42K	3.73K	20	2B (max)
2024-01-31	★ Dolma	Ai2	dataset	1.51K		9
2024-01-30	YOLO-World	Tencent	paper	6.4K		28
2024-01-29	★ Baichuan 3	Baichuan	model
2024-01-23	★ InternLM2 2	PJLab	model paper	7.22K	19.21K	27
2024-01-20	TFLOP	Upstage	paper	51
2024-01-19	Depth Anything	ByteDance	model	8.26K		22
2024-01-17	★ AlphaGeometry	Google	paper
2024-01-17	★ GLM-4	Z.ai	model	7.07K
2024-01-15	SciGLM / SciInstruct	Z.ai	paper			4
2024-01-11	★ DeepSeek-MoE 2	DeepSeek	model paper		18.65K	16	16B	2.8B
2024-01-09	Lightning Linear Attention	Ant Group	paper			2
2024-01-09	Baichuan-NPC	Baichuan	model
2024-01-04	LLaMA Pro	Tencent	model	513	1.37K	1
2024-01-01	VSAG	Ant Group	library	482
2024-01-01	FlagAI	BAAI	library	3.87K
2023-12-28	PanGu-pi 3	Huawei	model paper	3.16K		2	7B (max)
2023-12-28	★ Spike No More: Stabilizing the Pre-training of Large Language Models	SB Intuitions	paper			2
2023-12-23	★ SOLAR 10.7B	Upstage	model		51.95K	7	10.7B				6
2023-12-22	GraphCast	Google	paper	6.67K		170
2023-12-21	DUSt3R	Naver	paper	7.19K		3
2023-12-21	★ InternVL: Scaling up Vision Foundation Models	PJLab	model	10.06K		17	6B
2023-12-20	★ Emu2	BAAI	model	1.77K	564	7
2023-12-16	Paloma	Ai2	dataset
2023-12-11	★ Mixtral 8x7B	Mistral	model		60.57K	120	46.7B	12.9B		32K
2023-12-06	★ Gemini 1.0	Google	model			809				32K	5	6/100
2023-12-05	★ MLX	Apple	library	26.82K
2023-12-05	Lenna 2	Meituan	model paper	87		1
2023-12-05	ReasonDet	Meituan	dataset	87		1
2023-11-29	★ DeepSeek-LLM 2	DeepSeek	model paper	7.03K	1.57K	87	67B
2023-11-29	GNoME (Materials Discovery)	Google	paper	1.19K
2023-11-28	★ Falcon (7B / 40B / 180B)	TII	model		31	115	180B		3.5T	2.05K
2023-11-27	MagicAnimate & Make Pixels Dance	ByteDance	model	10.91K		7
2023-11-27	Yuan 2.0	Inspur	model	688			102.6B (max)
2023-11-27	UniRepLKNet	Tencent	model	1.07K		34
2023-11-22	T-Rex 3	IDEA Lab	model paper	2.68K		8
2023-11-20	★ GPQA: Graduate-Level Google-Proof Q&A	Anthropic, Ai2	eval	510		21							198
2023-11-14	Qwen2-Audio 2	Alibaba	model paper	2.08K	1.48K	27	7B (max)
2023-11-06	CogVLM	Z.ai	model	6.74K		77
2023-11-02	DeepSeek-Coder 2	DeepSeek	model paper	23.66K	6.35K	107	33B (max)
2023-10-30	★ Skywork-13B	Skywork	model		729	11	13B		3.2T	4.1K
2023-10-25	DiQAD	Baidu	dataset	1
2023-10-19	KwaiYiiMath 2	Kuaishou	model paper
2023-10-17	★ ERNIE 4.0	Baidu	model
2023-10-17	★ BitNet	Microsoft	paper			63
2023-10-13	VideoCrafter	Tencent	library	5.06K
2023-10-12	★ Aquila2	BAAI	model	445	36		34B (max)
2023-10-09	★ Kimi-v1	Moonshot AI	model							200K
2023-10-05	★ MathCoder	PJLab	paper	339	30	4
2023-10-04	SEED / SEED-LLaMA	Tencent	model	641		18
2023-10-01	DALL-E 3	OpenAI	model
2023-09-29	ToRA: Tool-Integrated Reasoning Agent	Microsoft	paper	1.12K		21
2023-09-28	PLaMo-13B	PFN	model		162		13B			4.1K
2023-09-27	★ Mistral 7B	Mistral	model	10.81K	479.84K	288	7.3B			32K
2023-09-26	InternLM-XComposer	PJLab	model	2.92K	13.08K	31
2023-09-25	Qwen-Agent	Alibaba	library	16.51K
2023-09-25	qwen.cpp	Alibaba	library	627
2023-09-21	★ PengCheng-Mind	PCL	model	6	4		200B		1.5T
2023-09-08	Ant Financial LLM	Ant Group	model
2023-09-08	CodeFuse	Ant Group	model
2023-09-08	Fin-Eval	Ant Group	dataset
2023-09-07	★ Hunyuan-LLM	Tencent	model
2023-09-06	★ Baichuan 2 3	Baichuan	paper model	4.1K	210.77K	125	13B		2.6T
2023-09-01	OpenSPG & OpenAGL	Ant Group	library	2.12K
2023-09-01	XTuner	PJLab	library	5.15K
2023-08-31	★ ERNIE 3.5	Baidu	model
2023-08-31	Belebele	Meta	eval			1							109.8K	122
2023-08-30	★ JAIS	MBZUAI	model		1.06K	23	30B		1.63T
2023-08-29	LongBench	Z.ai	eval	1.19K		10							4.75K	21
2023-08-24	Qwen-VL	Alibaba	model		136.35K	139
2023-08-24	Code Llama	Meta	model			398	70B
2023-08-22	Lagent & AgentLego	PJLab	library	2.26K
2023-08-21	WanJuan 1.0 Corpus	PJLab	dataset			8
2023-08-20	ViT-Lens	Tencent	paper	190		1
2023-08-18	KwaiYii	Kuaishou	model				175B
2023-08-15	★ Aquila 2	BAAI	model paper	445	1.06K		33B (max)
2023-08-11	MiLM-6B	Xiaomi	model	458			6B
2023-08-08	Baichuan-53B	Baichuan	model				53B
2023-08-04	SoftBank launches an OpenAI for Japan: SB Intuitions, building LLMs and generative AI in Japanese TechCrunch	SB Intuitions	news
2023-08-03	★ Qwen 3	Alibaba	model paper	21.27K	18.12K	89	72B (max)
2023-08-02	★ BGE Text Embeddings	BAAI	model	11.8K	13.9M
2023-08-01	FlagEmbedding & C-MTEB 2	BAAI	library dataset	11.8K		70
2023-08-01	★ ABAB 5 / 5.5	MiniMax	model
2023-07-31	ToolLLM: Facilitating LLMs to Master 16000+ APIs	OpenBMB	paper	5.66K		69
2023-07-30	SEED-Bench	Tencent	eval	364		58							19.24K	12
2023-07-19	★ EXAONE 2.0	LG	model
2023-07-18	★ Llama 2	Meta	model		11	2.62K	70B		2T	4.1K	3
2023-07-16	ChatDev 2	OpenBMB	paper library	33.36K		69
2023-07-13	InternVid	PJLab	dataset	2.28K		32
2023-07-11	★ Emu	BAAI	model	1.77K		29
2023-07-11	Baichuan-13B	Baichuan	model	2.93K	14.72K		13B
2023-07-07	★ SenseNova 2.0 Upgrade	SenseTime	model
2023-07-06	★ InternLM-1.0	PJLab	model	7.22K	1.68K		104B
2023-07-05	PanGu-Weather 2	Huawei	model paper	1.36K		125
2023-07-01	InternEvo	PJLab	library	420
2023-07-01	OpenCompass	PJLab	library	7.08K
2023-06-25	ChatGLM2 / ChatGLM3	Z.ai	model	13.68K	124.79K		6B
2023-06-23	MME (Multimodal Evaluation)	BAAI	eval	17.87K									2.37K	14
2023-06-20	★ Phi-1 ("Textbooks Are All You Need")	Microsoft	model			99	1.3B
2023-06-20	UniAD	SenseTime, PJLab	paper	4.64K		1
2023-06-15	★ Baichuan-7B	Baichuan	model	5.65K	148.39K		7B		1.2T
2023-06-14	WebGLM	Z.ai	paper	1.6K		2
2023-06-12	detrex 2	IDEA Lab	library paper	2.29K		15
2023-06-07	AlphaDev	Google	paper
2023-06-01	DB-GPT	Ant Group	library	18.97K
2023-06-01	DLRover	Ant Group	library	1.66K
2023-06-01	LMDeploy	PJLab	library	7.9K
2023-06-01	★ RefinedWeb	TII	dataset			156
2023-05-31	Let's Verify Step by Step	OpenAI	paper	2.14K		30
2023-05-30	GPT4Tools	Tencent	paper	770		33
2023-05-29	Mix-of-Show	Tencent	paper	431		30
2023-05-27	CPM-Bee	OpenBMB	model	2.41K		3	10B
2023-05-22	★ Grouped Query Attention (GQA)	Google	paper			27
2023-05-20	PengCheng-Nebula	PCL	announcement
2023-05-17	★ Ziya LLM 4	IDEA Lab	model	4.13K	1.83K
2023-05-04	★ StarCoder	ServiceNow	model		22.54K	192	15.5B		1T	8.19K
2023-04-20	UltraChat & UltraFeedback	OpenBMB	dataset	2.86K
2023-04-11	SenseChat / SenseNova Launch	SenseTime	model
2023-04-11	SenseMirage	SenseTime	model				10B (max)
2023-04-10	Stable-DINO 2	IDEA Lab	library paper	242		5
2023-04-06	Grounded SAM 3	IDEA Lab	library paper	17.63K		90
2023-04-05	★ Segment Anything (SAM)	Meta	model	54.33K		538
2023-04-01	BMTools	OpenBMB	library	2.77K
2023-03-27	EVA-CLIP	BAAI	model	2.68K		80
2023-03-27	Qianfan Platform	Baidu	announcement
2023-03-20	★ PanGu-Sigma 2	Huawei	model paper			7	1.1T		329B
2023-03-14	OpenSeeD 2	IDEA Lab	library paper	759		2
2023-03-14	★ GPT-4	OpenAI	model				~666B			128K	7	6/100
2023-03-14	★ ChatGLM-6B	Z.ai	model	41.05K	1.28K	176	6B
2023-03-09	★ Grounding DINO 2	IDEA Lab	model paper	10.25K	2.22M	245
2023-02-27	★ LLaMA	Meta	model			3.9K	65B
2023-01-30	★ BLIP-2	Salesforce	model		581.31K	914
2023-01-23	Microsoft Extends Multibillion-Dollar OpenAI Partnership Microsoft	Microsoft	news
2023-01-01	FlagEvaluation	BAAI	library	13
2022-12-22	Tune-A-Video	Tencent	paper	4.37K		27
2022-12-15	★ Constitutional AI	Anthropic	paper			306
2022-12-06	InternVideo / InternVideo2	PJLab	model	2.28K		93
2022-12-05	Painter	BAAI	model	2.6K		10
2022-11-30	★ Speculative Decoding	Google	paper			34
2022-11-12	AltCLIP & AltDiffusion	BAAI	model	3.87K	111.27K	10
2022-11-10	InternImage 2	PJLab	model paper	2.83K		41
2022-11-02	Chinese CLIP	Alibaba	model	5.93K		53
2022-11-02	Taiyi 3	IDEA Lab	model paper		419	2
2022-10-06	ByteTransformer	ByteDance	library	479		1
2022-09-30	CodeGeeX 2	Z.ai	model paper	8.79K		49
2022-09-21	★ Whisper	OpenAI	model	102.39K	5.05M	1.16K	1.55B
2022-09-16	CPM-Ant	OpenBMB	model	500			10B
2022-09-03	TuGraph	Ant Group	library	1.74K
2022-08-24	★ GLM-130B 2	Z.ai	model paper	7.65K		296	130B
2022-08-01	COYO-700M	Kakao	dataset	1.26K
2022-07-22	PanGu-Coder 3	Huawei	model paper	3.16K	5	36	2.6B (max)
2022-07-04	SecretFlow	Ant Group	library	2.67K
2022-06-24	YOLOv6 2	Meituan	library paper	5.89K		1.75K
2022-06-06	Mask DINO 2	IDEA Lab	library paper	1.54K		20
2022-06-01	Vision GNN (ViG) 2	Huawei	model paper	4.42K		194
2022-05-02	★ OPT (Open Pre-trained Transformer)	Meta	model				175B
2022-04-26	CogView2	Z.ai, BAAI	model	955		121
2022-04-12	★ Training a Helpful and Harmless Assistant (HH-RLHF)	Anthropic	paper			365
2022-04-05	★ PaLM	Google	model			2.13K	540B
2022-03-29	★ Chinchilla (Compute-Optimal Training)	Google	paper			663
2022-03-25	★ CodeGen	Salesforce	model	5.17K	1.81K	235	16.1B
2022-03-20	Delta Tuning 2	OpenBMB	paper library	1.04K
2022-03-07	DINO (DETR) 2	IDEA Lab	library paper	2.81K		760
2022-03-04	★ InstructGPT (RLHF)	OpenAI	paper			4.29K
2022-03-02	DN-DETR 2	IDEA Lab	library paper	605		55
2022-02-11	BMTrain	OpenBMB	library	624
2022-02-08	★ AlphaCode	Google	paper
2022-02-07	OFA: One For All	Alibaba	model	2.56K		258
2022-01-30	FEDformer 2	Huawei	model paper	802		541
2022-01-28	★ Chain-of-Thought Prompting	Google	paper			4.25K
2022-01-28	DAB-DETR 2	IDEA Lab	library paper	579		397
2022-01-25	SPIRAL 2	Huawei	model paper	604		7
2022-01-24	SenseCore AI Infrastructure	SenseTime	announcement
2022-01-14	DeepSpeed-MoE	Microsoft	paper	42.49K		55
2021-12-31	ERNIE-ViLG	Baidu	model			30	10B
2021-12-01	★ EXAONE 1.0	LG	model				300B
2021-12-01	tFold	Tencent	model	158
2021-11-30	Donut	Naver	paper	6.88K		5
2021-11-22	Fengshenbang 3	IDEA Lab	library model	4.13K	3.14K
2021-11-01	KoGPT	Kakao	model	1.01K	17		6.2B			2.05K
2021-10-13	ByteTrack	ByteDance	library	6.44K		105
2021-10-10	Yuan 1.0	Inspur	model	589		25	245B
2021-09-28	DiffVC 2	Huawei	model paper	604		25
2021-09-10	★ HyperCLOVA	Naver	model			4	204B		560B
2021-09-03	★ FLAN (Instruction Tuning)	Google	paper			69
2021-08-10	★ Codex	OpenAI	model	3.26K		1.43K	12B
2021-07-28	Triton	OpenAI	library	19.41K
2021-07-15	★ AlphaFold 2	Google	paper
2021-07-12	SPLADE	Naver	paper			2
2021-07-08	OpenDILab / DI-engine 2	SenseTime, PJLab	library	3.62K
2021-07-08	OpenPPL (PPLNN)	SenseTime	library	1.37K
2021-07-07	★ HumanEval	OpenAI	eval	3.26K		1.43K							164
2021-07-05	★ ERNIE 3.0 & 3.0 Titan	Baidu	model			195	260B
2021-07-01	Meituan Sky Project	Meituan	announcement
2021-06-28	PFP / Matlantis	PFN	model
2021-06-24	CPM-2	BAAI	model	164		15	198B
2021-06-17	★ LoRA (Low-Rank Adaptation)	Microsoft	paper			2.45K
2021-06-01	OceanBase	Ant Group	library	10.15K
2021-06-01	★ Wu Dao 2.0 2	BAAI	model paper				1.75T
2021-06-01	Wu Dao Corpora	BAAI	dataset
2021-05-26	CogView	Z.ai, BAAI	model	1.8K		383
2021-05-13	Grad-TTS 2	Huawei	model paper	604		43
2021-05-01	Trustworthy AI White Paper	Xiaomi	paper
2021-04-29	★ DINO	Meta	paper
2021-04-26	★ PanGu-alpha 2	Huawei, PCL	model paper	3.16K		94	200B
2021-03-20	★ Wu Dao 1.0	BAAI	model				11.3B (max)
2021-03-18	★ GLM (Original) 2	Z.ai	model paper	3.51K	285	21	10B
2021-03-18	P-Tuning	Z.ai	paper	2.08K		265
2021-03-01	M6 Series	Alibaba	model			48
2021-02-27	Transformer in Transformer (TNT) 2	Huawei	model paper	4.42K		1.01K
2021-02-26	★ CLIP	OpenAI	paper	33.74K		5.3K
2021-01-11	★ Switch Transformer	Google	paper			361
2021-01-05	★ DALL-E	OpenAI	model			1.13K	12B
2021-01-01	MS-MARCO-CN	Baidu	dataset			18
2021-01-01	PaddleNLP	Baidu	library	12.95K
2021-01-01	PaddleSpeech	Baidu	library	12.61K
2021-01-01	KoBART	SK Telecom	model	467	127
2020-12-07	HEBO 2	Huawei	library paper	2.77K		19
2020-12-01	CPM-1	BAAI	model	1.58K		22	2.6B
2020-10-22	★ Vision Transformer (ViT)	Google	paper	12.58K		21.58K
2020-10-08	Deformable DETR 2	SenseTime	model paper	3.98K	12.41K	1.87K
2020-09-12	FuxiCTR / BARS 3	Huawei	library paper	1.43K		12
2020-08-01	Vega	Huawei	library	848
2020-07-15	PaddleOCR	Baidu	library	81.73K
2020-06-11	★ GPT-3	OpenAI	model			3.03K	175B		300B	2.05K
2020-06-08	★ Liquid Time-constant Networks	Liquid AI	paper			3
2020-04-08	DynaBERT 2	Huawei	model paper	3.16K	12	119
2020-03-28	MindSpore	Huawei	library	4.69K
2020-03-13	★ ProGen	Salesforce	model	702		34	1.2B
2020-03-10	Bolt	Huawei	library	958
2020-02-13	★ DeepSpeed	Microsoft	library	42.49K
2020-01-23	★ Scaling Laws for Neural Language Models	OpenAI	paper			1.51K
2020-01-01	Kunlun XPU	Baidu	announcement
2020-01-01	PaddleDetection	Baidu	library	14.24K
2020-01-01	PaddleSeg	Baidu	library	9.34K
2020-01-01	KoGPT2	SK Telecom	model	558
2019-11-27	GhostNet 4	Huawei	model paper	4.42K		406
2019-09-23	TinyBERT 2	Huawei	model paper	3.16K	169.89K	137
2019-09-17	★ Megatron-LM	NVIDIA	library	16.66K		835
2019-09-01	NEZHA 2	Huawei	model paper	3.16K		86
2019-08-23	Ascend 910 Series	Huawei	announcement
2019-08-01	Chainer	PFN	library			27
2019-07-29	★ ERNIE 2.0	Baidu	model			74
2019-07-25	★ Optuna	PFN	library	14.34K
2019-07-19	SUMBT	SK Telecom	paper	90		24
2019-06-20	Alchemy	Tencent	dataset	114		65
2019-06-01	KoBERT	SK Telecom	model	1.41K	18.01K
2019-04-19	★ ERNIE 1.0 2	Baidu	model paper			770
2019-04-03	CRAFT	Naver	paper	3.38K		58
2019-02-14	★ GPT-2	OpenAI	model	24.92K	13.36M		1.5B			1.02K
2018-12-01	★ JAX	Google	library	35.79K
2018-10-11	★ BERT	Google	model				340M
2018-10-01	OpenMMLab / MMDetection 2	SenseTime, PJLab	library paper	32.75K		794
2018-06-11	★ GPT-1	OpenAI	model				117M			512
2018-06-01	MACE (Mobile AI Compute Engine)	Xiaomi	library	5.04K
2018-03-16	ApolloScape	Baidu	dataset	617
2018-03-01	Kata Containers	Ant Group	library	8.05K
2017-11-24	StarGAN	Naver	paper			18
2017-11-14	DuReader	Baidu	dataset	1.18K		51
2017-11-02	★ VQ-VAE	Google	paper			1.93K
2017-07-26	Xiao AI	Xiaomi	announcement
2017-07-20	★ Proximal Policy Optimization (PPO)	OpenAI	paper
2017-07-05	Apollo	Baidu	library	26.66K
2017-06-12	★ Attention Is All You Need (Transformer)	Google	paper
2017-06-01	Angel ML	Tencent	library	6.79K
2017-03-15	DiscoGAN	SK Telecom	paper	776		732
2017-01-18	★ PyTorch	Meta	library	100.65K		16.19K
2016-09-30	PaddlePaddle	Baidu	library	23.94K
2016-04-27	OpenAI Gym	OpenAI	library	37.22K
2016-01-27	★ AlphaGo	Google	paper
2015-11-09	★ TensorFlow	Google	library	195.63K		8.82K
2015-03-09	★ Distilling the Knowledge in a Neural Network	Google	paper			13.96K
2015-02-26	★ DQN (Deep Q-Network)	Google	paper
2015-02-11	★ Batch Normalization	Google	paper			24.38K
2014-12-17	Deep Speech 1 & 2	Baidu	paper

Keyboard shortcuts

Navigation

Other