Date Name Lab Type Stars Downloads Citations Params Active Tokens Context Intel Open Questions Tasks
2026-06-30 LongCat-2.0 Meituan model 1.6T48B1M
2026-06-30 Meituan Releases LongCat-2.0 (1.6T MoE) — Trained Entirely on Domestic Chinese Chips, No NVIDIA Reuters Meituan news
2026-06-29 Smooth Scaling Laws Hide Stepwise Token Learning Xiaohongshu paper
2026-06-22 AudioCALM: Continuous Autoregressive Language Modeling for Universal Audio Generation Alibaba paper
2026-06-22 Reinforcement Learning Towards Broadly and Persistently Beneficial Models OpenAI paper
2026-06-22 SpaceX Inks Compute Deal with Reflection AI for Colossus 2 (Up to ~$6.3B Through 2029) TechCrunch SpaceX news
2026-06-18 Laguna M.1 Released as Open Weights (Apache 2.0); Base and Post-Trained Checkpoints on Hugging Face Hugging Face Poolside news
2026-06-17 Xiaohongshu plans for Hong Kong IPO by year-end, targets US$70b valuation The Standard (via WSJ) Xiaohongshu news
2026-06-16 SpaceX Agrees to Acquire Cursor for $60B All-Stock, Days After IPO (Pending Regulatory Approval) TechCrunch SpaceX news
2026-06-16 Z.ai GLM-5.2 Tops the AA Intelligence Index as the Highest-Scoring Open Model (51); MIT Open Weights Released Crypto Briefing Z.ai news
2026-06-15 HCLTech to Buy 10.5% Stake in Sarvam AI for ~$150M (₹14.27B), Valuing It at $1.5B — Series B First Close ($234M Raised), Unicorn Status Reuters Sarvam news
2026-06-13 GLM-5.2 Z.ai model 753B1M51
2026-06-12 Kimi Code CLI Moonshot AI library
2026-06-12 Kimi K2.7-Code Moonshot AI model 1T32B262.14K
2026-06-11 MiniMax Sparse Attention (MSA) MiniMax paper
2026-06-11 NexAU (Agent Universe) Nex-AGI library
2026-06-11 Zonos2 (ZONOS2) Zyphra model
2026-06-11 SpaceX Prices Largest IPO Ever at $135/Share; Trades on Nasdaq as SPCX TechCrunch SpaceX news
2026-06-10 MiMoCode Xiaomi library
2026-06-09 Claude Fable 5 Anthropic model 6011/100
2026-06-09 Claude Mythos 5 Anthropic model 11/100
2026-06-09 North Mini Code Cohere model 30B3B262.14K21
2026-06-09 DiffusionGemma Google model 25.2B3.8B262.14K
2026-06-09 Nex-N2 Nex-AGI model 397B17B
2026-06-09 Cohere Releases North Mini Code — Its First Developer-Focused Model, a 30B-A3B Apache-2.0 Agentic Coder Cohere Cohere news
2026-06-08 OpenAI Confidentially Files for IPO (Last Valued at $852B), One Week After Anthropic TechCrunch OpenAI news
2026-06-08 Xiaomi MiMo-V2.5-Pro-UltraSpeed Breaks 1000 Tokens/s on a 1T-Parameter Model (with TileRT) Xiaomi MiMo Xiaomi news
2026-06-05 DaX (大象) Alibaba paper
2026-06-05 Chronos-2 Amazon model 5.45K12.42M1120M8.19K
2026-06-04 BioMysteryBench Anthropic eval 99
2026-06-04 OPI-Struc (STELLA) BAAI dataset 1
2026-06-04 Nemotron 3 Ultra NVIDIA model 1.35K550B55B1M38
2026-06-04 Nemotron 3.5 Content Safety NVIDIA model 5.91K4B
2026-06-04 SciCore-Omics OpenBMB model 82388B
2026-06-04 dots.tts 2 Xiaohongshu model paper 2B
2026-06-03 Gemma 4 12B Google model 816.16K11.95B262.14K
2026-06-03 ChartNet IBM dataset
2026-06-03 OfficeComprehensionBenchmark Microsoft eval 31.04K2
2026-06-03 RHELM Microsoft eval 1.3K7
2026-06-02 MAI-Code-1-Flash Microsoft model 137B262.14K
2026-06-02 MAI-Thinking-1 Microsoft model 1T35B30T262.14K
2026-06-02 Zamba2-VL (Vision-Language) Zyphra model 7B
2026-06-02 Build 2026: Seven MAI Models Launched — MAI-Thinking-1 (1T/35B Reasoning), MAI-Code-1-Flash, Multimodal Stack Refresh Microsoft AI Microsoft news
2026-06-02 Zamba2-VL Released: Hybrid SSM Vision-Language Models (1.2B / 2.7B / 7B) Zyphra Zyphra news
2026-06-01 MiniMax-M3 MiniMax model 428B23B1.05M44
2026-06-01 Anthropic Confidentially Files for IPO at ~$965B Valuation, First Among AI Labs Fortune Anthropic news
2026-06-01 Nemotron 3 Ultra Announced at Computex Taipei — 550B/55B MoE, AAII 48, Ships June 4 on HuggingFace Artificial Analysis NVIDIA news
2026-05-31 HakushoBench NII eval 32.05K
2026-05-29 SchGen Microsoft model 1520B13.31K
2026-05-29 Universal Audio Tokenizer Tencent model 4
2026-05-29 Trinity Is Moving to OpenMDW-1.1 Arcee AI Blog Arcee news
2026-05-28 Claude Opus 4.8 Anthropic model 5611/100
2026-05-28 Ultra-FineWeb-L3 OpenBMB dataset
2026-05-28 Step-3.7-Flash StepFun model 50.19K198B11B262.14K
2026-05-28 Autonomous Agentic Data Engineering Tencent paper
2026-05-28 ByteDance Developing Custom CPU Chips to Support AI Rollout; Pursuing Both Arm and RISC-V Tracks Reuters ByteDance news
2026-05-28 Microsoft to Unveil Homegrown Coding Model + Image / Reasoning / Speech / Transcription Suite at Build 2026 The Information Microsoft news
2026-05-28 Mistral Chases AI Superintelligence to Counter U.S. Dominance WSJ Mistral news
2026-05-28 SKT Launches A.Biz Cowork Internal AI Agent (Beta) and AXMS 1.5 Platform Upgrade Seoul Economic Daily SK Telecom news
2026-05-27 Sci-Base PJLab dataset
2026-05-26 Granite Guardian 4.1 IBM model 1.17K8B
2026-05-26 The MiniMax-M2 Series: Technical Report MiniMax paper
2026-05-26 LocateAnything-3B NVIDIA model 131.79K3B
2026-05-26 DeepMind CEO Demis Hassabis: Humanity Has 'a Few Years' to Prepare for AGI; 'Foothills of the Singularity' Axios Google news
2026-05-25 MiniCPM5-1B OpenBMB model 137.34K1.08B131.07K
2026-05-24 Granite Switch 4.1 IBM model 761.1K30B131.07K
2026-05-23 ScaleAcross Explorer Meta paper
2026-05-23 Nemotron-Labs Diffusion NVIDIA model 12.46K14B
2026-05-23 BitCPM-CANN OpenBMB model 9.42K7.17K8B
2026-05-22 Intern-S2-Preview PJLab model 8096.73K36B131.07K
2026-05-22 Liang Wenfeng Commits to AGI Mission Over Near-term Commercialization as ¥70B (~$10B) Round Advances Bloomberg DeepSeek news
2026-05-21 Hunyuan Model Matrix Refresh: TurboS, T1, T1-Vision, and Hunyuan Voice (Top-8 Chatbot Arena, +50% T1-Vision Speed) KrASIA Tencent news
2026-05-21 Tencent Open-Sources Hy-MT2 Translation Family (1.8B / 7B / 30B-A3B) + IFMTBench Tencent Hunyuan Tencent news
2026-05-20 Qwen3.7-Max-Preview Alibaba model 1M46
2026-05-20 Command A+ Cohere model 113.99K218B25B128K2939/100
2026-05-20 Lens Microsoft model 2343.93K3.8B
2026-05-20 Qwen3.7-Max-Preview Unveiled at Alibaba Cloud Summit — AA Intelligence Index 57 (#1 Among Chinese Labs) Qwen Alibaba news
2026-05-20 Cohere Releases Command A+ — First Full Apache-2.0 Open Model with Lossless 4-bit Quantization and Native Citations VentureBeat Cohere news
2026-05-20 Kakao Partners with Google DeepMind on SynthID for Kanana Models — First Asian Firm to Adopt Seoul Economic Daily Kakao news
2026-05-20 Q1 FY27 Earnings: $81.6B Revenue (+85% YoY), $80B Buyback Authorized, Dividend Raised 25x to $0.25 NVIDIA NVIDIA news
2026-05-19 Antigravity 2.0 Google model
2026-05-19 Gemini 3.5 Flash Google model 1M506/100
2026-05-19 Gemini Omni Google model
2026-05-19 Anthropic Hires Andrej Karpathy to Pre-training Team; Will Lead Sub-team Using Claude to Accelerate Pretraining Research CNBC Anthropic news
2026-05-19 Hitachi Deploys Claude to ~290,000 Employees; Embeds in Lumada 3.0 / HMAX for Critical Infrastructure Hitachi Anthropic news
2026-05-19 I/O 2026: Gemini 3.5 Flash (AAII 55), Gemini Omni Video Generator, Antigravity 2.0, Gemini Spark, AI Ultra Repriced $200/mo Google Google news
2026-05-19 Four years after ChatGPT, Xiaohongshu's AI restraint gives way to urgency KrASIA Xiaohongshu news
2026-05-18 First Vera CPU Deliveries to Anthropic, OpenAI, SpaceXAI, and Oracle NVIDIA NVIDIA news
2026-05-16 Full Attention Strikes Back (RTPurbo) Alibaba paper
2026-05-15 Grok Build Launches — Agentic Coding CLI Competing with Claude Code, Codex, and Antigravity Engadget SpaceX news
2026-05-14 TWN: Think When Needed Alibaba paper
2026-05-14 Realtime Voice API GA + gpt-realtime-2 Family (3 New Audio Models) OpenAI OpenAI news
2026-05-14 HCLTech to Anchor $300M Sarvam Round at $1.5B; Bessemer +$50M; NVIDIA, Prosperity7 Participating Outlook Business Sarvam news
2026-05-14 SKT × Korean Defense Ministry Sign MOU on Applying Sovereign AI Foundation Model to Defense SK Telecom SK Telecom news
2026-05-14 SpaceXAI Division Bleeding Researchers Since Merger; 11+ to Meta, 7+ to Thinking Machines TechCrunch SpaceX news
2026-05-13 Granite Embedding Multilingual R2 IBM paper
2026-05-13 NexRL Nex-AGI library
2026-05-12 Kuaishou Plans to Spin Off Kling AI Video Unit at \$20B Valuation; Tencent in Talks for \$2B Pre-IPO Round The Information Kuaishou news
2026-05-11 MiniCPM-V 4.6 OpenBMB model 615.51K1.3B262.14K7
2026-05-11 DeepSeek First External Funding Round Reportedly Near Close at $45–50B Valuation, Led by China's 'Big Fund III' SCMP DeepSeek news
2026-05-11 Zyphra Announces 15 MW of AMD Instinct MI355X GPU Capacity for Zyphra Cloud Memeburn Zyphra news
2026-05-09 ERNIE 5.1 Baidu model
2026-05-09 Step-Audio-R1.1 (Realtime) Tops Big Bench Audio at 96.4%, Surpassing Grok Voice Agent Artificial Analysis StepFun news
2026-05-07 Cola DLM ByteDance paper
2026-05-07 AI Co-Mathematician Google paper
2026-05-07 ZAYA1-74B-Preview Zyphra model 74B4B15T262.14K
2026-05-07 OMAI Compute Cluster Goes Live — $152M NSF + Blackwell-Ultra Infrastructure for Open Science AI Ai2 Ai2 news
2026-05-07 Kakao Announces Kanana 2.5 — 150B Agent-Focused LLM at Q1 Earnings Call Korea Herald Kakao news
2026-05-07 Kimi Chatbot Maker Moonshot AI Valued at $20 Billion in Meituan-Led Round Bloomberg Meituan news
2026-05-07 Kimi Chatbot Maker Moonshot AI Valued at $20 Billion in Meituan-Led Round Bloomberg Moonshot AI news
2026-05-07 ZAYA1-74B-Preview: Scaling Pretraining on AMD (74B/4B MoE) Zyphra Zyphra news
2026-05-06 ZAYA1-8B Zyphra model 8B700M
2026-05-06 DeepSeek in Talks for First-Ever Outside Round at $45B; Tencent + Big Fund III in Lead Group TechCrunch DeepSeek news
2026-05-05 TRIBE v2 (Brain Activity Foundation Model) Meta paper 1
2026-05-05 iOS 27 to Let Users Swap in Claude, Gemini, and Others as Default Apple Intelligence Model Bloomberg Apple news
2026-05-05 GPT-5.5 Instant Becomes Default ChatGPT Model; 52.5% Fewer Hallucinated Claims vs 5.3 Instant OpenAI OpenAI news
2026-05-04 Horizon Length in LLM Agent Training Microsoft paper
2026-05-04 Zyphra Launches Zyphra Cloud & Zyphra Inference — Serverless Inference for Open Models, AMD-First Zyphra Zyphra news
2026-05-03 Korea's National Growth Fund and SIF Approve KRW 560B (~$400M) Direct Equity in Upstage — First Software Co. Recipient Seoul Economic Daily Upstage news
2026-05-01 Huawei's AI Chip Gains Ground as DeepSeek and Others Shift Away from Nvidia Financial Times Huawei news
2026-04-30 OlmPool: Cracks in the Foundation Ai2 paper 7
2026-04-30 SenseTime Is Running Its New Model on Chinese Chips WIRED SenseTime news
2026-04-30 xAI Launches Grok 4.3 with Improved Agentic Performance and Lower Pricing Artificial Analysis SpaceX news
2026-04-29 Granite 4.1 IBM model 152619.44K30B15T512K961/100
2026-04-29 Mistral Medium 3.5 Mistral model 400.07K128B256K3033/100
2026-04-29 Granite 4.1 Released: 3B/8B/30B Dense Models, 512K Context, 8B Matches Prior 32B MoE IBM Research IBM news
2026-04-28 Laguna M.1 Poolside model 225.8B23.4B30T262.14K
2026-04-28 Laguna XS.2 Poolside model 33.4B3B30T262.14K
2026-04-28 MiMo-V2.5-Pro Xiaomi model 72.22K1.02T42B1M4239/100
2026-04-28 Poolside Launches Laguna XS.2 (Open, Apache 2.0) and Laguna M.1 Agentic Coding Models Poolside Poolside news
2026-04-24 DeepSeek-V4 3 DeepSeek model paper 6.84M1.6T49B33T1M4450/100
2026-04-24 Cohere Completes Merger with Germany's Aleph Alpha, Creating Transatlantic AI Champion Financial Times Cohere news
2026-04-24 DeepSeek-V4 Released: 1.6T/49B MoE, First Frontier Model Trained Entirely on Huawei Ascend 950PR, MIT License DeepSeek DeepSeek news
2026-04-23 Sapiens2 Meta model 5B
2026-04-23 GPT-5.5 OpenAI model ~9.7T1M556/100
2026-04-23 Hy3 Preview Tencent model 37382.33K295B21B256K3433/100
2026-04-22 Qwen3.6 Open-Weight Models 2 Alibaba model 3.54K10.41M35B3B262.14K
2026-04-22 LLaDA 2.0-Uni Ant Group model 7.22K16B
2026-04-22 Tencent and Alibaba in Talks to Invest in DeepSeek at $20B+ Valuation — First External Funding Bloomberg DeepSeek news
2026-04-22 Cloud Next '26: TPU 8t/8i Announced, Deep Research Max Agents, Chrome Auto Browse, Thinking Machines Lab Multi-Billion Deal 9to5Google Google news
2026-04-21 SpaceX Strikes Deal for Right to Acquire Cursor for $60B Bloomberg SpaceX news
2026-04-20 BAR: Branch-Adapt-Route Ai2 paper 150
2026-04-20 Kimi K2.6 Moonshot AI model 1T32B262.14K4333/100
2026-04-20 Qwen3.6-Max-Preview: Alibaba's Most Powerful Model, #1 on Six Coding Benchmarks (AA Intelligence Index 52) Qwen Alibaba news
2026-04-20 Amazon Invests $5B More (Total $13B); Anthropic Commits $100B+ AWS Spend Over 10 Years, Secures Up to 5GW Compute Anthropic Anthropic news
2026-04-17 Grok 4.3 SpaceX model 38
2026-04-16 Claude Opus 4.7 Anthropic model ~4T5411/100
2026-04-16 LeapAlign: Post-Training Flow Matching Models at Any Generation Step ByteDance paper
2026-04-16 Prefill-as-a-Service: Cross-Datacenter KVCache for Next-Generation Models Moonshot AI paper
2026-04-16 Claude Opus 4.7 Released Anthropic Anthropic news
2026-04-16 ByteDance Recruits DeepSeek R1 Lead Author Daya Guo for Seed Agent Team SCMP ByteDance news
2026-04-16 DeepSeek R1 Lead Author Daya Guo Joins ByteDance Seed Amid Intensifying AI Talent War SCMP DeepSeek news
2026-04-16 DeepSeek V4 Imminent — 1T-Parameter MoE to Run Solely on Huawei Ascend 950PR Chips Dataconomy DeepSeek news
2026-04-16 How France's Mistral Built a $14 Billion AI Empire by Not Being American Forbes Mistral news
2026-04-16 GPT-Rosalind Launched for Life Sciences Drug Discovery OpenAI OpenAI news
2026-04-15 Revenue Run Rate Hits $30B; VCs Offer Up to $800B Valuation Axios Anthropic news
2026-04-15 Upstage Becomes Korea's First Generative AI Unicorn with $126M Series C Seoul Economic Daily Upstage news
2026-04-14 Lightning OPD: Efficient Post-Training for Large Reasoning Models NVIDIA paper
2026-04-14 Gemini Robotics-ER 1.6 Launched; Boston Dynamics Partnership for Industrial AI Boston Dynamics Google news
2026-04-14 NAACP Sues xAI Over Memphis Colossus Data Center Pollution CNBC SpaceX news
2026-04-13 OpenAI Touts Amazon Alliance, Says Microsoft Has 'Limited Our Ability' to Reach Enterprise CNBC OpenAI news
2026-04-13 StepFun Unwinding Offshore Structure to Pave Way for HK IPO at Up to $10B Reuters StepFun news
2026-04-12 SoftBank/NEC/Honda/Sony Form JV for Trillion-Parameter Physical AI Model; $6.3B Government Backing Nikkei Asia SB Intuitions news
2026-04-10 Nexus: Common Minima for Better Generalization ByteDance paper
2026-04-10 Alibaba Token Hub Created: 5 AI Units Consolidated Under CEO Eddie Wu; RMB 380B ($53B) 3-Year Commitment SCMP Alibaba news
2026-04-10 Cohere in Advanced Merger Talks with Germany's Aleph Alpha Reuters Cohere news
2026-04-10 SK Telecom Partners with Rebellions and Arm for Sovereign AI Inference Infrastructure Rebellions SK Telecom news
2026-04-10 xAI Spending Pushed SpaceX to Nearly $5B Loss; CFO Anthony Armstrong Departs The Information SpaceX news
2026-04-09 Metis: Cultivating Meta-Cognitive Tool Use in Agentic Multimodal Models Alibaba paper
2026-04-09 HiFloat4 Format for LLM Pre-training on Ascend NPUs Huawei paper
2026-04-09 EXAONE 4.5 LG model 33B256K
2026-04-09 Efficient RL Training for LLMs with Experience Replay Meta paper
2026-04-09 EXAONE 4.5 Released — LG's First Open-Weight Vision-Language Model Korea Herald LG news
2026-04-09 Naver Shuts Down Clova X Chatbot; Pivots to Vertical AI Integrated into Search, Shopping, Finance Seoul Economic Daily Naver news
2026-04-08 Muse Spark Meta model 260K43
2026-04-08 Muse Spark Unveiled — First Model from Superintelligence Labs (Proprietary) Bloomberg Meta news
2026-04-08 Zhipu Hikes Prices Again as China AI Monetization Wave Quickens Bloomberg Z.ai news
2026-04-07 Harrier Microsoft model 387.22K27B (max)
2026-04-07 GLM-5.1 Z.ai model 3.39K123.47K754B4044/100
2026-04-07 Claude Mythos Withheld from Public Release; Project Glasswing Cybersecurity Consortium Launched with Apple and Google Fortune Anthropic news
2026-04-07 Ascend 950PR AI Chip in Production; 750K Units Planned for 2026; Alibaba, ByteDance, Tencent Place Massive Orders TrendForce Huawei news
2026-04-07 GLM-5.1 Open-Source Release Scores #3 on Code Arena (1530 Elo); Stock Surges 19% BuildFastWithAI Z.ai news
2026-04-06 AI Agent Traps Google paper
2026-04-06 MedGemma 1.5 Google model 1.51K416.35K4B
2026-04-06 Multi-GW Compute Partnership Expansion with Google Cloud and Broadcom TechCrunch Anthropic news
2026-04-06 NVIDIA Acquires SchedMD (Slurm Workload Manager); Draws Regulatory Scrutiny Reuters NVIDIA news
2026-04-03 Microsoft Announces $10B Japan AI Infrastructure Investment (2026-2029) WSJ Microsoft news
2026-04-03 TII Launches Falcon Perception — 600M-Parameter Open Multimodal Model for Grounding and Segmentation TII TII news
2026-04-03 Xiaomi Reveals MiMo-V2-Pro (1T Parameters), Approaching GPT-5.2 / Opus 4.6 Performance VentureBeat Xiaomi news
2026-04-02 Qwen 3.6-Plus Alibaba model 1M
2026-04-02 Trinity Large Thinking Arcee model 10.66K512K2444/100
2026-04-02 Gemma 4 Google model 31B4B (max)256K2939/100
2026-04-02 MAI Multimodal Stack (Transcribe / Voice / Image) Microsoft model
2026-04-02 SWE-HERO NVIDIA paper
2026-04-02 Alibaba Unveils Third Closed-Source AI Model in Focus on Profit Bloomberg Alibaba news
2026-04-02 Arcee's New Open-Source Trinity Large Thinking Is the Rare Powerful U.S.-Made Model VentureBeat Arcee news
2026-04-02 Gemma 4 Open Models Released Google Developers Blog Google news
2026-04-02 Poolside's $2B Series C Collapses; CoreWeave Exits 2GW Texas Data Center (Project Horizon) DataCenterDynamics Poolside news
2026-04-02 Sarvam AI Nearing $300-350M Raise at $1.5B Valuation Led by Bessemer with Nvidia and Amazon Bloomberg Sarvam news
2026-04-01 Simple Self-Distillation for Code Generation Apple paper
2026-04-01 Scaling Reasoning Tokens via RL and Parallel Thinking ByteDance paper
2026-04-01 Procedural Knowledge at Scale Improves Reasoning Meta paper
2026-04-01 Speech LLMs as Contextual Reasoning Transcribers Microsoft paper
2026-04-01 GLM-5V-Turbo Z.ai model 744B40B28.5T202.75K34
2026-04-01 Moonshot AI Raising $1B at $18B Valuation; Working with CICC and Goldman Sachs on HK IPO Bloomberg Moonshot AI news
2026-03-31 Think-Anywhere Alibaba paper 55
2026-03-31 ASI-Evolve SII paper 726
2026-03-31 OpenAI Closes $122B Round at $852B Valuation OpenAI OpenAI news
2026-03-31 Zhipu's Losses Climb 60% After Chinese AI Rivalry Worsens Bloomberg Z.ai news
2026-03-31 Zhipu's Losses Climb 60% After Chinese AI Rivalry Worsens Bloomberg Z.ai news
2026-03-30 Mistral AI Raises $830M in Debt to Set Up a Data Center Near Paris TechCrunch Mistral news
2026-03-28 daVinci-LLM SII model 1543B
2026-03-28 Falcon Perception TII model 71512.87K600M
2026-03-28 DeepSeek Before V4: Culture, Organization, and Liang Wenfeng's Unique Goals (English summary) LatePost (晚点) DeepSeek news
2026-03-26 Cohere Transcribe Cohere model 551.93K2B
2026-03-26 Intern-S1-Pro PJLab model 1T22B
2026-03-26 China's Moonshot AI Seeks Listing in Hong Kong Under Heightened Scrutiny WSJ Moonshot AI news
2026-03-25 LongCat-Next Meituan model 4382.18K74B3B
2026-03-25 Alibaba Launches AI Model Task Force; Top Researcher Resigns The Information Alibaba news
2026-03-25 MiniMax-M2.7, GLM-5 at 1/3 Cost Latent Space MiniMax news
2026-03-24 DeepSeek's Latest Job Postings Highlight Pivot to Agentic AI Bloomberg DeepSeek news
2026-03-23 SkillRouter Alibaba paper
2026-03-23 Felis ByteDance paper
2026-03-20 LongCat-Flash-Prover Meituan model 8564560B27B
2026-03-19 dots.mocr Xiaohongshu model 3B
2026-03-18 Path-Constrained Mixture-of-Experts Apple paper
2026-03-18 Qianfan-OCR Baidu model 174.67K4B
2026-03-18 MiniMax-M2.7 MiniMax model 3822/100
2026-03-18 MiMo-V2-Omni Xiaomi model 35
2026-03-18 MiMo-V2-Pro Xiaomi model 1T42B1M40
2026-03-18 MiMo-V2-TTS Xiaomi model
2026-03-18 Chinese AI Developer Zhipu to Create New Unit for Product Development The Information Z.ai news
2026-03-17 PRISM: Demystifying Retention and Interaction in Mid-Training IBM paper
2026-03-17 Pre-training LLM without Learning Rate Decay Enhances Supervised Fine-Tuning SB Intuitions paper
2026-03-16 Mixture-of-Depths Attention (MoDA) ByteDance paper 267
2026-03-16 Mistral Small 4 Mistral model 45.71K119B6.5B256K1239/100
2026-03-16 Attention Residuals 2 Moonshot AI paper library 3.3K
2026-03-16 CUBE: A Standard for Unifying Agent Benchmarks ServiceNow paper
2026-03-15 Scientific Judge 2 Baidu paper dataset 405
2026-03-13 OpenSWE / daVinci-Env SII dataset 187
2026-03-12 RoboBrain-Dex BAAI model 41
2026-03-12 IndexShare (IndexCache): Cross-Layer Index Reuse for Sparse Attention Z.ai paper
2026-03-11 Nemotron 3 Super NVIDIA model 745.73K120B12B1M2583/100
2026-03-10 Exclusive Self Attention Apple paper
2026-03-10 Ai2 CEO Ali Farhadi Steps Down; Microsoft Hires Key Researchers GeekWire Ai2 news
2026-03-09 Anthropic Sues Trump Admin Over Pentagon AI Blacklist CNBC Anthropic news
2026-03-09 OpenAI Acquires Promptfoo for AI Agent Security OpenAI OpenAI news
2026-03-08 Scalable Training of MoE Models with Megatron Core NVIDIA paper
2026-03-06 Sarvam-105B Sarvam model 12.57K105B10.3B128K1239/100
2026-03-06 Sarvam-30B Sarvam model 30B2.4B32K739/100
2026-03-05 GPT-5.4 OpenAI model ~2.2T1M516/100
2026-03-05 GPT-5.4 Released with 1M Token Context OpenAI OpenAI news
2026-03-04 RIVER PJLab dataset 10
2026-03-01 OLMo Hybrid Ai2 model 48.9K7B6T
2026-03-01 LLM-jp-4 NII model 4.22K32B3.8B11.7T65.54K
2026-02-28 AnyTouch2 / ToucHD 2 BAAI dataset paper
2026-02-25 MaxClaw MiniMax library
2026-02-25 ZUNA (EEG Foundation Model) Zyphra model 380M
2026-02-25 Tencent-Backed AI Startup StepFun Is Said to Plan Hong Kong IPO Bloomberg StepFun news
2026-02-19 Gemini 3.1 Pro Google model 1M466/100
2026-02-19 Gemini 3.1 Pro Released, Ties #1 on AA Intelligence Index Google DeepMind Google news
2026-02-17 OLMix Ai2 paper
2026-02-17 Tiny Aya Cohere model 1.79K3.35B
2026-02-17 Grok-4.20 SpaceX model 2M37
2026-02-17 Mercury 2 Released: Diffusion LLM with AA Index 33 at 1000 tok/s Inception Labs Inception Labs news
2026-02-16 Qwen3.5 5 Alibaba model 3.54K397B17B1M3439/100
2026-02-16 WebWorld Alibaba model 391.26K32B
2026-02-16 Ling 2.5 Ant Group model 2151T1M
2026-02-16 ZoomBench Ant Group dataset 155
2026-02-15 Optimal Batch Size Scheduling via Functional Scaling Laws Meituan paper
2026-02-14 Doubao-Seed-2.0 ByteDance model
2026-02-14 Doubao-Seed-2.0 Family Launched (Pro / Lite / Mini / Code) TechNode ByteDance news
2026-02-13 Cohere's $240M Year Sets Stage for IPO TechCrunch Cohere news
2026-02-12 MiniMax-M2.5 MiniMax model 585589.2K229B3428/100
2026-02-12 GEBench StepFun dataset 54
2026-02-12 FireRed-Image-Edit Xiaohongshu model
2026-02-12 Xiaomi-Robotics-0 2 Xiaomi model paper 4.7B
2026-02-12 Anthropic Raises $30B Series G at $380B Valuation Anthropic Anthropic news
2026-02-11 Ming-Flash-Omni-2.0 Ant Group model 2.66K
2026-02-11 MiniCPM-SALA OpenBMB model 9.42K6.33K1M
2026-02-11 Step-3.5-Flash 3 StepFun model paper dataset 2.08K325.86K196B11B256K26
2026-02-11 GLM-5 3 Z.ai model paper 3.39K102.79K744B44B28.5T4050/100
2026-02-11 Slime: Asynchronous RL for Agentic Tasks Z.ai library 6.07K
2026-02-09 Protenix ByteDance model 1.94K
2026-02-09 InternAgent-1.5 PJLab paper
2026-02-08 Data Darwinism / Darwin Corpora SII dataset 154
2026-02-07 Seedance 2.0 ByteDance model
2026-02-07 FireRed-OpenStoryline Xiaohongshu library
2026-02-06 Baichuan-M3 2 Baichuan paper model 2451.84K235B
2026-02-05 Claude Opus 4.6 Anthropic model ~5.3T1M4411/100
2026-02-05 Kling 3.0 2 Kuaishou model paper
2026-02-05 Claude Opus 4.6 Released with 1M Context Anthropic Anthropic news
2026-02-04 RationaleRM Alibaba dataset
2026-02-03 MiniCPM-o 4.5 OpenBMB model 25.58K203.83K
2026-02-02 WAXAL: African Language Speech Corpus Google dataset
2026-02-02 Kimi K2.5 2 Moonshot AI model paper 2.02K1.64M32B3833/100
2026-02-02 daVinci-Agency SII model 9
2026-02-02 SpaceX Acquires xAI at $1.25T Combined Valuation Fortune SpaceX news
2026-01-30 Keel: Post-LayerNorm Is Back ByteDance paper
2026-01-29 SenseNova-MARS 3 SenseTime model paper dataset 1147432B (max)
2026-01-28 Trinity Large Arcee model 523398B13B17T512K
2026-01-28 Trinity Mini / Nano Arcee model 14.85K26B3B131K
2026-01-28 ACE-Step-1.5 StepFun model 10.97K
2026-01-27 K2 Think V2 MBZUAI model 1.28K70B262.14K1789/100
2026-01-27 LongCat-Flash-Lite 2 Meituan model paper 2.08K1744/100
2026-01-27 Mistral AI Surges Revenue 20-Fold to Over $400 Million ARR MLQ Mistral news
2026-01-27 Tencent Bets Its AI Future on 28-Year-Old From OpenAI Caixin Tencent news
2026-01-26 DeepPlanning Alibaba dataset
2026-01-26 daVinci-Dev SII model 28
2026-01-26 Solar Pro 3 Upstage model 102B12B128K
2026-01-23 LongCat-Flash-Thinking-2601 2 Meituan model paper 2547.66K560B27B
2026-01-22 ERNIE 5.0 2 Baidu model paper 22.4T22
2026-01-22 EvoCUA Meituan library 3234.12K
2026-01-21 CorpusQA: A 10 Million Token Benchmark for Corpus-Level Analysis and Reasoning Alibaba eval
2026-01-20 Yuan 3.0 Ultra Inspur model 236351T68.8B
2026-01-20 Step-3-VL-10B 2 StepFun model paper 406484.26K10B
2026-01-15 Tao Qin Elected 2025 ACM Fellow ACM ZGCA news
2026-01-12 Engram: Conditional Memory via Scalable Lookup DeepSeek paper
2026-01-12 Alphabet Hits $4T Market Cap CNBC Google news
2026-01-11 Distributional Clarity: The Hidden Driver of RL-Friendliness in Large Language Models Baidu paper
2026-01-11 Solar Open 100B Upstage model 3.87K102B12B19.7T131.07K
2026-01-09 PaCoRe: Learning to Scale Test-Time Compute StepFun paper 334110
2026-01-09 Zhipu and MiniMax IPO ChinaTalk MiniMax news
2026-01-09 Zhipu and MiniMax IPO ChinaTalk Z.ai news
2026-01-06 xAI Raises $20B Series E at $230B Valuation CNBC SpaceX news
2026-01-05 Yuan 3.0 Flash Inspur model 187840B3.7B
2026-01-05 K-EXAONE LG model 236B23B2528/100
2026-01-05 HyperCLOVA X SEED Omni Naver model 6488B
2026-01-05 Falcon-H1R TII model 95.76K7B256K1044/100
2026-01-03 HyperCLOVA X SEED Think Naver model 156.09K32B128K1731/100
2026-01-01 FlashInfer-python-paddle Baidu library
2026-01-01 Agentar-Z-100K Z.ai dataset
2025-12-31 FineWeb-Mask ByteDance dataset
2025-12-31 mHC: Manifold-Constrained Hyper-Connections DeepSeek paper
2025-12-31 OpenOneRec Kuaishou library 81233
2025-12-30 SeedFold ByteDance paper
2025-12-30 LongCat ZigZag Attention Meituan paper 842
2025-12-27 RollArt: Disaggregated Multi-Task Agentic RL Training at Scale Alibaba paper
2025-12-27 A.X K1 SK Telecom model 3033.17K519B33B10T131K
2025-12-23 MiniMax-M2.1 MiniMax model 54410.19K229B3128/100
2025-12-23 VIBE & OctoCodingBench MiniMax dataset
2025-12-23 Step-DeepResearch StepFun library 561
2025-12-23 Zhipu AI's Rise from Tsinghua Lab Pandaily Z.ai news
2025-12-22 SekoTalk / Seko 2.0 SenseTime model 43
2025-12-22 GLM-4.7 Z.ai model 3444/100
2025-12-19 Kanana-2 Kakao model 15030B3B32K
2025-12-19 Kakao Open-Sources Kanana-2 Model Optimized for Agentic AI Korea Times Kakao news
2025-12-18 Seed1.8 ByteDance model 218
2025-12-18 EXAONE Path 2.5 LG paper
2025-12-18 Towards Scalable Pre-training of Visual Tokenizers MiniMax paper 490
2025-12-18 HY-Motion 1.0 Tencent paper
2025-12-18 Seed1.8 Released as a Generalized Agentic Model ByteDance Seed ByteDance news
2025-12-17 Peter DeSantis to Lead Unified AGI Org; Rohit Prasad Departing CNBC Amazon news
2025-12-17 Tencent restructures AI operations, promotes high-profile recruit to chief AI scientist SCMP Tencent news
2025-12-16 Molmo 2 Ai2 model 64118B (max)
2025-12-16 MiMo-V2-Flash 2 Xiaomi model paper 1.33K70.62K309B15B27T33
2025-12-16 MOPD (Multi-Teacher On-Policy Distillation) Xiaomi library 1.33K
2025-12-15 Nemotron 3 Nano NVIDIA model 2.18M30B3.5B25T1M1883/100
2025-12-15 NVIDIA in Advanced Talks to Acquire AI21 Labs for $2-3B SiliconANGLE AI21 Labs news
2025-12-10 LLaDA 2 2 Ant Group model paper 4309.38K
2025-12-09 JAIS 2 MBZUAI model 2.5K70B2.6T8.19K
2025-12-08 LongCat-Image 3 Meituan model paper 69547.59K6B
2025-12-06 K2-V2 (LLM360) MBZUAI model 18370B
2025-12-05 NEO (Native VLM Architecture) 2 SenseTime model paper 82519B (max)
2025-12-05 Hunyuan 2.0 Tencent model 406B32B256K
2025-12-04 Nex-N1: Agentic Models via Large-Scale Environment Construction Nex-AGI paper
2025-12-02 Amazon Nova 2 Amazon model 1M2211/100
2025-12-02 Mistral Large 3 Mistral model 1.98K675B41B256K1639/100
2025-12-02 Nova 2 Model Family and Nova Act GA at re:Invent 2025 TechCrunch Amazon news
2025-12-02 Anthropic Acquires Bun, Claude Code Hits $1B ARR Anthropic Anthropic news
2025-12-01 Ministral 3 Mistral model 782.22K14B (max)6
2025-12-01 John Giannandrea to Retire; Amar Subramanya Named VP of AI Apple Apple news
2025-11-30 gelab-zero (STEP-GUI) StepFun library 2.19K2.21K
2025-11-28 LFM2 (Liquid Foundation Models 2) Liquid AI model 24B2.3B528/100
2025-11-27 DeepSeek-Math-V2 2 DeepSeek model dataset 1.59K384
2025-11-24 Claude 4.5 Opus Anthropic model ~3.4T200K3511/100
2025-11-24 HunyuanOCR Tencent model 1.65K347.7K1B
2025-11-20 OLMo 3 Ai2 model 10.31K32B5.9T65.54K889/100
2025-11-20 AICC: A 7.3T AI-Ready Corpus Built by a Model-Based HTML Parser PJLab dataset
2025-11-20 HunyuanVideo-1.5 Tencent model 4.47K2.34K
2025-11-20 MiMo-Embodied: X-Embodied Foundation Model Xiaomi paper 1.12K
2025-11-19 LPLB (Linear-Programming Load Balancer) DeepSeek library 505
2025-11-19 Step-Audio-R1 StepFun model 67322233B
2025-11-19 Yann LeCun Departs Meta to Found AMI Labs CNBC Meta news
2025-11-17 SenseNova-SI (Spatial Intelligence) 3 SenseTime model paper dataset 2718B (max)
2025-11-15 Doubao Seed Code ByteDance model 256K2611/100
2025-11-15 Doubao Seed Code (Reasoning Coder) Hits AA Intelligence Index 34 Artificial Analysis ByteDance news
2025-11-14 Miloco (Xiaomi Local Copilot) Xiaomi library 2.61K
2025-11-13 M100 Chip Baidu announcement
2025-11-12 AlphaProof Google paper
2025-11-12 Interview: Ant Group's Open Model Ambitions Interconnects Ant Group news
2025-11-10 kosong Moonshot AI library 520
2025-11-06 InfinityStar ByteDance model
2025-11-06 Step-Audio-EditX StepFun model 92943.36K3B
2025-11-05 SoftBank and SB Intuitions launch Sarashina API for enterprise access to Japanese LLM SoftBank SB Intuitions news
2025-11-03 LongCat-Flash-Omni 2 Meituan model paper 49262560B27B
2025-11-01 LightX2V SenseTime library 2.36K
2025-11-01 Inception Labs Raises $56M Seed from Menlo, Andrew Ng, Karpathy Inception Labs Inception Labs news
2025-10-31 GATE LG paper
2025-10-30 Emu3.5 BAAI model 1.52K907
2025-10-30 Kimi Linear 2 Moonshot AI model paper 1.4K48B3B
2025-10-29 Ouro ByteDance model 87.71K2.6B7.7T
2025-10-28 ODesign BAAI model 311
2025-10-28 URSA (Uniform Discrete Diffusion) BAAI model
2025-10-28 Parallel Loop Transformer ByteDance paper
2025-10-28 OpenAI Completes For-Profit PBC Restructuring OpenAI OpenAI news
2025-10-27 MiniMax-M2 MiniMax model 2.6K127.97K230B10B2828/100
2025-10-27 CoKE: Context as the Key to Biomolecular Understanding PJLab paper 18
2025-10-27 JanusCoder PJLab model 8047
2025-10-27 Hunyuan Mirror Tencent paper 1.14K2.99K1
2025-10-25 LongCat-Video 3 Meituan model paper 4.26K3.21K13.6B
2025-10-24 KAT-Coder 2 Kuaishou model paper 72B
2025-10-23 Anthropic to Expand Google Cloud TPU Use to 1M+ TPUs Anthropic Anthropic news
2025-10-22 Seed3D 1.0 ByteDance model
2025-10-20 DeepSeek-OCR / OCR-2 DeepSeek model 23.27K2.35M
2025-10-17 LongCat-Audio-Codec Meituan paper 301
2025-10-16 MorphoBench ZGCA paper 13
2025-10-15 Granite 4.0 IBM model 32B9B556/100
2025-10-15 InteractiveOmni 2 SenseTime model paper 88B (max)
2025-10-15 Granite 4.0: Hybrid Mamba Architecture, First ISO 42001 Certified Open Models IBM IBM news
2025-10-14 Rex-Omni 2 IDEA Lab model paper 1.44K33.78K3B
2025-10-14 Zhipu AI Breaks US Chip Reliance With First Major Model Trained on Huawei Stack SCMP Z.ai news
2025-10-13 RITE: Reinforcement Learning for Tool-Integrated Interleaved Thinking Meituan paper
2025-10-09 Ling 2.0 / Ling-1T 2 Ant Group model paper 3.35K1T50B1044/100
2025-10-01 R-HORIZON-Websearch Meituan dataset 26
2025-10-01 GDPval OpenAI eval 11320
2025-10-01 IBM Research Names Jay Gambetta as Director; Dario Gil to DOE IBM IBM news
2025-09-30 GLM-4.6 Z.ai model 355B2344/100
2025-09-29 Ring 4 Ant Group model paper 25838.2K1T63B262.14K
2025-09-29 DeepSeek-V3.2 2 DeepSeek model paper 3.59M2685B37B33
2025-09-28 HunyuanImage-3.0 2 Tencent model paper 3.12K2.61K13B
2025-09-26 Qwen3Guard Alibaba model 465
2025-09-25 Expanding Reasoning Potential (CoTP) Meituan paper
2025-09-24 LRM-Eval / ROME BAAI dataset 5
2025-09-23 ByteWrist ByteDance model
2025-09-23 LongCat-Flash-Thinking 2 Meituan model paper 285103560B27B
2025-09-23 Symphony-MoE PCL paper
2025-09-22 BGE-Reasoner BAAI model 31959
2025-09-22 ScaleCUA PJLab model 1.11K58
2025-09-18 Seedream 4.0 ByteDance model 1
2025-09-17 AToken Apple paper 140
2025-09-16 Shanghai launches innovation institute to bridge AI research and industry Shanghai Municipal Government SII news
2025-09-15 checkpoint-engine Moonshot AI library 963
2025-09-08 PLaMo 2 PFN model 34.05K31B2T32K
2025-09-05 Klear 3 Kuaishou model paper 8258146B2.5B
2025-09-05 MiniCPM4.1 2 OpenBMB model paper 9.42K49.98K8B
2025-09-02 Baichuan-M2 2 Baichuan paper model 212938132B
2025-09-02 Apertus Swiss AI model 161.34K270B15T65.54K289/100
2025-09-01 VeOmni ByteDance library 2K
2025-09-01 LongCat-Flash-Chat 2 Meituan model paper 1.34K81.37K1560B27B128K
2025-09-01 Hunyuan-MT Tencent model 71056.53K30B3B
2025-09-01 RLinf ZGCA library
2025-09-01 TwinBrainVLA ZGCA paper
2025-09-01 Mistral AI Raises EUR 2B at EUR 12B Valuation Mistral AI Mistral news
2025-08-28 HyperOS 3 Xiaomi announcement
2025-08-26 MiniCPM-V 4.5 2 OpenBMB model paper 25.58K93.36K
2025-08-25 GEPO PCL paper
2025-08-25 InternVL 3.5 PJLab model 10.06K3241B (max)28B (max)
2025-08-23 HunyuanVideo-Foley Tencent paper
2025-08-21 Fin-PRM: Process Reward Model for Financial Reasoning Alibaba paper 502
2025-08-21 Waver ByteDance model 938
2025-08-21 DeepSeek-V3.1 DeepSeek model 21
2025-08-21 Intern-S1 PJLab model 241B28B
2025-08-20 Seed-OSS-36B ByteDance model 88536.95K36B12T512K1844/100
2025-08-20 Nemotron Nano V2 NVIDIA model 15.37K12B128K
2025-08-20 Seed-OSS-36B Released as Apache-2.0 Open-Weight Model VentureBeat ByteDance news
2025-08-15 PXDesign ByteDance model 229
2025-08-15 Physical Autoregressive Model (PAR) PCL paper
2025-08-14 NextStep-1 2 StepFun model paper 6886214B
2025-08-14 Hunyuan-GameCraft 1.0 Tencent model 72376
2025-08-14 Cohere Raises $500M at $6.8B Valuation Cohere Cohere news
2025-08-14 Cohere Hires Long-Time Meta Research Head Joelle Pineau as Chief AI Officer TechCrunch Cohere news
2025-08-12 Mistral Medium 3.1 Mistral model 128K1511/100
2025-08-12 InternBootcamp PJLab library 349
2025-08-11 GLM-4.5V Z.ai model 2.33K167.01K106B12B64K
2025-08-07 CANN Huawei library
2025-08-07 GPT-5 OpenAI model ~4.1T400K366/100
2025-08-07 TMA-Adaptive FP8 Grouped GEMM PJLab paper 25
2025-08-06 ACAVCaps Xiaomi dataset 424
2025-08-05 OmniScale ByteDance paper 2K
2025-08-05 Seed Diffusion ByteDance model
2025-08-05 dots.vlm1 Xiaohongshu model
2025-08-01 Qwen-Image 2 Alibaba model 7.98K173.33K20B
2025-08-01 MegaDFT ZGCA paper
2025-08-01 Ai2 and UW Awarded $152M from NSF and NVIDIA for Open Scientific AI GeekWire Ai2 news
2025-07-31 Seed-Prover ByteDance model 433
2025-07-30 dots.ocr Xiaohongshu model 3B
2025-07-29 Libra-Bench & PIE_bench Meituan dataset
2025-07-28 MixGRPO Tencent paper 1.14K
2025-07-28 GLM-4.5 2 Z.ai model paper 2355B19
2025-07-27 SenseNova V6.5 SenseTime model
2025-07-27 StepFun-Prover-Preview StepFun model 356032B
2025-07-27 HunyuanWorld 3 Tencent model 2.85K615
2025-07-25 Step-3 2 StepFun model paper 453144.09K321B38B
2025-07-24 A.X 3.1 SK Telecom model 1330534B
2025-07-24 SoftBank Corp. to Build the World's Largest AI Computing Infrastructure Using NVIDIA DGX SuperPOD with NVIDIA Blackwell GPUs SB Intuitions SB Intuitions news
2025-07-23 Towards Greater Leverage: Scaling Laws for Efficient MoE Ant Group paper
2025-07-23 ASI-Arch SII paper 726
2025-07-22 Qwen-Code Alibaba library 25.07K
2025-07-22 Qwen3-Coder 2 Alibaba model 16.61K1M480B (max)35B (max)1444/100
2025-07-22 Seed-X Series ByteDance model 1721307B
2025-07-22 Reka Raises $110M Series B at $1B Valuation Reka Reka news
2025-07-17 Agentar-DeepFinance-100K Ant Group dataset 35
2025-07-17 Apple Intelligence Foundation Models Tech Report 2025 Apple Apple news
2025-07-14 EXAONE 4.0 LG model 33.89K32B628/100
2025-07-12 Scaling Laws for Optimal Data Mixtures Apple paper
2025-07-11 Kimi K2 4 Moonshot AI model paper 10.84K2.71M51T32B256K1944/100
2025-07-10 FlexOlmo Ai2 model 1501.4K33B
2025-07-10 KAT (Kwai-AutoThink) 2 Kuaishou paper model 5572B
2025-07-09 EXAONE Path 2.0 LG paper
2025-07-09 Grok-4 SpaceX model ~3.2T256K336/100
2025-07-09 Grok-4 Released with Native Tool Use and Reasoning xAI SpaceX news
2025-07-07 POLAR PJLab paper 167
2025-07-05 How to Train Your LLM Web Agent ServiceNow paper
2025-07-03 IFBench Ai2 eval 140130058
2025-07-03 A.X 4.0 SK Telecom model 158653
2025-07-01 CodePRM: Execution Feedback-enhanced Process Reward Model for Code Generation Huawei paper
2025-07-01 Voxtral Mistral model 337.47K
2025-07-01 Solar Pro 2 Upstage model 31B64K811/100
2025-06-30 openPangu Huawei announcement
2025-06-30 Meta Superintelligence Labs Created; Wang Named Chief AI Officer CNBC Meta news
2025-06-27 HyperCLOVA X THINK Naver model 128K
2025-06-27 Hunyuan-A13B 2 Tencent model paper 81646.54K80B13B256K
2025-06-26 Kwai Keye-VL 3 Kuaishou model paper 785192.8K31B3B262.14K
2025-06-25 OctoThinker SII model 1888B (max)
2025-06-24 Video-XL-2 BAAI model 54
2025-06-17 Mercury (Diffusion LLM) Inception Labs model 1128K25
2025-06-16 SciSage / SurveyScope BAAI library
2025-06-16 MiniMax-M1 2 MiniMax model paper 3.15K855456B45.9B1M
2025-06-15 AI-Driven Agentic Design Platform for Tumor Immunotherapy Drugs ZGCA announcement
2025-06-15 ZGCA & ZGCI Unveil AI-Driven Tumor Immunotherapy Drug Design Platform Zhongguancun Academy ZGCA news
2025-06-13 Scientists' First Exam PJLab eval 83066
2025-06-12 Seed-1.6 (AdaCoT) ByteDance model 256K
2025-06-12 Magistral Mistral model 38.49K224B (max)1250/100
2025-06-12 Predictable Scale Part II: Farseer StepFun paper
2025-06-12 Seed-1.6 Introduces Adaptive Chain-of-Thought (AdaCoT) ByteDance Seed ByteDance news
2025-06-11 FlagEvalMM BAAI library 106
2025-06-10 Seedance 1.0 ByteDance model
2025-06-09 RedNote joins AI race with its own open-source model that it says bests Alibaba, DeepSeek SCMP Xiaohongshu news
2025-06-07 The Illusion of Thinking Apple paper
2025-06-06 RoboBrain 2.0 2 BAAI model 1.09K573
2025-06-06 MiniCPM4 2 OpenBMB model paper 9.42K11.73K8B
2025-06-06 Ultra-FineWeb OpenBMB dataset 30
2025-06-06 dots.llm1 2 Xiaohongshu model paper 142B14B11.2T32.77K
2025-06-05 RoboRefer / RefSpatial BAAI model 263
2025-06-04 MiMo-VL 2 Xiaomi model paper 6433.93K7B
2025-06-04 Chinese social media app Xiaohongshu's $26 billion valuation bolsters GSR fund Bloomberg Xiaohongshu news
2025-06-01 HumanSense Benchmark Ant Group dataset
2025-06-01 BrowseComp & WideSearch Moonshot AI dataset
2025-06-01 kimi-agent-sdk Moonshot AI library 487
2025-06-01 kimi-cli Moonshot AI library 8.94K
2025-06-01 Kimi-Dev 2 Moonshot AI model paper 1.23K2.67K72B
2025-06-01 Kimi-Researcher Moonshot AI model 81
2025-06-01 walle Moonshot AI library 21
2025-06-01 AgentCPM Series 3 OpenBMB paper 8001.67K
2025-06-01 A.X Encoder SK Telecom model 2.68K
2025-06-01 CF-Div2-Stepfun StepFun dataset
2025-06-01 SteptronOss StepFun library 575
2025-06-01 MiMo-Audio 2 Xiaomi model paper 7B
2025-06-01 BrowseComp Z.ai dataset
2025-06-01 KTransformers Z.ai library 17.26K
2025-05-30 AReaL Ant Group library 5.29K
2025-05-28 Ming-Omni Ant Group model 65642
2025-05-28 DeepSeek-R1-0528 DeepSeek model 671B37B
2025-05-28 Pangu Embedded Huawei paper 118
2025-05-28 Skywork Open Reasoner 1 Skywork model 74332B
2025-05-27 Pangu Pro MoE Huawei paper 1331
2025-05-27 HunyuanVideo-Avatar Tencent paper
2025-05-26 SynLogic 2 MiniMax paper dataset 203121
2025-05-23 One RL to See Them All: Visual Triple Unified RL MiniMax paper 333
2025-05-22 Claude 4 Anthropic model ~1.4T1M2511/100
2025-05-22 XRing O1 Xiaomi announcement
2025-05-21 Devstral 2 Mistral model 123B256K15
2025-05-21 Falcon-H1 TII model 11811.02K34B256K
2025-05-20 BAGEL ByteDance model 6K82014B
2025-05-17 Video-SafetyBench BAAI eval 122.26K
2025-05-17 Model Merging in Pre-training of LLMs ByteDance paper
2025-05-15 BGE-Code-v1 BAAI model 11.8K4.84K
2025-05-15 Apriel-Nemotron-15B Reasoning Model with NVIDIA ServiceNow ServiceNow news
2025-05-14 AlphaEvolve Google paper 6
2025-05-12 Seed1.5-VL ByteDance model 1.58K200B20B
2025-05-12 MiniMax-Speech: Intrinsic Zero-Shot TTS MiniMax paper
2025-05-12 Step1X-3D: High-Fidelity Textured 3D Assets StepFun model 870
2025-05-10 Gated Attention for Large Language Models Alibaba paper
2025-05-08 Seed-Coder-8B ByteDance model 7558B65.54K
2025-05-07 DeerFlow ByteDance library 70.89K
2025-05-07 Pangu Ultra MoE 2 Huawei model paper 16718B39B
2025-05-07 HunyuanCustom Tencent paper 1.22K1
2025-05-06 CCI 4.0 BAAI dataset
2025-05-06 OpenSeek BAAI model 2614
2025-05-06 RoboOS 2 BAAI library 575
2025-05-02 MiMo (Reasoning) 2 Xiaomi model paper 91.71K17B
2025-05-01 AWorld Ant Group library 1.2K
2025-05-01 Kanana 1.5 Kakao model 8915.7B3B32K
2025-04-30 Amazon Nova Premier Amazon model ~470B1M1311/100
2025-04-30 DeepSeek-Prover-V2 DeepSeek model 1.27K633671B
2025-04-30 Nova Premier Launched as Amazon's Most Capable AI Model TechCrunch Amazon news
2025-04-30 Phi-4 Reasoning Models Released with Chain-of-Thought Microsoft Microsoft news
2025-04-29 Qwen3 9 Alibaba model paper 27.3K741T22B (max)13
2025-04-29 First LlamaCon Developer Conference Meta AI Meta news
2025-04-25 PolyMath Alibaba dataset 44
2025-04-25 Kimi-Audio 2 Moonshot AI model paper 4.65K80.22K7B
2025-04-24 Step1X-Edit StepFun model 2.22K110
2025-04-22 TTRL: Test-Time Reinforcement Learning PJLab paper
2025-04-19 SRPO: Staged History-Resampling Policy Optimization Kuaishou paper
2025-04-17 Nemotron-CLIMB: Clustering-based Iterative Data Mixture Bootstrapping 3 NVIDIA paper dataset 1
2025-04-16 o3 OpenAI model ~3T200K306/100
2025-04-15 DataDecide Ai2 paper
2025-04-15 ReTool: Reinforcement Learning for Strategic Tool Use in LLMs ByteDance paper 2
2025-04-15 Kling 2.0 1 Kuaishou model
2025-04-15 Kimina-Prover 2 Moonshot AI model paper 370717
2025-04-15 miniF2F-test (Rectified) Moonshot AI dataset 370
2025-04-15 Apriel ServiceNow model 32415B (max)4.5T
2025-04-15 Step-R1-V-Mini StepFun model
2025-04-15 ZR1-1.5B Zyphra model 1.5K1.5B
2025-04-15 Apriel-5B: ServiceNow's First Open SLM ServiceNow ServiceNow news
2025-04-14 InternVL3 PJLab model 10.06K578B (max)
2025-04-12 SenseNova V6 SenseTime model 600B
2025-04-10 Scaling Laws for Native Multimodal Models Apple paper
2025-04-10 Seed1.5-Thinking: Advancing Superb Reasoning Models with RL ByteDance paper 1
2025-04-10 Pangu Ultra 2 Huawei model paper 77135B
2025-04-10 Kimi-VL 2 Moonshot AI model paper 1.2K123.09K116B2.8B
2025-04-08 Amazon Nova Sonic Amazon model
2025-04-08 Dream 7B Huawei model 1.25K7B
2025-04-08 Skywork R1V Series Skywork model 4138B
2025-04-08 Nova Sonic Speech-to-Speech Model Launched on Bedrock AWS Amazon news
2025-04-07 BaichuanMed-OCR Baichuan model 16972B (max)
2025-04-05 Llama 4 Meta model 400B17B1M1428/100
2025-04-05 Llama 4 Scout and Maverick Released (First MoE, Multimodal) Meta AI Meta news
2025-04-04 Nemotron-H NVIDIA model 58.35K56B20T
2025-04-03 DeepSeek-GRM: Inference-Time Scaling for Generalist Reward Modeling DeepSeek paper 1
2025-04-01 MiniMax Speech Series MiniMax model
2025-03-31 Amazon Nova Act Amazon model 909
2025-03-31 Amazon Unveils Nova Act, an AI Agent That Controls a Web Browser TechCrunch Amazon news
2025-03-30 ToRL: Scaling Tool-Integrated RL SII paper 3491
2025-03-28 Doubao-Deep-Thinking ByteDance model
2025-03-27 OpenComplex 2 BAAI model 269
2025-03-26 Qwen2.5-Omni-7B Alibaba model 4.02K777.76K7B
2025-03-25 Gemini 2.5 Pro Google model ~1.2T1M276/100
2025-03-24 SimpleRL-Zoo: Investigating and Taming Zero RL for Open Base Models ByteDance, Meituan paper
2025-03-21 Hunyuan-T1 Tencent model
2025-03-18 Sable BAAI model 2
2025-03-18 Llama-Nemotron (Nano/Super/Ultra) NVIDIA model 123.53K1128K953/100
2025-03-18 HaploVL Tencent model 65
2025-03-17 EXAONE Deep LG model 1.4K32B
2025-03-16 ERNIE 4.5 Baidu model 7.72K424B (max)47B (max)956/100
2025-03-16 ERNIE X1 Baidu model
2025-03-12 Gemma 3 Google model 1.42M4027B128K
2025-03-10 Seedream 2.0 ByteDance paper
2025-03-10 Reka Flash 3 Released (Open-Weight, 21B) Reka Reka news
2025-03-07 Ling 2 Ant Group paper model 25838.63K116B
2025-03-06 QwQ-32B Alibaba model 52.45K32B
2025-03-06 BGE-VL 2 BAAI model dataset 11.8K4.55K
2025-03-06 Predictable Scale Part I: Step Law StepFun paper 1
2025-03-04 CogView-4 Z.ai model 1.1K
2025-03-03 Aya Vision Cohere model 160.27K32B
2025-03-01 Command A Cohere model 2.28K111B256K833/100
2025-03-01 Sarashina2.2 3 SB Intuitions model 12.98K4B
2025-02-28 3FS (Fire-Flyer File System) DeepSeek library 9.96K
2025-02-28 Smallpond DeepSeek library 4.96K
2025-02-28 Image-01 MiniMax model
2025-02-27 RoboBrain BAAI model 553132
2025-02-27 UniTok ByteDance paper 526
2025-02-27 DualPipe DeepSeek library 2.96K
2025-02-27 EPLB (Expert Parallelism Load Balancer) DeepSeek library 1.39K
2025-02-27 Hunyuan Turbo S 2 Tencent model paper 560B56B
2025-02-26 DeepGEMM DeepSeek library 7.36K
2025-02-26 BIG-Bench Extra Hard (BBEH) Google eval 4.52K23
2025-02-26 Kanana Kakao model 280132.5B3T
2025-02-26 Granite 3.2: Multimodal Vision and Chain-of-Thought Reasoning IBM IBM news
2025-02-25 DeepEP DeepSeek library 9.71K
2025-02-24 Claude Code Anthropic library 131.53K
2025-02-24 Baichuan-Audio 2 Baichuan paper model 22210310B
2025-02-24 FlashMLA DeepSeek library 12.7K
2025-02-24 Reasoning with Latent Thoughts: On the Power of Looped Transformers Google paper 1
2025-02-24 Muon Optimizer 2 Moonshot AI paper library 1
2025-02-24 Topic Over Source: The Key to Effective Data Mixing for LLM Pre-training PJLab paper 1
2025-02-22 Moonlight-3B/16B Moonshot AI model 1.49K32.57K116B (max)
2025-02-20 SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines ByteDance eval 188426.53K285
2025-02-19 Qwen2.5-VL Alibaba model 52
2025-02-19 FlexTok Apple paper 319
2025-02-18 MoBA: Mixture of Block Attention for Long-Context LLMs Moonshot AI paper 2.13K2
2025-02-18 Hunyuan-Large-Vision Tencent model 389B52B
2025-02-17 Mistral Saba Mistral model 24B32K
2025-02-17 OpenDWM / MaskGWM 2 SenseTime library paper 398
2025-02-17 Grok-3 SpaceX model ~2.1T1M18
2025-02-17 Step-Audio / Step-Audio2 StepFun model 27
2025-02-17 Grok-3 Launched, Trained on 200K GPU Colossus Cluster xAI SpaceX news
2025-02-16 AdaGC: Improving Training Stability for Large Language Model Pretraining Baidu paper
2025-02-16 NSA: Native Sparse Attention DeepSeek paper 2
2025-02-15 1bit-Merging 1 Huawei paper
2025-02-14 WebOrganizer Ai2 paper
2025-02-14 LLaDA 2 Ant Group model 3.82K2.27K5
2025-02-14 Step-Video-T2V 2 StepFun model paper 3.19K300B
2025-02-12 Wu Yonghui Joins ByteDance as Head of Seed Basic Research SCMP ByteDance news
2025-02-11 Nature Language Model (NatureLM) Microsoft model 46146.7B13B
2025-02-05 Scaling Laws for Upcycling Mixture-of-Experts Language Models SB Intuitions paper 9
2025-02-05 LIMO SII paper 543
2025-02-04 OpenAI and Kakao to Jointly Develop AI Products for South Korea CNBC Kakao news
2025-02-01 ModernBERT-Ja SB Intuitions model 44.81K310M
2025-01-30 MedXpertQA: Benchmarking Expert-Level Medical Reasoning and Understanding PJLab paper
2025-01-30 MedXpertQA PJLab eval 4.46K
2025-01-26 Baichuan-Omni-1.5 2 Baichuan paper model 1911.84K7B
2025-01-24 Baichuan-M1 3 Baichuan paper model 821514.5B
2025-01-24 FireRedASR Xiaohongshu model 8.3B (max)
2025-01-23 UltraRAG OpenBMB library 5.58K
2025-01-22 Doubao-1.5-Pro ByteDance model 200B20B256K
2025-01-22 UI-TARS ByteDance library 10.9K621.09K5
2025-01-22 DeepSeek-R1 DeepSeek model 5.35M671B37B2050/100
2025-01-22 Revisit Self-Debugging with Self-Generated Tests for Code Generation Meituan paper 1
2025-01-21 Hunyuan3D 2.0 3 Tencent model paper 13.93K75.95K6
2025-01-21 Stargate Project: $500B AI Infrastructure Initiative OpenAI OpenAI news
2025-01-20 Kimi k1.5 2 Moonshot AI model paper 3.47K11
2025-01-17 ComplexFuncBench Z.ai dataset 180
2025-01-15 InternLM3 PJLab model 7.22K
2025-01-14 MiniMax-01 3 MiniMax model paper 3.43K172.55K1456B4M
2025-01-14 MiniCPM-o 2.6 OpenBMB model 25.58K386.43K8B
2025-01-10 GThinker PCL model
2025-01-09 WanJuan 3.0 (WanJuan-SiLu) PJLab dataset
2025-01-07 Cosmos NVIDIA model 9
2025-01-03 AgentRefine Meituan paper
2025-01-01 Document Parse Upstage library
2024-12-31 OLMo 2 Ai2 model 3.72K232B6T4.1K
2024-12-26 DeepSeek-V3 3 DeepSeek model paper 227671B37B10
2024-12-25 QVQ Alibaba model
2024-12-24 LLM-jp-3 (172B) NII model 20172B2.1T4.1K
2024-12-23 Baichuan4-Finance 2 Baichuan paper model 1
2024-12-18 NOVA (Non-quantized Video Autoregressive) BAAI model 651
2024-12-13 DeepSeek-VL2 DeepSeek model 5.3K2.68K22
2024-12-13 Liquid AI Raises $250M Series A Led by AMD Liquid AI Liquid AI news
2024-12-13 Profile: Shanghai AI Lab: Driving both AI safety and development MERICS PJLab news
2024-12-12 Phi-4 Microsoft model 801.85K2414B550/100
2024-12-12 Phi-4 Released: 14B SLM Specializing in Complex Reasoning Microsoft Research Microsoft news
2024-12-09 ProcessBench Alibaba dataset 1
2024-12-06 Aya Expanse Cohere model 48.1K332B128K
2024-12-06 EXAONE 3.5 LG model 10.25K232B
2024-12-06 Densing Law of LLMs OpenBMB paper 2
2024-12-06 InternVL 2.5 PJLab model 10.06K3491378B (max)
2024-12-05 Language Model Ladders Ai2 paper
2024-12-05 Infinity & InfinityStar ByteDance model 1.57K1
2024-12-05 Liquid: Scalable Multi-modal Generation ByteDance model 6431
2024-12-05 Divot Tencent model 87
2024-12-05 Moto Tencent paper 177
2024-12-04 GenCast Google paper
2024-12-04 RedStone Microsoft dataset 161
2024-12-03 Amazon Nova Amazon model 2~90B300K811/100
2024-12-03 AWS Trainium2 (Trn2 / Trn2 UltraServer) Amazon announcement
2024-12-03 HunyuanVideo Tencent model 12.19K613B
2024-12-03 SEED-Voken Tencent paper 1.01K
2024-12-03 GLM-4-Voice: End-to-End Spoken Chatbot Z.ai model 2
2024-12-03 AWS Trainium2 Chips Generally Available; Trainium3 Previewed TechCrunch Amazon news
2024-12-01 Falcon 3 TII model 10B
2024-11-25 Model Context Protocol (MCP) Anthropic library
2024-11-22 Tülu 3 Ai2 model 3.75K2.6K7
2024-11-22 Zamba2 (Hybrid SSM/Transformer Suite) Zyphra model 19311037.4B3T
2024-11-22 Amazon Doubles Anthropic Investment to $8 Billion CNBC Amazon news
2024-11-21 AIMv2 Apple paper 1.42K2
2024-11-21 DINO-X 2 IDEA Lab model paper 1.39K4
2024-11-20 Hymba NVIDIA paper 2792
2024-11-19 Aquila-VL-2B BAAI model 49
2024-11-15 MARS (Make vAriance Reduction Shine) ByteDance paper 721
2024-11-08 Sarashina2-8x70B SB Intuitions model 12
2024-11-08 SB Intuitions releases 460B-parameter Japanese LLM Sarashina2-8x70B for academia and industry SB Intuitions SB Intuitions news
2024-11-04 Hunyuan-Large 2 Tencent model paper 1.59K4826389B52B7T256K
2024-11-04 Hunyuan3D 1.0 Tencent model 3.48K
2024-11-01 SimpleQA OpenAI eval 124.33K
2024-11-01 InternThinker PJLab model
2024-10-29 Agentforce Platform Launched for Enterprise AI Agents Salesforce Salesforce news
2024-10-28 AutoGLM 2 Z.ai model paper 1
2024-10-28 Zhongguancun Institute of Artificial Intelligence Established ZGCI ZGCA news
2024-10-24 Infinity-MM BAAI dataset
2024-10-24 MotionCLR 2 IDEA Lab library paper 17
2024-10-24 Skywork-Reward 2 Skywork model 37
2024-10-22 OmniGen BAAI model 4.33K171
2024-10-17 Janus 4 DeepSeek model paper dataset 17.75K10.89K11
2024-10-15 Zyda-2 Zyphra dataset
2024-10-11 Baichuan-Omni 2 Baichuan paper model 2737B
2024-10-09 MLE-bench OpenAI eval 1.57K975
2024-10-09 PLaMo-100B PFN model 118100B2T
2024-10-09 Demis Hassabis & John Jumper Awarded Nobel Prize in Chemistry Google DeepMind Google news
2024-10-07 Falcon Mamba TII model 123.63K37.27B5.8T8.19K
2024-10-02 Llama-3.1-Nemotron-70B NVIDIA model 584128K
2024-10-02 Poolside Raises $500M Series B at ~$3B Valuation, Led by Bain Capital Ventures Crunchbase News Poolside news
2024-10-01 TxT360 MBZUAI dataset 22
2024-09-27 Emu3 2 BAAI paper 2.42K4
2024-09-25 Molmo Ai2 model 9132.17K8
2024-09-23 MobileUI Dataset Xiaomi dataset 79
2024-09-23 MobileVLM Xiaomi model 79
2024-09-19 Qwen2.5 3 Alibaba model paper 27.3K7472B (max)18T10
2024-09-18 Qwen2-VL Alibaba model 27.42K76
2024-09-18 Qwen2.5-Coder 2 Alibaba model paper 16.61K37480B (max)7
2024-09-18 Qwen2.5-Math 2 Alibaba model paper 1.08K1272B (max)
2024-09-12 o1 OpenAI model ~3.5T200K236/100
2024-09-11 Pixtral 12B Mistral model 4.18K612B128K
2024-09-05 AdEMAMix Optimizer Apple paper
2024-09-05 DeepSeek-V2.5 DeepSeek model 7.71K
2024-09-05 MiniCPM3-4B OpenBMB model 9.42K40.57K4B
2024-09-05 Open-MAGVIT2 Tencent library 1.01K
2024-09-05 FireRedTTS Xiaohongshu model
2024-09-05 Silvio Savarese Named to TIME 100 Most Influential in AI Salesforce Salesforce news
2024-09-03 OLMoE Ai2 model 1.03K103.29K76.9B1.3B5.13T4.1K
2024-08-31 Hailuo AI (Video-01 / 2.3) MiniMax model
2024-08-29 CogVLM2 Z.ai model 6
2024-08-28 Auxiliary-Loss-Free Load Balancing Strategy DeepSeek paper 6
2024-08-26 Fire-Flyer AI-HPC: Cost-Effective Software-Hardware Co-Design DeepSeek paper
2024-08-21 Minitron NVIDIA paper 3803
2024-08-21 Sarashina2 SB Intuitions model 91370B2.1T4.1K
2024-08-12 CogVideoX: Text-to-Video Diffusion Models Z.ai model 12.78K16
2024-08-07 EXAONE 3.0 LG model 41.67K27.8B
2024-08-05 MiniCPM-V 2.6 OpenBMB model 25.58K238B
2024-08-01 EXAONEPath 1.0 LG paper 1
2024-08-01 MiniMax Music Series MiniMax model
2024-07-29 Apple Foundation Models (AFM) Apple paper 4
2024-07-29 MindSearch PJLab library 6.87K2
2024-07-24 Mistral Large 2 Mistral model 6.26K123B128K
2024-07-23 Llama 3.1 Meta model 213.13K405B15.6T128K939/100
2024-07-22 RazorAttention: KV Cache Compression Through Retrieval Heads Huawei paper
2024-07-20 Consent in Crisis: The Rapid Decline of the AI Data Commons Cohere paper 11
2024-07-20 Falcon 2 TII model 4.48K211B5.5T8.19K
2024-07-18 Mistral NeMo Mistral model 32.02K12B128K
2024-07-16 Codestral Mamba Mistral model 47.98K7.3B
2024-07-11 EchoMimicV2 & V3 Ant Group paper 4.25K2
2024-07-11 Skywork-Math Skywork paper
2024-07-06 Kolors 2 Kuaishou model paper 4.61K401
2024-07-05 SenseNova 5.5 SenseTime model
2024-07-05 Vimi SenseTime model
2024-07-04 LLM-jp (v1/v2) NII model 513B
2024-07-04 Step-2 StepFun model 1T
2024-07-03 LivePortrait 2 Kuaishou library paper 18.53K7.37K9
2024-07-03 InternLM2.5 PJLab model 7.22K106.78K1M
2024-07-02 InternVL 2.0 PJLab model 10.06K305.47K108B (max)
2024-07-01 QPlanner LG paper
2024-07-01 Mathstral 7B Mistral model 19.29K7B
2024-07-01 MMLongBench-Doc PJLab dataset 1471
2024-06-28 ERNIE 4.0 Turbo Baidu model
2024-06-26 Zhang Hongjiang, founder of BAAI: 'AI systems should never be able to deceive humans' Financial Times BAAI news
2024-06-24 Mooncake 2 Moonshot AI paper dataset 5.55K13
2024-06-24 Large Vocabulary Size Improves Large Language Models SB Intuitions paper 1
2024-06-20 Claude 3.5 Sonnet Anthropic model 200K1011/100
2024-06-17 AquilaMed-RL BAAI model 241
2024-06-17 DeepSeek-Coder-V2 2 DeepSeek model paper 6.83K4.01K485
2024-06-17 Nemotron-4 340B NVIDIA model 5437340B9T4.1K
2024-06-14 MASt3R Naver paper 2.99K1
2024-06-12 SciRIFF Ai2 dataset 1
2024-06-11 Dasheng 3 Xiaomi paper model 4247B
2024-06-10 LlamaGen ByteDance model 1.95K5
2024-06-10 Apple Intelligence Introduced at WWDC 2024 Apple Apple news
2024-06-06 Qwen2 2 Alibaba model paper 27.3K4972B (max)14B (max)6
2024-06-06 Kling 2 Kuaishou model paper
2024-06-05 GLM-4V Z.ai model 2.33K
2024-06-04 Seed-TTS 2 ByteDance model paper 1.56K4
2024-06-03 Skywork-MoE Skywork model 1405563146B22B
2024-06-01 agentUniverse Ant Group library 2.27K1
2024-06-01 FlagScale BAAI library 518
2024-06-01 Yuan Embedding Inspur model 3.53K
2024-05-30 MotionLLM 2 IDEA Lab model paper 3866
2024-05-29 Codestral Mistral model 13.02K22B32K
2024-05-28 Yuan 2.0-M32 Inspur model 1951.68K140B3.7B
2024-05-27 RLAIF-V OpenBMB paper 455223
2024-05-23 DeepSeek-Prover DeepSeek model 5771186
2024-05-22 Baichuan 4 Baichuan model
2024-05-21 Scaling Monosemanticity Anthropic paper
2024-05-16 Grounding DINO 1.5 2 IDEA Lab model paper 1.12K11
2024-05-15 ByteFF ByteDance model 84
2024-05-14 Piccolo2 Embedding Model 2 SenseTime model paper 145431
2024-05-14 Hunyuan-DiT 2 Tencent model paper 4.29K11.5B
2024-05-13 GPT-4o OpenAI model ~720B128K116/100
2024-05-13 Plot2Code Tencent dataset 241
2024-05-08 AlphaFold 3 Google paper
2024-05-07 DeepSeek-V2 2 DeepSeek model paper 5.01K5.35K102236B21B4
2024-05-07 Granite Code IBM model 1.25K1034B4.5T
2024-04-26 llm-jp-corpus NII dataset 474
2024-04-25 InternVL 1.5 PJLab model 10.06K9.81K1626B
2024-04-25 ShareGPT-4o PJLab dataset 16
2024-04-24 SenseNova 5.0 SenseTime model 200K
2024-04-22 OpenELM Apple model 23B
2024-04-22 SEED-X Tencent model 5581
2024-04-18 Reka Core, Flash, and Edge Reka model 125367B128K439/100
2024-04-17 ABAB 6 / 6.5 MiniMax model
2024-04-17 Mixtral 8x22B Mistral model 4.66K141B39B64K
2024-04-12 MiniCPM-V 4 OpenBMB model paper 25.58K163.09K239B
2024-04-11 MiniCPM-V 2.0 OpenBMB model 67.98K2.8B
2024-04-03 VAR (Visual Autoregressive Modeling) ByteDance model 8.7K7
2024-04-02 HyperCLOVA X Naver model 6
2024-04-01 RULER: What's the Real Context Size of Your LLM? NVIDIA eval 1113
2024-03-28 Jamba AI21 Labs model 51541398B94B256K522/100
2024-03-28 Dataverse Upstage library 564
2024-03-28 sDPO Upstage paper 2
2024-03-28 Jamba: First Production-Grade SSM-Transformer Hybrid Released AI21 Labs AI21 Labs news
2024-03-23 Step-1 StepFun model 130B
2024-03-23 Step-1V / 1.5V / 2V StepFun model
2024-03-23 Understanding Emergent Abilities from the Loss Perspective Z.ai paper 1
2024-03-19 MergeKit Arcee library 7.13K3
2024-03-17 Grok-1 SpaceX model 314B78B8.19K
2024-03-12 Command R / R+ Cohere model 33.71K104B128K
2024-03-11 Unraveling the Mystery of Scaling Laws: Part I Meituan paper
2024-03-08 DeepSeek-VL DeepSeek model 4.13K8.76K45
2024-03-08 CogView3 Z.ai model 144
2024-03-04 Claude 3 Anthropic model 200K1211/100
2024-03-01 Kimi 2M Moonshot AI model 2M
2024-02-28 WanJuan 2.0 (WanJuan-CC) PJLab dataset
2024-02-27 BioT5+ Microsoft paper 4
2024-02-23 MegaScale ByteDance library 24
2024-02-21 SDXL-Lightning ByteDance model 72.67K6
2024-02-21 Gemma Google model 26.93K2257B
2024-02-15 Gemini 1.5 Pro Google model 2821M106/100
2024-02-15 SAMformer 2 Huawei model paper 1901
2024-02-15 Sora OpenAI model
2024-02-07 Moirai Salesforce model 1.52K109.96K32311M
2024-02-06 SenseNova 4.0 SenseTime model
2024-02-05 BGE-M3 2 BAAI model paper 11.8K28.87M52
2024-02-05 DeepSeek-Math 2 DeepSeek model paper 3.32K69
2024-02-04 Qwen1.5 Alibaba model 110B (max)
2024-02-01 OLMo Ai2 model 6.53K2.4K97B2.46T2.05K
2024-02-01 Aya 101 Cohere model 8.81K1213B
2024-02-01 MiniCPM 3 OpenBMB model paper 9.42K3.73K202B (max)
2024-01-31 Dolma Ai2 dataset 1.51K9
2024-01-30 YOLO-World Tencent paper 6.4K28
2024-01-29 Baichuan 3 Baichuan model
2024-01-23 InternLM2 2 PJLab model paper 7.22K19.21K27
2024-01-20 TFLOP Upstage paper 51
2024-01-19 Depth Anything ByteDance model 8.26K22
2024-01-17 AlphaGeometry Google paper
2024-01-17 GLM-4 Z.ai model 7.07K
2024-01-15 SciGLM / SciInstruct Z.ai paper 4
2024-01-11 DeepSeek-MoE 2 DeepSeek model paper 18.65K1616B2.8B
2024-01-09 Lightning Linear Attention Ant Group paper 2
2024-01-09 Baichuan-NPC Baichuan model
2024-01-04 LLaMA Pro Tencent model 5131.37K1
2024-01-01 VSAG Ant Group library 482
2024-01-01 FlagAI BAAI library 3.87K
2023-12-28 PanGu-pi 3 Huawei model paper 3.16K27B (max)
2023-12-28 Spike No More: Stabilizing the Pre-training of Large Language Models SB Intuitions paper 2
2023-12-23 SOLAR 10.7B Upstage model 51.95K710.7B6
2023-12-22 GraphCast Google paper 6.67K170
2023-12-21 DUSt3R Naver paper 7.19K3
2023-12-21 InternVL: Scaling up Vision Foundation Models PJLab model 10.06K176B
2023-12-20 Emu2 BAAI model 1.77K5647
2023-12-16 Paloma Ai2 dataset
2023-12-11 Mixtral 8x7B Mistral model 60.57K12046.7B12.9B32K
2023-12-06 Gemini 1.0 Google model 80932K56/100
2023-12-05 MLX Apple library 26.82K
2023-12-05 Lenna 2 Meituan model paper 871
2023-12-05 ReasonDet Meituan dataset 871
2023-11-29 DeepSeek-LLM 2 DeepSeek model paper 7.03K1.57K8767B
2023-11-29 GNoME (Materials Discovery) Google paper 1.19K
2023-11-28 Falcon (7B / 40B / 180B) TII model 31115180B3.5T2.05K
2023-11-27 MagicAnimate & Make Pixels Dance ByteDance model 10.91K7
2023-11-27 Yuan 2.0 Inspur model 688102.6B (max)
2023-11-27 UniRepLKNet Tencent model 1.07K34
2023-11-22 T-Rex 3 IDEA Lab model paper 2.68K8
2023-11-20 GPQA: Graduate-Level Google-Proof Q&A Anthropic, Ai2 eval 51021198
2023-11-14 Qwen2-Audio 2 Alibaba model paper 2.08K1.48K277B (max)
2023-11-06 CogVLM Z.ai model 6.74K77
2023-11-02 DeepSeek-Coder 2 DeepSeek model paper 23.66K6.35K10733B (max)
2023-10-30 Skywork-13B Skywork model 7291113B3.2T4.1K
2023-10-25 DiQAD Baidu dataset 1
2023-10-19 KwaiYiiMath 2 Kuaishou model paper
2023-10-17 ERNIE 4.0 Baidu model
2023-10-17 BitNet Microsoft paper 63
2023-10-13 VideoCrafter Tencent library 5.06K
2023-10-12 Aquila2 BAAI model 4453634B (max)
2023-10-09 Kimi-v1 Moonshot AI model 200K
2023-10-05 MathCoder PJLab paper 339304
2023-10-04 SEED / SEED-LLaMA Tencent model 64118
2023-10-01 DALL-E 3 OpenAI model
2023-09-29 ToRA: Tool-Integrated Reasoning Agent Microsoft paper 1.12K21
2023-09-28 PLaMo-13B PFN model 16213B4.1K
2023-09-27 Mistral 7B Mistral model 10.81K479.84K2887.3B32K
2023-09-26 InternLM-XComposer PJLab model 2.92K13.08K31
2023-09-25 Qwen-Agent Alibaba library 16.51K
2023-09-25 qwen.cpp Alibaba library 627
2023-09-21 PengCheng-Mind PCL model 64200B1.5T
2023-09-08 Ant Financial LLM Ant Group model
2023-09-08 CodeFuse Ant Group model
2023-09-08 Fin-Eval Ant Group dataset
2023-09-07 Hunyuan-LLM Tencent model
2023-09-06 Baichuan 2 3 Baichuan paper model 4.1K210.77K12513B2.6T
2023-09-01 OpenSPG & OpenAGL Ant Group library 2.12K
2023-09-01 XTuner PJLab library 5.15K
2023-08-31 ERNIE 3.5 Baidu model
2023-08-31 Belebele Meta eval 1109.8K122
2023-08-30 JAIS MBZUAI model 1.06K2330B1.63T
2023-08-29 LongBench Z.ai eval 1.19K104.75K21
2023-08-24 Qwen-VL Alibaba model 136.35K139
2023-08-24 Code Llama Meta model 39870B
2023-08-22 Lagent & AgentLego PJLab library 2.26K
2023-08-21 WanJuan 1.0 Corpus PJLab dataset 8
2023-08-20 ViT-Lens Tencent paper 1901
2023-08-18 KwaiYii Kuaishou model 175B
2023-08-15 Aquila 2 BAAI model paper 4451.06K33B (max)
2023-08-11 MiLM-6B Xiaomi model 4586B
2023-08-08 Baichuan-53B Baichuan model 53B
2023-08-04 SoftBank launches an OpenAI for Japan: SB Intuitions, building LLMs and generative AI in Japanese TechCrunch SB Intuitions news
2023-08-03 Qwen 3 Alibaba model paper 21.27K18.12K8972B (max)
2023-08-02 BGE Text Embeddings BAAI model 11.8K13.9M
2023-08-01 FlagEmbedding & C-MTEB 2 BAAI library dataset 11.8K70
2023-08-01 ABAB 5 / 5.5 MiniMax model
2023-07-31 ToolLLM: Facilitating LLMs to Master 16000+ APIs OpenBMB paper 5.66K69
2023-07-30 SEED-Bench Tencent eval 3645819.24K12
2023-07-19 EXAONE 2.0 LG model
2023-07-18 Llama 2 Meta model 112.62K70B2T4.1K3
2023-07-16 ChatDev 2 OpenBMB paper library 33.36K69
2023-07-13 InternVid PJLab dataset 2.28K32
2023-07-11 Emu BAAI model 1.77K29
2023-07-11 Baichuan-13B Baichuan model 2.93K14.72K13B
2023-07-07 SenseNova 2.0 Upgrade SenseTime model
2023-07-06 InternLM-1.0 PJLab model 7.22K1.68K104B
2023-07-05 PanGu-Weather 2 Huawei model paper 1.36K125
2023-07-01 InternEvo PJLab library 420
2023-07-01 OpenCompass PJLab library 7.08K
2023-06-25 ChatGLM2 / ChatGLM3 Z.ai model 13.68K124.79K6B
2023-06-23 MME (Multimodal Evaluation) BAAI eval 17.87K2.37K14
2023-06-20 Phi-1 ("Textbooks Are All You Need") Microsoft model 991.3B
2023-06-20 UniAD SenseTime, PJLab paper 4.64K1
2023-06-15 Baichuan-7B Baichuan model 5.65K148.39K7B1.2T
2023-06-14 WebGLM Z.ai paper 1.6K2
2023-06-12 detrex 2 IDEA Lab library paper 2.29K15
2023-06-07 AlphaDev Google paper
2023-06-01 DB-GPT Ant Group library 18.97K
2023-06-01 DLRover Ant Group library 1.66K
2023-06-01 LMDeploy PJLab library 7.9K
2023-06-01 RefinedWeb TII dataset 156
2023-05-31 Let's Verify Step by Step OpenAI paper 2.14K30
2023-05-30 GPT4Tools Tencent paper 77033
2023-05-29 Mix-of-Show Tencent paper 43130
2023-05-27 CPM-Bee OpenBMB model 2.41K310B
2023-05-22 Grouped Query Attention (GQA) Google paper 27
2023-05-20 PengCheng-Nebula PCL announcement
2023-05-17 Ziya LLM 4 IDEA Lab model 4.13K1.83K
2023-05-04 StarCoder ServiceNow model 22.54K19215.5B1T8.19K
2023-04-20 UltraChat & UltraFeedback OpenBMB dataset 2.86K
2023-04-11 SenseChat / SenseNova Launch SenseTime model
2023-04-11 SenseMirage SenseTime model 10B (max)
2023-04-10 Stable-DINO 2 IDEA Lab library paper 2425
2023-04-06 Grounded SAM 3 IDEA Lab library paper 17.63K90
2023-04-05 Segment Anything (SAM) Meta model 54.33K538
2023-04-01 BMTools OpenBMB library 2.77K
2023-03-27 EVA-CLIP BAAI model 2.68K80
2023-03-27 Qianfan Platform Baidu announcement
2023-03-20 PanGu-Sigma 2 Huawei model paper 71.1T329B
2023-03-14 OpenSeeD 2 IDEA Lab library paper 7592
2023-03-14 GPT-4 OpenAI model ~666B128K76/100
2023-03-14 ChatGLM-6B Z.ai model 41.05K1.28K1766B
2023-03-09 Grounding DINO 2 IDEA Lab model paper 10.25K2.22M245
2023-02-27 LLaMA Meta model 3.9K65B
2023-01-30 BLIP-2 Salesforce model 581.31K914
2023-01-23 Microsoft Extends Multibillion-Dollar OpenAI Partnership Microsoft Microsoft news
2023-01-01 FlagEvaluation BAAI library 13
2022-12-22 Tune-A-Video Tencent paper 4.37K27
2022-12-15 Constitutional AI Anthropic paper 306
2022-12-06 InternVideo / InternVideo2 PJLab model 2.28K93
2022-12-05 Painter BAAI model 2.6K10
2022-11-30 Speculative Decoding Google paper 34
2022-11-12 AltCLIP & AltDiffusion BAAI model 3.87K111.27K10
2022-11-10 InternImage 2 PJLab model paper 2.83K41
2022-11-02 Chinese CLIP Alibaba model 5.93K53
2022-11-02 Taiyi 3 IDEA Lab model paper 4192
2022-10-06 ByteTransformer ByteDance library 4791
2022-09-30 CodeGeeX 2 Z.ai model paper 8.79K49
2022-09-21 Whisper OpenAI model 102.39K5.05M1.16K1.55B
2022-09-16 CPM-Ant OpenBMB model 50010B
2022-09-03 TuGraph Ant Group library 1.74K
2022-08-24 GLM-130B 2 Z.ai model paper 7.65K296130B
2022-08-01 COYO-700M Kakao dataset 1.26K
2022-07-22 PanGu-Coder 3 Huawei model paper 3.16K5362.6B (max)
2022-07-04 SecretFlow Ant Group library 2.67K
2022-06-24 YOLOv6 2 Meituan library paper 5.89K1.75K
2022-06-06 Mask DINO 2 IDEA Lab library paper 1.54K20
2022-06-01 Vision GNN (ViG) 2 Huawei model paper 4.42K194
2022-05-02 OPT (Open Pre-trained Transformer) Meta model 175B
2022-04-26 CogView2 Z.ai, BAAI model 955121
2022-04-12 Training a Helpful and Harmless Assistant (HH-RLHF) Anthropic paper 365
2022-04-05 PaLM Google model 2.13K540B
2022-03-29 Chinchilla (Compute-Optimal Training) Google paper 663
2022-03-25 CodeGen Salesforce model 5.17K1.81K23516.1B
2022-03-20 Delta Tuning 2 OpenBMB paper library 1.04K
2022-03-07 DINO (DETR) 2 IDEA Lab library paper 2.81K760
2022-03-04 InstructGPT (RLHF) OpenAI paper 4.29K
2022-03-02 DN-DETR 2 IDEA Lab library paper 60555
2022-02-11 BMTrain OpenBMB library 624
2022-02-08 AlphaCode Google paper
2022-02-07 OFA: One For All Alibaba model 2.56K258
2022-01-30 FEDformer 2 Huawei model paper 802541
2022-01-28 Chain-of-Thought Prompting Google paper 4.25K
2022-01-28 DAB-DETR 2 IDEA Lab library paper 579397
2022-01-25 SPIRAL 2 Huawei model paper 6047
2022-01-24 SenseCore AI Infrastructure SenseTime announcement
2022-01-14 DeepSpeed-MoE Microsoft paper 42.49K55
2021-12-31 ERNIE-ViLG Baidu model 3010B
2021-12-01 EXAONE 1.0 LG model 300B
2021-12-01 tFold Tencent model 158
2021-11-30 Donut Naver paper 6.88K5
2021-11-22 Fengshenbang 3 IDEA Lab library model 4.13K3.14K
2021-11-01 KoGPT Kakao model 1.01K176.2B2.05K
2021-10-13 ByteTrack ByteDance library 6.44K105
2021-10-10 Yuan 1.0 Inspur model 58925245B
2021-09-28 DiffVC 2 Huawei model paper 60425
2021-09-10 HyperCLOVA Naver model 4204B560B
2021-09-03 FLAN (Instruction Tuning) Google paper 69
2021-08-10 Codex OpenAI model 3.26K1.43K12B
2021-07-28 Triton OpenAI library 19.41K
2021-07-15 AlphaFold 2 Google paper
2021-07-12 SPLADE Naver paper 2
2021-07-08 OpenDILab / DI-engine 2 SenseTime, PJLab library 3.62K
2021-07-08 OpenPPL (PPLNN) SenseTime library 1.37K
2021-07-07 HumanEval OpenAI eval 3.26K1.43K164
2021-07-05 ERNIE 3.0 & 3.0 Titan Baidu model 195260B
2021-07-01 Meituan Sky Project Meituan announcement
2021-06-28 PFP / Matlantis PFN model
2021-06-24 CPM-2 BAAI model 16415198B
2021-06-17 LoRA (Low-Rank Adaptation) Microsoft paper 2.45K
2021-06-01 OceanBase Ant Group library 10.15K
2021-06-01 Wu Dao 2.0 2 BAAI model paper 1.75T
2021-06-01 Wu Dao Corpora BAAI dataset
2021-05-26 CogView Z.ai, BAAI model 1.8K383
2021-05-13 Grad-TTS 2 Huawei model paper 60443
2021-05-01 Trustworthy AI White Paper Xiaomi paper
2021-04-29 DINO Meta paper
2021-04-26 PanGu-alpha 2 Huawei, PCL model paper 3.16K94200B
2021-03-20 Wu Dao 1.0 BAAI model 11.3B (max)
2021-03-18 GLM (Original) 2 Z.ai model paper 3.51K2852110B
2021-03-18 P-Tuning Z.ai paper 2.08K265
2021-03-01 M6 Series Alibaba model 48
2021-02-27 Transformer in Transformer (TNT) 2 Huawei model paper 4.42K1.01K
2021-02-26 CLIP OpenAI paper 33.74K5.3K
2021-01-11 Switch Transformer Google paper 361
2021-01-05 DALL-E OpenAI model 1.13K12B
2021-01-01 MS-MARCO-CN Baidu dataset 18
2021-01-01 PaddleNLP Baidu library 12.95K
2021-01-01 PaddleSpeech Baidu library 12.61K
2021-01-01 KoBART SK Telecom model 467127
2020-12-07 HEBO 2 Huawei library paper 2.77K19
2020-12-01 CPM-1 BAAI model 1.58K222.6B
2020-10-22 Vision Transformer (ViT) Google paper 12.58K21.58K
2020-10-08 Deformable DETR 2 SenseTime model paper 3.98K12.41K1.87K
2020-09-12 FuxiCTR / BARS 3 Huawei library paper 1.43K12
2020-08-01 Vega Huawei library 848
2020-07-15 PaddleOCR Baidu library 81.73K
2020-06-11 GPT-3 OpenAI model 3.03K175B300B2.05K
2020-06-08 Liquid Time-constant Networks Liquid AI paper 3
2020-04-08 DynaBERT 2 Huawei model paper 3.16K12119
2020-03-28 MindSpore Huawei library 4.69K
2020-03-13 ProGen Salesforce model 702341.2B
2020-03-10 Bolt Huawei library 958
2020-02-13 DeepSpeed Microsoft library 42.49K
2020-01-23 Scaling Laws for Neural Language Models OpenAI paper 1.51K
2020-01-01 Kunlun XPU Baidu announcement
2020-01-01 PaddleDetection Baidu library 14.24K
2020-01-01 PaddleSeg Baidu library 9.34K
2020-01-01 KoGPT2 SK Telecom model 558
2019-11-27 GhostNet 4 Huawei model paper 4.42K406
2019-09-23 TinyBERT 2 Huawei model paper 3.16K169.89K137
2019-09-17 Megatron-LM NVIDIA library 16.66K835
2019-09-01 NEZHA 2 Huawei model paper 3.16K86
2019-08-23 Ascend 910 Series Huawei announcement
2019-08-01 Chainer PFN library 27
2019-07-29 ERNIE 2.0 Baidu model 74
2019-07-25 Optuna PFN library 14.34K
2019-07-19 SUMBT SK Telecom paper 9024
2019-06-20 Alchemy Tencent dataset 11465
2019-06-01 KoBERT SK Telecom model 1.41K18.01K
2019-04-19 ERNIE 1.0 2 Baidu model paper 770
2019-04-03 CRAFT Naver paper 3.38K58
2019-02-14 GPT-2 OpenAI model 24.92K13.36M1.5B1.02K
2018-12-01 JAX Google library 35.79K
2018-10-11 BERT Google model 340M
2018-10-01 OpenMMLab / MMDetection 2 SenseTime, PJLab library paper 32.75K794
2018-06-11 GPT-1 OpenAI model 117M512
2018-06-01 MACE (Mobile AI Compute Engine) Xiaomi library 5.04K
2018-03-16 ApolloScape Baidu dataset 617
2018-03-01 Kata Containers Ant Group library 8.05K
2017-11-24 StarGAN Naver paper 18
2017-11-14 DuReader Baidu dataset 1.18K51
2017-11-02 VQ-VAE Google paper 1.93K
2017-07-26 Xiao AI Xiaomi announcement
2017-07-20 Proximal Policy Optimization (PPO) OpenAI paper
2017-07-05 Apollo Baidu library 26.66K
2017-06-12 Attention Is All You Need (Transformer) Google paper
2017-06-01 Angel ML Tencent library 6.79K
2017-03-15 DiscoGAN SK Telecom paper 776732
2017-01-18 PyTorch Meta library 100.65K16.19K
2016-09-30 PaddlePaddle Baidu library 23.94K
2016-04-27 OpenAI Gym OpenAI library 37.22K
2016-01-27 AlphaGo Google paper
2015-11-09 TensorFlow Google library 195.63K8.82K
2015-03-09 Distilling the Knowledge in a Neural Network Google paper 13.96K
2015-02-26 DQN (Deep Q-Network) Google paper
2015-02-11 Batch Normalization Google paper 24.38K
2014-12-17 Deep Speech 1 & 2 Baidu paper