SII
academicThe Shanghai Innovation Institute (SII) (est. September 2024) is a high-level innovation hub bridging academic research and industrial application, founded on the premise that "AI belongs to young people." Received ~500M yuan (~$70M) in donations with 50+ industry partners. President: Song Haitao (also at SJTU). Its primary AI research arm is the GAIR Lab (Generative AI Research, GitHub, HuggingFace), deeply integrated with Shanghai Jiao Tong University, with frequent collaborations with Ant Group/InclusionAI and Sand.ai.
SII's research labs practice "radical openness" — releasing not just weights but full training trajectories (logs, intermediate checkpoints, data mixtures). The daVinci series spans pretraining science (daVinci-LLM-3B, 8T tokens, matches OLMo-3 7B at half the parameters), software engineering (daVinci-Dev 32B/72B, 58.5% SWE-Bench), agentic AI (daVinci-Agency, 353B, 47% gain from only 239 samples), and multimodal (daVinci-MagiHuman, 15B single-stream video+audio with Sand.ai).
The LIMO ("Less Is More") line demonstrated that 817 curated samples can match 100x more data (AIME24 63.3%, COLM 2025). Data Darwinism established a systematic L0–L9 taxonomy for data curation producing the Darwin-CC corpus (504B tokens, 1B+ HuggingFace downloads). ASI-Evolve explores whether AI can accelerate its own development, discovering 105 novel attention architectures surpassing recent benchmarks.
People
- Song Haitao — President, SII
- Pengfei Liu Google Scholar — Director, GAIR Lab; Associate Professor, SJTU
- Ethan Chern Google Scholar — Lead Researcher (daVinci, LIMO) (formerly CMU (MS, Language Technologies))
News
- 2025-09-16 Shanghai launches innovation institute to bridge AI research and industry — Shanghai Municipal Government