FlexOlmo
modelNew paradigm for LLM training via data collaboration. Data owners independently train MoE expert modules on their private data, then contribute experts to a shared model — without sharing raw data. Data can be activated or deactivated at any time, enabling opt-out based on licensing or permissions.
FlexOlmo-7x7B-1T: 33B total MoE combining independently trained experts on public-mix, news, math, code, academic, creative writing, and Reddit data. 41% average relative improvement from combining experts. Apache 2.0.
Model Details
Architecture MOE
Parameters 33B
Paper
arXiv: 2507.07024