One RL to See Them All: Visual Triple Unified RL
paperIntroduces the V-Triune system and Orsta models (7B/32B) that unify visual reasoning and perception tasks via reinforcement learning. Up to +14.1 improvement on MEGA-Bench Core.
Paper
arXiv: 2505.18129