Mixture of Block Attention mechanism for efficient long-context processing.

Paper

Citations 1
scalingattentionarchitecture

More Links