NEZHA
model paperNEural contextualiZed representation for CHinese lAnguage understanding. A Chinese pretrained language model based on BERT with improvements including functional relative positional encoding, whole word masking, and mixed precision training. Achieves state-of-the-art on Chinese NLU tasks.