PaLM
modelPathways Language Model. 540B parameter dense Transformer trained on 6,144 TPU v4 chips using the Pathways system. 780B training tokens across multiple languages and code.
PaLM was the first model to outperform average human performance on BIG-Bench and demonstrated breakthrough multi-step reasoning. It established Google as a frontier lab competitor and served as the foundation for PaLM 2 and early Bard. By Chowdhery et al. Proprietary.
Model Details
Architecture DENSE
Parameters 540B
Paper
arXiv: 2204.02311