推翻OpenAI结论,DeepMind重新定义预训练的参数和规模关系!


Training Compute-Optimal Large Language Models
https://arxiv.org/pdf/2203.15556.pdf






[1]Scaling Laws for Neural Language Models
[2]https://www.lesswrong.com/posts/midXmMb2Xg37F2Kgn/new-scaling-laws-for-large-language-models
[3]https://www.zhihu.com/question/570189639/answer/2787763735

评论
