🔁 Hugging Face 转推了
Abhinav Adduri @abhinadduri
We updated the State Embedding 600M checkpoint on the @ArcInstitute Hugging Face
This model was trained with 4x FLOPs compared to the preprint model. It achieves significantly lower val/loss and does better on internal evals - would recommend using this over the 4 epoch one for
This model was trained with 4x FLOPs compared to the preprint model. It achieves significantly lower val/loss and does better on internal evals - would recommend using this over the 4 epoch one for
