Andy Martin | almart.in
chart shared by @tengyuma

Adam, a 9-yr old optimizer, is the go-to for training LLMs (eg, GPT-3, OPT, LLAMA). Introducing Sophia, a new optimizer that is 2x faster than Adam on LLMs. Just a few more lines of code could cut your costs from $2M to $1M (if scaling laws hold). https://t.co/GrMY600lLO 🧵⬇️

@tengyuma

contributed by Andy on May 24, 2023