1 post
A 397-billion-parameter model runs on a laptop. Not because it uses all of itself — but because it learned which four experts matter.