Jevon’s paradox says that efficiency will increase use. Microsoft is buying a nuclear power plant for AI shit. If they can train a trillion-parameter model for one-thousandth the cost… they will instead train a quadrillion-parameter model.
Or I guess if they’re smart they’ll train a trillion-parameter model longer. Or iterate like crazy, when training takes hours instead of months.
Jevon’s paradox says that efficiency will increase use. Microsoft is buying a nuclear power plant for AI shit. If they can train a trillion-parameter model for one-thousandth the cost… they will instead train a quadrillion-parameter model.
Or I guess if they’re smart they’ll train a trillion-parameter model longer. Or iterate like crazy, when training takes hours instead of months.