This is an automated archive made by the Lemmit Bot.

The original was posted on /r/singularity by /u/141_1337 on 2023-12-28 17:24:12.


The important bits:

“[This is] something much more rich into Windows that will drive higher compute demands,” said Bajarin. “For the first time in a long time, you’re going to see software that requires levels of compute that we don’t have today, which is great for everyone in silicon. A lot of it’s based around all this AI stuff.”

The explosion of generative AI tools like ChatGPT and Google Bard — and the large language models (LLMs) that underlie them — brought on server farms with thousands of GPUs. What could one desktop PC bring to the table? The answer is complex

First, the AI on a client will be inferencing, not training. The training portion of genAI is the process-intensive part. Inference is simply matching and requires a much less powerful processor.

And enterprises are extremely uncomfortable with using a public cloud to share or use their company’s data as a part of cloud programs like ChatGPT. “The things that I hear consistently coming back from CIOs and CSOs are data sovereignty and privacy. They want models running locally,” said Bajarin.

AI training is very expensive to run, either in the cloud or on-premises, he adds. Inferencing is not as power hungry but still uses a lot of juice at scale.

As models get more efficient and compute gets better, you’re better off running inferencing locally, because it’s cheaper to run it on local hardware than it is on the cloud. So data sovereignty and security are driving the desire to process AI locally rather than in the cloud.