Chinese AI lab DeepSeek massively undercuts OpenAI on pricing — and that's spooking tech stocks

Greenpepper@beehaw.org · 4 days ago

Chinese AI lab DeepSeek massively undercuts OpenAI on pricing — and that's spooking tech stocks

MagicShel@lemmy.zip · edit-2 1 day ago

You can look at the stats on how much of the model fits in vram. The lower the percentage the slower it goes although I imagine that’s not the only constraint. Some models probably are faster than others regardless, but I really have not done a lot of experimenting. Too slow on my card to really even compare output quality across models. Once I have 2k tokens in context, even a 7B model is a token every second or more. I have about the slowest card that ollama even says you says use. I think there is one worse card.

ETA: I’m pulling the 14B Abliterated model now for testing. I haven’t had good luck running a model this big before, but I’ll let you know how it goes.