If you’d asked me a couple of years ago which machine I’d want for running large language models locally, I’d have pointed straight at an Nvidia-based dual-GPU beast with plenty of RAM, storage, and processing power. That was the safe answer, of course. CUDA has been around for a long time with a thriving ecosystem, and much of AI has been built off the back of it. Paired with a ton of VRAM (such as the RTX 3090, which has 24GB of VRAM), it would have been the best option most consumers could even fathom.