Do I even need my own GPU?
Model picked. Next: what to run it on. VRAM is tight, and a month of GPU rental jumped a third right in front of me. I opened the calculator: 24 GB or 48, rent or buy a small server for ~600k ₽. The question read as "which card do I buy."
But before paying — I tried to stand up the production path on what I already had.
And the rakes came out (I'll start with those, as always).
→ The convenient model format, the one everything runs fast on locally, simply won't start on the production engine — the architecture isn't supported. Dead end, shame. → The format that does start on the production engine works, and fits comfortably in 24 GB. So 48 and the newest cards aren't needed. → The rented spot box rebooted and hung on me a few times mid-experiment — a couple of hours of work gone for nothing. (Spot is spot; no complaints, but I felt the friction.)
While I was fighting formats, a bigger question formed. I'm picking a GPU to run the model. But the model is a commodity I don't lean on much — I can even move it to Yandex or GigaChat. 24 GB is enough, the flagship isn't needed, my own box would sit idle most of the day. So maybe the question isn't which card to buy at all?
That question turned everything — more on it in the next part.