Digital Transformation

6 CFO-approved strategies to optimise GPU costs

For AI start-ups, the decision on which GPU provider to partner with is critical. This choice can either accelerate your company’s growth or put a significant strain on your finances.

With generative AI companies dedicating up to 70% of their budgets to computing power, even slight cost reductions in GPU expenses can have a lasting impact on your start-up’s financial health. Here are six CFO-approved strategies to help you choose the most cost-effective GPU provider for your needs.

Choose the right provider

GPU options range from the so-called “bare metal” clouds all the way up to the big hyperscale cloud providers like Oracle, AAWS, Microsoft Azure, and Google Cloud.

“Bare metal” GPU providers might look like the most appealing option from a pricing standpoint, but let’s investigate this further. Firstly, they offer direct access to dedicated servers without a virtualisation layer – which means they take care of the hardware, but you deploy the software stack on your own. While they offer the lowest compute pricing, they require a high level of additional technical expertise from your team and this additional expenditure on headcount and the associated costs of hiring in professional DevOps services can eventually add up to kill your savings.

At the other end of things, hyper scalers do offer a high level of service and flexibility, but they also come at the highest price on the market. If you have very deep pockets or sufficient scale of model, they’re a great option; but for start-ups, these costs might not be justifiable, based on your needs. You can however try to check their special programs for start-ups, especially at early stages, to see if they are the right fit.

One compromise – sat right in the middle of these two options – is GPU clouds, like Coreweave, Lambda Labs or Nebius. For start-ups, these providers offer the best balance between cost, services, and performance.

Consider more than costs

Many cloud providers list GPU prices on their websites, but don’t include costs for vCPU, RAM and storage – as well as the additional data transfer costs. Advertising pricing like this helps to make prices more attractive, but all these extras will add up. When it comes to comparing providers, make sure you compare like for like and calculate the total cost of the infrastructure you need with any provider you are considering – rather than just the standalone GPU costs.

Watch out for hidden expenses

The lowest price of any product is always subject to some special conditions, and GPU hours are no exception. The cheapest listed price of GPU often comes with a long commitment that must be paid in advance. Not only does this require a large amount of capital to foot an up-front payment, but it also limits your flexibility to switch providers or redeploy your capital further down the line.

Try before you buy

If you are training AI models at scale, before making any long-term commitments, it’s important to conduct a proof-of-concept test. Every cloud provider is different and the best way to ensure they can meet your technical requirements is to conduct smaller test trainings to assess their hardware’s speed and performance. A more expensive platform that performs better may work out cheaper in the long run, as although you pay more per GPU hour, you require fewer hours to train your models.

While not every cloud provider publicly offers a free trial, for a large enough contract, many will be open to some kind of proof-of-concept period – after which you can renegotiate your contract length.

Insist on premium support

The level of support available will always depend on the amount of compute you are purchasing – and this is a rule for all providers. If you are only training a model on a single GPU, you can’t have the same support expectations as a customer who has 512 GPUs being used in production.

Therefore, when you’re choosing between cloud providers, consider the level of support each offers. If you are planning to train large models or use a cloud provider for resource-intensive inference, you want a dedicated support engineer and an SLA that guarantees 24/7 support. Anything less could create problems in the future.

Maximise efficiency in your GPU spending

Choosing the right GPU provider and plan comes down to your start-up’s specific needs. Prepaying for a reserved amount of compute power often leads to better pricing, but it demands significant upfront capital and might not align with your operational requirements.

On the other hand, a pay-as-you-go (PAYG) model offers higher flexibility, allowing you to scale GPU usage based on demand, though it typically comes with a higher per-hour cost. Not every workload necessitates continuous 24/7 GPU availability, which is what many of the lowest-cost plans are designed for. Some providers offer hybrid plans, letting you reserve a base level of GPU resources while paying extra for any additional usage beyond that.

The key is to thoroughly analyse and model the total costs under different scenarios to identify which provider offers the optimal balance of cost, flexibility, and performance for your start-up. Sometimes, the most valuable time spent by an ML team isn’t on training AI models, but on fine-tuning compute expenses.

Author

Related Articles

Back to top button