1200 USD gone in unmonitored cloud spendingšŸ˜­šŸ˜­šŸ˜­

Felipe Lujan
3 min readAug 6, 2023

--

Subtle details in terminology result in alerts not triggering and spending skyrocketing; Learn from my mistakes and save.

As a GDE in the GCP category, I consider myself a well-seasoned user in topics such as cloud infrastructure, data and compute migrations and management of IaaS/PaaS offerings from GCP.
GCP has evolved a lot since I became a GDE, with Vertex AI being the technology that Iā€™m trying to catch up on, especially regarding pricing.

The most significant mishap of my learning journey? *1200+ USD of unmonitored spending piled up in VertexAI usage during July of 2023 alone.
*

The root cause.

I was under the impression that models from Vertexā€™s Model Garden worked in a ā€œper requestā€ or ā€œPer useā€ model; needless to say, they donā€™t.

The new offerings ā€” Palm2 chat and Text generation, Code completion, and Code Chat ā€” will charge you a small amount per 1000 characters, billing you on a ā€œper-requestā€ or ā€œper inference.ā€ However, deploying models from the Model Garden works differently.

Deploying those models require dedicated CPU, RAM, and GPU; therefore, youā€™re billed for those resources independently. I learned this after the fact.

A post-mortem analysis shows spending accumulating during the month :

The devil is in the details.

I always set up fairly conservative billing alerts; So, How did 1200 USD disappear without me noticing?

It turns out, Discount usage does not qualify as Spending in GCPā€™s billing system.

One could argue that resource consumption causes spending regardless if credits are being consumed or credit cards are being charged, but GCP doesnā€™t work like that at the moment. Link to Documentation.

Due to that subtle difference in terminology, the alert I set up at 200 USD never triggered, and neither did those at 450 USD and 500 USD.

No signs of intrusions or leaked credentials.

At some point, I was afraid I might have leaked credentials in a Jupyter Notebook or that a malicious Notebook author had accessed my GCP project. The proof that seems to discard such a hypothesis is the Vertex AI API logs, which show nothing but sporadic use.

Conclusion.

Regular GCP customers will hardly find themselves in a similar situation; nevertheless, capping API usage is the safest option for preventing unexpected spending.
Also, remember that you can use BigQuery to explore to analyze billing data.
Technicalities aside, keeping an eye on your cloud infrastructureā€™s billing panel and maintaining an inventory will always be in fashion.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

Felipe Lujan
Felipe Lujan

Written by Felipe Lujan

Google Developer Expert ā€” Google Cloud.

No responses yet

Write a response