CloudGCP

Why I Chose GCP Over AWS for Our OTA Pipeline

Real trade-offs, real costs, and why the "just use AWS" default isn't always right.

22 January 20268 min read

The Problem We Were Solving

At Flipkart, I was involved in building the OTA (over-the-air) update pipeline for our internal devices. The system needed to handle firmware packaging, delta generation, staged rollouts, and telemetry — all at a scale that demanded serious infrastructure decisions.

The default assumption was AWS. Most of Flipkart's infrastructure runs on it, and there's organizational muscle memory around their services. But when I actually mapped our requirements to offerings from both providers, GCP came out ahead in several areas that mattered for this specific use case.

Where GCP Won: Data Pipeline and Storage

Our OTA pipeline generates a lot of telemetry — device health, update success rates, rollback events, version distribution across the fleet. We needed a way to ingest, process, and query this data without building a custom ETL pipeline.

GCP's BigQuery was the decisive factor. The ability to run SQL queries over terabytes of telemetry data without provisioning clusters, managing partitions, or worrying about scaling is genuinely game-changing compared to Redshift. Our analytics team could self-serve queries from day one.

Cloud Storage with its strong consistency model and straightforward lifecycle policies was also simpler to configure than S3 for our firmware artifact storage. The pricing was comparable, but the developer experience was noticeably smoother.

Where AWS Still Wins

I want to be fair — AWS has a broader ecosystem and more mature offerings in several areas. Their IoT Core service is more feature-rich than GCP's IoT offering. If we were building a traditional IoT fleet management system, AWS would have been the better choice.

AWS Lambda is also more mature than Cloud Functions for complex event-driven workflows, and Step Functions is a better orchestration tool than anything GCP offers natively. For our OTA pipeline, we used Cloud Run instead, which fit our containerized workload model better than Lambda would have.

The IAM model in AWS is more granular, which is both a strength and a weakness. It gives you more control but also more rope to hang yourself with. GCP's IAM felt more intuitive for our team size.

The Cost Surprise

Everyone talks about cloud costs, but the real cost isn't compute or storage — it's developer time. Our team of four engineers had the GCP pipeline running in production in six weeks. Based on conversations with teams that built similar systems on AWS, the AWS path would have taken closer to ten weeks due to the additional configuration surface area.

On raw infrastructure costs, GCP was about 12% cheaper for our workload profile. The sustained use discounts applied automatically, unlike AWS where you need to commit to reserved instances upfront. For a new project with uncertain scaling patterns, this flexibility was valuable.

What I Would Do Differently

In hindsight, I would have invested more time in Terraform from the start. We initially used the GCP console and gcloud CLI, which worked fine for the first few weeks but became a maintenance headache as the infrastructure grew. Infrastructure as code is not optional for any serious cloud project.

I also underestimated the importance of multi-cloud readiness. While GCP was the right choice for this project, tightly coupling to any single provider's proprietary services makes future migration painful. If I were starting today, I'd abstract the storage and compute layers behind interfaces that could be swapped.

The takeaway isn't that GCP is better than AWS. It's that the right choice depends on your specific workload, team expertise, and timeline. The worst thing you can do is default to a provider without evaluating the alternatives for your use case.

Found this useful? I write about engineering, performance, and career growth.