Against cloud / distributed / serverless bull
Collection of articles arguing against resume-driven migrations to the cloud / usage of big data technologies / similar issues
[last updated: 2023-12-30]
So, these days everything is planetary-scale, distributed, big data, cloud-ready, fault-tolerant, elastically scalable and whatever buzzwords cloud providers' marketing departments could come up with. George Carlin would be proud.
These marketing departments have been extremely successful in convincing people that
- everyone needs these properties
- these are only achievable by paying cloud providers
- in fact, even simple hosting requires paying them obscene amounts of money
- paying cloud providers obscene amounts of money is a good deal
- no form of hosting other than fully managed is sustainable(own DCs, colocation, renting bare metal servers or VPSs)
This led to the situation where
- using overcomplicated and overpriced solutions is being forced onto developers by non-technical management and incompetent but certified (by the same cloud providers) "cloud-architects" and consultants
- working on a large-scale solution hosted at a private multi-DC environment is viewed less favorably as a resume entry than building a simple form on Azure thus stimulating resume-driven development
This has been bothering me for a while, I've engaged in a lot of arguments online, and this made me compile a collection of links that can help to illustrate that
- modern computers are extremely fast and have a lot of memory, so most applications will never need more than a single machine, or a rack at most.
- simple tools get you extremely far these days. Expanding the previous point, computers are fast enough these days that they can handle workloads that actually required using complex distributed systems 15-20 years ago

- bloated cloud-first designs lead to extreme infrastructure costs that become a significant business expense and even affect company valuation
- running massive workloads in the cloud is extremely expensive, with the costs ballooning 5-10x+
- companies don't really have that much requests coming in, or that much data, so there would be nothing to scale even if modern computers were slower
Single server / classic software / on prem capabilities
- Use one big server by Nima Badizadegan(2022)[reddit thread(2022), hacker news thread(2022)]. An article that shows the capabilities of modern hardware:
> One server today is capable of:
> Serving video files at 400 Gbps (now 800 Gbps)
> 1 million IOPS on a NoSQL database
> 70k IOPS in PostgreSQL
> 500k requests per second to nginx
> Compiling the linux kernel in 20 seconds
> Rendering 4k video with x264 at 75 FPS - Computers are fast(2015[reddit thread(2020), hacker news thread 1(2015), hacker news thread 2(2017), hacker news thread 3(2020), hacker news thread 4(2023)]. A little fun game showing how many operations a modern computer can do in a second.
- StackExchange runs on 11 colocated servers and they can function with just one(2016)[reddit thread 1(2016), reddit thread 2(2016), reddit thread 3(2016)].
- StackExchange runs on 23 servers(2023) while serving 1.3B pages / month.
- How much can you really get out of a 4$ VPS?(2023)[hacker news thread(2023), reddit thread(2023)]
- PoC of running Twitter on a single server by Tristan Hume(2022)[reddit thread(2023), hacker news thread]. A PoC of replicating most of Twitter's functionality with a single server.
"Big" data
- Your Data Fits in RAM(2013-2022+)[hacker news thread(2015), hacker news thread 2(2020)] — a website that shows that modern servers can hold up to 60TB of data in RAM, and you don't need Hadoop / spark / etc. to process data at this scale. Inspired by the original tweet by Gary Benhardt(2015).
- Don't use Hadoop - your data isn't that big(2013)[habr thread(2013), reddit thread 1(2013), reddit thread 2(2014), reddit thread 3(2016), reddit thread 4(2017), hacker news thread 1(2013), hacker news thread 2(2015), hacker news thread 3(2017)]. An article by Chris Stucchio that shows that most of the time you're much better off with SQL, simple Python scripts, or even Excel.
- Scalability! But at what COST?(2015)(paper)[hacker news thread 0(2015), hacker news thread 1(2016), hacker news thread 2(2021), hacker news thread 3(2022)] An article by Frank McSherry that shows that sometimes you can beat a whole hadoop cluster with single-threaded code running on your laptop.
- Bigger data; same laptop(2015) extension of the previous link.
- Command-line Tools can be 235x Faster than your Hadoop Cluster by Adam Brake(2014)[hacker news thread 1(2015), hacker news thread 2(2018), hacker news thread 3(2020), hacker news thread 4(2022), reddit thread 1(2018)]. Title says it all.
- Big Data Is Dead by Jordan Tigani(tech lead of Google BigQuery and founder of MotherDuck)(2023)[reddit thread(2023), hacker news thread(2023), habr thread(2023)]. Ex-cloud evangelist finds out the most of their clients barely query megabytes of data with a solution designed to process petabytes and has an epiphany.
- The simple joys of scaling up by Jordan Tigani(2023)[hacker news thread(2023), habr thread(2023)].
Distributed bloat / overpriced cloud
- The Cost of Cloud, a Trillion Dollar Paradox(2021) by Sarah Wang and Martin Casado of Andreessen Horowitz / a16z[hacker news thread 1(2021), hacker news thread 2(2022), reddit thread(2021)]. An article that analyzes negative impact of jacked up cloud pricing on companies' capitalization and puts an estimate at $100B mark.
- Twitter thread about Parler's insane hardware requirements(2021) / John Carmack's comment[habr(2021)].
- Tweet by Christopher Petrilli(twitter engineer)(2022) saying that their estimates of moving Twitter to the cloud predicted $300M monthly AWS bill even with volume discounts.
- Lyft pays $0.14 in cloud costs per ride(2019). The author argues that it's a good deal.
Surprise bills from the cloud
- Basecamp(37 signals) gets $3.2M bill and moves off the cloud(2023)[reddit thread(2023) full of similar experiences]
Migrations from the cloud / serverless technologies:
- Zendesk Moves from DynamoDB to MySQL and S3 to Save over 80% in Costs(2023)[reddit thread 1(2023), reddit thread 2(2023), reddit thread 3(2023)]
- Amazon Prime moves from their own serverless Lambda to plain EC2 and gets 90% cost reduction(2023)[reddit thread(2023), hacker news thread(2023)].
- Prerender.io moves from the cloud and reduces costs by 80%($1M -> $200k)(2022)[reddit thread 1(2022), reddit thread 2(2022)]
- Dropbox moves off AWS and saves 80%($92.5M -> $17.19M)(2015-2016):
* The Epic Story of Dropbox's Exodus From the Amazon Cloud Empire(2016)[reddit thread]
* Dropbox's S-1 Form(2018):
> Cost of revenue decreased $16.8 million or 4% during 2016, as compared to 2015, primarily due to a net decrease of $39.5 million in our infrastructure costs due to our Infrastructure Optimization. The net decrease of $39.5 million included a $92.5 million decrease in expense related to our third-party datacenter service provider, offset by a $53.0 million increase in depreciation, facilities, and support expense related to our infrastructure
> Cost of revenue decreased $21.7 million or 6% during 2017, as compared to 2016, primarily due to a $35.1 million decrease in our infrastructure costs due to our Infrastructure Optimization. - Hey(37 signals) moves off the cloud to save $7m over five years(2023)