Hadoop, Jamstack and ChatGPT: A Rumination on Tech Hype and Progress
You can’t scroll the internet these days without encountering yet another Top 5 list of ways to use AI to do something new and exciting. AI is clearly a generational development in computing – although one that creates in me some internal conflict.
My natural optimism believes the coming AI revolution will, in the long term, exceed everyone’s hype-fueled expectations, while my inner contrarian sees an overwrought hype cycle that will, short term, launch more than a few false starts.
And that got me thinking about innovations like Hadoop, like Jamstack, where one might glance at but quickly dismiss those technologies as short-lived. Doing so, however, would be to miss entirely the lasting innovation.
Infrastructure software – the underlying layers that allow engineers to build novel applications – exists on a continuum: Containers were born from virtual machines; frameworks like NextJS built on Facebook’s React; ML-driven models improved on rules-based models. Everything builds on what came before, and only rarely does anything spring up out of the blue.
Hadoop took us from Big Data to just “data”
I worked at Cloudera in 2011 – the heyday – when Hadoop was a quickly-emerging trend and every enterprise wanted to move off Teradata and Netezza and onto Cloudera’s Hadoop. Data practitioners were waking up to the idea of cloud-based data repositories and shifting their thinking from deleting non-core data to storing every last bit.
Hadoop’s darling status unfortunately resembles a classic rise and fall. To the uninitiated looking at it in 2023, Hadoop may not seem as if it amounted to much. But that conclusion would be selling it short.
The second paragraph of the original Dean/Ghemawat MapReduce paper in 2006 states, “Programs written in this functional style are automatically parallelized and executed on a large cluster of commodity machines.” What was a novel declaration at the time is now conventional wisdom, and yet Databricks and the concept of a Lakehouse would not exist without this innovation. Hadoop is responsible for bringing compute to the data and for provisioning different clusters for storage and compute. With that, the ability to query data in cloud storage (e.g., EMR).
“Jamstack helped create a category and represented a limit-pushing concept for a modern web development architecture.”
The technical innovations were astounding and yet the true legacy was cultural: getting us to see the right data as an asset and something worth productizing, regardless of big data or small data. dbt, now industry standard, builds on the Hadoop era, keeping the emphasis on scale and solving the creation of and collaboration around data assets.
Today we just call it data. But back then it was a hefty proper noun, Big Data, before everyone got on board.
Jamstack’s innovation: elevating the edge
Jamstack was catchy and helped create a category. It helped entire cohorts of web developers create and deploy progressive websites and web applications faster; these were tangible benefits. And yet, five years later, few of us are still talking about it.
Dismissing Jamstack as a fad, though, would unfairly deny it the credit it deserves.
Jamstack represented a limit-pushing concept for a modern web development architecture, one that did away with the database (until this month!) and that served pre-built files with copies stored at the edge. That progress led to innovations from Cloudflare, such as Cloudflare Workers, and to next-generation Platforms-as-a-Service rebuilding a beloved first-gen solution, Heroku, from scratch, with the edge in mind. You can also see Jamstack’s technical DNA on the homepage of Fly.io, which tells us to “Deploy app servers close to your users.”
Jamstack’s innovation was to elevate the edge, to bring content closer to the user.
ChatGPT broke new ground. What will we build on it?
So, why am I writing this. I’m writing about prior eras now as a reminder to myself – a timely one, as AI introduces groundbreaking technology into consideration.
We haven’t (yet) stopped talking about ChatGPT. We long ago stopped talking about Jamstack, just as we have stopped talking about Hadoop and Big Data. But with a discerning eye, we pull from each of these epochs their resonant innovation – the innovation that constitutes a building block to future technologies. I am constantly attuned to that.
†Mathias Biilmann introduced Jamstack in a presentation at Smashing Conference 2016.