Should We Be Using Kubernetes: Did the Best Product Win?
Episode Sponsor: PagerDuty - Checkout the features in their official feature release: https://fnf.dev/4dYQ7gLThis episode dives into a fundamental question facing the DevOps world: Did Kubernetes truly win the infrastructure race because it was the best technology, or were there other, perhaps less obvious, factors at play? Omer Hamerman joins Will and Warren to take a hard look at it. Despite the rise of serverless solutions promising to abstract away infrastructure management, Omer shares that Kubernetes has seen a surge in adoption, with potentially 70-75% of corporations now using or migrating to it. We explore the theory that human nature's preference for incremental "step changes" (Kaizen) over disruptive "giant leaps" (Kaikaku) might explain why a solution perceived by some as "worse" or more complex has gained such widespread traction.The discussion unpacks the undeniable strengths of Kubernetes, including its "thriving community", its remarkable extensibility through APIs, and how it inadvertently created "job security" for engineers who "nerd out" on its intricacies. We also challenge the narrative by examining why serverless options like AWS Fargate could often be a more efficient and less burdensome choice for many organizations, especially those not requiring deep control or specialized hardware like GPUs. The conversation highlights that the perceived "need" for Kubernetes' emerges often from something other than technical superiority.Finally, we consider the disruptive influence of AI and "vibe coding" on this landscape, how could we not? As LLMs are adopted to "accelerate development", they tend to favor serverless deployment models, implicitly suggesting that for rapid product creation, Kubernetes might not be the optimal fit. This shift raises crucial questions about the trade-offs between development speed and code quality, the evolving role of software engineers towards code review, and the long-term maintainability of AI-generated code. We close by pondering the broader societal and environmental implications of these technological shifts, including AI's massive energy consumption and the ongoing debate about centralizing versus decentralizing infrastructure for efficiency.Links:Comparison: Linux versus E. coliPicksWarren - Surveys are great, and also fill in the Podcast SurveyWill - Katana.networkOmer - Mobland and JJ (Jujutsu)
--------
1:06:35
--------
1:06:35
Mastering SRE: Insights in Scale and at Capacity with Aimee Knight
In this episode, Aimee Knight, an expert in Site Reliability Engineering (SRE) whose experience hails from Paramount and NPM, joins the podcast to discuss her journey into SRE, the challenges she faced, and the strategies she employed to succeed. Aimee shares her transition from a non-traditional background in JavaScript development to SRE, highlighting the importance of understanding both the programming and infrastructure sides of engineering. She also delves into the complexities of SRE at different scales, the role of playbooks in incident management, and the balance between speed and quality in software development.Aimee discusses the impact of AI and machine learning on SRE, emphasizing the need for responsible use of these tools. She touches on the importance of understanding business needs and how it affects decision-making in SRE roles. The conversation also covers the trade-offs in system design, the challenges of scaling applications, and the importance of resilience in distributed systems. Aimee provides valuable insights into the pros and cons of a career in SRE, including the importance of self-care and the satisfaction of mentoring others.The episode concludes with us discussing some of the hard problems such as the on-call burden for large teams, and the technical expertise an org needs to maintain higher complexity systems. Is the average tenure in tech decreasing, we discuss it and do a deep dive on the consequences in the SRE world.PicksThe Adventures In DevOps: SurveyWarren's Technical BlogWarren: The Fifth Discipline by Peter SengeAimee: Sleep Token (Band) - Caramel, GraniteWill: The Bear Grylls Celebrity Hunt on NetflixJillian: Horizon Zero Dawn Video Game
--------
1:17:55
--------
1:17:55
Exploring MCP Servers and Agent Interactions with Gil Feig
In this episode, we delve into the concept of MCP (Machine Control Protocol) servers and their role in enabling agent interactions. Gil Feig, the co-founder and CTO of Merge, shares insights on how MCP servers facilitate efficient and secure integration between various services and APIs.The discussion covers the benefits and challenges of using MCP servers, including their stateful nature, security considerations, and the importance of understanding real-world use cases. Gil emphasizes the need for thorough testing and evaluation to ensure that MCP servers effectively meet user needs.Additionally, we explore the implications of MCP servers on data security, scaling, and the evolving landscape of API interactions. Warren chimes in with experiences integrating AI with Auth. Will stuns us with some nuclear fission history. And finally, we also touch on the balance between short-term innovation and long-term stability in technology, reflecting on how different generations approach problem-solving and knowledge sharing.Picks:The Adventures In DevOps: SurveyWarren: The Magicians by Lev GrossmanGil: Constant Escapement in WatchmakingWill: Dungeon Crawler Carl & Atmos Clock
--------
1:04:57
--------
1:04:57
No Lag: Building the Future of High-Performance Cloud with Nathan Goulding
Warren talks with Nathan Goulding, SVP of Engineering at Vultr, about what it actually takes to run a high-performance cloud platform. They cover everything from global game server latency and hybrid models to bare metal provisioning and the power/cooling constraints that come with modern GPU clusters.The discussion gets into real-world deployment challenges like scaling across 32 data centers, edge use cases that actually matter, and how to design systems for location-sensitive customers—whether that’s due to regulation or performance. Additionally, there's talk about where the hyperscalers have overcomplicated pricing and where simplicity in a flatter pricing model and optimized defaults are better for everyone.There’s a section on nuclear energy (yes, really), including SMRs, power procurement, and what it means to keep scaling compute with limited resources. If you're wondering whether your app actually needs high-performance compute or just better visibility into your costs, this is the episode.PicksThe Adventures In DevOps: SurveyWarren: Jetlag: The GameNathan: Money Heist (La Casa de Papel)
--------
1:00:38
--------
1:00:38
Ground Truth & Guided Journeys: Rethinking Data for AI with Inna Tokarev Sela
Inna Tokarev Sela, CEO and founder of Illumex, joins the crew to break down what it really means to make your data “AI-ready.” This isn’t just about clean tables—it’s about semantic fabric, business ontologies, and grounding agents in your company's context to prevent the dreaded LLM hallucination. We dive into how modern enterprises just cannot build a single source of truth, not matter how hard they try. All the while knowing that it's required to build effected agents utilizing the available knowledge graphs and.The conversation unpacks democratizing data access and avoiding analytics anarchy. Inna explains how automation and graph modeling are used to extract semantic meaning from disconnected data stores, and how to resolve conflicting definitions. And yes, Warren finally coughs up what's so wrong with most dashboards.Lastly, we quickly get to the core philosophical questions of agentic systems and AGI, including why intuition is the real differentiator between humans and machines. Plus: storage cost regrets, spiritual journeys disguised as inference pipelines, and a very healthy fear of subscription-based sleep wearables.PicksThe Adventures In DevOps: SurveyWarren: The Non-Computability of IntuitionWill: The Arc BrowserInna: Healthy GenAI skepticism
Join us in listening to the experienced experts discuss cutting edge challenges in the world of DevOps. From applying the mindset at your company, to career growth and leadership challenges within engineering teams, and avoiding the common antipatterns. Every episode you'll meet a new industry veteran guest with their own unique story.