PodcastsCiênciaLatent Space: The AI Engineer Podcast

Latent Space: The AI Engineer Podcast

Latent.Space
Latent Space: The AI Engineer Podcast
Último episódio

274 episódios

  • Latent Space: The AI Engineer Podcast

    Giving Agents Computers — Ivan Burazin, Daytona

    21/05/2026 | 1h 10min
    Take the 2026 AI Engineering Survey and get >$2k in credits and AIE WF tickets!
    On the product side, everyone is getting Computer - Perplexity, Manus, Cursor, and so on. Meanwhile on the research side, agentic evals like TerminalBench and GDPVal are also assuming computer (Harbor). On both ends, the consolidating LLM OS stack has become a standard toolkit, and Daytona is one of a small set of AI Infra companies that are booming because of it.
    “The end of localhost” has been Ivan Burazin’s obsession for more than a decade.
    Something that is all too familiar…
    Long before agents became the default way people talked about software development, Ivan was already chasing the idea that development should not depend on a fragile local machine. CodeAnywhere, one of the first browser-based IDEs, was an early attempt at that future: move the development environment into the cloud, make setup reproducible, and free developers from the endless “works on my machine” tax.
    The thesis was directionally right, but the market wasn’t ready yet.However, agents changed that. They do not care about a laptop, desk setup, or favorite editor. They need a computer they can access through an API: something stateful enough to keep working, fast enough to spin up instantly, flexible enough to resize, isolated enough to be safe, and composable enough to run the messy real-world workflows that real software engineering actually requires.Daytona isn’t just selling “sandboxes” in the narrow code-execution sense. It is the latest version of Ivan’s original localhost thesis.
    In this episode, Daytona’s CEO joins swyx to explain why AI agents need more than code execution boxes: they need composable computers, stateful sandboxes, instant startup, dynamic resources, and infrastructure that can survive workloads going from zero to 100,000 CPUs.
    We go deep on the new agent compute market: Daytona’s hard pivot from human dev environments to AI sandboxes, the New Year’s Eve MVP that customers begged for, why Daytona runs on bare metal with its own scheduler, how one customer runs almost 850,000 sandboxes a day, and why RL/eval workloads went from 0% to roughly 50% of usage in just months. Ivan also explains why agents need Windows and macOS machines, why CLI may matter more than MCP, why Kubernetes is painful for this workload, and why the future AI cloud may look more like Stripe than AWS.
    We discuss:
    * How Daytona grew out of CodeAnywhere, Shift, and the “end of localhost” thesis
    * Why Daytona pivoted from human dev environments to AI sandboxes
    * Why agents need composable computers instead of disposable code execution boxes
    * The New Year’s Eve MVP that customers chased API keys for
    * Why Daytona chose bare metal, stateful snapshots, and its own scheduler
    * How Daytona spins up one sandbox in ~60ms and 50,000 sandboxes in ~75 seconds
    * Why Daytona’s biggest customer runs ~850,000 sandboxes a day
    * How RL/eval workloads create zero-to-100,000 CPU spikes
    * Why RL workloads went from 0% to roughly 50% of Daytona usage
    * Why customers compare Daytona against EKS/GKS and say they’re “never going back”
    * Why every AI agent may need a computer, including Windows and macOS environments
    * The Apple licensing constraints that make macOS sandboxes hard
    * Why CLI gives agents more power than MCP
    * How open source helps agents integrate Daytona
    * Why agent-generated PRs may break today’s CI/CD assumptions
    * Why AI SaaS companies reselling tokens may face a cold shower
    * Why the AI cloud may look more like Stripe than AWS
    Ivan Burazin
    * LinkedIn: https://www.linkedin.com/in/ivanburazin
    * X: https://x.com/ivanburazin
    Daytona
    * Website: https://www.daytona.io
    * X: https://x.com/daytonaio
    Timestamps
    * 00:00:00 Hook
    * 00:01:12 Introduction
    * 00:03:15 CodeAnywhere, Shift, and the end of localhost
    * 00:05:58 What Daytona is: composable computers for AI agents
    * 00:08:07 The pivot from dev environments to AI sandboxes
    * 00:10:17 The New Year’s Eve MVP and customers begging for API keys
    * 00:12:56 Bare metal, stateful sandboxes, and Daytona’s scheduler
    * 00:17:28 60ms startup, 50,000 sandboxes, and 850K daily runs
    * 00:21:53 Spiky RL/eval workloads and the new agent infra problem
    * 00:28:12 RL workloads, Kubernetes pain, and dynamic resizing
    * 00:33:31 Why every AI agent needs a computer
    * 00:38:48 macOS sandboxes and Apple’s licensing problem
    * 00:44:28 Why CLI may matter more than MCP
    * 00:48:11 Open source, GitHub stars, and agent integration
    * 00:53:11 Git, CI/CD, and agent collaboration bottlenecks
    * 00:58:15 Founder life and building a 25-person infra company
    * 01:02:44 AI SaaS, token resale, and API-first business models
    * 01:06:10 GPU sandboxes, data centers, and compute growth
    * 01:09:48 Why the AI cloud may look more like Stripe than AWS
    * 01:11:26 Closing thoughts
    Transcript
    Introduction: Daytona, CodeAnywhere, and the End of Localhost
    Swyx [00:00:02]: Okay, we’re in the studio with Ivan Burazin, CEO of Daytona. Welcome.
    Ivan [00:00:07]: Thanks for having me, man.
    Swyx [00:00:08]: Ivan, you and I go back.
    Ivan [00:00:10]: Way back.
    Swyx [00:00:11]: How I don’t even know how, you found, did you reach out or, for Shift.
    Ivan [00:00:17]: I reached out to you. The reason was you - we were just - we were thinking about I was one of the co-founders of CodeAnywhere, the first browser-based IDE, and so we were thinking a long time of, localhost should die. And you had this article.
    Swyx [00:00:29]: End of localhost.
    Ivan [00:00:30]: Then I reached out to you because of that, and then we talked, and I was actually at a different job and learning about I was the head of, developer experience, and you were quite well-versed in that, and I actually reached out to you, among other people, how do we go about that? What are the key things and whatnot at this point in time? And you were nice enough to take the call, and I remember I was late on your call with you.
    Swyx [00:00:51]: I don’t remember.
    Ivan [00:00:52]: I remember because I was with my then I’m thinking of a girlfriend or wife at that point in time, I’m not sure. It’s the same person, so that’s great, and I was late ‘cause we were, in, Italy on, vacation, and then I was late for something. I felt so bad, and you were so nice to be, good about.
    Swyx [00:01:10]: The reason I’m nice is because I’m also late to other people, so it’s like, who’s, who’s without sin here, yeah, so I have to, for those who don’t know, InfoBip Shift, there’s this whole thing that, you did in the past, and, and that was basically one of the inspirations for me starting AI Engineer, which is like, I have to thank you for giving me that push to be like, “Oh, you can, you can build and sell conferences?”
    Ivan [00:01:34]: I remember you asked you asked me at the beginning to give me advisory shares, and I was so focused on what we were doing, I said no, and I should’ve took the advisory shares. So I’m sorry, dude. But anyway.
    Swyx [00:01:43]: We’re not, we’re not venture backed.
    Ivan [00:01:44]: No, it doesn’t matter.
    Swyx [00:01:45]: It’s Yeah, anyway, so I think what’s impressive about you is that CodeAnywhere is the thing that you’ve been trying to build, and, you kind of put it on hold and then came back after InfoBip. Just give us the story, do you - the story and the origin story, going into Daytona.
    From CodeAnywhere and Shift to Daytona
    Ivan [00:02:05]: Sure. Like, really way back, me and my co-founder have been together. I say this, I’ve said this multiple times, it’s like we were married and divorced and married. Some people actually ask me is my co-founder my partner. they thought it literally. It’s not literally, but we have done multiple companies together, and to your point, we had this shift where we went from the CodeAnywhere to the conference called Shift, and then back to, Daytona. We originally started stacking servers, doing like virtualization in the early 2000s and, routers and doing basically all these things, at a foundational level, and that was a services company which we sold to focus on what my co-founder actually invented, which was the very first browser-based IDE, right, I say the first. Before us was actually Heroku. They did it for a very short time until they became Heroku. But outside of them, we were the only one, and it was called.
    Swyx [00:02:55]: There was Cloud9.
    Ivan [00:02:57]: Cloud9 came out slightly after us. There was Replit, which came out when we stopped doing it, Replit came out, and they have been successful since then, which is great. There was Nitrous.io. There was quite a few that existed at the time, but it was like too early. But the interesting part is that we, at that point in time, because there was no VS Code, there was no Kubernetes, and Docker had just started when we Or I’m not sure if it was even public at that point in time. And so we had to build everything to the whole stack ourselves and that was the key learning that we brought into and that we’ve been using in Daytona today. So it was super early. There’s about 3 million people used CodeAnywhere. It was slightly, it was angel-backed more than venture-backed. We ended up paying everyone back because it didn’t have that sort of scale. But, three years ago, we started something similar with Daytona, which is not what we are today, but it was automating dev environments for human engineers, the basically the underlying stack of CodeAnywhere. And then we did a hard pivot last January to sandboxes. And so here we are.
    Swyx [00:04:01]: Historic pivot, yeah, and, it’s one of those things where, I had independently invested in CodeAnywhere, but also in E2B, and then both of you pivoted into the same thing, and I’m like, “F**k.”
    Ivan [00:04:12]: You invested, you invested in Daytona. You invested in Daytona. But you were the first If we had not got your check, we wouldn’t have done it.
    Swyx [00:04:18]: No way.
    Ivan [00:04:19]: No, it was like, “We have to get him on board first,” and you were that kicker that we, that got us off the ground.
    Swyx [00:04:23]: No, because you were putting me on your pitch deck, man. I was like, “Man, this is like a good trip if I don’t invest.”
    Ivan [00:04:29]: That’s because it was your quote. It’s like we.
    Swyx [00:04:30]: Yeah. It’s the end of localhost.
    Ivan [00:04:31]: Did a bunch of research about end of localhost and who was interested in that,.
    Swyx [00:04:34]: No, that’s like, I put, I wrote that blog post, and every single company in that field reached out to me, and then every VC who was receiving those pitches then also had to call me and, talk it, talk through it with me.
    Ivan [00:04:47]: It’s finally happening though.
    Swyx [00:04:48]: It was really super interesting.
    Ivan [00:04:48]: It’s finally happening.
    Swyx [00:04:49]: It’s finally happening.
    Ivan [00:04:49]: Yeah, it’s finally.
    Swyx [00:04:49]: It’s finally happening, with maybe sort of non-human users. Yeah, so what is Daytona today? Let’s get like a quick description. I’m wearing the shirt.
    What Daytona Is Today: Composable Computers for AI Agents
    Ivan [00:04:58]: You’re wearing the shirt. Yes,.
    Swyx [00:04:59]: It says, I think your branding is very good. Like, it’s very consistent. It runs AI code. Like, it cannot be simpler.
    Ivan [00:05:05]: Exactly, but we’re gonna probably have to change that.
    Swyx [00:05:07]: Oh, s**t.
    Ivan [00:05:07]: It’s also a subset of what we do. Unfortunately, we really love this, Run AI Code is super simple. People interpret it different ways. I think we’ve given out 5,000, 6,000 of these shirts. People wear them with pride because it doesn’t really market about us.
    Swyx [00:05:21]: Yeah, Daytona’s on the back.
    Ivan [00:05:22]: It markets the back. It markets to the person itself, so I think we did a really good job on that one. But it is also a subset of what we do, because people, when they think about Run AI Code, they just think about these small, let’s call it isolates, code execution boxes that, you send some code, you get an output. Whereas what Daytona is today is essentially composable computers for AI agents. It is, the market calls them sandboxes which can be misleading.
    Swyx [00:05:44]: All these things. All these things on.
    Ivan [00:05:45]: Yeah, exactly, ‘cause it can be misleading ‘cause people usually think about sandboxes as a demo or a test environment versus a production-grade environment. But what Daytona does, if you think of the laptop that you have in front of you or the computer that’s over there, or, my wife is an architect, so she has like a Windows with a 3D graphics card inside to do 3D rendering. Like, as humans, we have different computers or different compositions of computers. And our belief is strongly that agents today and going forward will need all these different compositions of computers to do different types of tasks. And so we offer that basically through an API.
    Swyx [00:06:19]: Yeah, to give people - I’m trying to sort of front-load all the aha moments or the wow moments so that people can, stay engaged and click like and subscribe. the market is exploding, right? Like, you have been reporting 74% month-on-month growth, and it also, it’s just been growing for a while. Like, it’s been going like this. And every single - It’s not just you guys. It’s every single.
    Ivan [00:06:41]: Everyone, yeah.
    Swyx [00:06:42]: Sort of, compute provider. I don’t know if you agree with me saying compute provider or not.
    Ivan [00:06:48]: It’s fine.
    Swyx [00:06:48]: Yeah. So like organically PLG-driven growth, but also enterprise is doing super well, I think I wanna rewind to January of last year when you did the pivot. Like, so you obviously called this market early, and you were positioned for it, and you are now one of the market leaders. But what was the insight that made you do the pivot?
    The Pivot: From Human Dev Environments to Agent Sandboxes
    Ivan [00:07:06]: The insight that made us do this pivot is the quarter before that, so end of 2024, when we had - Basically, we did a demo with - I don’t I think we discussed this as well, Devin was not public. You actually gave me access to Devin at that time. So Devin.
    Swyx [00:07:25]: I did?
    Ivan [00:07:26]: Yeah, you gave me access.
    Swyx [00:07:26]: I don’t think I was supposed.
    Ivan [00:07:27]: Yeah, exactly.
    Swyx [00:07:28]: Yeah, I.
    Ivan [00:07:28]: So it doesn’t matter. You.
    Swyx [00:07:29]: Yeah. I gave like three friends access.
    Ivan [00:07:31]: Yeah, or it was a call and you showed it to me. It doesn’t matter. but OpenDevin was available, which is now called OpenHands. And so we’re like, “Oh, this seems to be a thing. This is not public. Let’s take our for human automation of dev environments and take, OpenDevin and launch that as a SaaS.” And we did that. Not very many people signed up and used it, but a lot of people reached out that were building agents, and they were like, “Hey, my agent needs a compute sandbox runtime,” whatever you wanna call it. I forgot what it was called at that point. And then we were like, “Oh, amazing. This is a new market. Here is our infrastructure. Here’s our product, and go.” And what we found really fast, soon, was that people did not like what we had built. It didn’t work. And I remember talking to people at the beginning when we’re doing this, the sandbox we’re building for agents. People were like, “Oh, why is it different? It’s the same thing. We have like EC2, we have VMs, we have all these things.” But we saw that everyone we gave it to, it was like 20, 30 people, they all said, “No.” Like, “This is not what we need. This sort of breaks.” And basically, me and my co-founder not knowing a lot about - ‘cause we’re infra people. We’re not AI people. So I basically took it upon myself to like watch every single podcast that exists, including all of, all of these and all that, and sort of get up to date, read all the blogs, like get, understand what’s going on.
    Swyx [00:08:45]: Do you wanna shout out who else was useful, just in case people are also looking.
    Ivan [00:08:49]: Generally we -, I looked at There’s a few of podcast, different segments and different types. So there’s you guys, No Priors, Bill Gurley’s was great while.
    Swyx [00:09:04]: VG2, yeah.
    Ivan [00:09:05]: Yeah, while it was around. So there’s a few. 20VC is interesting from a different dynamic, and some are different dynamic. But there was, also Red Points.
    Swyx [00:09:14]: We’re not really about the compute market.
    Ivan [00:09:15]: It was also already - Sorry?
    Swyx [00:09:16]: You’re, you want - You’re looking at the agent infra market.
    Ivan [00:09:19]: I was looking at the agent market and the AI market in general and sort of understanding who are the players, what the perception, and how that goes. And like obviously you complement this with like going to conferences, going to events, going to meetups, reading white papers, like doing all the things that you have to do to understand what’s happening. And so when we figured, when we sort of had an idea of what we had to build, literally over the New Year’s Eve, literally on New Year’s Eve, I half vibe coded the first MVP, first minimal viable product of what Daytona is today. And I went to sleep at like 3:00 AM or something like that. I was doing - I just put my like baby daughter and wife to sleep and, Happy New Year’s, and go back to just, doing this. And I sent it to my co-founder, my CTO, and he saw it in the morning. He’s like, “This is absolute garbage.” “Do not show this to anybody at all, but the idea is good.” And so he took two weeks, and he rebuilt it.
    Swyx [00:10:09]: Did it like look like that? Listen, I - It was rough idea.
    Ivan [00:10:12]: Oh, not even, not even close. Like it was it was way worse. But it was like a very - It was a simplistic view of what it should be. Like, it worked, but it was not ideal. And so he went, we went down the whole, which is his job as CTO, to go, and he came back with this version. We then called all the people that had said like, “This is garbage,” a quarter ago. And we set up these calls, and we gave it to - We just demoed it to everyone. And all the calls went long, every single one. They were 15-minute calls, and they all went to like 25, 30 minutes or whatnot. And everyone said, “We need, we want access.” There was no login, just an API key, ‘cause it was just a beta or an alpha. And they said, “Oh, we want access.” And we’re like, “Sure, yeah. Okay, thank you very much.” But after like the next day, if we’d not send it, every single one, like every call that we did, everyone came back, “Where is my API key?” Like everyone wanted it. We’re like, “S**t.” Like this is it. Like I’ve never felt So one, the understanding to your point was like most people thought it was the same infrastructure for humans and agents. We understood a quarter ago it’s not. We just didn’t know what was the right primitive. And then when we came, and we can talk about what that is, and we gave it to these people, I’ve never seen, I’ve never experienced - I’ve done multiple companies in my life. I’ve never experienced this, that people literally call you if you do not give them access. Like they want access right now. And so it’s like, okay, they don’t want this. the thing that they want doesn’t seem to exist, or they have not found it, and they really want what we want. And then when we understood that we’re onto something, and then when you think about the size of the market, like the market for human engineers and enterprise is a very large market, so think GitLab or whatnot. But the market for every single agent that will exist ever in the future is just like, what is that market? How big is that? And we’re like, “We are all in on this.” And so that is where we made sort of the cut between the old product and the new one.
    Bare Metal, Stateful Sandboxes, and the Lambda + EC2 Model
    Swyx [00:12:02]: Yeah. But it wasn’t composable at the time?
    Ivan [00:12:05]: It was very - It was basically just a Linux box that you could change, that you could define number of CPUs, disk, and RAM. Like that is what you could do, but you couldn’t have multiple operating systems, you couldn’t resize it on the fly, you couldn’t add a GPU, you couldn’t do like all the things. It was just the, just the first sort of variation of that, yeah.
    Swyx [00:12:22]: Was it bare metal from the start?
    Ivan [00:12:24]: It was bare metal from the start. And so the interesting thing that we thought about right away, so our.
    Swyx [00:12:29]: Which, give people the background, what is the normal path?
    Ivan [00:12:32]: Yeah, so, basically most providers run this on top of VMs. And also.
    Swyx [00:12:37]: Firecracker.
    Ivan [00:12:38]: Yeah, they run on Firecracker and VM. And so we also fire - We can get - We have multiple isolation layers and we can do that. But the common way to do it is that they, one, that the state of the machine, or the hard disk is not part of the sandbox itself. And the other thing is they’re not meant to last forever. So most of them are preemptible, like they can There’s a time that they can live. And so our thought was when we were going into this is, agents will be like humans in the sense of you don’t want your laptop to be shut down until you’re done with work. Like, and you want to close the lid and open the lid, it’s the same state. So you - Agents would want that, like the pause and come back. They want those two things. But also agents really want speed, right? Can they get it? So when we thought about it’s like we need something insanely fast, how to make it fast, how to make it long-running, and stateful. And so those two things, it’s like combining a Lambda and an EC2, right? Those two things together. And so we didn’t have an idea how others did it, ‘cause we didn’t know too that there was a market around this. It was more like, okay, this is what we need, what they need. And we looked at Kubernetes, it wasn’t wasn’t good enough for that. We looked at Nomad, it didn’t enable that. And so our history in rewriting our own scheduler at CodeAnywhere is basically what my CTO came up with. Like, he’s like, “Oh, the learnings from there,” and he brought it. And the funny thing is, our third co-founder, when he saw it, he’s like, “Dude, what is this? This is like 2008.” Like, we went back in time, and he’s like, “Exactly.” And so the reason why Daytona is like super fast, and you see this on benchmarks, is we essentially, we run on bare metal. We have our own scheduler, we use the underlying, disk, CPU, and RAM of the underlying machine, which means your IOPS are insanely fast because there’s no, there’s no network between an EBS or something like that. But also the snapshot, the point in time, the templates, are also preloaded on the bare metal machines. So when you fire off a sandbox from a template or a snapshot, you’re essentially directed to the bare metal machine where that snapshot is based on that NVMe drive, and then it literally just turns on that machine, and it’s local. There’s no network latency, anything on there. And so that is sort of the specificities that we, when we’re thinking from first principles, what a computer would look like for an agent, that is what we came up with, and that’s what we created.
    Benchmarks, 60ms Startup, and 50,000 Sandboxes
    Swyx [00:15:02]: Yeah. I should maybe, I don’t know if you endorse this, but there’s someone that does compute SDK, you guys do very well on there, with like the TTI, right? I. is this a, is this a is this a relevant benchmark for you guys? I don’t know.
    Ivan [00:15:16]: I don’t know, and it changes every day. So today RKL is.
    Swyx [00:15:18]: I don’t know what RKL is. Never heard of it.
    Ivan [00:15:20]: Yeah. RK, yeah, so it is there.
    Swyx [00:15:22]: You are, at least a third of the next tier of performance, and then, there’s a lot of other better-known names that are very slow to start.
    Ivan [00:15:31]: Yeah. We’ve been the number one by far for a long time, and now there’s different, there’s different definitions also of sandboxes, different isolation patterns, different other things. So RKL runs it literally on the S3, the data, so it’s very different, and they spin up a sandbox, spin up a container for that, so it’s a different type of thing. So the definition of a sandbox is something that we can all, we all need to get along with. But yeah, we’re insanely fast on getting these things, up and running. And so you can see even there that it’s a zero point 0.10 to 0.11, so.
    Swyx [00:16:03]: Close enough. Yeah. what else do you need, right?
    Ivan [00:16:05]: Yeah. So the benchmarks itself, so, in this, in I don’t think the benchmarks equate to market ownership or revenue or anything like that. and I’ve seen this with multiple benchmarks, not just in sandboxes, but in general benchmarks around.
    Swyx [00:16:20]: It’s table stakes. It’s just like.
    Ivan [00:16:21]: Exactly. But it doesn’t hurt.
    Swyx [00:16:22]: Just roughly check.
    Ivan [00:16:22]: Like you definitely have to be up there and you have to be competing so that people know that, oh, this is definitely one of the top. Because this is only one dimension of what customers look for. There’s other things like how many can you spin up consecutively? There’s a feature set, there’s support, there’s like all different things that people look at, but you definitely have to be there, on the benchmarks.
    Swyx [00:16:40]: How many people do people spin up consecutively?
    Ivan [00:16:43]: So we have.
    Swyx [00:16:43]: Or concurrently, is the Concurrency, right?
    Ivan [00:16:45]: There’s three metrics that we look at. And so one is like time to spin up one, and so our time to spin up one is 60 milliseconds with network latency. So request, spin up, reply, 60, the whole thing, 60 milliseconds. That is one. But if you wanna spin up 50,000 at once, we are now at about 75 seconds. So it takes about 75 seconds to spin up concurrently 50,000. Some others, there’s public data around this, like take 2,000 seconds, which is 30 minutes. Like there’s different variations of that. And then there is the so it is speed of one, speed of like multiple, and then how many can you consistently have up and running. And so we basically have right now no limit to how much we can add because we basically own our own metal. But the biggest customer of ours does like about 850,000 every single day is sort of where they’re, where they’re just shy of a million every single day that they’re running, we do have a request for half a million concurrent, which is literally half a million CPUs somewhere running. So that’s an interesting.
    Swyx [00:17:44]: They pay by like vCPU seconds.
    Ivan [00:17:47]: By seconds, yeah.
    Swyx [00:17:47]: Or whatever. Yeah. Okay, and so and then, and the other thing is, the sleeping and the resuming, ‘cause it’s all the stateful resumption of all these things, how, what kind of workload are people putting through this, right? Like how is it Do we measure by gigabytes in memory, gigabytes in storage? I don’t In like network attached storage. I, what are the costly ones of, out of all these features?
    Workload Economics: CPU, RAM, Network, and Storage
    Ivan [00:18:15]: The most expensive thing are CPU.
    Swyx [00:18:18]: Okay. Yeah, of course.
    Ivan [00:18:18]: The second one, yeah Then it’s RAM, then it’s disk. We actually don’t charge.
    Swyx [00:18:22]: Which is snapshotting, right?
    Ivan [00:18:23]: No, it’s actually the, snapshotting’s part of it, but basically the size of your hard disk, of your machine. So do you have 10 gigabytes, do you have 20, do you have 50, do you have whatever? And then the transference of that. Right now, currently we don’t charge for, network at all at Polychron.
    Swyx [00:18:37]: Oh, you gotta, yeah, you gotta fix.
    Ivan [00:18:38]: Yeah. It is very much a it’s a larger and larger part of our bill, so we’re working around, that part there. Obviously, that is the least, expensive, so the hard disk is the least expensive, so it’s basically CPU, RAM, for us network, ‘cause we don’t charge the customer, and then hard disk, is how it’s split up. But there’s also different types of workloads, so we basically split it up into two types of workloads in Daytona. One is what we call background agents or long-running agents. and the other is, basically RLs and evals, which I put sort of together. And so they have very different patterns of usage, and if you look at the usage of a background And I’ll just name names of companies, not specifically.
    Background Agents vs. RL/Evals: Two Usage Shapes
    Swyx [00:19:21]: Yeah, open, all hands.
    Ivan [00:19:23]: Yeah. So like a background agent’s a Cognition, a Lovable, a like all these things are Harvey. These are all long-running, background agents. And so if you look at their usage patterns, their usage patterns are similar to human, which is like follow the sun. Basically, the usage patterns of that is like noon is probably the highest, and the midnight is the lowest, and then weekends are lower. weekday is higher.
    Swyx [00:19:42]: Yeah, that’s a fun question. How global is it? Is it very US-centric or?
    Ivan [00:19:46]: The US is a large part, but we have currently, we have Asia, Europe, and the US regions.
    Swyx [00:19:52]: So it’s quite global.
    Ivan [00:19:53]: Yeah, it’s quite global. We have it all over. It’s interesting that our I talked to you a bit about this. Our number one city by user.
    Swyx [00:20:01]: Hmm.
    Ivan [00:20:02]: Is Singapore.
    Swyx [00:20:04]: Oh, wow. Amazing.
    Ivan [00:20:05]: Which is an interesting one, right? Not by revenue, just by just like by individual head count.
    Swyx [00:20:09]: Really?
    Ivan [00:20:09]: Just like an interesting thing.
    Swyx [00:20:10]: Singapore is, Singapore is weirdly high in the adoption charts of AI for the population. It’s like an, seven, eight million population. And it’s like keeps showing up.
    Ivan [00:20:20]: No, it’s quite interesting. We were quite shocked, and I was like, “Oh, this is interesting.” And also one that’s up there.
    Swyx [00:20:24]: There’s a reason I’m doing AI using Singapore. it’s because I’m from there.
    Ivan [00:20:27]: We’re there. We’re gonna, we’re gonna be there as well. and it’s interesting that Japan is in the top or like Tokyo’s in the top, which is in all the tech cycles it has never been. It has never been, so it’s quite interesting that they’re.
    Swyx [00:20:39]: I think the Japanese just love AI. Yeah. It’s that, and then it’s Brazil. That’s it.
    Ivan [00:20:44]: Brazil has always been in.
    Swyx [00:20:45]: I think.
    Ivan [00:20:46]: Even when I look, if you look at like GitHub’s data and ask historically with CodeAnywhere, it was always like US, Western Europe, and then you’d have like India, Brazil, China, like that would be there. But like Singapore was not in, specifically Japan was never in sort of that top, that top.
    Swyx [00:21:01]: Yeah. Weird pockets.
    Ivan [00:21:01]: Weird. Yeah, so it’s very global.
    Swyx [00:21:02]: Okay, so actually that, but that’s helps you to distribute your load through, all time?
    Ivan [00:21:08]: The interesting thing is like we have those kind of loads, but if you look at the researcher loads, they’re quite different. So what they are is like if you give them concurrency of 10,000 or 50,000 or 100,000 CPUs at ARMb, when they fire off a run, it’s just 100%. And then it just runs, and then it stops. So it’s very, the usage pattern is squares basically, right? And it’s also not follow the sun, because people will fire it off at midnight before they go to sleep but then wake up and so it’s very unpredictable, so you don’t know where that is. So the shapes of the usage are quite different than we have had before. And also what’s interesting is when it’s sort of a follow the sun, even if you have a high growth company, you can sort of predict your usage patterns and have enough capacity for that, because it’s sort of, it grows in a, in a way you can project. When you have companies doing sort of like evals and RL, they’re super spiky. So they’re gonna come in, it’s like, “We’re gonna use nothing, then can we have 100,000?” Right? And then go back down. And then 100,000, go back down. So it’s very different, right? And.
    Swyx [00:22:09]: Do you want to lock them into commits so.
    Ivan [00:22:11]: Yeah, we do.
    Swyx [00:22:12]: Yeah, okay.
    Ivan [00:22:12]: We so we have to lock them into some sort of commits to have that capacity, because we have to have, basically we have to have the capacity for peak. Right? And so right now, Daytona’s mean utilization is 15%, 1-5.
    Swyx [00:22:25]: Oh my God.
    Ivan [00:22:26]: So it’s very low.
    Swyx [00:22:27]: Because it’s very spiky.
    Ivan [00:22:27]: It’s very spiky, but we get up to 90%. so we have these things. And so what we’re, what we’re looking at right now as a company is similar to Cloudflare where you can like geo move things around, but that works really well for basically the background agent where it’s follow the sun. But this, it’s not. Like it’s a very different shape. Obviously with scale you figure these things out, but that’s an interesting new problem that we have, as a compute provider in the agent space. And when we were doing the conference recently, and so we talked to like Nikita from Neon and.
    Swyx [00:22:57]: I should bring it up.
    Ivan [00:22:58]: Parag from Parallel and whatnot, everyone has the same problem. Whereas the usage is super spiky, and this is something that has not happened before, that you have these types of like it was always, it the amplitudes were not this high, right? So it’s quite interesting use case and problem solve.
    Compute Conference and Spiky Agent Infrastructure
    Swyx [00:23:12]: Yeah, I don’t know if we’re gonna bring this up again, but let’s just talk about the conference, you had like 1,000 something people at the Warriors game, at the Sorry, where is it? What’s.
    Ivan [00:23:22]: Chase Center.
    Swyx [00:23:23]: Chase Center.
    Ivan [00:23:23]: Chase Center.
    Swyx [00:23:24]: I went. It was, it was very impressive. Obviously, you can, how to throw a conference, what did you learn? you put, you pulled together all these impressive names.
    Ivan [00:23:33]: What I.
    Swyx [00:23:34]: What were you looking for?
    Ivan [00:23:35]: My thesis behind the Compute Conference was let’s bring together people that are building infrastructure for AI agents. Because when I think of what we’re building, it is the agent is the primary user, what are the ergonomics and usage patterns of agents, and so we can do that. And what I found, this was a theory, it wasn’t proven, is that we all have these problems, as I touched onto. And I was, as I was talking on stage, it was like we all have the same underlying infra problems, which is this spiky workloads, unpredictable workloads that we’ve never had before, in human, compute or human infrastructure. And it’s, again, it’s the same when I was talking to Parag or when I was talking.
    Swyx [00:24:20]: Lynn. Nikita.
    Ivan [00:24:21]: Lynn, Nikita. Lynn especially, I was talking to her the other day as well. Like the It is a very interesting type of problem to solve because I can touch on Cloudflare because there’s a lot of like talk about that recently as to how they solve that, which is they have a bunch of geos, and basically, as users work in different places, and depending on your tier, they can move you around the geos. And so that how, that’s how they get the higher utilization. But you can sort of predict these, and it’s If it’s something in You’ll rarely get a spike that is 10 orders of magnitude. Like you’ll get a like let’s say one of your customers has some like an exponential curve. What is that to I’m using Cloudflare as an example. 10%, 20%, whatever it is. I don’t, I don’t have this data, I’m just assessing. It’s surely not 10x, right? It’s surely not something there. And so how do you go out and solve this problem? And we’re all solving this in different ways. So we have.
    Swyx [00:25:11]: She also has the same thing.
    Ivan [00:25:12]: Yeah, I know specifically that like Neon had that issue as well. Like how are we solving these spiky loads and things like that ‘cause we talked about it. And so the interesting thing for me to actually internalize was, yes, everyone that’s building for agents first is going through this, and we’re all solving similar problems, which is quite.
    Swyx [00:25:28]: Let me let me double-click on this. Okay. So for example, Neon, I happen to know that they’re very sort of S3 oriented, right? so they’re just like fully bet on S3. And you get to benefit from S3’s distribution and infrastructure. So I would imagine that Neon doesn’t have to care, whereas Lynn maybe has to care a bit more because obviously she’s doing GPU inference. And, for listeners, we did an episode with her, one and a half years ago. And you have to care. But like, right?
    Ivan [00:25:54]: Parag cares for sure, and Nikita.
    Swyx [00:25:58]: And Parag is C of, Parallel.
    Ivan [00:25:59]: Parallel, yeah.
    Swyx [00:26:00]: Former CTO of Twitter.
    Ivan [00:26:01]: Twitter, yeah.
    Swyx [00:26:02]: They are the search.
    Ivan [00:26:03]: Yeah, they’re search, yeah.
    Swyx [00:26:03]: I You and I know but the listeners don’t know.
    Ivan [00:26:08]: Yeah, we can put it down in the screen, and so ‘cause we, when we were talking.
    Swyx [00:26:11]: I’ll put it up on the, on the screen.
    Ivan [00:26:12]: Yeah, right.
    Swyx [00:26:12]: People can look it up if they need.
    Ivan [00:26:14]: Look it up. And, yes, but they still have CPU and RAM, allocation that you have to have up and running. And so CPU and RAM, you have to allocate that and have that ready. And so there’s basically two ways to do it. One is you either over-provision and you can handle the bursts, or two, you basically have, I don’t know if this is a term, just-in-time compute, which is like as your load becomes, as your usage comes in, you can fire off requests for VMs or bare metals at other cloud providers and then get them up and running.
    Swyx [00:26:43]: This is if you go above 100%, right?
    Ivan [00:26:45]: Yeah, this is.
    Swyx [00:26:46]: Like your overflow.
    Ivan [00:26:46]: If your overflow, like spillage or whatever you do.
    Swyx [00:26:48]: You probably lose money on it, but it doesn’t matter, right?
    Ivan [00:26:50]: It, not Well, you might, you might not That is a more cost-effective way to do it but it’s a slower way to do it. Because basically what you have to do is you have to like queue your requests, spin up these just-in-time compute, get it all ready, provision it, and then get your workload there. And so if the time isn’t important that much, that’s fine, and you can do that. But if your customer, and especially for, let’s say, the RL training runs, the reason why a lot of people come to us is because GPUs are more expensive than CPUs, right? So you want your GPU running at, what, 100% the entire time. And so when you’re running runs on CPUs, when the when the CPU cycle is like down and spinning up the next one, you want that to be instantaneous so that your GPU doesn’t go down, right? And if you then have to like go out and provision machines, you’re essentially telling the GPU that it has to wait, and that’s incurring our cost. So there’s things that you have to try to solve for there.
    RL Workloads, Declarative Images, and Kubernetes Replacement
    Swyx [00:27:43]: Yeah, let’s talk about the different workload, right? You said that, what was it? A few months ago, you had zero RL workload and now it’s 50%.
    Ivan [00:27:52]: It will be this one, 50%, yeah.
    Swyx [00:27:54]: Let’s talk about how different it is, right? Like I imagine, for example, a lot less dynamic code generation of like arbitrary code. Like here, it’s probably all the same code. You’re just doing parallel runs or something, I don’t know.
    Ivan [00:28:05]: Yeah. So you’ll have multiple Depends on the like for each run, you’ll have a snapshot. And they, for the most part, they actually do use our declarative image builder, which is like, “Oh, we, the agent wants these dependencies, these env vars.”
    Swyx [00:28:17]: These ones, yeah.
    Ivan [00:28:18]: Yeah, the declarative image builder, it.
    Swyx [00:28:20]: Which is a very modal like thing that they.
    Ivan [00:28:22]: Yeah. And so we build it on the fly and then we propagate that snapshot, and you can spin up as many sandboxes as you want against that snapshot. And then if you have to do changes, the model can, or like it could be also be automated. It’s like, “Oh, now for the next run, we need to install these things or remove these things or whatever to get, a task done,” and then it goes off and runs that. So yes, that is something that it seems that they prefer. The number one reason I found, or should I say, let’s take a step back. What we are competing against in that environment is essentially managed Kubernetes. So EKS, GKE, whatever. That is what the vast majority run on. And anyone that has tried Daytona versus GKE, EKS is like, “I’m never going back.” That has always been. There’s a few reasons. One is the ergonomics. So if you have, if you’re using Kubernetes to spin that up, you have to essentially manage the interface interactions with that. Daytona, although as a compute provider, it’s more akin to a Twilio and Stripe from a consumption perspective than it is an AWS. Like you have an API, an SDK, it’s quite like easy and seamless to get these things up and running, that’s one. The other is the speed to which we spin up, which we mentioned earlier, which is much faster, and the scale to which we can go to. We haven’t got into features, but an interesting feature is that it’s very hard to OOM, or out of memory, our sandboxes, because we can dynamically on the fly.
    Swyx [00:29:48]: Resize.
    Ivan [00:29:49]: Resize, which is like impossible on almost any other thing. There are some technologies that enable you to do that, but it’s like a very hard thing. And so we actually saw this when, the Terminal Revenge team is, brought us actually. So thank you, Alex and the team, that brought us into this whole space.
    Swyx [00:30:05]: It’s just very rare that, a framework would just say, “Guys, just use Daytona.”
    Ivan [00:30:11]: Yeah, I think it says it somewhere. Yeah.
    Swyx [00:30:13]: Yeah. I was like, “What is this?”
    Ivan [00:30:15]: There’s all, there’s multiple there, but they also mention a few other places. and so Daytona specifically-We have, the, just jumping on themes here We, I don’t know where it says Data Center.
    Swyx [00:30:27]: I, there.
    Ivan [00:30:27]: Doesn’t matter.
    Swyx [00:30:28]: There’s a very strong recommendation, which is, very unusual. Which is, it’s.
    Ivan [00:30:33]: We do not pay them for this, just.
    Swyx [00:30:34]: I know, yeah. They just like you.
    Ivan [00:30:35]: Yeah, they like us. yeah, and also a thing, so, Data Center has multiple isolation sets underneath. The customer doesn’t have to know what they are. But basically we have Docker, which is a container, that’s hardened with Sysbox. So it’s Docker’s, isolation that is a security equivalent to a VM, but it’s still a container. And that is the default, and they, especially in these training workloads, really like that as an interface to be able to use just a basic Docker container, and we enable Docker and Docker. Which for these RL runs, if you need to do a Docker compose or Kubernetes, you can spin up a K3S inside of these things, which unlocks a huge amount of workloads that you can do that you cannot do on other providers. So just on that part is much more interesting. And so we went that, through that. We showed them that we could do that, and they enjoyed that quite a bit. They being the general venture people.
    Swyx [00:31:28]: Those people, yeah.
    Ivan [00:31:29]: And Harbor people.
    Swyx [00:31:29]: Harbor people, do are they, are they a company yet?
    Ivan [00:31:33]: As far, I do not know.
    Customer Pull, Slack Connect, and the Computer Use Bet
    Swyx [00:31:35]: Okay. All right. Yeah. It’s like super obvious that like, there’s a lot of excitement and success around these things, okay, so yeah, tell us more, right? Like, this is an exploding workload, Harbor adopted you, which helped speed things along. But what are you learning as this new workload comes online?
    Ivan [00:31:53]: There’s a couple things that we learned, which we chat about in the beginning. We, and this has led our story, as we mentioned, we like talked to a lot of customers along the way, and we add more features and more tool sets as we talk to customers. And it’s interesting that And I think it’s that the ecosystem is so small and/or the models get smarter, where when we see one user come with a request, we know it goes on a roadmap if like three to five customers come with the same request in that week. It’s like very bizarre. It happens so many times, which is.
    Swyx [00:32:27]: Because they’re all friends.
    Ivan [00:32:28]: Sorry?
    Swyx [00:32:28]: They all, they’re all friends. They’re all in the same group chat.
    Ivan [00:32:30]: Yeah, probably, yeah. ‘Cause and they’re like, “Oh, can you do this?” And I’m like, “Okay, this is interesting. We’ll put it on a feature request.” And then the next one’s like, “Oh, can you do this?” “Okay.” It’s all the same, right? It’s always the same. And so what we try to do, and I personally try to do, I try to be on as many call, quote-unquote “sales calls” I can. I’m in every Slack channel. We literally have about 1,000 Slack Connect channels, something like that. It’s an interesting, there’s so many interesting things you find out when you have all the Slack channels. You can also see where people, transfer between companies. You see leave Slack channel, enter Slack channel. It’s an interesting thing. Also, just I digress, I feel that Slack Connect is literally LinkedIn what it should be. You have a list.
    Swyx [00:33:08]: LinkedIn charges you to, use your own connections, but Slack doesn’t, right? Slack is like, do it for free. It’s more lock-in. It’s great.
    Ivan [00:33:15]: Yeah. It’s amazing. Yeah. It’s one of the reasons.
    Swyx [00:33:17]: You’re gonna pay Slack for life.
    Ivan [00:33:18]: Exactly. You’re there for life. So that’s interesting. And so one of the things, the newer things we were talking about earlier is we made a big bet and put a lot of investment on computer use. that is not seen publicly the light of day. We haven’t GA’d that yet, but we have.
    Swyx [00:33:32]: Is there a thing I can pull up?
    Ivan [00:33:33]: There is computer use there. It’s right up a bit.
    Swyx [00:33:36]: Oh, yeah. Okay.
    Ivan [00:33:38]: What we have, what we talked about and what we’ve seen publicly is there’s this theme now about, the human emulator where And Elon from XAI has talked about this publicly, and if you think about the models today, they’re actually quite sophisticated and they can do a lot of work, but they still don’t have access to all the tools. Like, I’m a strong believer that the most efficient way for an agent to work is essentially headless or through, terminal or whatnot. But if we, if we look at knowledge work in general, there’s about 100 million knowledge workers in the US, about a billion in the world, and knowledge workers, and the salaries of them aggregate to 10 trillion in the US 50 trillion worldwide.
    Swyx [00:34:24]: Wow.
    Ivan [00:34:25]: Something like that. And if we look at, the five most important sectors of that, so like healthcare and government and financial services and whatnot, that’s about 56% of that. So let’s say it’s about half of that. So in the US it’s about 25 trillion, and most of them, most of that work is actually still locked into legacy apps inside of Windows, which is not going anywhere for a very long time. Like, people just won’t invest in that. How much of it? our assumption is the following: if, in the RPA market, which is similar market, well, not the same 25% of, these white collar, workers’, work is automated. If an agent is more sophisticated, can go through more runs, figure stuff out, let’s say it’s, 40%, right? And so if you take 40% of that, you get to essentially, $10 trillion a year.
    Swyx [00:35:17]: That’s a TAM.
    Ivan [00:35:18]: That is a that is a TAM. So that’s the TAM of the models, right? That’s not our, essentially ours. But you get to that size, and to be able to do that, you essentially have to give agents these computers with the legacy. So computer use, either Mac or Windows or Linux. Linux we also obviously have and others have. But Windows specifically is something very new, and the only option right now is an EC2 with, Windows or on Azure. Both of them take anywhere from three to five minutes to spin up. We’ve created an actual sandbox, so it’s a second instead of milliseconds, but you have, point in time snapshots, you have, forking, you have all the things that you have from a sandbox, but essentially enables you to hopefully unlock all this value. And so that’s been our big push and bet, but we’ve sort of, kept our ear to the ground. What is sort of the next things in the market?
    RPA Returns: Why Agents Still Need Computers
    Swyx [00:36:06]: Yeah, knowledge work, and building, and sort of RPA, the next wave of RPA. I got very excited about RPA kind of during COVID times. The UI path was IPO-ing. And it was, a very hot Isn’t it, Eastern European?
    Ivan [00:36:20]: It is, Romanian.
    Swyx [00:36:21]: Romanian?Yeah, it might be the only Romanian, big unicorn okay, yeah. This I don’t I don’t, I don’t have like a I think there’s, I think there’s a stage being set for the resurgence of RPA, ‘cause everyone understands that, yeah, no one wants to deal with these shitty apps and no one’s gonna rewrite them. Like, you just have to do, a remote operation and programmatic operation of them.
    Ivan [00:36:45]: If you wanna unlock it, my own setup was basically the following. So I was doing a board deck recently, last month, whatever, and I’m like, “Okay, let’s just, let’s just do automated.” So, all our data’s in, ClickHouse and PostHog and QuickBooks, where everyone else’s is, and I’m basically, connected that all to, my Cloud code, like go off and go Cloud code whatever. Go off and, here’s the integrations, go do that. It pulled out the first report, which was great. It connected to Brex and all these things, pulled it, which was great, and then I say, “Okay, now pull out this, and this,” and I kept getting, really well McKinsey-style design reports, but the data said partial data. all the missing data, partial data. Like, it can’t access all the things, and I got so frustrated, and so I got, I got, my Mac Mini virtual sandbox with OpenClaw. I gave it its own account in our company, and then I went to all these services and created a read-only account, so literally like an intern in your company. And so I would say, “Now go and do this report,” and it would get the same, or like, “I can’t via the MCP or the API or whatever. I can’t get all the information.” I’m like, “Go log in.” And it will log into the website, then go in, export the data. It’ll export the data and do the thing end to end. So even for things that have today APIs, not all of it is exposed, and I to get value, I get immense value right now, but it has to be a computer usage, unfortunately, and so I spend a bunch of tokens just on that, but I get the job done. And so if even a startup like ours, and using all the hottest tools, still needs a computer agent what hope does, Goldman have to have a headless, right?
    Swyx [00:38:22]: Yeah, what a - Why isn’t Microsoft doing this?
    Ivan [00:38:27]: I’m pretty sure, Satya had a post yesterday.
    Swyx [00:38:29]: Oh, okay. I see.
    Ivan [00:38:29]: Which was like, “Every agent needs a computer.”
    Swyx [00:38:31]: I see, I see.
    Ivan [00:38:32]: So they have launched something recently.
    Swyx [00:38:34]: Yeah, they have Microsoft Power Automate, I’m sure, I’m sure, they’re gonna have their version.
    macOS Sandboxes, Apple Constraints, and the Windows Opportunity
    Ivan [00:38:39]: Version of that, yeah.
    Swyx [00:38:39]: You’re gonna try to do yours, and it - I always know there’s always demand for Mac, but I know it’s, tricky to host, macOS sandboxes.
    Ivan [00:38:49]: We will have macOS sandboxes fairly soon. The problem with macOS, OS sandboxes is, I’m deep in this, I don’t know how much interesting is.
    Swyx [00:38:55]: No, it’s.
    Ivan [00:38:56]: MacOS has this problem.
    Swyx [00:38:57]: It’s a licensing thing, right?
    Ivan [00:38:58]: Licensing thing. So one, you’re allowed to run only two parallel VMs per machine, so that’s one. Two, you can only license to a different user every 24 hours. So if you come in and theoretically, if I wanna charge you per second and I charge you one second, I have to have it idle for the rest of the day. I can’t have anyone else doing that. So the pricing will be different in the sense that I will have to - we would have to charge for 24 hours, and that’s not even, that’s not even the most difficult thing. But the, thing above that is, from a security perspective, they enable you to do memory snapshot, pause, resume, but only on the same physical drive, physical machine. And so what you can do in, Windows world or Linux world is that I can move in the background, your snapshot from one to the other and manage load, right? Here, if you wanna do that, you essentially have to have your.
    Swyx [00:39:49]: Yeah, snapshots. Yeah.
    Ivan [00:39:50]: Your.
    Swyx [00:39:51]: It’s like.
    Ivan [00:39:51]: Physical machine.
    Swyx [00:39:52]: You can’t break it up.
    Ivan [00:39:53]: You can’t, you can’t move things around that, and all of that is, that part is, from a security standpoint, if it is written. Like, I understand the security aspect of that, but it disables you from doing these agentic, like really scalable agentic workloads.
    Swyx [00:40:08]: You need to do a vibe-coded, clean room implementation on macOS that you can then - That’s like Clean OS or something. I don’t know.
    Ivan [00:40:17]: So. We have.
    Swyx [00:40:18]: ‘cause like Linux was originally like a clean room rewrite of Unix.
    Ivan [00:40:21]: Okay. Yeah.
    Swyx [00:40:21]: Or something like that, right? Like same thing to macOS. Someone needs to do it.
    Ivan [00:40:25]: Someone will do that, and someone will have some long-running agents for a few days to figure this stuff out. But yeah. So definitely we - we’re really close to offering something ‘cause people do want it, but the pricing will be different, and the feature set will be sort of stringent.
    Swyx [00:40:38]: Yeah, nobody’s gonna use this. like, the labs, the labs will because they want to automate macOS.
    Ivan [00:40:42]: They have to do RL. They have to do RL again. But even if you The - So the point is with the RL part, if you, if you do RL on macOS, then the next iteration of the model comes out, it will be able to use these tools significantly. Then you actually need to run those, that somewhere. So you’re gonna have to have that, later on. And from, if anyone at Apple is listening, I very much feel that they are shooting themselves in the foot of the scale of the revenue of compute or licensing they could get if they would just enable a concurrency model similar to what you can get on a Windows and a, and Linux.
    Swyx [00:41:17]: Yeah. Yeah. And I’m sure they’ve heard this before. They just don’t care. Yeah, it’s And maybe they will change their mind with the new CEO.
    Ivan [00:41:24]: Yeah. We’ll see.
    Swyx [00:41:25]: We’ll see.
    Ivan [00:41:25]: High hopes.
    Swyx [00:41:26]: High hopes.
    Ivan [00:41:26]: High hopes.
    Swyx [00:41:27]: Okay. But I, it’s very clear the market opportunity is huge in Windows, and you can go for a long time on just Windows, but your customers are gonna want both. and I think, it is interesting to me that, this is the sort of God application of agents, right? Like, I don’t It was - How big was OpenClaw for you guys? Like, was it, was there, a significant bump.
    OpenClaw, Agent Labs, and the B2B2C Sandbox Market
    Ivan [00:41:54]: Not for us because we.
    Swyx [00:41:54]: Because you already.
    Ivan [00:41:55]: We’re kind of positioned differently. Whereas although it’s completely PLG and we have individual developers that use it, most of the users that use Daytona are sort of a B2B2C. Sort of it’s either B2B or B2B2C. So, in the researcher world, it’s B2B, so you’re selling to, labs and neo labs and things like that. But on the long-running agents, it’s mostly, from a scale revenue perspective, it’s mostly B2B2C, where you have a app layer agent that uses you at a big scale.
    Swyx [00:42:26]: Like a Manus. Yeah.
    Ivan [00:42:28]: Like a Manus Lovable type of thing.
    Swyx [00:42:31]: Yeah. I think that’s the question of, well how, um-Uh, yeah, B2B to C is basically to me what I’ve been calling an agent lab, which is kind of like you’re not in a model lab, but you’re making a very good wrapper that is a platform that other people can sign up so they don’t have to code those things. Yeah, it sound, it sounds like a much better market than the direct OpenClaw market.
    Ivan [00:42:56]: I’ve like - We I’ve done multiple things. So the CodeAnywhere’s part of our career path R in the calendar, was very much an end user developer product. And so that is great. It You can get a lot of developer love, and I feel that we do as a company have a bunch of developer love. But it’s a different type, where it’s people building these things. Again, it’s more akin to a Twilio because you don’t really run - As a person, you wouldn’t run Twilio. I don’t know how many people remember. It was like ask your developer billboard and whatnot. And people really love Twilio, but they only used it inside of like, “Oh, I’m building this app or service for thing.” And so we’re very much directly to that. And you also know that I used to work for a competitor for Twilio, so it’s kind of ingrained, in my DNA.
    Swyx [00:43:35]: People don’t know InfoBip is that big.
    Ivan [00:43:38]: Yeah, it’s.
    Swyx [00:43:39]: Because.
    Ivan [00:43:40]: It’s a billion euro.
    Swyx [00:43:40]: They’re all American. They’re like, “Whatever’s in Europe doesn’t matter to me.” But like it’s the, it’s the same size or bigger? Same size?
    Ivan [00:43:46]: It’s about half the size.
    Swyx [00:43:47]: Half the size?
    Ivan [00:43:48]: Yeah, about half the size.
    Swyx [00:43:48]: It’s like, yeah.
    Ivan [00:43:48]: Still huge. Multiple billions a year. Yes.
    Swyx [00:43:51]: That’s crazy.
    Ivan [00:43:51]: Exactly, and so that - These are like really interesting and large revenue-generating, very sticky businesses. Whereas when you’re selling to the - When your focus is the end developer, it is a very hard sell because they’re very price sensitive, very price conscious, very around that. And there’s very It’s very hard to scale. Your cap is the number of people that are willing to spin up - First of all, wanna spin that up, and then spin up multiple of these. Whereas if you’re in the enterprise one, like we know everyone’s talking about like how many tokens they’re spending, I’m spending. Like a lot of companies today are like, “If this is our company, spend as much as you can.” Like basically that is where we’re going. And so if you think about that paradigm, where you’re selling to companies that say, “Spend as much as you can to generate, productivity,” versus, “Oh, I’m a single person. I have this much budget, and I’m doing this thing because it’s fun or it’s helping me out or whatever.” Like it is a different, it’s a different go-to-market, I think, strategy.
    MCP, CLIs, and Sandboxes as the Agent Runtime
    Swyx [00:44:50]: Yeah, there’s a lot of discussion. I’m just kind of going through like the mental list of things that are in your favor, which is, for example, MCP versus CLI. Like obviously you want CLI. It’s been very good for you. I feel like it’s maybe a drop in the bucket or maybe it’s huge. I’m just checking whether it’s like these are big trends.
    Ivan [00:45:10]: Those things you - work well in our favor, to your point just because every.
    Swyx [00:45:13]: They’re kind of drop in the bucket, right?
    Ivan [00:45:15]: I think it’s like sort of all the things come together. And so there’s so many things that impact that. To your point, like OpenClaw wasn’t huge for us, but like having the agent SDK, from Anthropic, so or Cloud Claude Code was very interesting. The reason why it was interesting is that a lot of, let’s call them app I don’t know what to call them, app layer agent companies, essentially they are like, “Oh, I can create this new app, this new agent. All I need, I just use Claude Code, and I throw it into a sandbox, and then I have my interface to the human to that.” And so that enabled so many more companies to actually offer this, and then they would pull on sandbox. So that was, that was interesting. And to your point, like MCP, versus the CLI, the MCP is an interface against an API, whereas the CLI is like you can actually go do things. Like this is it. The difference between integrations and actually running scripts or data or analysis against a thing. So being able to use a CLI very well enables the agent to do more things, and it’s because that people will invoke a sandbox, they’ll run it in the CLI, and but it’ll do anal-analysis on that data and then give you an actual result versus just, pulling data from an API source.
    Swyx [00:46:29]: Yeah, it’s a layer of indirection basically, it’s the same thing as agentic search versus RAG, which where you’re.
    Ivan [00:46:34]: Exactly, yeah.
    Swyx [00:46:34]: Just like you just win whenever people put more agents into their workflow. And so like it doesn’t really matter, but I’m just kinda teasing out like what else have people heard about that like it’s sort of, “Oh yeah, this is another sandbox use case. Oh yeah, that’s another one.” Am I, am I missing any big ones?
    Ivan [00:46:51]: The thing, the thing that people, which is the computer use stuff, which I think is probably the most interesting one, is, and to your point, we’ve talked to so many people over the last year. It’s like, “Oh, like why do you need a sandbox? Why do you need this? Why this?” And to your point, it’s like, “Oh, I need sandbox for this. I need sandbox for that. I need sandbox-” It’s like, “Oh, I need it for every single thing.” And so basically what I, what I - and it sounds like a broken record, it’s like you use a laptop every single day, right? And you are n of one. It’s just you. But now imagine how And by the way, the laptop, the computer PC market, the PC market is about equal to the cloud market in total. So it’s about 150, 180 billion a year. Something like that. It’s about roughly the three cloud hyperscalers is about equal to like Apple, HP, Lenovo, whatever, It’s a little bit less, but it’s sort of like that. And now imagine And that’s just like, so how big is the addressable market? What, how many people are there in the world now? What’s the last data?
    Swyx [00:47:45]: Let’s call it eight billion.
    Ivan [00:47:46]: Eight billion. And so let’s say you can have two computer, like you have one personal and one business, whatever. Like so it’s double that, right? and so that’s 16 billion, right? How many agents are gonna be running in two years, in 10 years, in 100 years? Like And for every single task, they will need one of these. And so how big is that? That market is essentially quote unquote “infinite”. You will get to the point, and Dylan Patel was at the conference talking about, from SemiAnalysis, that talks usually about GPUs, was also talking about how CPUs will now be a bottleneck because it will be the constraint. You won’t be able to grow, or we won’t be able to have enough of these because there won’t be enough CPUs to basically do.
    Swyx [00:48:23]: Yeah. Well, I actually had a really good podcast with Doug Oliphant, who, which was his president at SemiAnalysis, where they’ve basically been like, yeah, it’s been a GPU shortage first, but then it’s cascaded down to memory and now to CPUs.
    Ivan [00:48:35]: CPU, yeah.
    Swyx [00:48:35]: It-What’s next? So networking. So, networking actually has been in shortage for a while if you’re looking at, just GPU networking. But, yeah, it’s really crazy the amount of computer use that’s going on, yeah, cool. I, other questions are, just the one very big part is the open sourceness which you didn’t have to do, your competitors don’t do, like it’s not, a lot of people are worried about keeping their projects open source because some competitor can just slot fork it. I don’t know if there’s any reflections on just being an open source company.
    Open Source, Trust, and Enterprise Procurement
    Ivan [00:49:15]: Yeah. There’s a bunch. So we the original product that we did was open source.
    Swyx [00:49:19]: Yeah. CodeAnywhere.
    Ivan [00:49:20]: So doing that was actually very good for us. There’s basically a saying of, What’s the saying? Like, companies that are, that are doing really well, measure themselves against, free cashflow, that are kinda okay, it’s EBITDA, then, it’s, it goes all the way down.
    Swyx [00:49:36]: The worst is like GitHub stars.
    Ivan [00:49:37]: GitHub stars. GitHub stars are the worst, yeah. So you go all the way down to GitHub stars. And so our original one was GitHub stars. That’s what we talked about, we’re at the point we’re talking about revenue, so we’re we’ve gone up the stack on that. And so we started.
    Swyx [00:49:47]: No, profit.
    Ivan [00:49:48]: Yeah. We haven’t, we’re, we’ll get there. We’ll get there. But basically at that point we did stars and GitHub and it was useful, and the original variation that we did, it we split the core into its own repo and it was Apache 2.0, so very, permissive. And then we basically would bundle that on the enterprise side with a proprietary repo. So it was like open core, but it didn’t, it didn’t fill out the repository was very clean. When we did the pivot, we didn’t have time to rethink this, and we wanted to We had this open source community. It felt a shame not to do that, and so, but we still did want to add some restrictions, so in the new sandbox product we did add a AGPL 3, which is, it’s a kind of a shortcut way to do that where you are open source. And it is true open source in the sense of an enterprise can use it if it, if it wants, but you essentially can’t make a competitor without open sourcing your stuff, which.
    Swyx [00:50:42]: It’s one of, three approaches. Like, there’s, BSL and some of the other sort of, elastic license.
    Ivan [00:50:47]: Yeah. There’s some others there. So pure open source believers agree that this is not full open source and I totally respect that. That is absolutely true, but we did leave that. And Daytona, in its essence everything outside of what’s under a feature flag today, which is like the Windows stuff, GPU stuff, and whatever, it is in this open source. It is there. So everything is there, like our own scheduler, everything’s there. So we are I’ve had some competitors say, “You guys are actually open source open source. Like, you’re real.” “Like, you can actually see that.” And people do like that, and it has helped a bit, but it’s actually more helped in the consumption of our cloud product than actually transferring people over. The reason is you can actually You send the repository to your agent when you’re integrating Daytona and it just has more context. It’s like, “Oh, okay. This is why this is happening. This is why this, that.”
    Swyx [00:51:41]: You could equivalently just have docs that you can Yeah, so, okay.
    Ivan [00:51:45]: I agree, but I, it to be fair, and so it actually doesn’t really help the growth significantly today. We’ve had this conversation with, investors and other people is like, “How do you convert people.
    Swyx [00:51:56]: Dude,.
    Ivan [00:51:56]: From open source?”
    Swyx [00:51:57]: The open source business conversation is so all over the place, right? Okay, on and I would just, for listeners who maybe they haven’t thought this through, a lot of people say, “Oh, it’s our free tier,” right? Like, “Oh, if you run it yourself, but if when you get serious, call us.” Right? And then other, And then me personally, ‘cause of my Temporal experience, it actually is the way that, it’s the, it’s GTM into some of the largest companies where we wouldn’t pass their, review process maybe ‘cause we’re too young of a company or, there’s, parts of the stack that we haven’t, that just doesn’t work with them. But because it’s open source, then they, then they adopt it, and then later on we figure it out. Like, that’s the low end and the high end. I don’t know if it.
    Ivan [00:52:37]: No, absolutely, and that has been historically. The thing that we have found in this AI transition is, and so we haven’t talked about this, Daytona’s customers are everything from, the single developer, the YC startup, to people say Fortune 500, I’ll say Fortune 5, like the biggest companies in the world.
    Swyx [00:52:55]: Big Neo labs. You told me about the, we’re gonna keep them anonymous.
    Ivan [00:52:59]: All, the enormous companies, right? And because the market pull is so strong, we’re able to circumvent these processes. I’m not saying We go, we pass security audits, we pass all these things, but as you mentioned, like Temporal way back in the way, day, in our old version of Daytona, like it took us months, and usually at the end they would churn off because just like, “Oh, you’re too small of a company,” like, “We don’t trust you” “enough.” Whereas today we’ve had these large companies push us, like they would push us through. Like, usually when you would go through procurement to become a vendor of large companies, it would take you like two, three months. We get it done in five days now. And this is not saying that maybe we’re great, but it’s more, I think, a sign of the market where it is today. And so when you think about that, the open source is something that we, from a go-to-market perspective, don’t think about that much because everything that we’ve created right now has been PLG through the cloud product, people signing up and just pulling us inwards.
    GitHub, Agent-First Versioning, and CI Bottlenecks
    Swyx [00:53:53]: Yeah, this is a personal interest, and I don’t know if you have an answer, but, do you have problems with GitHub?
    Ivan [00:54:02]: I do. A little bit. A little bit.
    Swyx [00:54:04]: Yeah. Tell me, tell me. ‘Cause I’m thinking about, well, okay, what would it take to replace GitHub?
    Ivan [00:54:09]: There’s a lot of things. I’ve thought about this, and I’ve talked, I’ve tweeted about this, and I looked at some. I’ve actually invested personally in some.
    Swyx [00:54:17]: Is it, Entire?
    Ivan [00:54:18]: No, I haven’t done it.
    Swyx [00:54:18]: No? Okay.
    Ivan [00:54:19]: Yeah, so I, and I’ve met Thomas or virtually and we’ve talked. So I really think that And this was my reason for that. Because we have a bunch of background long-run agents, and for our time most of them are coding agents. Like, everyone was building up a competitor to Lovable or Devin or whatnot. What we saw from our customers was that they were all trying to figure out how to do, versioningLike, everyone is doing it in different ways. There was like some really weird ways where people were doing that, and the reason was that GitHub as is was an overhead. Like, it wasn’t fast enough what they needed, it didn’t solve the problem that they needed. And to be fair, like GitHub is for post your the inner loop, right? It is post your laptop, right?
    Swyx [00:55:07]: Yeah, GitHub is the point at which the outer loop starts.
    Ivan [00:55:11]: So people started using that for sandboxes, which is inner loop, which is usually, it’s on your laptop, right? And so that is not what it’s made for, and then we had everything from people Actually, the most interesting one is we had one customer that would literally take the entire code base inside the sandbox and every I forgot what the time sequence was, they would just dump it all into a JSON and then push that to S3. And that’s it.
    Swyx [00:55:37]: Make your own Git.
    Ivan [00:55:38]: It’s, it But it’s not, there’s not even diffs, it’s just a whole thing every single time. It’s just every Because it was super fast. Like, it didn’t matter. And then they would go back and search and find, sort of what the file was and write it, and whatnot. Because there’s text file, there’s JSON, like they’re very small so the network cost is very low, and they didn’t care, and they just did it that way. And I’m like, if people are doing this, that means there needs to be a new solution to this problem, right? And so for me, it’s quite interesting to look at who is building these types of new things. Agent first. I think Git as is still exists in the future, maybe even GitHub exists, but there will be a whole new sort.
    Swyx [00:56:15]: Yeah, exactly. Git is like the deploy artifact to kick off CI/CD. But then there’s a layer before that is like the agent collaboration layer.
    Ivan [00:56:23]: Yeah. And so I think something needs to be said there, but on the other side, like there’s issues with Another interesting thing is just like CI right now. So the amount of PRs being created is insane right now, right? In general.
    Swyx [00:56:33]: Even for you guys, right?
    Ivan [00:56:34]: Everyone’s creating a bunch of PRs. everyone. And then all that has to go through CI, and then that’s the bottleneck. Like, everyone’s bottleneck. Like, not just like, not just actions, but like go to any CI provider, you will not be able to, if you have a high throughput of PRs There’s one company we’re talking to, they do 1,000 PRs a day. Which means like And they’re just waiting. They have just a queue on that, right?
    Swyx [00:56:55]: What do they use, Buildkite.
    Ivan [00:56:58]: I don’t know what they.
    Swyx [00:56:59]: Circle?
    Ivan [00:57:00]: They’re, whatever.
    Swyx [00:57:00]: Technically your tech can be used for CI.
    Ivan [00:57:03]: That’s, that was the conversation. That was the conversation.
    Swyx [00:57:06]: Is that a serious conversation?
    Ivan [00:57:08]: We’ll, we’ll see how that goes. We’ve had quite a few conversations around that. We’re we are not a CI provider by any means, right?
    Swyx [00:57:13]: But what is what’s missing?
    Ivan [00:57:15]: No, so essentially.
    Swyx [00:57:17]: Nothing.
    Ivan [00:57:18]: You, essentially you could use a Daytona sandbox instead of whatever you use for, your GitHub runners essentially.
    Swyx [00:57:27]: Like, yeah, I’m The only thing I would say is like maybe CI machines are supposed to be very cheap, maybe it’s like the low end because it’s supposed to be like, non-blocking or like something like a, like a background job. Like, it’s, the urgency is not that important for CI.
    Ivan [00:57:45]: Performance is, though. Performance is, yeah.
    What Sells Daytona: Responsiveness, Support, and Customer Trust
    Swyx [00:57:48]: Yeah, okay, that is interesting, and yeah, I think, like before we leave Daytona and go into like sort of broader like founder takes and what have you, any other Daytona elements that, is interesting that we haven’t touched on?
    Ivan [00:58:04]: Interesting Daytona things. There’s, there.
    Swyx [00:58:06]: I can, I can give you more prompts if you want.
    Ivan [00:58:07]: Yeah, I’d love more prompts, actually.
    Swyx [00:58:09]: Okay. So when startups evaluate you, so you have, you have all these like names and you have more that you can’t, you can’t even name, they see all your wall of competitors. and yeah, you have differentiation versus, many of these, but like what sells them?
    Ivan [00:58:26]: The thing that we found that sells people the most, this is more maybe a day two thing instead of a day one thing. And we’ve seen this again and again. So we have a bunch of case studies, and we have a bunch of them still coming out. They’re all done by a third party, so we don’t do the case studies, and it’s actually interesting to watch those cases. I watch, they’re recorded, and because it’s a third party, people are actually more open, and they will tell you, “Oh, we use this competitor,” or, “We like this competitor more,” or this thing or whatever. And the number one thing that people come back to us for is that our, we have an insane responsiveness.
    Swyx [00:58:57]: In terms of your team?
    Ivan [00:58:58]: In terms of the team, yeah. Insane responsiveness has been by far the Now, we can talk about like features and breadth of product and concurrency and CPUs and like all those things, but I feel that would probably So if all other things are equal, that is very much a differentiator I’ve found. And I didn’t know.
    Swyx [00:59:15]: Is that entirely Slack or Slack plus email?
    Ivan [00:59:18]: It is, there’s email there as well, there’s calls, but the vast majority is like on Slack. So it’s Slack. Like, we have had customers like, “Hey, we have a problem. Can you get on Huddle?” Like, we will get on that Huddle like in five minutes, literally. I’ve done this multiple times, so yeah.
    Swyx [00:59:31]: Wait, okay, so how big are you?
    Ivan [00:59:33]: 25 today.
    Swyx [00:59:34]: How do you do this kind of support like this?
    Ivan [00:59:36]: We’re insane. We don’t sleep. 007, have you heard the new thing?
    Swyx [00:59:40]: 007. like I’ve met your team. They’re very impressive, they’re very dedicated, but like also how do you get a team to do that? it’s.
    Startup Culture, Family Tradeoffs, and Enjoying the Pain
    Ivan [00:59:48]: So there’s.
    Swyx [00:59:49]: I have Slack exhaustion?
    Ivan [00:59:51]: Yeah, we all have Slack exhaustion. We’re very tired. the thing that is unique, I don’t know unique about us, but unique, I would say unique about any successful, serial founder is that you’re able to pull in people that you’ve worked with before, and so you can’t do that as a first-time founder. Like, I couldn’t have done that or not. But of the 25 people in Daytona, I think about 13 of them we have worked with seven years plus. So it’s like high trust, high throughput, high we know what we’re signing off to do. And especially these people worked with us when we were starting, and we were actually hustling. hungry for food hustling type level, and so those are the people that work with us. The, now the new segment that has come is almost everyone is sort of, one degree of separation, so it’s like someone that someone has known, and so they sort of come into this org. And we’ve had people that have like not fit into org as well. It’s just like, it’s type of culture where there is a high expectation of, being online, replying for these things, and I do that first. You if you ask any engineer, they’re like, “You never sleep,” like, about me. And so then I do that as an I don’t do it as an example. That’s just how I’m wired. My wife doesn’t appreciate that I have to tell you. My wife doesn’t appreciate that. I told her about 996, she said, “I wish.”
    Swyx [01:01:09]: It’s like these Chinese people are slacking.
    Ivan [01:01:13]: Yeah. So, that is something there. And so I think every company has their own culture, and that’s something very deep, ours. And it’s something that’s come up again and again, and every single day we’re reminded about that. And I didn’t go out thinking that is how I’m gonna build it. It’s just how I’ve built these things right now.
    Swyx [01:01:29]: Yeah. so okay, I’ll transition a little bit on the founder side. Like, I’m very impressed by you in general of, your sort of balance, you have, you have a young family.
    Ivan [01:01:38]: Two kids, yeah.
    Swyx [01:01:39]: Two kids now.
    Ivan [01:01:40]: Yeah, two kids now. Yeah.
    Swyx [01:01:41]: I think a lot of people I meet, they’re like, “Oh, I’m starting a family. I can’t be a founder,” and all that, what’s your advice to those people?
    Ivan [01:01:48]: Everyone has their own I, it’s a hard, it’s a hard, they Every single day, so my family, they’re here right now, but they’re usually I fly between Croatia and here. Like, a lot of our team is in Croatia. A part of our team, and are growing, is here now in San Francisco. And so I spend a lot of time away from my family, and that is hard. Like, that is a sacrifice that you have to. But going in, people say, on your deathbed, you’re gonna miss some of those things. The thing that, and probably might be true, but the thing that going into this, I already said, I know that this is gonna hurt, and everything has to hurt. By the way, I’m very much of a feeling that everything has to hurt. Going to the gym hurts. Losing weight hurts. Like, everything has to hurt, right? It does. Like, we all.
    Swyx [01:02:32]: No pain, no gain.
    Ivan [01:02:33]: It is literally, but you actually have to enjoy the pain and just, if you don’t enjoy the pain, it’s not for you. And so you get accustomed to that pain. And so love the kids, especially I have a daughter and a son. Daughter is the eldest, love her and do miss her when she’s not here, but it’s like, that’s what I signed up for, and there is a plan and target of what I’m trying to achieve. And now hopefully with my wife, which does support me, we can get ourselves together more, so it doesn’t there. But she takes a large part portion of that. And so if you have a partner on the other side that is okay with that, then you can do that. But even if they do, you have to be okay with not being there, right?
    Swyx [01:03:11]: Yeah. This is my vision for you, this meme.
    Ivan [01:03:15]: Yeah. I.
    Swyx [01:03:15]: That’s your kids in the future.
    Ivan [01:03:18]: Yeah, I think.
    Swyx [01:03:18]: It’s like this,.
    Ivan [01:03:18]: We have to teach them that they’re not rich.
    Swyx [01:03:19]: Because Dad, built the compute sandboxes.
    Ivan [01:03:21]: Yeah, you built compute sandboxes. Dad made sandboxes. Dad made sandboxes.
    Swyx [01:03:25]: Built the spiritual successor to serverless and Kubernetes and for agents, any other sort of, hot topics, trends? You have a lot of hot takes, actually, you are best known for, you were, you were, you were sort of in sort of hustle culture mode, right? And someone quoted you and said, “I haven’t even heard of you, bro.” “Just log off and take the, take the Christmas off.” And then your response was?
    Ivan [01:03:53]: Oh, my response was, “That’s why I can’t.”
    Swyx [01:03:56]: Like, I think that’s, very typical of you. I don’t have it here. I can’t, I can’t bring it up. But, I think that’s very typical of the culture. But, I think you have a lot of, interesting hot takes like that. Any other sort of takes on, the startup ecosystem?
    SaaS Token Resellers, API Revenue, and Startup Hot Takes
    Ivan [01:04:11]: Oh, yeah, the startup ecosystem. And this was the recent one, which is I think that And this is general, business. I feel that the It didn’t come off, I think, well on Twitter. Some people at least misread it. Which is, the market is adding premium to SaaS vendors that are reselling tokens. And I think that’s incorrect.
    Swyx [01:04:34]: Why?
    Ivan [01:04:35]: Because I think So what I think, why I think that’s incorrect is that if you look at, one, your pricing depends on what the price is, if it’s public market or if it’s private or whatever. You’re saying, the person that’s reading that the re-acceleration of revenue is equal to the old revenue, which it’s not even close. Because one, you had on SaaS, you had typical SaaS margins, whatever it was, right? Stickiness and all these things. Now what you’re doing is you are saying, “Here is my agent, and I have whatever the margin is.” It’s way worse, right? And now you’re using Anthropic or OpenAI or whatever through me, the SaaS product, and then we as a community are saying now that is re-acceleration. And so one, I think that’s wrong because it, first, it’s not the same. The makeup is not the same. The other thing is, and go back to, what I mentioned earlier is, the Kua and how I set up OpenCloud and whatever. I don’t want your agent, essentially, because what happens, right now we have a problem that, and this has historically been, you have data siloed in, again, ClickHouse, QuickBooks, it’s all siloed, and now you’re giving me an agent that’ll give me the data, but it’s still siloed, right? And so now I have to, take that data and then get another agent.
    Swyx [01:05:52]: Just expose the data to my agent.
    Ivan [01:05:53]: Just expose the data. Just expose it. And one thing I have to and so I’m like, “Just expose everything and charge me for that.” So charge me for consumption of API. So you’ll have your old seat-based pricing for humans. Charge me for this. The number of agents will skyrocket, and essentially you’ll have more usage, and charge for more if your product has value. So, there’s arguments some of them do have value. It’s a database, not database. We can get into that. But some of them really do, and I was actually shocked that the first person to do this was Benioff.
    Swyx [01:06:24]: Salesforce, yeah.
    Ivan [01:06:25]: Sales.
    Swyx [01:06:25]: Agentforce?
    Ivan [01:06:26]: It, there was a tweet, I think three days ago, where she said every product in Salesforce has been exposed via an API.
    Swyx [01:06:33]: Wow.
    Ivan [01:06:33]: Everything. And I’m like, now I understand why this person has built.
    Swyx [01:06:38]: This guy’s king.
    Ivan [01:06:38]: This insane. Kudos to him. Amazing. It’s like, thank you. I don’t know if you listen to me or someone else, but like thank you for someone This is the direction of the world, and so if you can get real acceleration against that, against consumption of API, that is actual revenue, and that is actual real acceleration, and that is where value come from. And I think that there will be cold shower when people understand, no one’s actually gonna use and pay for these agents and tokens, and that wasn’t actually really a solution, but it’ll drop back down.
    Swyx [01:07:05]: Yeah. Yeah, look, obviously, I think generally correct, and I agree. I think - But people are going to try to become an AI company.
    Ivan [01:07:15]: No, absolutely. And nothing against that. And I - this is no, - To be very clear, this is not a downer on anyone that’s building this thing. Everyone has to get to, get to the revenues, get to the multiples, get the valuations, do what you have to get to the next step. Absolutely agree. But we, as a community, are now, saying, “Oh, this is, the magical way to get out.” This is not. Like, that is not what is happening, right?
    Swyx [01:07:35]: Yeah. No, I think, there was like this kitchen appliance company that put out some AI nonsense recently.
    Ivan [01:07:42]: It was also the sneaker as well. It was called Allbirds.
    Swyx [01:07:44]: Allbirds. No, Allbirds is pivoting to GPU. That’s fine. It’s like, I have - I can - I have some money left, I’m just gonna, do some lottery tickets, would you go into offering GPUs?
    GPU Sandboxes, Data Centers, and Bare Metal Economics
    Ivan [01:07:55]: Oh, yeah, we will. But not for inference. Like, essentially, what we think about is, the GPU sandbox. So, if you think of, if you have a GPU in your computer, that is what you have a GPU in the sandbox. So, there are workloads that do need GPUs. Again, I always go back to 3D rendering ‘cause it’s the easiest one to comprehend. But, if you wanna do any type of RL on, CAD or something like that, you will need a GPU in the sandbox, and so that’s coming now as well, yeah.
    Swyx [01:08:18]: How about own data centers?
    Ivan [01:08:20]: Own data centers. So we run on co-location providers, bare metal machines. Data centers, we technically can run on that or our own data center. Like, that’s how we architected it. Today, from a gross profit margin perspective, it doesn’t make sense for us to get in that. You have to raise a large amount of capital, a large amount of risk for, single-digit percentage points. So today, that doesn’t make sense, but we are fundamentally architected so that we can do that if we want.
    Swyx [01:08:47]: Yeah. you’re a large customer of these guys now. Do you see any opportunity?
    Ivan [01:08:51]: We will see. We will see, yeah.
    Swyx [01:08:54]: Yeah. I see a lot of people, trying to do the bare metal thing, we talked to Railway, the other day and they’re also doing a very similar, strategy.
    Ivan [01:09:04]: They think - I think they’re building out something or they have their own sort of data centers now.
    Swyx [01:09:07]: Yeah, they have majority their own data centers, I - But I do think, they still use Equinix and all those things. So I think it’s just interesting that this model basically hasn’t changed. It’s basically a real estate model. They manage the facilities and then you do everything else, I wonder how it can be changed for the, for the future ‘cause, the AI wave is the opportunity to reinvent everything, yeah. anything else, cool. I think that’s about it. I didn’t have any other, topics. I think this is, as best and comprehensive, if you have, any questions about the compute market, and sandboxing and Daytona, this is the best place to start. Where does this go, man? Like, we’re here in April. Things are growing 75% month to month. Like, where are we, where are we gonna be by end of year?
    The Agent Cloud: New AWS, New Stripe, or Something Else
    Ivan [01:09:58]: It’s an insane number. I’m sort of scared to say it out loud. So, it is - It’s very big, just the sandbox market on - And we - There - We talked about this in general. The entire infrastructure market is growing 40% plus or minus month over month. Everyone is growing 40% month to month. And that’s also a hot take, is like if you’re not growing 40%-ish, it’s not that - It’s just the market. You might as well - You don’t have to come to work to grow that amount, basically. I’m half kidding, but that’s where it’s going. And so where does it end? We will see. The thing that I think about from at least a CPU perspective, a GPU is even crazier, but from a CPU perspective, it is like there’s a high probability that actually owning the CPUs beforehand will be a go-to-market tactic, and it will probably - ‘Cause I - You - As you do probably talk to a lot of GPU providers, their growth is hindered by the amount of GPUs that you have right now, right?
    Swyx [01:10:47]: Yeah. It’s just like, it’s whatever NVIDIA decides to bless that day.
    Ivan [01:10:51]: That’s how much, that’s how much they’re gonna grow, right? And so where - The CPU market in general, be it like something like Railway, for example, or Vercel or whatnot, or Deployment, or it’s like the sandboxes, they’re still CPUs. So, each is growing at the pace of the of their - the market and what their, plus or minus of that market. But it’s still not constrained by that. And so my thought is, for all of us in this market, and databases fall into that as well ‘cause databases also run on CPUs. And it’s like we all have to grow as fast as we can so we can get enough of, CPUs tomorrow from Intel or from NVIDIA, ‘cause they have now CPUs and everyone else later on. So it’ll be interesting when we get to that cap.
    Swyx [01:11:30]: Okay. maybe one version I’ll phrase this is like, are you, is the potential new Heroku, new AWS or new, what’s it? New Stripe but compute? Or like what’s the, what’s the analogy that is most appropriate?
    Ivan [01:11:48]: There’s interesting. There’s like analogies of like - So the, there’s new Cloudflare, but new Cloudflare is new Cloudflare.
    Swyx [01:11:54]: New Cloudflare.
    Ivan [01:11:54]: They’re actually doing a really good job about,.
    Swyx [01:11:56]: Cloudflare owns networking. No one can fight. it’s like, come on.
    Ivan [01:11:59]: They’re doing - No, they’re doing really well. No, what I said is in the sense of their whole agent portfolio is actually really good. And I should say there are some technical I think, personally, around, everything’s under constrained under Workers. Like, Workers is their thing. But from a go-to-market vision perspective, I think they’re actually really good. I think they actually get it, unlike some other companies, and to your question is like, what is gonna be - There will be an equivalent, everyone says like an AWS for AI agents, but your answer, it might look more like Stripe than AWS, in a sense. So there will be a cloud built out specifically for agents. And so that cloud will have sandboxes, and it will have web search, and it’ll have, databases like SQLite or Neon or whatever, specifically for agent and other things. We are not at the end of the new infrastructure primitives for agents. There are more coming. So people think like, “Oh, there’s nothing else. This it.” There are more. Like, we have some ideas about the next ones. We don’t have time to do them, but there are definitely more primitives that are being built out for agents, and there will be, I think, a cloud that runs all that together.
    Swyx [01:13:07]: Yeah. Yeah, OpenAI has said AI cloud, Vercel has said AI cloud, and you are potentially also one of the other, the prospective AI clouds. I think it’s a very big prize to win, well, thanks for coming on.
    Ivan [01:13:18]: Thank you for having me. It’s been amazing.
    Swyx [01:13:19]: Yeah. Okay. That’s it.


    This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.latent.space/subscribe
  • Latent Space: The AI Engineer Podcast

    Railway: The Agent-Native Cloud — Jake Cooper

    20/05/2026 | 1h 28min
    Take the 2026 AI Engineering Survey and get >$2k in credits and AIE WF tickets!
    This was recorded before Railway suffered a major GCP outage on May 19, despite being a multi-AZ, multi-zone mesh ring, with HA fiber interconnects between their Metal <> GCP <> AWS, because workload discoverability was unintentionally still tied to GCP. All has been resolved with a post-mortem.
    Railway did not start as an AI infrastructure company.
    It was founded in 2020 years before agents became the default way people thought about deploying software. Jake Cooper, formerly at Bloomberg and Uber, started Railway with a simple obsession: the activation energy to ship something to production should be near zero. Push code, get a URL, iterate. No Docker files, no Kubernetes manifests, no Ansible scripts stacked on Ansible scripts.
    For years, this was a slow grind. Railway spent its first 18 months hand-acquiring its first 100 users with Jake personally greeting every Discord signup on a second monitor.
    Today, Railway has raised $124m and is growing very fast. A 35-person team supports 3 million users, adding roughly 100,000 signups a week. Their bare metal data centers have a 3-month payback period vs. renting in the cloud, with 70% margins funding aggressive cloud bursting when needed. The servers they own have actually appreciated in value as RAM prices have climbed basically meaning the value of their hardware now exceeds the capital they've raised.
    From rebuilding Railway’s network overlay over a weekend to moving the vast majority of workloads onto its own bare metal data centers, Jake Cooper is trying to build a new cloud for an agent-native world. In this episode, Railway’s founder and “conductor” joins swyx and Alessio to unpack why the next era of software infrastructure is not just “Heroku but newer,” what agents need that humans did not, and why the old deployment loop of Git, PRs, CI/CD, and static cloud resources may be heading for a rewrite.
    We go deep on Railway’s infrastructure stack: own-metal data centers, three-month cloud payback periods, cloud bursting, data center debt, Railpack, Nixpacks, Temporal, feature flags, Central Station, content-addressable filesystems, agent-safe production forks, and why the CLI may become more important than the canvas in an agent world. Jake also shares the founder journey behind Railway, how the company survived losing $500K/month, why it now serves millions of users with only 35 people, and why he believes the pull request is dying.
    We discuss:
    * How Railway went from a slow six-year grind to adding 100,000 users a week
    * How Railway thinks about agents as the next dominant software species
    * Why agents need version control, observability, compute, storage, and orchestration at 1000x scale
    * The economics of Railway’s own-metal data centers and three-month payback
    * How Railway uses cloud bursting while scaling its own infrastructure
    * Why data center debt can be a better tool than venture debt for infra startups
    * Central Station, Railway’s internal system for clustering customer feedback and incidents
    * Why responsible disclosure and over-communication matter for platforms
    * Why feature flags, progressive rollouts, and shadow traffic are essential for agents
    * Temporal’s strengths, pain points, and why workflows matter for agents
    * Railpack, Nixpacks, Nix, and lazy-loaded content-addressable filesystems
    * Why “cattle, not pets” may change if you can clone the pets
    * Why Railway is building a new cloud from scratch instead of copying hyperscalers
    * The solo founder path, focus, writing, and how Jake thinks about company building
    Railway:
    * Website: https://railway.com/
    * X: https://x.com/Railway
    Jake Cooper:
    * LinkedIn: https://www.linkedin.com/in/thejakecooper/
    * X: https://x.com/JustJake
    Timestamps
    00:00:00 Introduction: What Is Railway?00:02:07 Jake’s Path to Railway00:06:13 Railway’s Six-Year Growth Story00:08:52 Rebuilding the Business After the Free Tier00:11:17 Agents as the Next Software Platform00:13:29 Railway’s Infrastructure Philosophy00:15:42 Bare Metal, Cloud Economics, and the Compute Crunch00:17:22 Cloud Bursting and Five-Cloud Networking00:20:20 Data Center Debt and Infra Financing00:23:31 Data Centers in Space00:25:24 What Agents Need From Infrastructure00:28:24 CLIs, Canvas, and Agent-Native UX00:35:15 Central Station, Incidents, and Responsible Disclosure00:40:30 Safe Rollouts, SRE Agents, and Production Forks00:45:00 AI SRE, Specs, Code, and Tests00:48:24 Self-Replicating Infrastructure and the New Serverless00:53:18 Heroku, Temporal, and Workflow Engines01:04:07 Railpack, Nixpacks, and Lazy-Loaded Filesystems01:06:01 Coding Agents, Token Spend, and Roadmap Acceleration01:10:56 The Pull Request Is Dying01:12:28 Feature Flags and the Agent-Era SDLC01:16:15 Cattle, Pets, and Cloning Machines01:19:29 Solo Founder Lessons01:24:12 Focus, GPUs, and Building a New Cloud01:28:20 Closing Thoughts
    Transcript
    Alessio [00:00:00]: Hey, everyone. Welcome to the Latent Space Podcast. This is Alessio, founder of Kernel Labs, and I’m joined by Swyx, editor of Latent Space.
    Swyx [00:00:10]: Hey, hey, hey. Today we’re in the studio with Jake Cooper of Railway.
    Alessio [00:00:14]: Conductor of Railway.
    Swyx [00:00:15]: Conductor at Railway. Yeah.
    Alessio [00:00:16]: Choo-choo.
    Swyx [00:00:17]: Do you actually have that anywhere, like on your business card?
    Jake [00:00:20]: We call some of our volunteer moderators conductors. I don’t have a business card. We’re not that big yet. At some point I will. I got handed a nice business card from the Supermicro folks, and I was like, “Damn, this is pretty official.”
    Swyx [00:00:30]: Business cards are coming back.
    Jake [00:00:32]: They’re cool. They’re hip. The conductor thing is good. We’re trying to figure out what we want to call each other internally. Some people think it’s super cringe and say, “You don’t need a name for people internally.” Some people want to call each other something. We still don’t have a really good one.
    Jake [00:00:55]: We’ve got New Railcrews, Trainiacs. Nothing has stuck yet.
    Swyx [00:01:00]: I like Trainiac. Trainiac sounds good. Railwayians. For those who don’t know, what is Railway? Let’s give people a crisp definition up front.
    Jake [00:01:09]: Railway is the easiest way to ship anything. You go to the canvas, or you talk with Claude, and you say, “Deploy a Postgres instance, deploy my GitHub repository, run this code,” and you’re off to the races.
    Swyx [00:01:22]: You’ve got a nice animation on the landing page.
    Jake [00:01:24]: Thank you. None of my work, by the way. They don’t let me touch the design stuff anymore.
    Jake [00:01:25]: We want to make it trivially easy not just to deploy things, but to evolve applications over time. Most tooling right now stacks entropy on top of entropy: Docker, Kubernetes, Ansible scripts, and all these other things. If we can version all of your software and keep track of all the changes, then we can make it trivial to clone environments, fork into a parallel universe, get copies of production data, get copies of any services, make changes, validate them, and collapse them back in without reproducing everything across a staging environment.
    The Railway Origin Story: From Uber Systems to a New Cloud
    Swyx [00:02:07]: I was looking at your background: Bloomberg, Uber. Nothing immediately stands out as, “This guy is going to found the next great platform as a service.” What prepared you for Railway?
    Jake [00:02:21]: It was curiosity to keep going deeper. I started out on front-end stuff, working on Wolfram Mathematica and porting it over. Then I briefly moved to Bloomberg, then toward Uber and distributed systems, taking the Jump Bikes systems and moving them to a distributed system built on top of Cadence, the pre-Temporal Temporal.
    Swyx [00:02:44]: Which, by the way, I’m happy to talk about, pros and cons.
    Jake [00:02:48]: Totally.
    Swyx [00:02:51]: But let’s do the Railway story.
    Jake [00:02:52]: It has been a continual step of wanting an experience. Whether it’s walking up to a bike, unlocking it, and having it work frictionlessly, or something else, the depth required to make that happen follows from the experience. A lot of the work I do, and a lot of the team does, is in service of that experience. We fundamentally don’t care how deep we have to go. We will swim to the bottom of the swimming pool to get the experience.
    Jake [00:03:17]: I don’t have a physics PhD. I did an EECS degree. It has always been about figuring out the next step: how do we get there? That’s what led to starting Railway for that experience and then moving all the way to bare metal data centers. I was adding patches to the kernel this week to get the experience there because I can see how much better it can be.
    Swyx [00:03:49]: Other patches to the Linux kernel this week?
    Jake [00:03:51]: Yeah. Not upstream. Our fork.
    Swyx [00:03:52]: That’s a flex. Railpack? No, this is different. This is the OS on top of Railpack?
    Jake [00:03:57]: No, this is an actual kernel patch. It’s always literally: what do we have to do to get that experience? Then figure it out. Anything is figureoutable.
    Swyx [00:04:10]: Would you send the patch upstream, or does it not fit other use cases?
    Jake [00:04:13]: Maybe. We have to work out the experience internally. It has to do with the storage layer we’re building for some of the agentic stuff. Maybe it’ll be useful upstream, but it’s deeply useful for us internally.
    Open Source, Forks, and Non-Deterministic Versioning
    Swyx [00:04:29]: You mentioned open source before. How do you think about starting from open source, and then coding agents letting you do a lot more from forks of it?
    Jake [00:04:38]: GitHub’s original sin is that it’s almost a series of broken pointers. You have this thing, then you clone it, and now you’ve lost the whole upstream. How do we make it trivial for people to modify really small pieces of it?
    Jake [00:04:51]: We think of Git in a discrete sense: I’ve either made a change and merged upstream, or I haven’t. What would it look like if it were percentage-based, a little more non-deterministic, or a stream of changes that users traverse as a percentage rolled out in general and then rolled all the way up?
    Jake [00:05:13]: We have the open-source kickback program and let you deploy templates because we want to make it trivial for people to version these shards over time. It solves a large problem around authentication, authorization, and security. NPM has a way to define, “Don’t take any new packages.” The ideal end state is that you roll out progressively to users with the minimum impact zone and continue rolling up. JPMorgan should probably be the last one on the patch line, for all our sakes, because our money and livelihoods are there.
    Jake [00:05:53]: It’s okay if Johnny Vibe Coder gets a broken patch because there’s so much entropy in the system that the rubber has to meet the road at some point. You have to test at varying levels.
    The Long Grind: First Users, Free Tier, and Making the Business Work
    Swyx [00:06:13]: I wanted to pull up this glorious chart, which is your usage or number of daily signups?
    Jake [00:06:22]: Daily signups, I think.
    Swyx [00:06:24]: You started six years ago. It was a slow grind, and now you’re on a rocket ship. You say, “Don’t doubt your fight and don’t quit.” Maybe pick out certain points that were key inflections for the company.
    Jake [00:06:40]: At the start, it’s about getting your first 100 users, hell or high water. We had a website and a support link. The support link was the Discord channel. I had notifications on with two monitors: the monitor I was working on and the other monitor with Discord. If anybody came in, I was immediately like, “Hey, how’s it going?” It was rare, so getting those first 100 users to come back was the start.
    Jake [00:07:14]: Then you build a consultancy factory because users want all these things. You have to go back to the board and ask, “What is the actual product offering I want to build on top of this?”
    Jake [00:07:28]: VCs want charts that always go up and to the right, but in reality you don’t necessarily want charts that look like that. For us, there have been periods of expansion where we add features to test use cases, and periods of compaction where we ask, “If the experience we have is good, how do we make it significantly better?” Maybe we strip out features that don’t fit our ICP anymore.
    Jake [00:07:57]: The boom from 2022 to 2023 came from the free tier. Everybody under the sun was using it.
    Swyx [00:08:09]: A lot of Reddit bots and Discord bots.
    Jake [00:08:12]: And crypto miners. When you build an open product on the internet where anybody can sign up, the internet is a horrible place with so many things. You go through periods of asking, “How do I reach as many people as possible?” Then, “How do I fit the exact use case for the people who really matter and are really excited about this specific thing?”
    Jake [00:08:39]: Then there was a two-year period of making the actual business work. During the free-tier era, we were losing about half a million dollars a month.
    Swyx [00:08:59]: On a $20 million bank account.
    Jake [00:09:02]: On a $20 million bank account with maybe $50,000 a month in revenue. That’s a horrible business. I don’t know how anybody invested. But you have to go through it and say, “We have an experience people love, but the business has to work.”
    Jake [00:09:17]: There are two schools of thought. You can run the horrible business all the way up with bad margins, or you can go back and make it work. We’ve always wanted a super lean team. We’re 35 people right now. It’s very small.
    Swyx [00:09:36]: Supporting three million already?
    Jake [00:09:38]: Yeah. We’re adding 100,000 users a week right now, so it’s growing fast. We don’t want to add headcount for the sake of headcount or throw bodies at problems. We want to build systems. It’s hard to build systems during expansion because you’re adding things to the system because people are asking for them or things are breaking.
    Jake [00:10:00]: We had to cut off the free users for a little while, rebuild the business, and make sure it worked. We want to reach as many people as possible because software is important. It’s become difficult to create things in the physical world, so it’s important to make it easy for people to build in the virtual world and have access to creation. But there are legs to that journey.
    Jake [00:10:30]: You can see divots in the charts. If you follow between 2025 and 2026, it’s either summer or winter. People go on holiday with family.
    Swyx [00:10:50]: It affects that much?
    Jake [00:10:51]: Yeah. It’s kind of B2C and kind of B2B. People are shipping constantly, then they stop. Our activation curve now shows more people activating on weekdays because we have more business users, so it smooths out over time.
    Agents as the New Interface to Deployment
    Swyx [00:11:17]: Was there a point where you started prioritizing AI development or agent development?
    Jake [00:11:24]: We’ve prioritized agentic as a top-of-funnel thing. Over the last six months, we’ve deeply prioritized agentic as a mechanism to build and deploy things because we believe the curve is so steep and that is how people will build and deploy software.
    Jake [00:11:42]: It almost fundamentally doesn’t matter whether this is dot-com or not because we’re all on the internet anyway. If agents are going to deploy a bunch of things and we hit an inference wall at some point, we’ll fix those problems. The dominant species over the next 10 years is that we’ve moved from assembly to C to C++ to JavaScript to words. You’re going to need to close that loop.
    Swyx [00:12:13]: When you say this is dot-com, did you mean buying the domain, or the general case?
    Jake [00:12:17]: I mean the dot-com era, when companies had a huge run-up because people understood the internet was important. Then they hit bottlenecks, fundamental laws of physics, math didn’t work, and everybody came back down to earth. But it didn’t matter because the internet became so impactful. If you operate on a long enough time horizon, you should build these things anyway because you can see where it’s going.
    Jake [00:12:45]: That’s where I think a lot of agent stuff is. You get to a point where you’re running thousands of agents in parallel. What is the inference cost? What is the compute cost? How do you make that efficient? How do you coordinate all this? We have issues coordinating humans; we don’t even have good tooling for that. Now we have to figure out how to get agents to coordinate, safely version changes, and know when to raise their hand for someone to intervene. Otherwise it becomes an interrupt factory.
    Railway’s Infrastructure Thesis: Network, Compute, Storage, and Metal
    Swyx [00:13:19]: Let’s go right into the technical side. What are the core infrastructure or architectural beliefs of Railway that allow you to do what you do?
    Jake [00:13:29]: The primitives matter a lot for us. We need network, compute, storage, and orchestration around it. You need control over a lot of those things. We’ve talked a lot about how we don’t really use Kubernetes because we want higher-order control to place workloads in very specific places.
    Jake [00:13:48]: The reason is that you have to be very efficient with agents: memory reuse and all these other things, or you’re going to massively blow up your cost structure. Being able to rack and stack your own servers and build your own metal unlocks performance and cost. Experiences where you’re running 1,000 agents in parallel are not massively cost prohibitive.
    Jake [00:14:13]: Token use and compute use are blowing up. Over time, those things have to get a lot more efficient. You can get a lot of margin to make those experiences solid by building your own metal. That’s all in service of offering a differentiated experience to as many people as humanly possible.
    Swyx [00:14:51]: You have a data center in Singapore.
    Jake [00:14:53]: Yeah. We have two in every other region now. In Singapore, we’re adding a second one in Q3.
    Swyx [00:14:58]: What’s it like? I’ve never built a data center. Do you go to Equinix and say, “I want some slots?”
    Jake [00:15:05]: Yeah. Equinix. You basically go and say, “I want power and I want a cage.” They say, “Great, here’s what it’s going to be.” You rent the cage for a period of time, fill it with racks and servers, and hook up internet to it. That’s all the pieces.
    Swyx [00:15:36]: Then you handle everything else.
    Jake [00:15:37]: You handle everything else.
    Swyx [00:15:39]: What’s the math versus clouds doing it for you?
    Jake [00:15:43]: If we rented in the cloud, our payback period when we go to metal is about three months.
    Swyx [00:15:50]: Which is crazy.
    Jake [00:15:51]: It’s nuts. That’s four years of depreciated hardware. You’re going to see a lot of this compute crunch because hyperscalers are buying up a lot of stuff. We’re working directly with OEMs, resellers, and people building these machines: Supermicro, Dell, and others.
    Jake [00:16:11]: Upstream, there’s a bunch of supply pressure. When we raised our last round, between deploying capital for servers and now, the amount of money we’ve raised is less than the amount of money we have in the bank plus the value of the servers because the servers have appreciated as RAM has gone up. It’s nuts how valuable hardware has become.
    Jake [00:16:50]: If you look at hyperscalers, they deployed around $80 billion of capital expenditures this year, and next year will be more. That’s a massive infrastructure build-out. You look at that and think it’s crazy that they’re spending way more than the Manhattan Project. But if every person is going to run dozens or hundreds of agents in parallel, you have no conceptual idea how much compute is required to make that experience happen, even if you’re deeply efficient and sharing resources. And that doesn’t even count inference.
    Swyx [00:17:22]: How do you plan the build-out? The growth chart is so vertical. Are you usually at 100% utilization as soon as racks are live? How far ahead are you planning?
    Jake [00:17:33]: We still maintain cloud presence for bursting. We work with AWS, GCP, and a few other clouds. We can rent, and then the moment we get space or power, we compact those workloads off the cloud. We started on the clouds, then built a system to migrate to our own metal. There’s nothing that says you can’t continually do that again, and that’s exactly what we do. We never want to be compute constrained.
    Jake [00:18:09]: At the start of the year, we actually became compute constrained because one upstream provider wasn’t able to give us quota at the rate we needed, and the hardware was slower. I spent a weekend rebuilding our entire network overlay so we could straddle five clouds: Oracle, AWS, ourselves, GCP, and one other one. We can do more than that now.
    Jake [00:18:38]: We got into a spot where we were trying to pack instances tight because we couldn’t get enough compute. That led to a few reliability issues, which are now past us. I made a tweet pointing out that it’s becoming harder and harder to acquire compute at the rate these models need to acquire compute. We got bit by it.
    Swyx [00:19:15]: How do you think about pricing knowing you might not have your own metal available at all times? Are you pricing assuming you need extra margin if you end up going into the cloud?
    Jake [00:19:26]: Because we’ve built out our metal data centers, our margins on metal are around 70%. We can deeply subsidize the cloud business if we want to scale at a reasonable rate. We have a few levers: metal, which makes the margins; cloud burst; debt to buy servers; and venture capital. It’s an interesting operational problem: how much cash do we have, how much should we raise, how quickly can we deploy it, and can we scale revenue as quickly as we scale compute?
    Jake [00:20:05]: If we continue making it trivially easy for people to build and deploy, then the faster we close that loop and the more operationally excellent we are with capital, the faster the business can scale. It’s almost a straight linear deployment rate.
    Financing Infrastructure: Hardware Debt, VC, and Operational Leverage
    Swyx [00:20:20]: I think infra startups raising debt is a tool people don’t utilize enough or know enough about. What can you tell us about that? Is it secured against your CPUs?
    Jake [00:20:32]: It’s secured against our hardware.
    Swyx [00:20:37]: What rates do you get? Who are the lenders?
    Jake [00:20:39]: We pay prime plus a spread, and we can refinance any of the debt as rates go down. The terms are pretty good. The unfortunate thing is that Twitter has no nuance, so people say, “Venture debt bad.” But as with all things, there are specific tools and areas where you can be deliberate instead of using one tool as a hammer. Venture capital is not the hammer for everything. You have to explore and figure out what works.
    Swyx [00:21:12]: VC is usually the most expensive financing you can get.
    Jake [00:21:15]: Yeah. I also think people think about VC incorrectly from a capital-raising perspective. Most people think, “How do I raise as much money as possible from whoever is probably the best I can get at that time?” That’s close to right, but what we’ve tried to do is figure out what unfair advantage we can buy with that equity.
    Jake [00:21:34]: It’s the most expensive equity you’re going to give away at that point in time, assuming the company keeps getting better. How do you use it to work with someone stellar who complements you? In the seed stage, I had never started a company. Ray Tonsing had good advice, and I could text him all the time. He was really fast. Awesome.
    Jake [00:22:01]: Then with John and Erica at Unusual, they said, “You roughly know what you’re doing building a product. We’ll mostly leave you alone and be available for advice.” Amazing. Then we got to Series A and the business was an operational tire fire because we didn’t know how to scale a business. Work with Erica, and Jordan is over at Redpoint, so bonus.
    Jake [00:22:28]: Now we’ve raised from TQ and FPV as we’re moving into enterprises. Every step of the way, we’ve asked: who can we partner with at this specific time to unlock the next section of the journey? I don’t know enterprise sales. As an engineer, I can eyeball what features we might need, and we have wonderful people internally who can help. But you want boardroom dynamics where everyone is aligned and asking, “How do we win this?” instead of bickering about strategy.
    Data Centers in Space and the Physics of Compute
    Swyx [00:23:31]: You had a tweet about data centers in space. Why no data centers in space?
    Jake [00:23:37]: It’s not “no data centers in space.” My hot take is that I think it is solvable. I’ve just never seen anybody solve it.
    Swyx [00:23:49]: You said, “How are you going to dissipate that much heat in a vacuum?” You’re making a physics claim.
    Jake [00:23:55]: I haven’t seen anybody prove how you’re going to dissipate that much heat in a vacuum. It doesn’t mean it’s not possible. It just means nobody has brought it up yet.
    Swyx [00:24:05]: Astrophage.
    Jake [00:24:06]: I don’t know what that is.
    Swyx [00:24:07]: The Martian thing. Okay, you’re very logical.
    Jake [00:24:09]: It could work. A lot of people are putting the cart before the horse. They say, “We’re going to put data centers in space.” Okay, but how? “We have time to figure it out.” It’s like in The Martian where they ask how they’re going to intercept something and say, “We’ll figure it out.”
    Swyx [00:24:36]: Making a bet on human invention is weird because you blind trust that it can be solved. But with physics, there are first-principles bounds you can put on it. Maybe not. Maybe you’re asking to travel time or break a fundamental thermodynamic law.
    Jake [00:24:57]: I don’t know how VCs do this either. How do you know what’s not possible and a grift versus what’s possible but sounds completely insane? “We’re going to put data centers in space.” Coin flip as to which it is, and I guess you’ll know in 10 years. That’s one cycle.
    What Agents Need: Versioning, Observability, and 1,000x Scale
    Swyx [00:25:23]: Moving back to agents. The branching, fast spin-up, and orchestration you do feels like pre-work that happened to be exactly what agents want. What do agents want differently than humans?
    Jake [00:25:37]: They want the ability to version things. It’s not that different; it materializes slightly differently. Agents want a way to test changes incrementally. Engineers have feature flags. Is there a reason agents can’t use feature flags? I don’t think so.
    Jake [00:25:54]: They want version control. Can we use Git or not Git? That one is up in the air. I think something outside Git will emerge for how we version these things over time. They need observability. You need to query what happened, when it happened, which steps failed, traces, logs, metrics, and all the rest. They need network, compute, and storage. They need to write files, save files, iterate on files, and snapshot file systems.
    Jake [00:26:25]: A lot of what humans needed is in line with what agents need. Branching and forking are not different; we’re just moving 1,000 times quicker. It can look like you need something massively different, but what you need is something massively better than what existed. You need orchestration massively better than Kubernetes. You need networking probably better than Envoy. It goes all the way down the stack.
    Jake [00:26:55]: If the workload profile doesn’t change so much as it gets massively compressed because you need thousands of these things, what assumptions change? etcd is going to melt. You need to replace it with something. You can go all the way down the stack and say, “That part has to change, that part has to change, and that part has to change.”
    Jake [00:27:19]: The interesting thing about the super-exponential curve is that you have to build systems where you can rip out those parts at any time because a new bottleneck might emerge. You get good at parallel agents, and a different part of the system breaks. So it’s similar to what humans needed, but at 1,000x scale.
    Jake [00:27:55]: How do you do code review in the age of agents?
    Swyx [00:28:00]: You throw more agents at it.
    Jake [00:28:01]: You don’t. But then who reviews for CVEs and all these other things?
    Swyx [00:28:07]: More agents.
    Jake [00:28:08]: And that’s how we hit the inference wall. You can continually throw agents at the problem, but I think there’s a limit to the number of agents you can throw at a problem.
    CLI, Agent Handles, and Closing the Loop
    Swyx [00:28:24]: You already had a CLI before it was cool. How is the shape of what you’re exposing changing, if at all?
    Jake [00:28:28]: CLIs have always been cool. The CLI changes because we think about how to give Claude, Codex, ChatGPT, or any model a handhold.
    Jake [00:28:50]: A CLI is a single command: deploy, get logs, and so on. Things that were prohibitively annoying to humans are not annoying to agents. They’re nice. If I handed you a CLI with 40 arguments and 600 flags, you’d think, “I’m never going to use all of this.” But if you hand it to an agent, it says, “This is excellent. I have so many handles to work with.”
    Jake [00:29:24]: If you’re going to expose things to agents that way, you want as many handles as possible where they can get information, query dynamic information, and close the loop quickly. Most problems right now are about how to close the loop as quickly as possible. Where does the agent get stuck, and how can you remove that?
    Jake [00:29:49]: Telemetry is important. If you can tell where the agent gets stuck from the CLI and say, “12% of people deviate from the happy path because of this, and now I add this argument and drive it down to 2%,” you massively increase the rate of loop closure.
    Jake [00:30:03]: That’s how we think about not just the CLI, but every point in the dashboard. It’s a user journey: I hear about Railway. I get something deployed. I get my first green build or aha moment. I see an endpoint, logs, whatever. Then I iterate. The iteration loop is indefinite. The user wants to deploy a new thing, a Postgres instance, change code, and keep iterating.
    Jake [00:30:36]: If you focus on the iteration loops and what’s blocking them from closing quickly, one thing we say internally is: you never want to be waiting on compute anymore. You always want to be waiting on intelligence. If you’re waiting on compute, there’s a bottleneck that needs to be destroyed because eventually that bottleneck becomes so large that another workflow emerges to change it.
    Jake [00:31:04]: We’ve built a product where you push code, build it, and so on. But I fundamentally believe the push-pull loop is going away. We’ll get to a point where you make a small change in production, that change is versioned across your infrastructure, you’re working alongside copy-on-write versions of your database and infrastructure, and then you merge it in and it’s instantaneously live. That’s the holy grail of loops. The push-pull-rebuild thing is a point of friction that we’re removing entirely.
    Canvas as Output: Dashboards, Context Anchors, and Hyperstructures
    Swyx [00:31:43]: It’s incredibly fast. If anyone hasn’t tried it, that fast feedback is great. My hot take is that Railway was famous for its canvas, which visualizes your infrastructure and lets you manipulate it visually. But that was for humans. For the next phase of growth, Railway CLI is more important than canvas.
    Jake [00:32:05]: The canvas is funny because it’s a mechanism to show changes over time. You’re right that previously we used it a lot as an input. Moving forward, its goal is more like an output. You would go to the canvas, make changes, see them, and watch your infrastructure evolve. Now agents have access to the CLI and can make those changes. So the canvas becomes an output: what information does the human need at this moment to make suitable decisions about control requests? Do I approve this or not?
    Jake [00:32:57]: It also has to be an anchor for your context, a port in the storm. Think of it like layers in a file system. You start with a project, then drill down into services, then into a function or code, because you want to represent the entire thing not just in your head, but in the canvas. Other people can share that representation, think on the same wavelength, and move quickly.
    Jake [00:33:33]: A lot of organizations get in trouble as they scale because all the context lives in someone’s head. “How does this microservice work?” “I have no idea; go ask this person.” Then you have whole categories of products built around context discovery. A lot of that melts away if you have a solid hierarchy and can infinitely nest services, code, context, and everything else all the way down. That’s what lets you build these structures over time.
    Jake [00:34:18]: It’s also what lets us build what I’ve called hyperstructures: things that are way bigger. You look at the Golden Gate Bridge and ask, “How did we build that?” There’s a meme that we lost the technology. To some extent, yes, because the coordination that built those things evolved and changed. We lost some of the art of building structure as we jammed everything into Slack.
    Swyx [00:34:52]: But you jam everything in Discord.
    Jake [00:34:53]: Same point. It doesn’t matter. It’s message passing and interrupts, message passing and interrupts.
    Swyx [00:35:00]: So you’re arguing there should be something better and more structured than Slack?
    Jake [00:35:04]: Yeah. For sure. I think Slack is awful, and Discord is awful too.
    Central Station: Context Routing, Support, and Incident Clusters
    Swyx [00:35:09]: This is the equivalent of my mom test. What have you done that has your solution to this?
    Jake [00:35:15]: Internally, we’ve built a tool called Central Station that aggregates all the context from our users. Every piece of feedback, every customer support item, everything gets aggregated into clusters. If an incident is brewing, we can determine how many users are affected and break off a discussion based on that.
    Jake [00:35:40]: That is more helpful than long-running channels where you’re trying to decide which channel to put something in. If you can dynamically aggregate information and dynamically route it to the right person based on context, it works better. We know internally that these four people are close to networking. If we see a networking thing, we can drill it down to those four people. If it’s with this part, we can look at the commits. This is no longer a manual process internally.
    Jake [00:36:13]: If you go to station or help.railway.com, that’s why we built it. We wanted to scale with a massive amount of leverage by aggregating feedback.
    Swyx [00:36:27]: This is built in-house?
    Jake [00:36:28]: Yep.
    Swyx [00:36:29]: I remember helping out on this one with Angelo in 2023. You scale a lot with a very small team.
    Jake [00:36:38]: Yeah. We’re about 10 times bigger now.
    Swyx [00:36:40]: You have your full developer code here? Very cool.
    Jake [00:36:44]: If you go to railway.com/stats, we expose this as a pub-sub-able thing. It’s all real-time metrics. There’s a way to get it as JSON somewhere if you care.
    Jake [00:37:01]: We’re big on trying to build everything in public and talk about what we’re working on. We’ve had issues in the past, and we’ll say, “Here’s how we’re fixing these things.” We’ve gotten compliments and flak for incident reports. We’re always trying to make them better and talk with people.
    Incidents, Disclosure, and Progressive Rollouts
    Swyx [00:37:20]: You had a big one recently. I liked that it was scoped to 3,000. You presumably used Central Station. Talk through what happened and how you address it internally as a team.
    Jake [00:37:38]: Internally, this one really sucked. It had to do with an upstream provider that didn’t do the behavior it said it documented, which is unfortunate given they wrote the RFC for how the behavior should work. We rolled those things out, and Central Station caught it initially when a couple users said caches weren’t invalidating. We turned it off immediately.
    Jake [00:38:03]: When you roll out to a large user base of three million people, you get a lot of disparate behaviors. We tested in staging and had tests, but we hit an edge case. We’ve hardened those systems, and now we can make that better. But it was a tough one.
    Swyx [00:38:39]: I always wonder how private disclosure is supposed to work if people find an issue. Are they supposed to contact you first? When you run a platform, these things will happen. What channels should people pursue to quietly resolve it before it becomes a bigger incident?
    Jake [00:38:59]: There’s responsible disclosure. We err on the side of over-disclosing and letting you know something is wrong versus having your provider gaslight you. We’ve erred on sharing those things more publicly, even if they impact a small subset of users. That’s a decision we’ve made internally. We have four values. One is honor. The honorable thing is to notify people to the widest degree at which they may have been affected or there was an issue, and then confront it head-on: why did it happen, what can we do better?
    Swyx [00:39:45]: Not the whole user base. That’s because of incremental rollouts and other things?
    Jake [00:39:50]: Yeah. Progressive rollouts.
    Swyx [00:39:54]: That should be the norm at all large platforms.
    Jake [00:39:58]: It should. A variety of companies do this. There’s the quote that Meta runs 10,000 different versions of Meta. To our earlier point about agents, they need the same thing. They need shadow traffic and all these other things. We’ve built so much ceremony around production being sacred that we need to make it trivially easy to test different behaviors in a safe environment. Then you can make mistakes in a safe environment.
    Safe AI SRE: Customer Agents, Forked Environments, and Production Parity
    Alessio [00:40:30]: Do you see a world where these things get automatically caught, not necessarily by your agent, but by your customer’s agent? The cache invalidation issue seems easy to check if you know to look for it.
    Jake [00:40:44]: It’s hard because to determine it, we almost need to hook into your observability infrastructure. That’s why we have the template loop on the platform: so you can roll things out progressively. You can roll out to Johnny Vibe Coder initially, or push a shard that someone consumes at their own leisure. Or you can roll it out over weeks: 0.1% of people, 1% of people, early adopters, then all the way up. That’s the non-deterministic version control we talked about earlier.
    Jake [00:41:30]: I believe that’s where most things should go, because most companies end up building staged rollout systems in-house. It’s the same thing built again and again at every company. There’s a massive opportunity to consolidate developer debt.
    Alessio [00:41:45]: You should have a free tier. Model providers give free tokens if you let them use the data. You could give free compute if someone is the number-one shard that goes out and lets you plug into their observability.
    Jake [00:41:55]: We do that. That’s why we talked about the impact on 3,000 people. We start with lower-impact people. Larger companies on the platform are last to receive those rollouts so they have a version of the platform that’s deeply stable.
    Alessio [00:42:16]: I have three services, so I’m sure I get the first rollout. You can nuke my thing at any time. There are all these SRE agent companies. Observability people also want agents that fix upstream problems. You have your own agent in the canvas now. How do you see that playing out?
    Jake [00:42:39]: It’s the stacking entropy problem. If you don’t have primitives to make iteration in production safe, it becomes difficult. If you’re an observability provider saying, “Here’s the fix to this error,” assume 80% are good and make sense. But in the last 20% long tail of complex issues, if you let somebody stamp it, you create an opportunity for an incident.
    Jake [00:43:08]: That’s why forked environments are important. People have staging, but it always drifts from production. You need primitives, workflows, and experience built first-party on the platform so you can fork any service at any point in time.
    Jake [00:43:33]: I think of the canvas as a sheet of transparency paper. The agent is a little guy you push up into the canvas. It should say, “I need to copy that service and that service so I can test these two things.” It gets a read-only copy of production. Anything that’s PII gets marked as a transform when we clone the database, create a copy-on-write version, or read from it. Then the agent makes changes and asks, “Does this actually work?” as close to production as possible.
    Jake [00:44:22]: That’s how close you have to be, or you get massive drift. The system becomes unstable. You see this with massive systems built on Docker for local, Kubernetes for production, and a specific thing for something else. That complexity slows developers and becomes unstable at scale, making it hard to iterate. We want to compress that way down and say, “As close to prod as possible is where we want to be.”
    From AISRE Skeptic to Agent Believer
    Swyx [00:45:00]: I was texting Erica for questions, and she says you were originally not a believer in AISRE. Have you come around on it?
    Jake [00:45:10]: I flipped, but I’m still not a believer in AISRE if you don’t have the primitives to make it safe. If you unleash AISRE on production infrastructure without safe primitives for copying volumes and making sure things are fine, it’s going to nuke your production database. It’s not a matter of if, but when. I’m a big believer in making those loops safe.
    Jake [00:45:33]: I was a deep AI skeptic until 2023. In 2024, I thought, “Maybe I can roughly make this thing do it.” In 2025, I thought, “Now I can hold this.” Over winter break, everybody came back saying, “It’s almost impossible to hold this.”
    Swyx [00:46:01]: Did you see this on the Claude docs? CloudBot? OpenCloud?
    Jake [00:46:06]: It’s gotten to a point where it’s harder to hold it wrong than to hold it right. There’s a scene in Avengers where Vision picks up Thor’s hammer and says it’s terribly well-balanced. It self-balances and works well. I’m a deep believer at this point that this will be the dominant species: assembly, C, C++, JavaScript, words.
    Swyx [00:46:35]: It feels like a big jump.
    Jake [00:46:37]: It is. But it’s not like you abandon CPU-based discrete logic and move straight to fuzzy logic. You need both. Your skills should call code or applications or some static structure. You can use skills to distill what the procedure should be or how the code should act.
    Jake [00:47:02]: I’m coming to a thesis: you need three points. You need a clear spec defining the system, the code, and the tests. When you say it out loud, if you’ve been in engineering long enough, you’re like, “Of course. That’s an RFC, tests, and code.” But they all matter. Having them together lets them reinforce each other: the spec and tests match, but the code doesn’t, so reconcile it. Or the tests and code match but the spec doesn’t, so reconcile that. That’s the iteration loop.
    Jake [00:47:41]: That’s why you’re seeing people talk about software factories, docs, and reconciliation. Some of that is architectural astronomy if you don’t implement it, but that loop is where most things will end up.
    Swyx [00:48:07]: For listeners, we’ve been talking about this on the pod for three years: the holy trinity of specs and tests. Itamar Friedman from Qodo is the reference if people want to look it up.
    Self-Modifying Infrastructure and the End of Push-Pull-Rebuild
    Swyx [00:48:18]: One thing I want to mention on the OpenCloud idea is self-modification. I don’t know how Railway would support it, but I have my OpenClaw, and I just tell it it has the Railway CLI and can do whatever. In theory, whatever capabilities or new infra it needs, it can call the Railway CLI, provision it, and add it to itself. The agent can modify its own infra.
    Jake [00:48:45]: It’s nuts. I have a loop set up where you put the Railway CLI on top of something that runs on Railway. You’re authenticated as whatever the current box is, and you can make any changes to it. Then you call Railway deploy, and it deploys itself.
    Jake [00:49:04]: It’s like: “I need to spin up this instance of this environment. I already exist in this environment. Excellent, I have access to a Postgres instance now.” That’s where we want to go with agentic, self-replicating infrastructure. That’s your loop: iterate in production. You continue making changes. If it works, merge it upstream. If it doesn’t, throw it away.
    Jake [00:49:37]: How do you make throwaway copies trivial to spin up and super cheap? The era of “I have an AWS instance with four vCPU and 16 gigs of RAM” is going to get destroyed. If you do that for agents, you need a thousand of those machines. It’s prohibitively expensive compared with what we’ve spent a ton of time figuring out: the atomic unit of deploy, whether you call it isolates, sandboxes, or something else. Only pay for what you use, spin up instantaneously, and close the loop as quickly as possible.
    Jake [00:50:15]: If the system can self-replicate safely and say, “This is my environment, I’m making these changes,” it can come back with, “Does this look good? This is a new state of infrastructure given this prompt. I think I’ve solved it.” Then you go back and say, “Actually, it looks different.” It does the loop again. Then you say, “Cool. Apply.”
    Swyx [00:50:38]: That’s retroactively obvious, which is the most useful kind. Any other comments on agent deployment on Railway?
    Jake [00:50:51]: It’s getting better every day. I’m on X or Twitter. You can always yell at me about the parts not working as well as they should, because plenty of things should work way better.
    The New Serverless: Stateful, Long-Running, Pay-for-What-You-Use Linux
    Swyx [00:51:04]: At this stage, when people want massively or embarrassingly parallel compute, they usually talk serverless. I feel like there’s a new serverless compared to the previous five years of serverless. You’re in that new bucket. Do you have comparisons or philosophical differences you want to call out?
    Jake [00:51:31]: It’s somewhere in between. It’s the ability to run stateful, long-running workflows or executions.
    Swyx [00:51:42]: Vercel has Fluid Compute, Cloudflare has some container thing, Google has App Runner and others.
    Jake [00:51:55]: That’s where everything is roughly going, and it’s why we’ve been working on this for six years. We believe users need access to a computer: a box that speaks Linux. They need to deploy what they want. Other systems change the surface area of what you can build. For us, users need a computer and need to deploy anything they truly want. That’s why we’ve focused on the primitives: network, compute, storage. If we give you those and expose them so you can run things indefinitely, that’s where we believe it’s going.
    Jake [00:52:43]: Twitter has no nuance, so everyone says “servers” or “serverless.” It’s always somewhere in the middle: I want to run it for a long time, but I don’t want to provision the resource statically or pay for things I’m not using. That’s been our thesis from day one: pay only for what you use, run it indefinitely, and it is full Linux.
    Swyx [00:53:12]: That’s why I like the naming of Fluid. It’s fluid. Flexible.
    Heroku, Focus, and Carrying the Torch Without Becoming the Past
    Swyx [00:53:18]: Another milestone is the Heroku official deprecation. You’re one of the presumptive new Herokus. “New Heroku” has been a category for as long as I’ve been in developer tooling. It’s finally happening. What was that like? Any behind-the-scenes of, “This is the moment”?
    Jake [00:53:42]: You have people where you’re like, “You were running stuff on here? You, as this company?” It’s crazy that names you would know are running on it and now coming to us saying, “We want to move a lot of this off.”
    Swyx [00:54:00]: Any behind-the-scenes on why Salesforce let Heroku stagnate?
    Jake [00:54:05]: I can only guess. It’s hard when it’s not your business. Salesforce’s business is to build a great CRM. That’s their focus. Then you acquire a compute business as an offshoot. A lot of early Meta people talk about focus. Boz has a write-up about how in the early days of Meta they had no money, so they were forced to focus. Then they turned on the money tree and had no reason not to split their focus.
    Jake [00:54:52]: But that dilutes your product. You get offshoots where you ask, “Is this the focus of the business?” If it’s not core, it languishes. A lot of companies get in trouble when they split focus because they’re fighting a multi-front war, not just externally but internally for alignment. Where are we going? What are we doing? What is our purpose?
    Jake [00:55:24]: If you’re Salesforce-built and mission-driven, you want to work on Salesforce. Heroku is off to the side. It’s not core to the business. Getting resources, budget, focus, and alignment internally becomes hard. It was a matter of time.
    Swyx [00:56:06]: Kudos for them to call it out instead of leaving it unknown.
    Jake [00:56:12]: Their release was a little odd. They called it out, but they didn’t say they were shutting it down. Behind the scenes, I think they issued messages to people saying they should close accounts and that they were going to deprecate and remove things over time.
    Jake [00:56:30]: It’s crazy because some of my first deployment experiences were on Heroku. You start with dragging things into an FTP server, then you try to get a deploy working, and then it’s Heroku. It was the on-ramp for us. But the wheel turns. New things emerge. We’re happy to carry the torch for a lot of that. But we don’t want to be the new Heroku. We want to be the way people build and deploy software, and ultimately the way people monetize software over time.
    Swyx [00:57:19]: It’s still a big crown to be the new Heroku. There are 50 companies that fought for that.
    Jake [00:57:23]: Everybody is holding some portion of it. We’re happy to support people and companies. The platform works differently. The game loop is similar, but we’ve been dogmatic about where these things are going: primitives, agents, fan-out. Some things fit; some workflows need to change. We have an approximation of Heroku pipelines with the environment system. It’s exciting. We’ve got a ton of people we can support, and it’s growing a lot.
    Temporal, Workflow Engines, and State Machines
    Swyx [00:58:12]: I have one more technical question about Temporal. I’ve sold my shares. You’re a power user and one of our earliest customers. I met you through Temporal. You built on Temporal. You have complaints. This may be the most neutral and informed conversation anyone will hear about Temporal without someone working at the company.
    Jake [00:58:39]: That’s fair. I’ve used Temporal for almost 10 years because of Cadence at Uber.
    Swyx [00:58:52]: Give people a sense of what Cadence was at Uber.
    Jake [00:58:57]: Cadence was the precursor to Temporal. It powers trip actions, rides, when you rent a Jump bike or scooter or car. You’re running workflows for a period of time and saying, “This ride will run indefinitely until it finishes.” You attach information: you paused in this zone, so add this charge to the bill. When you end the trip, the workflow is done. That experience was powered by Cadence at the time.
    Swyx [00:59:34]: I used to say it’s like programming the entire user journey top-down as one function.
    Jake [00:59:39]: It’s a powerful idea and important. It’s also important for the next phase of the agentic journey. You want an agent to do a specific task, be complete or incomplete on that task, and move on to the next thing. You need a way to manage workflows dynamically.
    Jake [00:59:59]: Temporal was always great in theory, and great when you got it working the way you wanted in production. But it required you to model the entire journey in your head. If you didn’t, you could cause issues where replaying the state of the workflow causes non-determinism.
    Swyx [01:00:25]: Because it works on deterministic workflow history.
    Jake [01:00:28]: Exactly. I describe it as a jet engine. If you know how to operate it and run it, it’s great. But you can’t hand it to people trying to build complicated things if they don’t have the whole state in their head.
    Jake [01:00:48]: We run our whole deployment pipeline on top of it. That’s a reasonably complicated workflow: pre-commit hooks, signaling, queuing, and all the rest. We ran into the same thing at Uber. As you express a large workflow, it gets more complicated, with more states in the state machine that you have to map back to the workflow.
    Swyx [01:01:15]: It’s a lot of ifs.
    Jake [01:01:16]: Exactly. At Uber, we built a system for doing the state machine and testing it. We’ve started to build some of those things here because it’s grown heavily. It’s not quite love-hate. When it works well, it works super well. But if someone who doesn’t have full context puts something into the system that invalidates state or causes non-determinism, or spins off a ton of activities, you have to keep track of underlying SRE knobs like activity slots. Those should scale with memory, vCPU, and so on. It becomes a bear to scale.
    Swyx [01:02:10]: You need a capable sysadmin running things behind the scenes. If you moved off, what would you do?
    Jake [01:02:19]: We’d build our own workflow engine. We have a few internally that we’ve worked on.
    Swyx [01:02:27]: This is one of those classes of things you typically wouldn’t vibe code, but I’m wondering if you can.
    Jake [01:02:33]: I still don’t think you should vibe code it. You still want to run decent tests to make sure it works.
    Swyx [01:02:39]: Timo didn’t invent that from scratch either. There are libraries you can run. On top of that, it’s just a state machine that you have to map out. Ultimately, you define the instructions you want and run them through a state machine.
    Jake [01:03:00]: It’s very doable. Workflow stuff is interesting. Restate is doing neat stuff here.
    Swyx [01:03:10]: You’re tied into JavaScript. Are you a JavaScript maxi?
    Jake [01:03:13]: Internally, we have TypeScript, Rust, and Go. We don’t add more languages. Actually, we have a little C because we write BPF code and hooks. But those are the languages.
    Swyx [01:03:28]: Is this for sidecars?
    Jake [01:03:32]: No. It’s for the networking stack, volumes, and things like that. We use TypeScript a lot because it powers the dashboard, but we’re moving a lot of workflow stuff off the dashboard stack and into the infrastructure stack.
    Railpack, Nixpacks, and Content-Addressable Filesystems
    Swyx [01:04:00]: Cool. Any other technical infrastructure stuff? Railpacks?
    Jake [01:04:07]: We built an engine for determining dependencies based on source code. It’s called Railpack. We built the first version, Nixpacks, on top of Nix, and then we moved.
    Swyx [01:04:17]: People have been trying to get me to adopt Nix and NixOS for four years. Is it ever going to be a thing?
    Jake [01:04:23]: I don’t know. We’re excited about it, but it has pain points. Think of it as a stack of versioned binaries at specific slices in time. If you want version X and version Y, you bloat the package space, which blows up image size and makes real-world workloads difficult.
    Swyx [01:04:53]: But you content-address it and cache it. In theory, there are optimizations.
    Jake [01:05:00]: In theory, yes. But with a large enough user base and disparate enough machines, you run into a problem Meta described in the XFAAS paper, their internal serverless system. It becomes difficult at scale unless you break out specific runtimes.
    Jake [01:05:24]: We didn’t want to do that because we wanted to truly allow you to deploy anything. That was our initial thing with Nix. But we’ve moved toward interesting work around content-addressable file systems that can lazy-load anything from any point and page it into memory.
    Swyx [01:05:48]: Amazing.
    Jake [01:05:49]: The future is very bright. It’s crazy, and it’s going to be nuts.
    Coding Agent Spend, Roadmaps, and Token ROI
    Swyx [01:05:54]: Founder journey stuff?
    Alessio [01:05:56]: Your cloud usage: you tweeted you’re going to spend $300K this month?
    Jake [01:06:01]: I think we got to $200K.
    Alessio [01:06:02]: Coding agents?
    Jake [01:06:03]: Yeah.
    Swyx [01:06:04]: Across the company?
    Alessio [01:06:05]: You only have 35 people, so I’m sure they’re not all spending $10K a month. What’s the distribution?
    Jake [01:06:10]: I think I’m at about $25K. We have power users all the way down. We came back from winter break, and I basically said, “If you’re writing code by hand, you’re doing this wrong.” The tools are good enough now that you can move extremely quickly. There are issues and pain points, but you should be reviewing the code you are writing instead of writing it by hand.
    Jake [01:06:40]: Architectural patterns matter more now than ever, but you shouldn’t spend your time generating code you would write. If you know how to write it, ask the agent to write it and reconcile it until it looks like you would have written it yourself.
    Jake [01:06:58]: People misconstrue my propensity to push people toward agents as connected to our growth and some reliability bumps. They’re not necessarily related. The tools are good enough to move extremely quickly and build things way larger than you could before.
    Jake [01:07:19]: To the earlier point about cooling data centers in space: I don’t know. But with software, you can ask, “How would I build block storage from scratch? How would I do these things?” I have ideas because I have history and have read papers. Let me work them out and build massive test benches with thousands of tests, because those are now free to author. If you’re not using AI systems to speed-run your roadmap and reconcile your existing system onto the future, you’re missing a large point of what’s happening.
    Alessio [01:08:12]: What’s the path to spending $3 million a month? Is it bound by ideas and things customers can absorb?
    Jake [01:08:19]: For most companies, it’s bound by deployment at this point. That’s why we’ve seen a massive boom in users and companies, from Fortune 50s down, asking how to get developers to move faster. You’ll probably hit your CFO before any technical limits because they’ll look at the eye-watering amount of money spent on tokens. Inference costs have to come down, but we’re inference constrained now. There will be price discovery around what makes sense for an org to adopt.
    Jake [01:09:06]: I think you’ll end up with the F1 driver concept. If someone is really adept at these things, it makes sense to put them in a $3 million car. If they’re not, it probably doesn’t make sense. You’ll take a few people and say, “You can drive the F1 car. We need to go in this direction. Figure out if it works and prototype it.”
    Jake [01:09:33]: We’ve done some of that and vastly accelerated our roadmap. We thought we’d ship something in a few years; now we can probably ship it in a few months because we validated it and don’t have to build it incrementally. We can skip steps and move toward our vision.
    Alessio [01:09:58]: A lot of people are realizing the roadmap doesn’t always have a business impact, so they say tokens are too expensive. But if your roadmap were built to make more money by the time you built it, you’d have token pricing for it, the same way you do with sales. You’d spend a billion dollars on sales if you knew you would get $2 billion of revenue.
    Jake [01:10:19]: Exactly. A naive way to measure this is the percentage of tokens that end up in production. If you can measure impact because those tokens end up in production, that’s awesome. But the burden of proof will rise. Internally, we have a growing number of pull requests that haven’t merged. The question becomes: how do you get this into production? It’s about how quickly you can build and deploy software, which is exciting because that’s our whole thing.
    The SDLC Shift: Prompt Requests, Feature Flags, and Safe Rollouts
    Swyx [01:10:56]: The SDLC is changing. One thesis is that the pull request is dying. It’s going to be the prompt request. Beyond that, code review is also kind of dying if you have all the other systems in place. What else is changing about the SDLC?
    Jake [01:11:19]: The AISRE and the tools to make it happen. AISRE is pie-in-the-sky aspirational. What does it take to get an AISRE? What tools do you need to build?
    Swyx [01:11:32]: You should expose your tooling to customers at some point. The Central Station command center.
    Jake [01:11:39]: We have it for template maintainers. Template maintainers can deploy and maintain templates, and they get feedback. We’re going to expose those things incrementally.
    Swyx [01:11:51]: Clustering around incidents. Everyone has a version of that, but I don’t think anyone has solved it.
    Jake [01:11:56]: I won’t say we’ve solved it internally, but it’s gotten so good that we can see incidents forming pretty quickly. At some point, those will be things either someone else builds or we build. We’ve always built things purpose-built for us. If it makes sense to make it useful for users, monetize it, or turn that loop into a profit center instead of a cost center, we want to do that.
    Jake [01:12:28]: Pull request is definitely dying.
    Swyx [01:12:29]: Do you do first-party feature flagging and incremental rollout stuff?
    Jake [01:12:34]: We have a feature-flagging engine we built internally and will eventually roll out.
    Swyx [01:12:38]: I don’t see it as a user. How come you didn’t give us what you have?
    Jake [01:12:43]: We have to beta test it. We care a lot about the quality of the things. There’s plenty we’ve used internally that doesn’t make it all the way through the journey because it fails. It works for one service but not multiple services. We’d have to build it for multiple services and know that if we released it, we’d rebuild it again and again. Some things are worth that, but many inform the roadmap.
    Jake [01:13:18]: We don’t want to dilute the experience by saying, “This works, but only for this service,” unless it’s a core initiative. Over the next few months, we’ll roll out things that work for a single service, then multiple services, then multiple services across the environment. You have to be deliberate. Otherwise you create broken disparate experiences and support load because people ask how to use the feature.
    Jake [01:13:52]: It’s the earlier expansion and compaction pattern. You expand the company to get features, then compact and smooth them out so the experience is stellar. You told me in the hallway, “It’s gotten so much better.” Internally we’re saying, “This part really sucks. We need to make it significantly better.”
    Swyx [01:14:11]: I can attest to that over the last three years watching you build Railway. For listeners, feature flagging is a huge part of Uber culture. So much so that they have too many feature flags and another thing to remove feature flags. Facebook has Gatekeeper. Agents are going to need this. It’s fundamental to incremental rollouts. OpenAI acquired Statsig. GPT-5 is routing and flagging through different models.
    Jake [01:14:56]: It’s super important. If the software development lifecycle is going to change because we’re doing things 1,000 times faster and 1,000 times more concurrently, what becomes important at scale?
    Jake [01:15:16]: Before I started Railway, I built a feature-flagging product and tried to sell it. It was an easier version of LaunchDarkly. I ran into a problem: anyone small enough to adopt your technology doesn’t care about feature flags, and anyone large enough to need feature flags needs so much scale that you have to build out all the infrastructure. I scrapped it.
    Jake [01:15:42]: But what is old is new again. Companies are trying to move quickly, but you can’t YOLO a vibe-coded thing straight into production. You need to say, “Here’s my blast radius, my impact, and I want to shadow it for these users.” Feature flags. You’re going to need the tools larger companies built to maintain their structures. Everything gets compressed by 1,000x so everybody can build those structures quickly.
    Jake [01:16:07]: That’s exactly where we are: compressing the software development lifecycle, then expanding it and adding more new things.
    Cattle, Pets, and Clonable Infrastructure
    Swyx [01:16:15]: Another term that comes to mind for newer developers is “cattle, not pets.” People treat production like a pet. It has a name. You baby it and keep it alive. With cattle, you can mass farm, roll out, portion parts out, and kill them.
    Jake [01:16:37]: I think that might change. You can move toward having pets as long as you have a cloning machine for your pets.
    Swyx [01:16:52]: Yeah.
    Jake [01:16:52]: If you can snapshot every single thing at every frame, it doesn’t matter if something gets obliterated because you have a snapshot of it. The things we’ve built right now are designed to block changes from the hermetically sealed DevOps line. You have to write a Dockerfile because you need a specific cut of the file system.
    Jake [01:17:14]: What if you had the whole file system? What if you snapshot it and lazily load the entire file system? Then you get around this problem entirely. You don’t need the ceremony of Dockerfiles, Ansible scripts, or other things. You can iterate, snapshot, ask if it’s the right loop or state, and then merge it into production. Merge the file system.
    Swyx [01:17:45]: Why not?
    Jake [01:17:46]: It’s going to be fun.
    Swyx [01:17:47]: This is a whole other can of worms, but if you cataloged the stateful things in a VM and developed dedicated solutions for each, you can cut the problem down a lot. It’s surprising people weren’t trying until now.
    Jake [01:18:04]: It has always been surprising to me because these are the things we would work on. It’s obvious.
    Swyx [01:18:11]: At first principles, you need them. Everyone needs them in theory. Then the big clouds don’t do them, so you assume it’s impossible.
    Jake [01:18:18]: Exactly. You think, “Meta has all the people writing eBPF code, and they’re doing something with them.” But you need that kind of work to solve these problems. Whatever is required, however deep we have to go, we’ll go all the way down to the kernel’s TCP/IP stack if needed. If we need to modify something to make it work for the mental model of the universe moving forward, we’ll do it and keep going down.
    Swyx [01:18:52]: That sounds fun.
    Jake [01:18:53]: It’s so much fun. I have to peel myself away from fun, interesting problems to make sure we can scale the company in a way that works. There are so many fun problems: getting information from customers to support to the person who built the thing internally, safe iteration, context from the dashboard to users, drilling down to the infrastructure layer, and managing orchestration as a real-time operating system versus a feedback control system. It’s just so fun.
    Solo Founder Lessons: Obsession, Writing, and Focus
    Swyx [01:19:29]: Speaking of the founder side, you’re famously outside the YC/SF consensus. You go to YC, get a co-founder, and do all these things. You did none of that.
    Jake [01:19:40]: None.
    Swyx [01:19:45]: In the elevator you said a co-founder makes sense if one person is the tech person and the other is the biz dev person. But you have to contain those multitudes yourself. How do you do it?
    Jake [01:19:58]: I try to get eight hours of sleep.
    Swyx [01:20:11]: Is there a balance: 50/50, 30/30/30? What’s the mental model as a solo founder?
    Jake [01:20:17]: There’s no balance. You have to think about all these things and be obsessed with them. Be obsessed with how people think about your product from a go-to-market perspective, and be obsessed with the kernel-level change that makes a user’s SSH connection never drop. I want a universe where you can snapshot everything and it feels like iterating on a VM.
    Jake [01:20:47]: You have to be obsessed at every layer of the stack. That’s what makes it easier for me. Some people are obsessed with different portions of the company journey, and if you can segment those lines well and be clear about ownership, you’ll have a good time.
    Jake [01:21:12]: I said two is the worst number of co-founders because you have no tiebreak. You disagree, and how do you resolve it?
    Swyx [01:21:38]: Usually someone is CEO, so they have the tiebreaker.
    Jake [01:21:43]: Totally. It’s hard every way you cut it. It’s hard if you get help, and it’s hard if you do it yourself. Running things is hard, but it’s so rewarding and fun.
    Swyx [01:21:56]: What have you found useful? A coach? Any advice that has been helpful?
    Jake [01:22:01]: I like to write a lot. I get in trouble a lot for my Twitter. I once said if you’re working weekends, you’re messing up your planning. I’ve gone back and forth on that because right now we’re at an extenuating time where it makes sense to work more. The goals are clear in my mind. If you have the vision and know where you’re going, work harder to distill that vision and do those things.
    Jake [01:22:33]: If you’re not certain and need clarity, disconnect and take your weekends seriously. Write about where you are, what you want to do, where you want to go, and what problems you’re solving.
    Jake [01:22:56]: Writing is important. I don’t love the word meditation, but whatever gets you into mental clarity is important when you’re trying to say, “We’re here and need to be here,” or “We’re here and I think we need to be in this general space for this to work.”
    Jake [01:23:22]: Disconnect, hang out with people you love, and work hard when you’re working. I try to work sunup to sundown, Monday to Friday, all out. I disconnect on Saturday and come back Sunday afternoon to write, plan the week, and do everything else. It works well for me.
    Jake [01:23:43]: Another hot take: most advice should be digested and thrown out the window. If it’s helpful, it’ll come back. You’ll learn it through experience. We have made failure very expensive as a society, and it makes it difficult for people to walk off the paths.
    GPUs, Focus, and the Dominant Role of Agents
    Swyx [01:24:03]: Anything you haven’t tweeted and gotten in trouble with that you want to preview to the world?
    Jake [01:24:12]: The agent stuff is crazy. It’s going to be the dominant way people do pretty much everything, provided we can get the inference required for that to happen. Over the next 10 years, you’ll see a fundamental shift in how people think about authoring the logic in their head.
    Swyx [01:24:36]: One way of phrasing it is: if Allbirds can become a GPU provider, so can Railway.
    Jake [01:24:44]: I think there’s a lot of “everyone becomes a GPU provider” that is actually not becoming a GPU provider. You’re defined more by the things you don’t do than the things you do, because it’s easy to say yes to a lot of things.
    Jake [01:24:56]: Anthropic is amazing and moving into different zones. They’re moving into Figma-like things.
    Swyx [01:25:09]: As we’re recording, Mike Krieger was on Figma’s board, they removed him Monday, and then they launched this today.
    Jake [01:25:18]: Things move fast right now. But agents are going to be the way people operate.
    Swyx [01:25:25]: So your answer is focus: no GPUs for now, but never say never.
    Jake [01:25:27]: Focus. We will not do GPUs now, but we 100% will do GPUs at some point in the future. That’s not me leaking our roadmap because we don’t have plans to do GPUs. It’s just a function of needing FLOPS at some point. If you’re fully vertically integrated and want to make it trivial for people to iterate, build, and deploy, you need access to this core piece of fundamental logic.
    A New Cloud From First Principles
    Swyx [01:25:57]: Presumably your own data center traffic is a minority of your workload right now, but is there a point where it’s a majority or you turn off public clouds?
    Jake [01:26:10]: At some point, we got to 100% data center: our own data centers. Right now, the vast majority of what exists on our platform is on our bare-metal data centers.
    Swyx [01:26:21]: So you’re already there.
    Jake [01:26:23]: Yeah. The transition was completed at some point, and then we grew so fast that we had to scale back on that. It got to 100% on the Datadog dashboard and then divoted back into the 90s because we were adding capacity.
    Swyx [01:26:45]: You’re literally building a new independent cloud, and people assume that could never happen post-AWS.
    Jake [01:26:53]: It’s hard. We’re going to figure out a bunch of things to make sure the platform is deeply reliable. But you have to break ground on new things when you decide to build a cloud from scratch but not copy the hyperscalers.
    Jake [01:27:10]: We’ve been deliberate about inventing our own infrastructure from scratch based on reading a ton of papers, while promising ourselves we wouldn’t copy someone else’s homework. If we copy someone else, we lose. You become them over time. You need a core thesis for why this business needs to exist now.
    Jake [01:27:33]: For us, the activation energy required to deploy something in production on hyperscalers is far too high. We believe it should be instantaneous. There should be no friction between your thought and the reality that comes out and that you can share with friends. That’s what we’re building toward at every layer of the stack. If we have to go down to energy, we’ll go down to energy.
    Jake [01:27:58]: It matters for giving people access to this tooling. It’s gated not just for citizen developers who are now vibe coding. You have multiple layers: citizen developer, front-end developer, back-end developer, DevOps person, and more. Those layers need to disappear so people can just ship.
    Swyx [01:28:20]: Amazing. That’s the future of cloud.
    Jake [01:28:22]: Awesome. Thanks for coming on. Thank you for having me. It’s been wonderful.


    This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.latent.space/subscribe
  • Latent Space: The AI Engineer Podcast

    The Autonomous Drone Tech Stack & Economics of Drones — Yaroslav Azhnyuk, The Fourth Law & Guest Host Noah Smith, Noahpinion

    18/05/2026 | 1h 59min
    The future of war has been evolving before our eyes in Ukraine, yet the west still plans to fight the last war. In this special episode, guest host Noah Smith (@noahpinion) and Brandon Anderson sit down with Yaroslav Azhnyuk (@YaroslavAzhnyuk), a serial tech founder who went from building PetCube to founding The Fourth Law, one of the world’s most advanced AI-guided drone companies. Over two hours we cover the technology, tactics, and geopolitics of drone warfare, and why the modern battlefield has already left the West behind:
    * Yaroslav’s personal history and the Ukraine war [00:01:04 – 00:14:01]
    * The modern drone tech stack: why FPV drones are the new god of war, the future of the rifleman, fiber optic vs. AI, five levels of autonomy, and the eight dimensions of the autonomous battlefield [00:14:01 – 01:05:13]
    * The geopolitics and economics of drones: China’s manufacturing advantage, the drone race, Western defense readiness, countermeasures, and why the gap is widening [01:05:13 – 01:58:57]
    For those looking for Noah Smith’s commentary, it really gets going around the 00:51:31 mark.
    Yaroslav Azhnyuk / The Fourth Law:
    * X: https://x.com/YaroslavAzhnyuk
    * LinkedIn: https://www.linkedin.com/in/yaroslavazhnyuk/
    * The Fourth Law: https://thefourthlaw.ai
    Noah Smith:
    * Substack: Noah Smith
    * X: https://x.com/noahpinion
    Timestamps
    00:00:00 Cold Open: China’s 4 Billion Drones and the Cameras-to-Explosives Pipeline
    00:01:04 Introduction: Brandon, Noah Smith, and Yaroslav Azhnyuk
    00:05:41 From Tech Entrepreneur to Defense: PetCube, Brave One, and the D3 Fund
    00:10:42 The Ethics of Building Weapons: Dual-Use Technology and the Wolf at the Door
    00:14:01 The Tech Stack: Cameras, Autonomy Modules, Interceptors, and a Semiconductor Fab
    00:18:47 Fiber Optic vs. AI: The Radio Horizon Problem and $32/km Cable
    00:25:32 FPV Drones: The New God of War — 70–80% of Frontline Casualties
    00:28:28 The Five Levels of Drone Autonomy: From Terminal Guidance to Full Autonomy
    00:41:37 The Eight Dimensions of the Autonomous Battlefield
    00:45:32 AI Safety and the Morality of Autonomous Weapons
    00:51:31 The End of the Rifleman? Noah’s 2013 Prediction vs. Battlefield Reality
    01:05:13 China’s Manufacturing Advantage and Western Vulnerabilities
    01:24:21 Policy Advice for Western Defense: Defense Valley and the Widening Gap
    01:32:54 The Drone Race: Who’s Ahead, Category by Category
    01:41:57 Countermeasures: Shotguns, Jammers, Lasers, and Fishnets
    01:58:19 The Wedding and Final Takeaway: Be Prepared for War
    Transcript
    Cold Open: China, FPV Drones, and the New Warning Sign
    Yaroslav [00:00:00]: Think about this. Last year, Ukraine produced 4 million FPV drones. Ukraine is not the most industrious nation in the world. China can produce 4 billion of these FPV drones.
    Noah [00:00:10]: Would you say that right now China is now the supreme conventional military power on Earth, given its ability to manufacture and deploy drones in the quantity and quality that you just described?
    Yaroslav [00:00:20]: I don’t think we have all the information to claim that but we cannot count it out, and that alone should be a big warning sign. As I say, at some point in my life I went from making cameras that fling treats to pets to cameras that fling explosives to the occupiers. So that’s the short story. And when you think about what your nation, what your patriots are going through, you realize that’s the only morally right thing to do is to fight back, and it is immoral not to fight back, and then the choice becomes very clear.
    Introduction: Yaroslav Azhnyuk, Petcube, and the Last Flight into Kyiv
    Brandon [00:01:04]: Welcome to Latent Space. I’m Brandon. I normally do science podcasts, but today we’re going to do something a little bit different. I’m joined by Noah Smith of Noahpinion on Substack and Twitter. And he has lots of interesting things to say about drones. And as a guest, we have Yaroslav Azhnyuk, founder of The Fourth Law and several other, drone-related startups. To get started, it is February 23rd, 2022. You are running a pet startup. You’re connecting pets with their owners. Let’s go in just a little bit of background. How did you get started in tech, and what were you working on before the Ukrainian war started?
    Yaroslav [00:01:50]: Good to be here. Thank you. On February 23rd, late in the evening, 11:00 PM Kyiv time, my wife and I landed in Kyiv. Actually, then she was a fiance. We came from Lviv, where we were looking at a church, where our wedding should have taken place. And we got into this cab ride from the airport to our home, and the driver was like, “You crazy. Like, everyone’s leaving Kyiv. Why do you come?” We’re like, “What? Nothing’s going to happen. Dude, chill.” And then obviously, eight minutes later, or eight hours later, the bombs fell in the city. It was quite surreal. We probably landed on the last flight that landed in Kyiv, or one of those last flights. My background, I’m a tech guy. Studied applied mathematics in Kyiv Polytechnics, born and raised in Kyiv. My parents are old PhDs from academia, and grandparents too. Like, everything, from linguistics to nuclear physics. And I’m an entrepreneur, so I’ve built a bunch of companies. Petcube is the one you were referencing. So I lived in San Francisco 2014 to 2020, building Petcube, which is one of the leading, pet device companies in the world, selling lots of pet cameras. And then, yeah, as I say, at some point in my life I went from making cameras that fling treats to pets to cameras that fling explosives to the occupiers. So that’s the short story.
    February 24th: Leaving Kyiv as the Invasion Begins
    Noah [00:03:28]: February 24th, I guess a few hours after you, go to check out your wedding chapel, what do you do?
    Yaroslav [00:03:37]: We had a plan for this situation. So my parents and family live in Kyiv, and we’re like, “Okay, this has actually started. The worst has, come true.” And so we basically packed our belongings and got in the car and spent 17 hours driving west. And that was pretty sure most people in our audience watched at least one apocalyptic movie in their life, so that was exactly like that. Like, felt exactly like that. Missiles are falling. Like, there was smoke in Kyiv. Like, my dad and I went, like, to central part of the cities. It’s probably, like
    Yaroslav [00:04:20]: 800 meters from presidential office, to pick some stuff up at his workplace. Because he’s, like, the head of an academic institution, so he had to get some of the things with him. And super surreal. Like, the streets are empty. Like, the gas stations are out of gas. Like, we found some gas station. We didn’t have, like, spare canisters with us, so we’re like, We figured out, like, the car was diesel, so like, we figured out, if it’s diesel, you can actually store it in plastic, canisters, and we bought some window wash for the cars. We poured it out of the canisters, and we poured the diesel into that. Yeah, so it was like that. And then, like, helping friends get out, like my friend and his dog. Like, we found Like, my brother was also, like, riding in a separate car. We found a place for my friend who didn’t have a car. It was like, yeah, it was like, totally surreal. And we didn’t know of course, and you didn’t know this will last for so long. You didn’t know whether Ukraine will be able to defend Kyiv. And it was like, yeah, very little information and very little insight into future.
    From Pet Cameras to Defense Tech: Building for Ukraine and the Free World
    Noah [00:05:42]: What are your thoughts with regards to how do you, defend, Ukraine? So you eventually start building drones Like, what is the process to get from there from where you were building, devices that connect owners with pets to building drones, and what other things did you do to help the war effort in the process?
    Yaroslav [00:06:07]: It’s definitely non-trivial, right? Like, I didn’t go, to I didn’t get any, like, military education when I was a student. Like, normally, in Ukraine, you would, you would go to like, this military school even if you’re getting higher education in any other, sphere. I decided to skip that which is like, an unusual way to go. And I never thought that I will be somehow engaged in a war effort. Like, what is war? Of course, wars are over. It’s the end of history. So one thing you got to understand about, like, many Ukrainians and like, I guess, it’s also true about most of the people I met here in the US, that your who you are in terms of your nationality is a big part of your identity. So when that gets under attack, it’s something deeper than just the country you live in gets under attack, right? And I Day one, I figured I’m going to I’m going to fight back with everything I can, right? But I didn’t think on day one that I’m actually going to do, weapons. And a bunch of things. We were reaching out to a number of American, congresspeople and senators, and basically advocating for support of Ukraine, for voting for lend lease, which has happened in May 2022, but didn’t actually work as expected. We helped start, Brave One, which is now a very important defense innovation cluster, sort of like a DIU here in the US. We helped start, a fund called D3. It’s like, it was started or co-started by Eric Schmidt, former CEO of Google. So a bunch of these odd things, but then eventually I was like, “Okay,”by 2023 it was obvious this thing, A is going to last a lot more time, and B, that the whole world is shifting and that there’s going to be a new arms race, that the warfare is redefined by drones as platforms. And for the first time in history, you have a platform that is software defined, that can increase your battlefield capabilities, in a in a step change just overnight. So it’s like if you were able to push a software update and get all of your Roman legionnaires a new helmet? That has never been possible before. It’s the first time in the history of war this is possible. So all of that and many other things like, supply chain fragilization, and the impact that AI is going to have on all of this all these things have become evident to me in 2023, and it’s like, “Okay, I should do what I do best, or what I know how to do best, start a tech company, and sort of leverage the global techno capitalist machine, to provide, defensibility to Ukraine and the free world.” So that’s literally the mission of the company, increase defensibility of Ukraine and the free world. And then there was some sort of soul-searching and like, asking yourself. It’s like, “Okay, am I Actually, I know nothing about weapons. Am I actually, like, ready to make, things that other people use to kill other bad people?”
    Yaroslav [00:09:36]: When you think about what your nation, what your Compatriots are going through And think about all the terror of places like Bucha, the occupied cities in the east and south, the abducted children, the raped women, all the economic damage that’s being done, and the intention to destroy a whole nation, to genocide the people of Ukraine, you realize that’s the only morally right thing to do is to fight back, and it is immoral not to fight back. And then the choice becomes very clear. And look, we’re just passing the ammunition. We’re not doing the actual job. The actual fighters and defenders and heroes are people in the armed forces. We’re just support.
    The Moral Question: Weapons, Responsibility, and Fighting Back
    Noah [00:10:33]: I have so many questions. Actually, I know you seem to have a question. Do you want to ask anything?
    Yaroslav [00:10:38]: No, I’m just listening. Go ahead.
    Noah [00:10:40]: I do want to talk about, some of let’s say, the moral issues, like you just said. You end
    Yaroslav [00:10:50]: I think there are no issues there.
    Yaroslav [00:10:52]: What would an example of a moral question be in this case?
    Noah [00:10:55]: No, I mean Okay. As you just said, you are creating the tools, but others are using them.
    Noah [00:11:05]: I was maybe thinking of having this conversation later, but one of the questions is like, is it actually you are going to be building them for your homeland, which you are building it for your homeland, which is I think, very a strong morally defensible position, but this technology is not going to stay with you, right?
    Noah [00:11:26]: This you will probably be selling these to other people Yeah. So the future is really where the moral issues may come into play
    Yaroslav [00:11:38]: The this question becomes, easier and more complete if we ask this not about a particular technology or particular weapon, if we think that this question actually applies to any kind of technology Right? So -Knife or fire. You can use knife to do surgery and save people’s lives, or you can use it as a weapon to take people’s lives.
    Noah [00:12:06]: Cut tomatoes, too.
    Yaroslav [00:12:08]: Cut tomatoes too.
    Noah [00:12:09]: Yes, knife.
    Yaroslav [00:12:09]: That’s helpful.
    Noah [00:12:10]: In Japan, sword and knife, they, call the same word.
    Yaroslav [00:12:14]: It’s like, it’s with any technology. Large language models, right? Look at how powerful they are and yet they’re available to anyone in North Korea or in Russia.
    Yaroslav [00:12:29]: That’s one side of the argument. The other side is As a maker, what is your responsibility for how the tools you’re creating, will be used? There’s definitely some responsibility, right? Then How should the decision process look like? Should you, like, try to calculate all the possible scenarios before starting to work on something? Or do you create something that is needed now to save people’s lives, and then think about, addressing the unwanted edge cases later? In ideal world where there’s like, or okay, it’s not ideal world. In a mythical world where there is some one governing party and it gets to decide everything, and there is no other country, that can, decide on their own, you could say, “Well, we need to calculate for all the consequences, and only then, maybe build this building, by replacing this park because, maybe we need this park in the city,”right? So that kind of situation. But when you’re in a situation where you’re in a forest, in front of a wolf, you first going to deal with the wolf that wants to eat you, and then you’re going to go consult Greenpeace. So that’s kind of situation that Ukraine is in.
    The Fourth Law, Odd Systems, and Ukraine’s Drone Stack
    Noah [00:13:59]: Enough. Because this is a tech podcast, I did want to spend some time talking about, sort of the tech in that you’ve developed and what you’ve been working on. So can you explain, I guess, first of all, like, the problem that you were trying to solve from a technical standpoint? And I think, and then maybe, like, go into some of the solutions and some of the design process that led you from designing, little laser-guided, guiding lasers with a with an iPhone versus Having drones.
    Yaroslav [00:14:34]: Like, it so happened, that my partners and I, we sort of So I started one company called The Fourth Law, and its goal was and is to Make, massively scalable on-drone autonomy. And then In parallel with that together with my, Petcube co-founders, partners, and friends, we started another company called Odd Systems Which, was focused on making thermal cameras. Cameras, thermal cameras are seeing thermal radiation and are used to see at night. And we’re now sort of those companies are getting closer and closer together and we’re probably going to merge them. And this group of companies is currently the leading, team in on-drone AI and thermal imaging on the Ukrainian battlefield, and Likely one of the leading, if not the leading in the world. So We have these, like, three sort of business units, which are cameras, drone autonomy, and drones. So the cameras and drone autonomy sell daytime and nighttime cameras and different types of drone autonomous modules to other drone manufacturers, over 200 drone manufacturers in Ukraine. And then the UAV, business unit sells the drones themselves to the armed forces of Ukraine, Ukrainian government. And there are different types of drones. Those are sort of front strike, as we call them, so those are sort of FPV strike drones and the bombers, and then interceptors. And there are different kinds of interceptors. We do Shahed interceptors and we do ISR interceptors. We don’t do the deep strike-
    FPV Drones, Interceptors, and Battery-Powered Warfare
    Noah [00:16:32]: What’s an ISR interceptor?
    Yaroslav [00:16:33]: ISR is stands for intelligence, surveillance, reconnaissance, and those are basically drones which are which, Russians are using to watch over positions and then communicate where, the targets are coming.
    Noah [00:16:48]: It’s a reconnaissance.
    Yaroslav [00:16:48]: That’s, the ISR is sort of a classical term for a for a reconnaissance drone.
    Noah [00:16:53]: Are all of these battery-powered drones that you just described? ‘Cause I know that the sort of deep strike drones still have, like Some sort of
    Yaroslav [00:17:01]: Internal combustion engine?
    Noah [00:17:02]: Internal combustion engine. Are all the things you’re talking about battery-powered?
    Yaroslav [00:17:06]: What we’re working on is all battery-powered, right? We don’t do the deep strikes, right? And then in terms of autonomy-
    Noah [00:17:12]: You can catch a Shahed with a battery-powered thing. It’s not Fast to catch.
    Yaroslav [00:17:17]: No, absolutely. Look, Shahed interceptor, like ours, it’s called Zero, it goes up to 326 kilometers per hour.
    Noah [00:17:26]: For reference, how fast is a Shahed?
    Yaroslav [00:17:28]: Eight, like, in internal phase it could be 280, but in cruise phase it’s, like, 220-ish.
    Yaroslav [00:17:36]: Yeah. And sorry, I’m not like you can convert that into miles if you’re interested.
    Noah [00:17:41]: No, that’s fine.
    Noah [00:17:41]: Multiply by two thirds or point six or something.
    Yaroslav [00:17:44]: That’s easy. Yeah, I was saying that for autonomy modules, right, we, -We make systems, autonomous systems for frontline, for interceptors and some for deep strikes as well, and then different levels of autonomy. So from terminal guidance, which is like lasts 500 meters, give or take, to autonomous bombing, to autonomous target detection, to autonomous navigation and all of that across day and night, different terrains, different time of the year, different platforms like quadcopters and fixed wing, and maybe some other platforms. So it’s quite a wide variety of products. We also have like our own simulation. We have our own training school for the war fighters. And we’re about to start construction of two, semiconductor plants to make, sensors for thermal cameras. So that’s super exciting for me as a computer science guy is Doing semiconductors. Super cool.
    Noah [00:18:49]: Like in terms of kind of core drone technologies, you basically are one is an FPV replacement without fiber optics, and the other is
    Yaroslav [00:18:59]: You
    Noah [00:18:59]: Signal tracking with interceptors
    Yaroslav [00:19:00]: With or without fiber optics. Fiber optics Is just like, sort of a communication module.
    Yaroslav [00:19:05]: You can, you can use classical analog, video link and radio link. Those would be two separate radios. You can do digital, or you can do fiber optic, and then fiber optic Has its own advantages but also adds weight and decreases, the distance and decreases, how fast you can, sort of turn and With a drone. Yeah.
    Noah [00:19:33]: Do you need AI for fiber optic drones?
    Yaroslav [00:19:36]: Like you can use AI for fiber optic drones. AI replaces a human, right? Fiber optic is making your communication link more resilient. So those are slightly different goals. Like if you want, you can have, AI controlling hundreds of fiber optic drones instead of having 100 operators for each.
    Fiber Optics, Radio Horizons, and Terminal Guidance
    Noah [00:20:03]: I guess I thought that the key reason that people moved to fiber optic drones was for like electronic, countermeasures. Or I guess to counter those.
    Yaroslav [00:20:13]: I think that’s a correct assessment from sort of a public awareness standpoint. In practice it’s somewhat more difficult Because besides electronic countermeasures, you have these issues of a radio horizon For FPV drones, which means that as
    Yaroslav [00:20:36]: I believe Earth is round Some people disagree. But basically if you fly a drone and you have a land station over here and a drone flying over here
    Yaroslav [00:20:49]: If your drone is flying high, you have good direct radio visibility. If your drone goes low, and usually, Russian infantry and vehicles, they’re on the ground and you want to hit them, you need to go low. Lower you go, maybe you’ll get behind a hill or behind a forest, and if you’re far enough, you’ll just get behind the curvature of the earth. You get into what’s called a radio shadow. And then That is a real bummer because for the last, be it 60 or 20 meters, you won’t be able to see anything and it will be very difficult to hit the target. So to counter that what-- And then the distances that these FPV drones, act on they’re, they can be quite large. So for example, here in the US there was this drone dominance program competition, and in drone dominance the furthest distance was about 10 kilometers.
    Noah [00:21:44]: What was drone dominance? What was that competition?
    Yaroslav [00:21:47]: Drone, the drone dominance is a is a program started, by the US government, to accelerate the development of drone technology here in the US.
    Noah [00:21:57]: Got it. And the longest range thing they were using was 10 kilometers.
    Yaroslav [00:22:00]: Was 10 kilometers, right. In Ukraine, like if your drone doesn’t fly at least 20, 25, it just, no one’s interested in it, and the usual hits are happening. It was like, okay, many hits are happening between 30 and 40 kilometers, and that’s what expected from a regular 10-inch, FPV drone. So at that distance, even at altitudes of like 60 to 100 meters, you might start losing, the link. So some of the earlier AI technology that was fielded in FPV drone was this terminal guidance technology. That was the first product that we ever, launched that helped you as an operator, once you see the target from two, three, 500 meters, you lock onto the target and then, it just, drives the drone towards the target no matter what, even after you lost the visual connection. So optic fiber solves that. However, if you want to go like 20 kilometers with optic fiber, that will add an extra three kilos, of useful weight to your drone. So
    Noah [00:23:12]: ‘Cause the cable that you have to unspool as you go weighs.
    Noah [00:23:15]: It is heavy.
    Yaroslav [00:23:15]: At first, like the spool is about 800 grams, so a bit less than a kilo, and then, and then think about 10, 10 kilometer optic fiber is another kilo, something like that. That takes away from your useful mass and then now you have like, you need a 15-inch drone and it can only carry maybe one or two kilos of explosives if you want to go, 20 kilometers. If you want to go to 30 or 40, like 30 is probably max. 40 is like very problem problematic on optic fiber. And then the problem with optic fiber is it’s actually getting super expensive. So and why? Because of all the data centers for AI. That’s literally the same optic fiber-
    Noah [00:24:01]: We’re running out of centers
    Yaroslav [00:24:02]: That’s being used there.
    Yaroslav [00:24:02]: Like when Ukrainians and Russians come to Chinese factories to buy the optic fiber, they’re like, “We’re out. We sold it out to the Americans.”? That’s the craziest thing. So optic fiber went up in price from like, $4 per, kilometer to like, $32 per kilometer in a few months in the beginning of this year. And I’ve
    Brandon [00:24:26]: Claude Code is stopping the Russian drone effort here.
    Yaroslav [00:24:30]: Ukrainian as well. Yeah.
    Brandon [00:24:31]: Ukrainian. But I read somewhere that the Russians had grown more dependent on fiber optic drones relative to the Ukrainians, and that’s one reason why the Ukrainians have sort of regained the initiative in drones recently.
    Brandon [00:24:42]: How accurate’s that?
    Yaroslav [00:24:43]: The Russians were the first ones to scale that. I think by as of now, Ukraine has caught up. I think, like, as of maybe three months ago, Ukraine is mostly caught up on fiber optic. Yeah.
    Brandon [00:24:57]: What percent of damage would you say is in terms of FPV drone damage would you say is now fiber optic versus, like autonomous?
    FPVs as the New God of War: Tanks, Artillery, and Cost per Kill
    Yaroslav [00:25:07]: For our, for our audience, I actually, I cannot answer that question. Like, it’s like I know the answer, but I would not disclose that. But for our audience, I think another interesting fact is out of all the casualties on the front line Between 70 and 80% are done by FPV drones.
    Brandon [00:25:30]: FPV drones are the new weapon of universal weapon of warfare.
    Yaroslav [00:25:34]: It’s
    Brandon [00:25:35]: Land warfare, anyway
    Yaroslav [00:25:35]: They used to say that artillery is a god of war because artillery used to cause, like 80% of casualties, and now On that ranking-
    Brandon [00:25:46]: FPV
    Yaroslav [00:25:47]: FPV drones rule.
    Brandon [00:25:48]: FPV drones are the god of war.
    Yaroslav [00:25:51]: Sort of. Dethroned artillery. But it’s not to say that artillery is not useful, is not needed. Like, all of these systems are needed. Maybe except cavalry, although Russians still use it. I know, have you seen the videos of Russians using mules and horses?
    Brandon [00:26:09]: What is the usefulness-
    Yaroslav [00:26:10]: It’
    Brandon [00:26:10]: Of a tank in the in the modern-
    Yaroslav [00:26:11]: That’s where we need Greenpeace to say a word, but they’re silent. Yeah.
    Brandon [00:26:15]: What’s the use of a tank on the modern battlefield?
    Yaroslav [00:26:21]: It’s diminishing.
    Brandon [00:26:22]: Diminishing.
    Yaroslav [00:26:22]: However, I think there might be technologies which will, revive the tank. Look, tank still provides you armor, and armor is important. Like, you still need to armor and firepower, right? Like, you can be an armor personal carrier that provides you, armor. The challenge that currently exists is armor is not very well protected against incoming drones. However, there are ways to do to protect it. We were previously talking about this before the podcast. The CEO of Rheinmetall, recently sort of ridiculed, Ukrainian drone industry, saying that like, there is nothing interesting there, no real innovation, no to stand Compared to like, Rheinmetall or Boeing, and it’s all made by housewives. There was like, obviously a ton of memes about this people ridiculing the CEO of Rheinmetall. And one of the best quotes, I heard on this topic is from my friend, Alexey Babenko, who’s, the head of and founder of VIARI Drone, which is one of the largest manufacturers of FPV drones. They’re our partner. They’re using our autonomy. So he said that the drones we manufacture in one day will be more than enough to destroy all the tanks Rheinmetall manufactures in a year.
    Yaroslav [00:27:52]: Then, yeah, cost-wise, of course, a drone is like, $500 and a Rheinmetall tank is what, probably 5 million-ish or maybe more.
    Brandon [00:28:00]: Don’t mess with those housewives.
    Yaroslav [00:28:03]: Drone wives.
    Brandon [00:28:04]: Drone wives.
    Yaroslav [00:28:06]: That’s it.
    Noah [00:28:06]: There’s a classic saying that everyone always fights the last war.
    Noah [00:28:12]: Yet do How did So from your standpoint, how did we get to the point where tanks became irrelevant in at least for now In a matter of just a few years?
    Yaroslav [00:28:24]: Look, I think it’s the same way, how do we get to the point that calculators become irrelevant?
    Yaroslav [00:28:31]: Now we have iPhones. Like, why would you need a calculator? Technology progresses and its influence grows non-linearly. It’s all exponential. So I can tell you that full autonomy, when you put it on a drone Look, so if you, if you think about a tank and a like, it’s not a direct comparison, but even, like, a drone and a artillery shell or like, sort of cost per kill, an artillery shell for 155 caliber, which is a standard NATO caliber Currently market price is about $4,000 per piece. So compare that to say, $400 per drone. That’s 10 times more expensive. Account for the amortization of the artillery gun and for how vulnerable it is and what is the sort of tactical, capabilities it gives you as compared to a drone. You’ll figure out that an FPV drone is maybe three orders of magnitude, more versatile, more useful, more capable than artillery and many of than a classic artillery. Many of Because there are different types of artillery. Not just, like, one 155. You have mortars, you have all that. But give or take, roughly three orders of magnitude maybe. Again, it doesn’t have that firepower. It’s not one-to-one comparison still.
    Yaroslav [00:29:53]: Now, take that FPV drone. When you put full autonomy on that FPV drone, which can be not very expensive, like systems that we’re, producing are like, in hundreds of dollars of pure bomb
    Full Autonomy: From Human Pilots to Smartphone-Directed Drone Missions
    Noah [00:30:06]: Just interrupt. You said full autonomy Just a second ago you were saying that the autonomy here is guidance, right? It’s not decision-making.
    Yaroslav [00:30:14]: No, I was I was saying that’s the f-First and sort of easiest pieces of autonomy that was fielded by us. But if you, if you add full autonomy to a drone
    Brandon [00:30:24]: He, I think he’s asking what does it can you, for the listeners, can you explain What the term full autonomy means?
    Yaroslav [00:30:29]: Basically, I think a good way to think about an FPV drone is like an iPhone of warfare. It’s, like, very inexpensive, very mass producible, very versatile. You don’t need a bunch of other things when you have a iPhone in your pocket. You don’t have, need an MP3 player, you don’t need a calculator, don’t need other things. All right? So FPV drone is an iPhone. Or like, okay, Apple please don’t sue me, is a smartphone. And then, when you add autonomy to it sort of becomes like Uber or ride sharing. Okay? So what it means is instead of actually being a trained pilot who has this complex remote controller device which requires a couple months of training to actually pilot the drone, and then having to pilot it for 30 minutes, flying towards the target, et cetera, et cetera, now you basically, you have your smartphone, you have a drone, you pick your smartphone, you say, “We are here. The bad guys are here. Go and get them.” And the drone goes up, flies in a given direction, localizes itself on the map, finds the dedicated area where they, the bad guys are supposed to be sees the bad guys, bombs them, return, like, watches, so does a damage assessment, returns back, sits down, and then you can pick it up and watch the video if you didn’t have the radio link, right?
    Noah [00:31:59]: That’s a bomber drone.
    Yaroslav [00:32:00]: That’s full autonomy for a bomber drone, right?
    Noah [00:32:03]: You’re saying that no human decision is made in this entire process?
    Brandon [00:32:06]: That’s not, that’s not what he’s saying.
    Yaroslav [00:32:07]: A human decision was made at the beginning of the process-
    Noah [00:32:09]: I get it. I get it
    Yaroslav [00:32:09]: The same way as you would fire an artillery.
    Yaroslav [00:32:12]: When you fire an artillery, you don’t stop at like, 500 meters away from a target and ask it whether, you want to strike or not. That’s exactly, a human decision is always made at some point. So when you do that’s full autonomy, and such full autonomy is happening as we speak. And such full autonomy increases the capabilities of an FPV drone, which is already, like, three orders more powerful than an artillery shell. Full autonomy increases its capabilities by four orders of magnitude because now you can have 100 times as many people who can use it, because you don’t need to train those people, and this is important. You can have 10 times, mission success rate, and you can have 10 times utility per drone because now instead of being one-way kamikaze, it’s, it can be a bomber.
    Brandon [00:33:05]: Now wait, let’s, you said 10 times mission success rate, which means that fully autonomous bomber drones succeed in their missions 10 times more often than human piloted bomber drones do. That’s an important thing to know.
    Noah [00:33:17]: Maybe, to push back on
    Brandon [00:33:19]: They’re super, they’re superhuman. They’re, they’ 10X superhuman.
    Yaroslav [00:33:22]: They’re not vulnerable to electronic warfare. They don’t care about the radio horizon. They don’t lose track during navigation. They are not susceptible to human error when, an artillery shell or other drone blows up besides you and you’re like, “Hell no,”like, “I’m getting out of here.” Right? That doesn’t happen to an autonomous drone. Like, all of those things. Like, we have, like, one of the brigades that’s using our drones with just first level autonomy They literally said that their success rates-
    Brandon [00:33:53]: What’s first level autonomy?
    Yaroslav [00:33:54]: First level autonomy is just the terminal guidance.
    Yaroslav [00:33:57]: By the way, we have video of that. We can watch that.
    Brandon [00:33:59]: Terminal guidance means a human gets it nearby and then the AI takes over.
    Yaroslav [00:34:03]: The human flies it all the way, like 30 kilometers towards the target, and obviously the target was probably given to that human by someone who’s flying some ISR drone, some reconnaissance drone, right? So all the way to the target, and once you see the target from a distance of 500 meters, you do target lock, and from there drone flies autonomous. So just that feature alone, it has increased the guy’s, his call sign is Grom, so it has increased his, mission success rate, like precision of mission, yeah, mission success rate from 20% to 71%, and it also increased his kill zone from three kilometers to 10 kilometers, which means there’s certain area around the front line which is designated kill zone. Whenever enemy goes into that area, it’s almost guaranteed to be to be destroyed by a drone. And then obviously the drones are not launched from like, the zero line. They’re usually launched from like, minus 10 kilometer-
    Mission Success, Failure Modes, and the Five Levels of Autonomy
    Brandon [00:35:03]: What is a zero line?
    Yaroslav [00:35:05]: Zero line is sort of an imaginary line of control, of two conflicting forces.
    Brandon [00:35:14]: It’s important to explain these things to a lot of the listeners who are
    Yaroslav [00:35:17]: Thank you for asking
    Brandon [00:35:18]: Familiar with warfare.
    Noah [00:35:20]: Myself.
    Noah [00:35:20]: I’m one of those listeners.
    Brandon [00:35:20]: You said that level one autonomy, in other words just terminal guidance, just, like, human gets it to the finish line and then it goes over the finish line, increases mission success from 20 something percent to 71%, or something like that.
    Yaroslav [00:35:33]: Increases the kill zone
    Brandon [00:35:34]: Increases the kill zone
    Yaroslav [00:35:34]: Three kilometers to 10 kilometers.
    Brandon [00:35:36]: Got it.
    Yaroslav [00:35:36]: On both parameters-
    Brandon [00:35:37]: What is full autonomy, dude? And
    Noah [00:35:38]: Actually on real quick, can we define mission success and like, maybe in a way, what are the failure modes of missions?
    Brandon [00:35:44]: I have a guess what mission success is.
    Noah [00:35:46]: But I could
    Brandon [00:35:47]: Get ‘em.
    Yaroslav [00:35:49]: No, but that’s a very good question, in fact, because, even if you fly into the target, well, first the target can be damaged or destroyed. Those are two different modes. Then there can be different targets. A sole infantryman is one kind of target. A dugout where supposed there are some, enemies there is another kind of target, and a some mechanical equipment is another type of target. Radio emitting equipment, which, like, often, like, the targets that the military want to get more than anything else is the some enemy radio tower or something like that or some small radio dish that really makes life difficult in that area, in that combat area. So those are different targets, right? It can be destroyed, can be damaged.Then sometimes, the drone hits but doesn’t explode. Like, that happens. And then, there are other failure modes. You didn’t even reach the target because you were A jammed by electronic warfare; B, you lost the control over drone because of the radio horizon; C, you were jammed by a different type of electronic warfare that happens way before You hit the target area. It’s, impacting your, video receiver. So like jamming on video or jamming on control are two different types of jamming. Then something malfunctioned on a drone, just a mechanical malfunction, maybe like a motor broke or like, whatever. So all of those are different failure modes. Yeah, or maybe you got lost, you’re navigate navigating to your, to your target. That happens, too.
    Noah [00:37:41]: The Level one autonomy, basically you manage to point in a direction.
    Noah [00:37:49]: You go there, and then the last mile The drone taking over.
    Yaroslav [00:37:52]: We define this like, I define that but it sort of got picked up by the industry. We define five levels of autonomy. So level one is terminal guidance. It’s what we just discussed. Level two is bombing. Level three is autonomous target detection and engagement decision. Level four is autonomous navigation. And level five is autonomous takeoff and landing.
    Noah [00:38:15]: Those are good things to know
    Yaroslav [00:38:16]: Those are five levels of autonomy. Now, if you
    Noah [00:38:19]: I have a question for you.
    Yaroslav [00:38:19]: Sorry. Like, let me finish with
    Noah [00:38:21]: Sorry
    Yaroslav [00:38:21]: Theoretical part.
    Noah [00:38:23]: What is Tesla running at right now?
    Yaroslav [00:38:25]: Tesla?
    Noah [00:38:25]: No, sorry.
    Yaroslav [00:38:26]: That’s very good point. Like, it’s exactly, it was inspired by the levels of self-driving autonomy.
    Noah [00:38:32]: Waymo’s level five, right?
    Noah [00:38:35]: You just tell it where you want to go, it picks you up, and then you go there.
    Yaroslav [00:38:36]: I think, like, if you, if you look at the classic definitions of self-driving cars, Waymo is still, like, level four because it still requires even remote, but still, like, human control. It’s like if Waymo gets in trouble, there is an operator who takes over and resolves this. So that would still be a level four. It doesn’t map directly, but it’s also five levels.
    Brandon [00:38:58]: Can I, can I interject a question here? In terms of an FPV drone that’s like a suicide drone that’ll just blow itself up killing something, how do what it hit? Like, does it, just transmit back, or do you sort of like, lose track of it and hope it hit? Like, what happens to that?
    Yaroslav [00:39:16]: That’s a great question. So
    Brandon [00:39:18]: You need another drone
    Yaroslav [00:39:19]: Like, the current battlefield in Ukraine is saturated with different types of drones. So obviously you have all the FPV drones and last year alone, Ukraine manufactured about 4 million of these, and then Russia’s maybe, like, 20% less than that. And for this year, the publicly voiced target was 7 million on Ukrainian side. So it’s, like, serious numbers. We’re getting in serious numbers here. And then besides those, there are different, reconnaissance drones, ISR as we call them, and there are sort of tactical level ISR where we, both Ukrainians and Russians usually use, Mavic, drone by DJI. And then there are a bunch of locally produced drones, which are sort of fixed wing drones that can stay in the air for much longer than Mavic, maybe, like, half an hour. And then, there are drones that can stay for many hours or even up to a day. And those drones have, are more expensive, have more expensive cameras, et cetera, et cetera. We hunt those drones that Russians launch. The Russians hunt our drones, and so on. But ideally, when you, are a group of soldiers operating an FPV, you’ll have someone in your, company, or someone in your platoon who has an ISR asset that will do target designation for you. They’ll say, “Oh, like, there’s a Russian vehicle over there. Go and get him.”and you go there, you get it, and they’re like, “Okay, confirmed.”
    Battlefield Surveillance and the Eight Dimensions of Autonomy
    Brandon [00:40:57]: Those guys are watching. They have their own drones in the sky.
    Yaroslav [00:40:59]: Target destroyed. They have, like, a carousel of drones because One Mavic cannot stay more than 30 minutes. It
    Brandon [00:41:06]: They’re constantly surveilling the battlefield.
    Yaroslav [00:41:07]: Almost every spot on the battlefield.
    Yaroslav [00:41:11]: It’s not always the case. Sometimes you will not have a surveillance asset, so then you would launch another FPV just to confirm that there was a hit. Then if you see there was a hit and you’re not sure if it completely destroyed, you maybe hit again for good measure.
    Brandon [00:41:26]: You double tap.
    Yaroslav [00:41:28]: That’s how it works. But I was about to give you another sort of piece of taxonomy. So you have five levels of autonomy, right? Then you have sort of eight dimensions of autonomous battlefield. So what is eight dimensions? It’s crucial to understand how autonomy evolves in a modern, battlefield environment. So dimension number one is level of autonomy. What are the capabilities that your asset has? Dimension number two is the platform you’re operating on. So it can be a quadcopter, a fixed wing drone, different types of maybe, like, a long range drone or short range drone, but it can also be a missile. You can have autonomy even on an artillery shell or a ground vehicle or a sea vehicle. So all of those are different platforms. Level three would be domain. So it’s ground to ground or ground to air as an intersection, or ground to sea or sea to air. They’re all, like, all the nuances with different domains. Then level four, would be higher levels of autonomy, such as swarming, drone carriers, drone nests, et cetera.
    Brandon [00:42:39]: Now when you’re saying level, you’re talking about dimensions, not about-
    Yaroslav [00:42:42]: Sorry. Yeah
    Brandon [00:42:43]: Autonomy levels. So dimension four.
    Yaroslav [00:42:43]: The dimension. Yeah, I used to say I was supposed to say dimension. I say dimension because each of them works with another, right? So you might have, like third level autonomy, fixed wing drone operating in land to air, and stuff like that right? And then operating in a swarm or operating from a nest. Right? Then you have, sort of dimension number five is environment. So is it day or night? Is it summer or winter? Is it, humid, cold, dry? What kind of target is it? Is your target hiding in a forest, or is it, behind a hill or within buildings? So all of that is environment. Then you have, dimension number six is command and control. How are you dealing with or like, tens of thousands of those assets around the battlefield? How are you coordinating that on the higher levels of command? How are you collecting data? All that.
    Yaroslav [00:43:44]: Dimension number seven would be infrastructure, so things like simulation, data collection tools, security, deployment mechanisms, et cetera. So all those systems have to be developed separately and integrate with all the others. And finally, dimension number eight is sort of distribution. Have you deployed 100 of these systems or 100,000 of these systems? Because those are two very different ballgames. So that now gives you a more broad overview of how autonomy propagates across the battle space.
    Targeting, Human Responsibility, and Rules of Engagement
    Noah [00:44:23]: As someone who has done machine learning and had gone out of distribution and had things, go horribly wrong, you were talking several of these, kind of axes of thinking about drone warfare seem like they could be very susceptible to some sort of distribution shift if you start making things autonomous.
    Yaroslav [00:44:41]: Like what?
    Noah [00:44:41]: I mean Well, first of
    Yaroslav [00:44:43]: If the I’m very interested Sort of sort of kinds of scenarios that you’re thinking about.
    Noah [00:44:48]: Like the most obvious one is you, if I assume these are computer vision guided systems for at least the last mile, how do you ensure that oh, well, like you now have some fog roll in or something, and you, the drones just attack the wrong thing? Or maybe, it probably will not turn around and fly back and attack you, but you
    Yaroslav [00:45:10]: Same, the same, the same question, how do you ensure that your mortar fire hits the right thing? Well, it’s like mortar fire, give or take half a kilometer could be plus or minus. So maybe you fire one, and then you fire another. So drones are actually, much better in being precise in those scenarios. And I think, to your point, I think five to 10 years from now it will be immoral to use weapons without AI.
    Yaroslav [00:45:44]: ‘Cause weapons without AI will be more likely to cause, collateral damage or unwanted damage. Same way, it will be immoral to drive your own car manually on a public road because it’s more likely to cause, unwanted damage.
    Noah [00:46:02]: Wow, I never considered that might
    Brandon [00:46:04]: Really? That’s definitely coming.
    Yaroslav [00:46:07]: Anyway.
    Brandon [00:46:07]: No, but that’ I don’t know, it’s an obvious, an obvious thought. I agree with you.
    Brandon [00:46:12]: I, No, they, obviously they’re not going to let you drive once most of the cars on the road are autonomous.
    Noah [00:46:17]: No, that one, don’t I believe.
    Yaroslav [00:46:19]: No, I think you were you were talking about drones, right?
    Brandon [00:46:21]: The drones, right. Cool.
    Yaroslav [00:46:22]: The weapons, right?
    Brandon [00:46:23]: Friendly fire and collateral damage and stuff like that is all minimized with AI.
    Brandon [00:46:27]: Here’s my question. Take all let’s go to level six autonomy. Let’s take all of the target selection. Let’s take all the battlefield data, integrate it into one big AI, and have that big AI basically be in command of the battlefield And agentically do target selection.
    Yaroslav [00:46:44]: Be the general, right?
    Brandon [00:46:44]: It’s a general. It’s, you’ve cut humans out of the loop except maybe as dexterous robots, repairing drones and fastening things to drones or maybe something like that because you don’t have those robots yet. How soon are we there? AI general.
    Yaroslav [00:46:58]: The most important thing to ask ourselves is who will be faster to that us or our adversaries?
    Brandon [00:47:07]: I assume us, but how fast will we be to that? I hope us.
    Yaroslav [00:47:11]: I hope so too.
    Brandon [00:47:12]: How fast can we Like when are we looking at that in terms of like horizons years?
    Yaroslav [00:47:18]: Like technically, it could be done now. The question is of course, there’s, some engineering work to be done. The bigger challenge is deployment. Right? So okay, technically Like operation in Iran, right? They, the publicly, it was claimed that I think Palantir system was used for target designation, et cetera, et cetera. So it is not exactly as you say, the AI makes all the decisions, but basically AI goes through all the data you have, gives you these 1,027 different targets and says, “You-- To confirm, please press Okay.” And you look at the targets and you’re like, “Yeah, sounds right. Press Okay.”so that’s, I think that’s where we are now already, or we were a couple weeks ago as we’re recording this on April 10th. Another question is how massively deployable it is. Is it, like, every decision being made like that or is it, like, just some of the decisions made like that? And then different levels of command and control. There you have, like, the platoon, the company level, the battalion, et cetera, et cetera, et cetera. But the tricky thing here when we get into that territory, the tricky thing is If your enemy is getting advantage of being Thousand times faster than yourself by deploying such systems What do you do?
    Yaroslav [00:49:10]: You got to-
    Brandon [00:49:12]: The if the enemy is a thousand times faster than you at deploying those systems?
    Yaroslav [00:49:16]: Like, if enemy starts deploying level six autonomy, as you call And you have not started doing
    Brandon [00:49:22]: You’re in trouble
    Yaroslav [00:49:23]: Yes, exactly. So you have to catch up. So my point is that it is very important to think about the safety of these systems, but that thinking should not slow you down in developing them because they are critical for your existential, survival, right? And like, one person who doesn’t think, doesn’t get to think about the ethics of the war is a dead person. That person surely doesn’t get to think about that.
    Brandon [00:49:52]: What would be the safety risk of such a system?
    Yaroslav [00:49:55]: Of course-
    Brandon [00:49:56]: Friendly fire?
    Yaroslav [00:49:56]: Just wrong decisions, right?
    Brandon [00:49:59]: I see.
    Yaroslav [00:49:59]: Maybe, these decisions-
    AI Command Decisions, Dead Zones, and Complex Battlefields
    Brandon [00:50:06]: Skynet AI decides it’s going to use
    Yaroslav [00:50:08]: No, these-
    Brandon [00:50:08]: Drone army to kill us
    Yaroslav [00:50:09]: Decisions will not only be made about drones. They are likely to made about what the humans should do on your side as well. Then obviously some environments are more like Ukrainian-Russian war, where you have
    Brandon [00:50:26]: It will have to choose to risk lives. It will have to choose to sacrifice human lives-
    Yaroslav [00:50:28]: Of course
    Brandon [00:50:29]: On your side.
    Yaroslav [00:50:29]: Of course. And then some environments are just, like, dead, like, dead zones and there are no civilians there, or virtually no civilians close to the front line because, like, super dangerous. Everyone has evacuated from there. But there are other environments which are more like, okay, there’s a counterterrorist operation. There’s, like, a group of terrorists or a group of civilians. Or like, it’s like the recent operations in Iran, I imagine that the US and Israeli forces do not want to harm civilians. They only targeted the military targets there, right? So in those situations, it’s a different level of responsibility for that decision-making as well. And then there is just such a big variety of those military missions, and I’m not even, like, well-informed or well-educated in military science to tell you about all those scenarios. We would need to put some general besides me, and maybe a Ukraine general and American general would have told you very different stories about these things.
    Brandon [00:51:34]: Got it. Can I ask a few more questions? All right. So in 2013, I wrote one of my first, paid articles ever was about how the era of drones will change human society. I was just sitting around bored thinking about things.
    Yaroslav [00:51:54]: You were way ahead of your time.
    Brandon [00:51:55]: I said, I said, “The following will happen.”
    Yaroslav [00:51:57]: It’s, this article is real. I’ve read it.
    Yaroslav [00:51:58]: It’s actually-
    Brandon [00:51:59]: I said small autonomous, suicide drones, will cleanse the battlefield of human infantry. Human infantry will not be able to stand against swarms of AI-powered, suicide drones. That was I didn’t even know about, like, AlexNet at the time, I think.
    Yaroslav [00:52:19]: You’re just an avid sci-fi reader.
    Brandon [00:52:23]: I’m an avid sci-fi reader, but also, like, it’s not Like, there will be a way to do that. It’s a it’s a nonlinear multidimensional search problem, and you get enough compute, you’ll find some search algorithm that will get you there. And so
    Brandon [00:52:38]: I, yeah, I think that one sentence describes the bitter lesson right there.
    Brandon [00:52:41]: It’s just like it’s a multidimensional search space. You search it somehow. I don’t know. Figure out some get a grad student-
    Yaroslav [00:52:47]: Sooner or later
    Brandon [00:52:47]: To make a search algorithm.
    Brandon [00:52:48]: It’s not that hard. Anyway, so but then, but I guess the point is The point is that human infantry on the battlefield will be will be gone at the end. I wrote that in 2013. Many people on social media laughed at me for that called me hysterical, said things like, “Electronic warfare will knock all the drones out of the sky.”like, “You need humans to hold ground.”that’s something you still hear from a lot of people on social media today. I feel that this article that I’ve written has never been directionally wrong. It has gotten more and more right steadily over time, and that we’re very reading the battlefield reports from Ukraine, where, human infantry are basically guy, like a few guys hiding in dugouts for months, and I’m not sure what they’re doing.
    Yaroslav [00:53:35]: That’s on Ukraine’s side. On the Russian side, that’s just like a zerg rush.
    Brandon [00:53:38]: The zerg rush, and then they just die. Then, but they have some guys in dugouts too, right? Like hiding in dugouts for months.
    Yaroslav [00:53:45]: They have. Yeah.
    Brandon [00:53:45]: Like, but that like, what are those guys doing in the dugouts? Are providing, like, frontline, like, reconnaissance? Like, what are they doing?
    Yaroslav [00:53:54]: If there is a guy in a dugout with some bullets and automatic weapon, the other guy cannot come and take the that dugout. That’
    Brandon [00:54:07]: I see
    Yaroslav [00:54:08]: They are they’re establishing control over territory.
    Brandon [00:54:10]: I see. So that is so there still is a use for human infantry on the battlefield as of today.
    Yaroslav [00:54:15]: Like
    Brandon [00:54:15]: How long will that last?
    Yaroslav [00:54:17]: I think it will last for a while. This is funny. There’s this whole Layer of the modern culture, a modern Ukraine culture built around the war-related stuff. So there is this -Punk rock band, that is called SZC, I guess in English that would be. Which stands short for like a deserter or something like that. So anyhow, this band has a song titled “2030.” It’s basically about the year 2030, and the war still goes on as like the whatever, third world war or whatever. And they basically, they, sang about the AI and like cyborgs and everything, but the simple infantry is still needed, and we’re still, like, getting cold in those dugouts, and we’re still doing our job. That’s sort of the theme of the song. And it seems like that’s actually what’s going to happen. There are
    Ground Robots, Simulation, and the Limits of World Models
    Brandon [00:55:30]: Ground robots will not replace humans in the dugouts soon.
    Yaroslav [00:55:34]: I’m very much interested in following the whole humanoid robot theme and
    Brandon [00:55:39]: What about like a dog robot?
    Noah [00:55:41]: Or just mobile controlled platforms or something.
    Brandon [00:55:44]: Spider robot, yeah.
    Brandon [00:55:45]: Everything evolves into a crab.
    Brandon [00:55:46]: You build a crab robot.
    Yaroslav [00:55:47]: A humanoid-
    Noah [00:55:48]: The carcinization of warfare.
    Yaroslav [00:55:51]: There is a lot of utility in humanoid robots because the world is designed around humanoids. So I would not, like, 100% disqualify the possibility that sometimes 10 years in the future, humanoid robots, will be actually fighting. So that’s an actual Terminator kind of scenario.
    Brandon [00:56:14]: Yeah, in the first Terminator movie, you look at what they’ve got on the battlefield, they’ve got flying bomber drones and humanoid robots.
    Yaroslav [00:56:20]: Look, the cost of large language models of running them is getting so low, you can have basically an inexpensive computer running, what was a state-of-the-art model a year and a half ago, running it locally on a device with an open source model, which also means that the Chinese can have it, the Russians can have it, the North Koreans can have it, et cetera. So that is already possible. And with when we’re looking at the acceleration of the neural nets, I would’ve, if not the acceleration of the large language models, I would’ve said that I don’t think that humanoid robots will be able to be useful in the battlefield earlier than in 10 years. But if you account for the exponential, it might be five years or so. The problem with all of the autonomous systems, and it’s like starts with self-driving cars and even with all the AI, like modern day AI agents, to make them really, useful, you have to solve such a long tail of edge cases, that it’s really difficult to make them useful. Like we were promised, self-driving cars, what, like 2007, Sebastian Thrun and Google, and even before that all the challenges, everything. And Elon of course told us it’s going to be one year from 2014, and now we still don’t have self-driving Teslas everywhere. We have Waymos in SF and some other places, but they’re still, like, not perfect. So I think, I expect something similar from self-flying drones and fully autonomous drones, and we saw that firsthand as with each level of autonomy that we’re adding, there is a very wide distance between a prototype and something that is ready to be scaled to millions of units and something that has been scaled to millions of units. But the race with like AI coding tools is just insane. So things might accelerate very fast, faster than we can imagine.
    Noah [00:58:46]: I think your point is that with due to this long tail behavior Level one autonomy as you’ve defined it, is actually very natural. Like you basically are just solving an image recognition and tracking system.
    Yaroslav [00:59:02]: It’s actually interesting that you say it that way, and I thought about this the very same way, and we have this joke that there are like 200 companies in Ukraine which are trying to solve last mile, targeting or terminal guidance. It seems like we’re like the only company that actually solved that because even that problem-
    Noah [00:59:22]: I’m not saying it’s, I’m not saying it’s trivial, but it’s at least something that you imagine given our current state.
    Yaroslav [00:59:26]: Like us and Eric Schmidt, like Eric Schmidt’s companies are pretty good.
    Yaroslav [00:59:29]: Like, I actually have lots of respect to what they’re doing, and they’re, they have been practically influential and helpful on the battlefield, and they have good engineering.
    Noah [00:59:38]: I wasn’t, I wasn’t saying it’s trivial. I’m just saying this is a something naturally adaptive based upon things that we know work, well. But some of the other domains that where you do have to make decisions and you have a long tail become much harder, and you worry about edge cases more.
    Yaroslav [00:59:57]: Like the more, the more complex behavior you’re trying to simulate, the more edge cases there are right? The more ways to do it wrong there are. And then there are different approaches. It’s like if you think about, if you read academic papers about robotics, right? You sort of the robot is represented as something that has the sort of sensor input, and then you have three, levels of sort of logics or decision-making, which are perception, planning, and control, and then you have actuators as output.So pre-neural nets, you would do perception output and control all with classic logics, right? Then, with AlexNet and computer vision, you could do perception with neural nets and the rest with logic. You cannot currently do each of those separately with neural nets, each of those separately with logics, or you can just have one huge neural net that just takes lots of sensory data. It’s not just pixels. Could be sound, could be accelerometer, could be everything, as input, and just outputs the controls. And some of the self-driving car companies are doing that or like, experimenting between different ways of doing that. So you can also, like, think about that and the way you implement those features, also influences how much degrees of freedom the system would have, right? Like control, you can do it classical algorithmic control with common filters and PAD filter, PAD controllers, et cetera, or you can do a neural net, that was trained in a gym with a reinforcement learning, et cetera. And those would be two different behaviors of a system.
    Noah [01:01:53]: I-- Maybe my point was just much more high level. It’
    Yaroslav [01:01:56]: Or you can If you go even like, if you go high level, you can, you can like train to like have whatever, like Feifei Li and folks who are doing like physical, sort
    Brandon [01:02:08]: World models
    Yaroslav [01:02:08]: World models, right, physical intelligence, they’re trying to make these big models and sort of understand the world and then supposedly you have such model and you can tell a drone, “Okay, like, go over that hill and like, find the bad guys and then get them,”or “Make me a video, make me a photo of the guy smiling and get back to me.” Right? That’s one way. Another way you have like these subsystems, like one is navigation, another is finding the person, another is like getting to them to take a photo. And those are again, very different behaviors. And then it’s not that one is necessarily better than the other, and we might have more technological ability to do one or another. But all of those systems will exist. And then again, you should always keep in mind that it’s only the not only the good guys that are developing these systems, the bad guys are developing these systems as well.
    China’s Drone Supply Chain and the West’s Manufacturing Gap
    Noah [01:03:00]: I guess where I’m going with this back to Noah’s original thought with the end of the end of the soldier. And so in order to replace-
    Brandon [01:03:10]: Or at least the end of the rifleman.
    Noah [01:03:11]: Or the end of the rifleman, yeah.
    Yaroslav [01:03:13]: I’m not seeing that very close, and it was like I’m, as much as I’m a lover of sci-fi and all of that and a technologist, the more I try to be
    Yaroslav [01:03:27]: Like the I try to have certain humility about these things, and like the military, domain and there was just so much human history and blood and tears, dedicated to sort of understanding this art of war and perfecting it and so on. There is so much knowledge in there that I don’t feel like I even started to comprehend, a lot of that. But one thing that I really understood is that even though drones are now making eighty percent of the casualties, you go to the actual officers, you talk to the actual, like, brigade commanders, corps commanders, and they explain to you, how all of it fits together, how when you’re thinking about an operation that involves a couple thousand people to get this piece of land, out of the enemy’s hands, deoccu deoccupy it, how it is so complex, it involves, dozens of different types of drones and then land operations and reconnaissance operations, psychological operations and then aviations and tanks and logistics and all kinds of these different assets. So modern warfare is really very complex, and the fact that the drones are the latest, coolest thing, and then the AI is latest, coolest thing, doesn’t mean that now it’s that and only that right? So yeah. Whoever’s looking into that I think should realize that it’s not just what the press talks about, that the reality is much more difficult, much more complex.
    Brandon [01:05:17]: Let’s talk about China and China’s manufacturing capabilities. So suppose that someone, like suppose the United States went to war with China. And
    Yaroslav [01:05:26]: I hope not.
    Brandon [01:05:27]: I hope not as well. And then but suppose that drones were very essential to that war of all the types of drones that we’re talking about here, and that suppose that China said, “All right, well, you need X and Y and Z, to make those drones to fight us, and we control the production of X and Y and Z, so we’re just going to cut you right off, and now you have no drones.”
    Brandon [01:05:47]: I know that a number of countries, including Ukraine and Taiwan, have been making moves to China-proof their drone productions that China couldn’t do that. Examples of things they might be able to cut off might include rare earths, fiber optic cable that you were talking about before, various other things that where even if they don’t control one hundred percent of the production, they control enough of the production that would be extremely expensive to produce it without relying on Chinese sources. Or the market’s fragmented enough, et cetera. What do you see as China’s key bottlenecks, and how easy are those to overcome in terms of China-proofing drone production in case of a war against China?
    Yaroslav [01:06:30]: Let me start with a saying that -Although China does not sell directly to Ukraine and it does sell directly to Russia, a lot of Ukrainian supply chains, they start in China, right?
    Yaroslav [01:06:49]: We’re not in a conflict with China, and we would not want to be in a conflict with China. And we’d hope that China stays a neutral power between Ukraine and Russia and the US as well. That said, the scenario that you’re describing, everything is much worse.
    Yaroslav [01:07:11]: Think about this. Last year, Ukraine produced four million FPV drones. Ukraine is not the most industrious nation in the world.
    Yaroslav [01:07:19]: China can produce four billion of these FPV drones.
    Yaroslav [01:07:23]: China can make them not drones with propellers, but fixed-wing drones, which go not forty kilometers far, but maybe two to three hundred kilometers inland. Slightly more expensive.
    Brandon [01:07:34]: With internal combustion
    Yaroslav [01:07:36]: No. With
    Brandon [01:07:36]: Battery-powered fixed-wing drones.
    Yaroslav [01:07:38]: Battery, yeah.
    Brandon [01:07:39]: What’s the propulsion system on those propellers?
    Brandon [01:07:43]: I don’t-- I just don’t know how that works.
    Yaroslav [01:07:44]: You have that. They can also make them all fully autonomous. They have DJI, the world’s most advanced drone company. They can make them fully autonomous without GPS, without anything. Then they can put those drones on maybe tens of thousands of fully autonomous underwater submarines, or maybe not even that just on shipping containers and barges that ship goods or freight ships. And then they show up with millions of drones packed onto those, sea vessels. They show up to any coastline in the world, be it Taiwan or be it California, and they have millions of long-range impactors targeted at a at a piece of land.
    Yaroslav [01:08:38]: What do you do with that? There are not enough hunter submarines. There are not enough anti
    Brandon [01:08:46]: Ship missiles.
    Yaroslav [01:08:47]: Anti-ship missiles, anti-ship, planes. They can produce these assets, on in tens of thousands of factories because they’re so simple to produce that even the if the FBI director picks a phone, calls to the President of the United States, says, “Hey The scenario Yaroslav was warning us about is beginning to unfold. We need to do a preemptive strike,”You wouldn’t have enough assets, to do preemptive strikes because there can be like tens of thousands of places where these things are being manufactured. And then so to counteract a scenario like that we would need to have like a similar amount of mass
    Brandon [01:09:39]: You mean a similar number of drones.
    Yaroslav [01:09:41]: Yes, to intercept that like either in sea or in air, et cetera, at a similar cost, right? So economics should work out. I’ll tell you that currently, we in the West and we in the United States, we don’t have the technology to do that. We don’t
    Four Layers Behind China: Technology, Manufacturing, Components, and Rare Earths
    Brandon [01:10:01]: What technologies, key technologies do we lack?
    Yaroslav [01:10:03]: Like autonomy, mass drone manufacturing, stuff like that.
    Brandon [01:10:06]: We lack autonomy technology?
    Yaroslav [01:10:09]: I think so.
    Brandon [01:10:10]: Because our computer vision algorithms are not as good?
    Yaroslav [01:10:12]: It’s not only about the computer vision algorithms. It’s like the like if a group of companies by Eric Schmidt founded two, three years ago and my small startup, was like maybe not as small, but it’s also founded three years ago, are sort of two of the leading companies in the world, and maybe a couple others who are capable of something like that but not really on small drones. I do think we’ll, we were behind China in technology. So we lack technology, we lack mass manufacturing capacity, we lack the components, and we lack the rare earth materials. So there are four layers in which we’re behind this challenge. And that’s why it is my point that we in the in the West, and especially in the United States, we should, there should be far more smarter people working in defense, and there should be more funding, if we want to keep the resemblance of our good past life.
    Brandon [01:11:14]: That’s really important. Would you say that right now, as things stand, in conventional terms, not, abstracting from strategic nuclear weapons, but in conventional terms, would you say that China is now the supreme conventional military power on Earth, given its ability to manufacture and deploy drones in the quantity and quality that you just described?
    Yaroslav [01:11:35]: Look, I don’t, I don’t think we have all the information to claim that but
    Yaroslav [01:11:41]: We cannot count it out, and that alone should be a big warning sign. We have not seen, Chinese drones in action. We’ve seen some of the Iranian drone in action and Russian drones in action. Not Chinese really. Not seen Chinese forces in action. Obviously, hopefully, this never happens, but the conflict of a scale US, China, there are many Sort of classical assets that we should not discount. As we just discussed, we should not discount artillery in the land war, we should not discount, air-carrying groups and the air force, and long-range missiles and electronic warfare and satellites, et cetera. But then there are also things that we, at least we as a general public don’t really know about China. I’m sure there’s a lot of information that the US intelligence has about the Chinese capabilities. -I think if you, if you get back to the scenario that I just described, and if you take that like, sort of to the maximum You basically see that whoever has bigger manufacturing capacity, that side wins.
    Brandon [01:13:03]: That’s just a typical law of conventional warfare Has been forever.
    Yaroslav [01:13:07]: Sort of.
    Noah [01:13:07]: Do you read Noah’s blog?
    Yaroslav [01:13:09]: I not as often as I would like. But I read Noah’s, X.
    Brandon [01:13:15]: It’s not necessary.
    Noah [01:13:15]: It’s a theme where
    Brandon [01:13:16]: Don’t read my X.
    Brandon [01:13:19]: It’s just for
    Noah [01:13:19]: He doesn’t, he has no opinion about certain things. Yeah
    Brandon [01:13:22]: It’s just jokes.
    Yaroslav [01:13:22]: No opinion. Okay.
    Brandon [01:13:22]: Okay, so here’s the I guess there’s two questions here. The question of could The United States and other countries allied with the United States even develop supply chains that are independent of China to make any of these drones? And the second question is could they do it in sufficient mass? And so I think the answer to the question of can they do it in sufficient mass is today, no. But in a extended, prolonged war situation, things change a lot. And all the development restrictions that we put on new factories go out the window, and a sense of urgency. Ukraine obviously wasn’t making all these drones before the war.
    Yaroslav [01:14:04]: Of course.
    Brandon [01:14:04]: So if America had the same kind of urgency that Ukraine has now, things would happen. Things would move, and of course, America has allies too, or had allies until recently, and may have them again in the future. But America has or had allies that would also scale up very quickly, like Japan and European countries if we ever ally with them again, et cetera. And so a lot of things could then change in terms of the actual mass. So I, in terms of looking at China and saying they have all these factories today, and looking at the history of conventional warfare, America had very few military very little defense production capability on the eve of World War II, and ended up easily outproducing everyone else, even the Soviet Union.
    Yaroslav [01:14:47]: Maybe not easily. Yeah.
    Brandon [01:14:49]: Not easily, but by a long, a long shot.
    Yaroslav [01:14:51]: Also the added benefit of not being attacked.
    Brandon [01:14:54]: That’s right. That’s right.
    Yaroslav [01:14:54]: That helps.
    Brandon [01:14:55]: Who knows how Secure they are now, but or what, where cyber influence
    Yaroslav [01:15:03]: No, look, I totally agree with your sentiment. I like, and I’m not as y, I’m even less doomerish than you are. Or as it seems to me, you’re a little bit doomerish, but like, in the long term, you’re bullish.
    Choke Points, Europe’s Wake-Up Call, and Defense Industrial Policy
    Brandon [01:15:17]: I’m not, I’m not doomerish. I’m thinking about the I’m thinking about what we need to do.
    Brandon [01:15:21]: I’m not, I’m not thinking like, “Oh, we’re doomed.” That’s not my point. It’s never useful saying that. If you’re doomed, then just don’t go on podcasts.
    Brandon [01:15:28]: Go pet a rabbit and play a video game or something. It’s Anyway, no, if you’re, we’re not doomed, but I’m saying step one, how, what are the key choke points that we need tomorrow, besides rare earths, which we already know, what are the other key choke points that the West needs to free itself from Chinese supply chains on in order to manufacture even one drone Free Chinese supply chains?
    Yaroslav [01:15:54]: There are companies here who are doing that like our, we have, good friends, a company called Neuros. I know they’re, down in El Segundo or whatever, like somewhere on South California.
    Brandon [01:16:05]: What are the most pressing choke points besides rare earths that everyone talks about?
    Yaroslav [01:16:09]: That’s one of the pieces that we do, thermal cameras. That’s like actually a big one.
    Brandon [01:16:16]: Thermal cameras.
    Yaroslav [01:16:17]: Then, like, the motors. Like you need The special-
    Brandon [01:16:25]: Even after you have the magnets, then you turn them into a really good motor.
    Yaroslav [01:16:28]: You have, you need these special magnets, and then that’s sort of your rare earth component.
    Brandon [01:16:34]: That’s, that’
    Yaroslav [01:16:34]: Like rare earth is not that oh, like there are these metals that only for some reason, God only put them under the Chinese territory and not under any others. No, like they’re distributed. There are plenty of them around Earth. It’s about the refining capabilities and like, investing into that and so on. And then, like, frankly, at some point, we don’t have that many humans. Like, that’s where the humanoid robots help. Like China is a big populous country. The population of like, United West is comparable to that but the population of the US is much lower than that. And I definitely think that the whole West should get their act together, because, ubi semper victoria, ibi concordia. There’s always victory where there is union.
    Brandon [01:17:27]: Agreement.
    Yaroslav [01:17:27]: Agreement, yes.
    Yaroslav [01:17:31]: I think we sort of as the free nations of the world, we should get their act together because freedom is what unites us. And I’m also, like, pretty mad at what’s happening in the European Union. And I think that Current US administration is the best thing that has ever happened to Europe, since World War II probably. Or since post-World War II, because World War II wasn’t the best thing.
    Brandon [01:17:59]: Trump withdrawing the image of omnipotent American support forced the Europeans to get their butts in gear, unite Develop their defense industries.
    Yaroslav [01:18:07]: Also, like, doing that not in a nice way, right? Like when JD Vance came to Munich, Forum one year ago, he wasn’t, like, super nice, like, “Oh, please, our European friends, please could you please increase your, defense spending?” He was somewhat pushy. Let’s put it that way. And that I think that was a necessary measure. Like, I’ve been, I’ve been thinking about that. Could it, could it have been he, maybe he could have been nicer? I was like, no, because, like, the voters of European leaders, the European countries, would have not understood this. They would not get the message. And now I think the message was gotten across, but Europe is still sort ofSlow to wake up, I would put it that way. Things are getting better, but I’m not happy about the speed of how they’re getting better. So when I, when I, like, when I would go to some of the European capitals, I would get back pretty depressed from like, talking to their, military officials and their entrepreneurs, et cetera. Here, I’ve been in the US for the last month or so. I’m not depressed. I’m actually, I’m actually excited. I still think you should, like, 10X the effort in sort of making sure that you remain the strongest power, in the world and you can defend your values, et cetera. But I’m very optimistic, and definitely once we are in danger, I think, we’re just, like, lots of very smart people in the West who can figure these things out. But people in China are also extremely smart. It’s very different from even the Cold War sort of situation. Like, Soviet Union was economically a very declining power. China’s not like that. And then if we look at electric car race, I think they’re ahead of the US and ahead of the whole world, definitely ahead of Europe, which used to be sort of a car superpower. When you look at AI, I think they’re Almost where we are maybe slightly behind. When you look at humanoid robotics, I would argue they’re ahead. And in many other, like, in like medicine and sort of biosciences, there are lots of interesting things there, and like, in consumer space, there are lots of interesting, things there. I don’t know if you heard this podcast called 996. I don’t know if it’s still airing or not. There used to be a fantastic podcast by some, American Chinese, businessman, maybe venture funds.
    Humility About China, Taiwan, and Deterrence
    Brandon [01:20:55]: About the Chinese economy?
    Yaroslav [01:20:56]: About China from a sort of tech venture point of view. So and I lived in China for maybe four months, and I visited a couple times. Like, even WeChat is like, such a more advanced app than anything we have in the West. So we, it’s very important not to be too arrogant, and I think we’re guilty of that like, definitely in the US. Sometimes we tend to be too arrogant. Like, I think, like, humility helps always, at least to me personally. And then I think, like, we don’t have to we don’t have to obviously be enemies. So Like with Ukraine and Russia, it’s like Russia came to kill all of these people and get all this territory. With China and the US, it’s not like that and thanks God it’s not like that right?
    Brandon [01:21:54]: It might be with China and Taiwan. Maybe.
    Yaroslav [01:21:57]: Hopefully not. Yeah. It’s
    Brandon [01:21:59]: Hopefully not
    Yaroslav [01:22:00]: It’s like China has their own, problems probably with human rights, et cetera. But hopefully, it’s still not beyond the fixing point.
    Brandon [01:22:13]: Hopefully. Hopefully.
    Yaroslav [01:22:14]: We should, we should be armed, right? We should, we should be ready to whatever, and then that alone decreases the probability of any conflict. If you’re weak, you’re basically provoking the conflict. The problem with Europe these days is that like, last year, Ukraine and Russia went in drone technology of 2025, year to drone technology of 2026. Europe went from winter of 2022 to spring of 2022. So the gap, Europe didn’t even make one year of progress. The and the US, I would argue, made less than a year of progress as well in the last year. So the gap, the technological gap is getting wider and wider and wider. And at some point, like, I’m looking at polls who are like, very close to us and close to Russia.
    Brandon [01:23:06]: Polish people-
    Yaroslav [01:23:07]: Polish people
    Brandon [01:23:08]: Not surveys.
    Yaroslav [01:23:09]: Not, yeah. Oh, yeah, sorry. Yeah. That’s what I meant. Sorry, not my first language.
    Brandon [01:23:12]: When I’m looking at the polls, what do they, what do they say?
    Yaroslav [01:23:15]: Polish people. Polls.
    Brandon [01:23:16]: No, it’s the right word.
    Brandon [01:23:18]: You’re just thinking about-
    Yaroslav [01:23:20]: No, we.
    Yaroslav [01:23:20]: I’m looking at them, and they bought like 100 tanks and four submarines. It’s like, dudes, you don’t have, like, 1,000 people who know how to operate an FPV. What the hell you’re doing?
    Brandon [01:23:30]: Poland is not preparing for war correctly.
    Yaroslav [01:23:33]: From what I can
    Brandon [01:23:36]: They’re doing a very bad job
    Yaroslav [01:23:36]: They’re not doing it right. And the problem is they’ll be in a situation where, they’re so proud of their winged hussars and like, their cavalry, and the enemy is attacking with airplanes and tanks. That’s literally like the gap is getting wider between Russia and Poland.
    Brandon [01:23:57]: That happened in 1939.
    Yaroslav [01:24:01]: I don’t want that to happen again.
    What America Should Learn from Ukraine’s Defense Valley
    Brandon [01:24:03]: All right, so the Europeans need to wake up more. If you were advising America’s defense establishment, which you might be doing in real life, but if you were saying things on a podcast that might be heard by some people connected to that defense establishment Then which you may or may not be what are like, the besides more funding, more funding, that’ll be necessary for anything, literally anything. But so what are the top priorities policy-wise for America to increase its readiness right now? And let’s say three to five priorities.
    Yaroslav [01:24:38]: Look, I really like this quote, I think it’s by Arthur C. Clarke, that “the future is already here - it’s just not evenly distributed yet.”and just the same way as Silicon Valley as this Sort ofFuture location for all things tech. Kyiv and Ukraine is sort of the defense valley. It’s the point where the future of defense has already arrived, and there is a ton of things to learn from that starting with particular, hundreds of companies in very particular fields, to the battlefield experience, from battlefield commanders of every level, starting from soldiers, surgeon to platoon level commander to brigade level commander, special forces and intelligence, all of that to how the government, organizes, the sort of the infrastructure and sort of the playing ground for all these businesses to flourish, et cetera. So I would definitely look into much tighter integration and exchanging, the experience and so on. That would be one thing.
    Yaroslav [01:26:03]: I think Reform and procurement would be another thing, and I think that’s what, is currently being done with drone dominance. I think Pete Hegseth is leading that and maybe some other people in the administration. I think that’s extremely sort of powerful and right thing to do, and they should scale that big times.
    Yaroslav [01:26:26]: Obviously, any sort of military person would say, “Well, yes, okay, Yar, you’re fine, cool,”but Ukraine and its war theater is very much different from potential scenarios that U.S. Might have to fight, and yes, I agree, but there is still so much to learn even, like, from the sea warfare that Ukraine is doing and then long strain, long range drones like these Shaheds that unfortunately damaged some of the American equipment in the Middle East. They can fly up to two thousand kilometers. So like, if you think about in the Pacific region, like two thousand kilometers, that covers a lot of land with all the like, islands and aircraft carriers, et cetera.
    Brandon [01:27:16]: I think America is learning that lesson right now in Iran, in the Middle East.
    Yaroslav [01:27:20]: You would think so but then, I’m not sure. It’s like there was so many chances to learn that lesson from Ukraine before, and I don’t think it was like, fully learned, so I’m not sure how fully learned the Middle East lessons were.
    Brandon [01:27:34]: Perhaps losing a war to a minor power will teach America.
    Yaroslav [01:27:38]: You can, you
    Brandon [01:27:39]: Although the their economic weapon will be the most important and decisive by far, but still, some of our bases were supposedly, allegedly rendered unusable by their Shahed-type drones.
    Yaroslav [01:27:51]: Look, I think, there are so many lessons to be taken from this like Russia, a much bigger power attacking Ukraine. Given the same logic that we discussed, whoever has more production capacity should win. But then Russia didn’t achieve victory in Ukraine, and then the US didn’t get, like, full victory in Iran. Probably achieved some of the goals, but probably not all of them. So that also, you can flip that. Like when you say, “Okay, what if China has so much more capacity than the US? What if they attack us for whatever reason? How can we hold them back if we don’t have the rare earths?” Well, as the Ukraine and Iranian examples show, you actually can hold back something like that even if you’re a less capable, party.
    Brandon [01:28:42]: Well, those examples did rely on Chinese supply chains, though.
    Yaroslav [01:28:47]: Partially, yes. But then if you think about Ukraine in February twenty-two, twenty-two to first half a year or a year, wasn’t much reliance on Chinese supply chain. We were just relying on whatever we’ve got. So that’s one side of things. Another side of things is basically how much suffering can you withstand along multiple axes? It’s not just the military axis, it’s also, like, the economic axis and the political axis, I would, I would argue. So like, one of the reasons why wars stop or start is because the political pressure on the leadership internally in the country is so high that you just have to stop that right? So I think that differs big times, from whether you were the one who’s seen by the population as the party which started the conflict or the one who was attacked. That’s one part. Another, just by overall state of the society. Like, and one thing I’m worried about in Europe now, that people are not ready to fight even if they’re attacked. Like, when people are asked about that they’re like, “Oh, I’m just going to move to somewhere where there’s like less, there’s no war.”so that’s a challenge, and that’s what makes Europe weaker right now. And the US didn’t really have to ever, I think, fight a foreign war on its own turf. I hope that never happens, but in case that would have happened, I don’t know what would be how would the rich cities of East or West Coast, how would people behave? Like, would all the Wall Street bankers and Silicon Valley VCs, mobilize and really start working on defense stuff? I would love to think so. I like-- That’s the way I think about the American spirit.
    The Nuclear Lesson: Budapest, Deterrence, and the World After 2022
    Brandon [01:30:49]: The way we did in World War II.
    Yaroslav [01:30:53]: In a way, but look, like it wasn’t that clear in World War II, and like Churchill was like famously said, “America will always make the right decision after trying all the wrong ones,”right? And it’s like one could argue that there is this sort of this USA that lives in popular culture and was sort of created by Hollywood as like cool dudes that will always come and do the right thing, right? And then if you, if you look at like, international politics
    Yaroslav [01:31:21]: It doesn’t necessarily always look like that. Like the Budapest Memorandum, like Ukraine gave all of its nuclear weapons, the second, worst, third largest, nuclear arsenal, because the US and Russia and the others were very persuasive and they’re like, “Yeah, just give it away. We guarantee you security.” And they’re like, “Oh, it’s not guarantees, it’s assurances. We use the word assurances, so therefore we didn’t promise you much. You just gave it away for free.” And then like Russia attacks and like no reaction. So the whole world, like 2022, the whole world looks at it and is like, “Oh, okay, so maybe we should get nukes.” So like my prediction, next couple decades, a lot more countries, will be working their own nukes.
    Brandon [01:32:02]: They really should. I’ve, I’m consistently advocated for specifically Japan, South Korea, and Poland to get nukes. But obviously Ukraine should as well, but can’t
    Yaroslav [01:32:11]: Someone could argue that if a country currently doesn’t work on their own nuclear program, they’re, doing a disservice to their country and the government should be fired. Like, because it seems like from the recent world history that is like the only way to actually provide credible deterrence, all right? So I guess I think like in Europe, people are not quite sure, how will America behave. Will it behave as the Hollywood hero, or will it behave pragmatically as it did at the beginning of World War II, or as it did, with when Ukraine was attacked by Russia and the US just decided to sort of push the Budapest Memorandum, aside because of course Russia’s a nuclear power and like we don’t want to mess with it.
    The Drone Race: Where Ukraine, Russia, and the West Stand
    Brandon [01:32:59]: Everyone says Russia’s behind right now in the drone war.
    Yaroslav [01:33:04]: True. Okay.
    Brandon [01:33:04]: But that wasn’t true a year ago. So a year ago people were saying either Russia was ahead or they’re at parity, or maybe a year and a half ago.
    Brandon [01:33:12]: Russia has more people, four times as many people about, or more.
    Yaroslav [01:33:17]: I think give or take, yeah. 30 versus like 120-ish. Yeah.
    Brandon [01:33:21]: Four times as many people.
    Brandon [01:33:27]: More help from China.
    Yaroslav [01:33:28]: Like economy is like 10, 10- 20 times bigger, I don’t know. A lot bigger.
    Brandon [01:33:33]: A lot of oil money, a lot of oil money, that Ukraine just doesn’t have. More direct help from China than Ukraine is getting.
    Brandon [01:33:41]: Russia just has this massive advantage in scaling against Ukraine itself. Ukraine has financial assistance from the EU, but Right now Ukraine is ahead in the drone race
    Yaroslav [01:33:54]: I’m not sure about that by the way.
    Brandon [01:33:56]: Is that I was Well, that was going to be my next question. Is that true? And if it is true, how long before Russia manages to pivot, course correct, and regain the lead?
    Noah [01:34:05]: Sorry. For my own curiosity, can we define drone race?
    Yaroslav [01:34:09]: Look, I think it’s also for our listeners It’s helpful to understand that there are
    Yaroslav [01:34:17]: At least 30 different types, categories of drones, right? Like you have If you, if you, first you have like different domains. You have flying drones, ground vehicles, and you have sea vehicles, and you have undersea vehicles, right? Then for each of those domains, you have multiple use cases. Like for ground vehicles, you have logistics, evacuation, mining, de-mining
    Yaroslav [01:34:48]: Like maybe something else. For aerial, you have reconnaissance, front strike, mid strike, deep strike, mining, de-mining, radio repeating, kamikaze and bombing, ISR, different types of surveillance, so tactical surveillance, operational level surveillance, maybe strategic level surveilla surveillance at some point.
    Yaroslav [01:35:17]: Logistics also with aerial drones. For sea drones, same thing. So In each of those categories, you have Dozens, sometimes over 100 companies, and products which compete. So that’s the current Ukrainian, battlefield. From the Russian side, it’s less of a zoo, as we say. So they, in each category, they usually have one to maybe three products, and then they scale it sort of in a centralized fashion. And then so when you talk about whether we are behind or who’s behind or ahead in drone warfare You got to analyze
    Brandon [01:36:04]: It’s asymmetric, so it’s hard to compare
    Yaroslav [01:36:05]: Sort of area by area, right? So if you’re like talking about their front strike, I would argue that Ukraine has gotten ahead recently with after scaling the fiber optic. Before that Russia was slightly ahead. So Ukraine got ahead. With like mid strikes, so say something like 40 to 200 kilometers
    Yaroslav [01:36:35]: It’s hard for me to judge. At some point Russia was ahead. I think maybe we’re getting ahead as well, and deep strike we recently got ahead, so we were we were doing more damage to Russia with deep strike drones than they’re doing to us. In sea drones, we’re consistently ahead, always were ahead. In ground drones, I think we’re ahead. Yeah, I think like on
    Brandon [01:37:00]: Where are they still ahead?
    Yaroslav [01:37:01]: In general, I think we’re ahead. Where they, where they are still ahead? I think in certain parts, -Of the components, like A GPS free or navigation like these CRPA antennas are pretty good. They have, these, winged, bombs that they drop from their bomber planes.
    Yaroslav [01:37:33]: I forgot the English name for it.
    Brandon [01:37:34]: Glide bomb?
    Yaroslav [01:37:35]: Sort of. Yeah. So they’re ahead on that side, and it’s like it’s difficult to protect from those.
    Brandon [01:37:42]: What’s the range of that?
    Yaroslav [01:37:45]: It can be pretty big. I think it’s like, can be up to 80 kilometers. Then obviously the range-
    Brandon [01:37:52]: From like a fighter plane, like a strike?
    Yaroslav [01:37:54]: The range is a very iffy subject here because the range is
    Yaroslav [01:38:01]: Is like basically the distance from where you drop the bomb to where it lands, but also you drop it from a fighter plane, and then fighter planes are susceptible to aerial interceptor missiles. So on our side, we have our own fighter planes, and we have the ground anti-air systems. And then, and then those two assets, they have their radars and radar fields. And then, depending on the enemy tactics, you can, calculate how big is the aerial area that you cover with those assets. And look, I’m not a professional military guy, so I’m covering these topics in a in layman terms. Don’t quote me on this. I’m just trying this to make this as understandable to an average listener as possible.
    Brandon [01:38:50]: Helicopters. I’ve recently seen reports of drones taking out helicopters in the air, and that this is new.
    Brandon [01:39:00]: Is that new? Is that going to be a big deal? Is that going to incre like, is that going to eventually get rid of helicopters the way drones are getting rid of tanks in the battlefield?
    Helicopters, Drone Carriers, and Future Air Defense
    Yaroslav [01:39:10]: Look, helicopters are also versatile assets. Front strike helicopters, I think we’re going to be seeing fewer and fewer of them. These few Russian helicopters that Ukraine’s intercepted with drones were more like edge cases than a systematic, sort of helicopter hunting campaign. I think it is possible to turn it into a systematic, countermeasure against helicopters.
    Brandon [01:39:38]: What kind of Will those be battery powered drones themselves, do you think?
    Yaroslav [01:39:41]: Potentially. And there are like so many different scenarios. Like you can have large aerial drone carriers carrying interceptor drones.
    Brandon [01:39:54]: That then go hit the helicopters.
    Yaroslav [01:39:56]: For example. Or you can have, battery powered interceptor drones, but not of a missile with a propeller type, as many of these well-known drones like Stinger or P-One Sun. They look like basically a missile with a quadcopter, behind it. But you can also have a plane or like fixed wing like, aerial interceptors.
    Brandon [01:40:25]: Does anyone, does anyone have like a little like, drone that flies super low under the helicopter and like shoots it from underneath?
    Yaroslav [01:40:33]: Like in theory you can imagine that but it’s just
    Brandon [01:40:37]: Or like surface, a drone that carries surface-to-air missiles somehow.
    Yaroslav [01:40:40]: I don’t think that’s very practical because whatever you have going on land will be just super slow and not fast enough to be able to hunt down a helicopter.
    Brandon [01:40:50]: I mean like in the in the air. Is it, is are is there a drone capable of carrying a small surface-to-air missile that can like skim, low and then launch its little missile, like a flying missile platform or something?
    Yaroslav [01:41:00]: In theory, but like a big part of a mission like that is not just kinetically getting to a helicopter, but also identifying it, either by means of first radar and then visually, and placing the asset you have, the interception asset you have in the right place in the right time. So the combination of those things is much more complex than just, how can we strike it like from behind or from below. But then helicopters are not, that does not mean they’re becoming like completely useless. Like for example, helicopters are used to intercept, deep strike drones. Like Ukraine uses a lot of helicopters to shoot down Shaheds.
    Yaroslav [01:41:44]: Russia uses helicopters to shoot down our deep strike drones.
    Counter-Drone Systems: Shotguns, EW, and Surviving FPVs
    Brandon [01:41:50]: A lot of people talk Oh, so Some ideas about drone countermeasures, things people do technologically to try to shoot down FPV drones or bomber drones or whatever.
    Brandon [01:42:03]: Dumb question that I probably already know the answer to but for the listeners, why can’t you use a shotgun? Shoot down drones that are coming after you. When you have like a Why can’t you just shoot the thing?
    Yaroslav [01:42:11]: That’s the main, weapon that people use against them.
    Brandon [01:42:15]: Why aren’t they very good?
    Yaroslav [01:42:17]: They’re pretty good. Like there are there are like hundreds, maybe thousands of cases of drones being shut down with shotguns, both by definitely thousands, but both by Ukrainians and Russians. There’s even like statistics of
    Brandon [01:42:29]: Got it
    Yaroslav [01:42:29]: What is the percentage of Ukraine FPV drones that didn’t accomplish the mission because they were shut down by a shotgun.
    Brandon [01:42:35]: Got it. So if I’m a guy with a shotgun, I’m walking around, FPV drone comes for me
    Yaroslav [01:42:40]: I don’t recommend that.
    Brandon [01:42:42]: No. I don’t plan on it.
    Brandon [01:42:44]: I’m saying suppose that were the case. In or suppose there’s a there is a guy, he’s not me.
    Brandon [01:42:50]: He’s dumber than me, okay? He’s got a shotgun, he’s walking around. FPV drone is sent. Someone says, “Okay, there’s a guy walking around. Kill him. FPV drone go.”
    Brandon [01:43:00]: FPV drone goes after him. And he has a shotgun.
    Brandon [01:43:03]: What are his chances of using that shotgun to shoot down the drone before the drone gets him? Can Is Are you allowed to say that?
    Yaroslav [01:43:08]: Depending how good you are with a shotgun. I’ll tell
    Brandon [01:43:11]: Random dude
    Yaroslav [01:43:11]: Like I was I was talking to some Ukraine pilot group, and they told me like there was this Russian guy. He was just likeRambo.
    Yaroslav [01:43:20]: He’s like, he like, he shot down like seven FPV drones. They couldn’t, they couldn’t get him. They finally got him, but it was like nothing they’ve seen before, right?
    Brandon [01:43:30]: Got it.
    Brandon [01:43:30]: Your average non-Rambo.
    Yaroslav [01:43:32]: Average non-Rambo will just die.
    Brandon [01:43:34]: Will just die. So there’s like very low chance that they’ll be able to use a shotgun to shoot down the drones.
    Yaroslav [01:43:38]: Rather low chance. Yeah.
    Brandon [01:43:39]: Got it. Well, that was the kind of question I was getting at and there’s no, there’s no sort of portable electronic countermeasure that can get FPV drones if you’re just holding it, very effectively.
    Yaroslav [01:43:50]: There are plenty of it just, depends on it’s always like Electronic countermeasures are used all across the front line. The tricky thing is electronic countermeasures cover certain, radio electronic bands of frequencies.
    Brandon [01:44:06]: Let me simplify my question. Sorry.
    Yaroslav [01:44:07]: Like each side tries to tries to find frequency Will not be covered.
    Brandon [01:44:10]: Let me simplify my question. Is there a man portable system that will give me a greater than 50% chance of living if an FPV drone specifically targets me to come kill me right now?
    Yaroslav [01:44:21]: Look, if your system jams the frequency the drone works on and the drone doesn’t have optic fiber or a last mile autonomy, then you have 100% chance that it will, it will not fly towards you. But then what is the chance to not have drone that can either use different frequency or autonomy or fiber optic? Well, that depends on the on the area you’re in and who’s your adversary in that area, in that zone.
    Brandon [01:44:51]: Let’s I guess this question was maybe too dumb that I was trying to ask.
    Yaroslav [01:44:57]: No, it’s a great question. There are no dumb questions here, and it is just like my answers, if you feel the common theme here, is that things in practice, in war, things are way more complex than they seem.
    Brandon [01:45:11]: What, but so I want, like, I want I’ve read tons of things that say that basically if you’re walking around in the open and drones come for you’re not 100% dead, but you’re probably dead, and I’ve read a bunch of things that say that. I want Listeners to understand why, like, people, who are paying a tiny bit of attention to this debate, to this issue from far away intermittently in America, who don’t, I think don’t understand the weakness of our military against this kind of attack Against drone attack.
    Yaroslav [01:45:48]: I think there was I
    Brandon [01:45:49]: Have a lot of mechanisms, psychological mechanisms by which they cope with the mental idea of drones. I would like to bust those mechanisms by explaining why drones defeat in human infantry on the battlefield.
    Yaroslav [01:46:01]: It’s just A guided bomb flying at you, and it knows exactly where you are right? It’s not that it’s the ultimate weapon, but I think like one of the things that went viral in Ukrainian defense tech bubble, even before the words of the CEO of Rheinmetall, was some American, tank, battle tank pilot, who was interviewed and he was he was asked whether he’s afraid of FPV drones, and he’s like, “No, it’s like we have Our tanks are strong.” And that went viral among Ukrainians because they’re like, “Dude, you have no idea what you’re talking about.” Like, “Don’t mess with those drones.”like, Abrams tank, great tank, but against an FPV drone, sorry, dude, but it’
    Brandon [01:46:54]: Not just deadly
    Yaroslav [01:46:54]: Not going to work.
    Brandon [01:46:55]: Deadly.
    Yaroslav [01:46:55]: No, I was like, maybe not from one drone, but like a dozen drones will take it out. So yeah. But there is hope. So you just have to have kinetic countermeasures. Interesting thing-
    Brandon [01:47:10]: Kinetic countermeasure means a thing that shoots down the drone.
    Yaroslav [01:47:13]: Can mean many things. So if you, if you go to Ukrainian east and sort of territories close to the front lines, I think like about 50 kilometers in from the front line, all the roads are covered by fish nets.
    Yaroslav [01:47:31]: You literally, you ride in a corridor of fish nets, and that’s the mechanical countermeasure against the drone.
    Brandon [01:47:39]: You count that as a kinetic countermeasure?
    Yaroslav [01:47:41]: Mechanical. It says mechanical. Yeah.
    Brandon [01:47:42]: Got it. Got it.
    Brandon [01:47:43]: I don’t know all the jargon, so it’s, I’m, I’
    Yaroslav [01:47:45]: Whatever.
    Brandon [01:47:45]: What I’m talking about.
    Yaroslav [01:47:46]: Whatever. Then the tanks, if you look at Russian tanks and sometimes Ukrainian tanks or equipment They all look like Porcupines. They have these long sticking, I don’t know, poles? We talked about poles already on this podcast.
    Brandon [01:48:05]: Different kind of poles.
    Yaroslav [01:48:05]: Different kind of poles.
    Brandon [01:48:06]: A third kind of poles.
    Yaroslav [01:48:06]: That’s the way to protect from drone. That’s to make to that’s the way to make the drone detonate, maybe half a meter or a meter away from the actual shell of the tank. Or yeah, sometimes there are like nets on top of these tanks, just welded on some extra, sort of equipment. Then of course, there are guns That
    Yaroslav [01:48:35]: Like what both Russians and Ukraine or Ukrainians are beginning to experiment with is Kind of interceptor drone, anti-FPV interceptor drone, which you put on top of something like a gun, like harpoon sort of thing, and when you see like a drone coming at you, maybe you can notice or hear it from 200 meters or 100 meters. So you have a couple of seconds, and you grab that thing, you point it, and you fire it, and then onboard it has certain AI that helps it to guide the small drone towards an attacking drone and intercept it that way. So those are the things that are being developed and like, we’re working on some of these things as well, and then you can imagine like an armor with -Hundreds on of drones on top of it, which are protector drones. They’re sort of like active armor. Whenever they see a drone-
    Brandon [01:49:27]: Huh
    Yaroslav [01:49:27]: Coming at you, they, like, take off.
    Lasers, Skynex, and the Cost-to-Effect Problem
    Brandon [01:49:29]: That’s cool. What about, what about the kind of things that the Germans are building, which is basically like a big truck with a some sort of automated shotgun on it?
    Yaroslav [01:49:40]: Like they have Skynex. It’s, by Rheinmetall, by the guy whom we mentioned today. Skynex is considered to be an okay weapon. Their shots are quite expensive though. So I’ll tell you this different story, about
    Brandon [01:50:00]: It’s about cost to fire each shot really and stuff.
    Yaroslav [01:50:03]: Cost to effect in a sort of a more abstract way. So I was last year I was speaking at Land Europe Conference. It’s the biggest USAA, USA Army, conference in Europe, called Land Europe. And There was an expo there, and there was like a Raytheon, a RTX booth there. And Raytheon is an amazing company. Gosh, we love Raytheon. They’re making Patriots. Patriots are the best. And they make a bunch of other things. And they had this laser gun project there basically.
    Brandon [01:50:44]: That’s what I was going to ask about next is laser.
    Yaroslav [01:50:46]: Laser thing was like they have it in two variations, two kilowatt, sorry, 10 kilowatt laser and 20 kilowatt laser. I’m like, “Okay, 10 kilowatt laser, tell me about it.” He’s like, “Can it take down an FPV drone?” I’m like, “Yes, of course it can.” I’m like, “Okay, cool. How much time does it take to take down an FPV drone?” And they’re like, “Well, maybe three seconds.” I’m like, “three seconds. That’s like a lot of time. But okay, maybe fine. And what if FPV drone tries to evade, right?” And he’s like, “Well, we will retarget it again.” And it’s like, “And then three seconds start again?”“Yeah.”“Okay. Well, can it take down like a dozen FPV drones?” They’re like, “Yeah, for sure.” I’m like, “Okay, a dozen FPV drones, 30 seconds? Maybe, yes. Two kilometers? Maybe yes, maybe no.” And I’m like, “Okay, how much does it cost?” And he said something like $3 million or something like that.
    Yaroslav [01:51:44]: I’m like, “Okay, $3 million. So that is 6,000 FPV drones.
    Yaroslav [01:51:51]: I doubt this thing will be able to handle 6,000 FPV drones or even 600 FPV drones coming at it at the same time.” So you have this kind of economic. And this product may not be necessarily a product against an FPV drone. It might Or against an FPV drone in an active battlefield environment. It might be guarding a stadium in a peaceful country. And then, some random dudes launch a couple drones above a stadium, shoot them down. Okay, everyone’s happy, although the drone will fall down, maybe fall on someone’s head. That wouldn’t be cool. So you would want something like catching bad drones with a net above a stadium or something like that. But whatever.
    Yaroslav [01:52:33]: My point is the economics matters
    Brandon [01:52:35]: You’re talking about the 6,000 drones. If you sent them one by one, it wouldn’t, it would just be pew.
    Yaroslav [01:52:40]: But who would send them one by one?
    Brandon [01:52:40]: If you sent a mass of 6,000, it wouldn’
    Yaroslav [01:52:42]: Of course, yeah.
    Brandon [01:52:46]: What about just like a more powerful laser, like 100, kilowatt laser or something that wouldn’t need to spend, that would
    Yaroslav [01:52:51]: No, that’s worse. You need less powerful laser that achieves the same effect.
    Brandon [01:52:56]: For cost of the system.
    Yaroslav [01:52:56]: A more powerful, yeah, a more powerful laser would be more expensive, heavier, more difficult to transport. It will be more difficult to make many of them. And therefore you wouldn’t be able to cover a long front line, and would be super expensive to replace if it gets damaged, all of those issues. So the reason why FPV drones or iPhones become so popular is because they’re small and everyone can have one? And so is with the countermeasures. So that’s, you were asking me about sort of policy advice. So that’s like another sort of mental shift that you got to go through. It’s no longer about an aircraft carrier that costs whatever, $14 billion and takes forever to build. It’s about mass, that is you can iterate on very quickly. You can upgrade it. Everyone can operate it. And then that mass when it is combined or the technologies when they’re, extrapolated from like one domain to another domain, they add up, right, as it happens with software. So I think that’s important.
    Noah [01:54:14]: Can I ask a follow-up question? So Russia is not necessarily the smartest army you could be fighting. What would happen if you, your adversary was smarter? Do you think things would change meaningfully?
    Yaroslav [01:54:31]: Look, I don’t know if I fully agree with not the smartest army. Who is the smartest army?
    Brandon [01:54:37]: Ukraine?
    Noah [01:54:38]: That’s a great question.
    Yaroslav [01:54:40]: I don’t know. I don’t know.
    Yaroslav [01:54:43]: I think those are like, very dangerous assumptions to make.
    Brandon [01:54:48]: Who was the smartest army in World War I?
    Yaroslav [01:54:51]: Like, well, define smart.
    Russia’s Strategy, Western Assumptions, and Preparing for War
    Brandon [01:54:53]: The United States. Yeah.
    Yaroslav [01:54:53]: Why do you think so?
    Yaroslav [01:54:55]: Why do you think Russia is not the smartest army?
    Noah [01:54:56]: Maybe this is just my own, information bubble.
    Yaroslav [01:55:00]: I’m just like, maybe I agree with you. But I’m just like, I’m naturally wired To challenge those assumptions.
    Noah [01:55:06]: No, that’s a that’s a really good point. I guess, when I, from my information bubble, it seems like Russia’s strategy has largely been to just throw resources, people-
    Yaroslav [01:55:17]: You are living in a Western propaganda Information bubble, of course.
    Yaroslav [01:55:21]: Like, as am I.
    Yaroslav [01:55:22]: Like, because we’re all rooting Ukraine to win, right? Sorry, go on.
    Noah [01:55:26]: In but going back to this granted there’s a history of large powers failing to take over smaller, -Strategically, you
    Yaroslav [01:55:38]: Divide and Goliath
    Noah [01:55:40]: They, this
    Brandon [01:55:40]: They fail a lot more now than they used to. The success rate of taking-
    Noah [01:55:44]: That’s true
    Brandon [01:55:44]: Places over has gone way down.
    Noah [01:55:46]: Certainly, yeah. But regardless, it does, I do wonder, like, if Russia had not essentially assumed victory early It may have different, yeah
    Yaroslav [01:55:56]: I, like, they’re super stupid, of course.
    Yaroslav [01:55:58]: Like, they were marching at With their parade, costumes and like, they were thinking they’re going to have a parade in Kyiv in a few days. Like, that was super stupid. And like, there were lots of stupid things that are like they have no regard, no care for human life. They’re sending those Russian folks just, like, without armor, without anything, like folks on crutches, like sending them to storm Ukrainian positions. And it’s
    Brandon [01:56:23]: They’re the Zerg.
    Noah [01:56:23]: You think at this point there’s
    Yaroslav [01:56:24]: I have, like, I have actually a good friend. He’s American. He’s from Seattle. He’s, served, had been in the Special Forces here in the US, had been in maybe three deployments, and then went to Ukraine, volunteered.
    Yaroslav [01:56:39]: He’s been fighting since, like, 2022. He’s a very good friend of mine. So at some point he’s like, he’s been texting me, and he’s like, “Okay, I’m near Pokrovsk,”and sorry, not Pokrovsk. It was gosh, the other city, Chasiv Yar.
    Yaroslav [01:56:55]: It, and he’s like, “Okay, so what Russians are doing, they’re just creating so much work for all the all the psychologists who are going to heal those Ukrainian, whatever, riflemen or machine gunmen, who are just, like, shooting at the Russians who are like, going nonstop,”right? So it’s like causing, or Russians are causing psychological trauma on Ukrainians because they’re dying in such stupid way.
    Noah [01:57:26]: Jeez
    Yaroslav [01:57:26]: That is indeed stupid of sort of Russian higher command, et cetera, et cetera, et cetera. But then that’s the resource they have. And
    Brandon [01:57:38]: If you’ve got, if you’ve got Zerglings, you use your Zerglings.
    Yaroslav [01:57:40]: That’s the way. That’s their strategy. That’s their way of strategy, right?
    Brandon [01:57:43]: If you’re going to play Back in the That’s what you do.
    Yaroslav [01:57:46]: If you play StarCraft, that’s how Zergs win.
    Brandon [01:57:48]: Are Ukrainians the Terrans?
    Yaroslav [01:57:52]: I don’t know. I hope we will become Protoss soon.
    Yaroslav [01:57:57]: I’m working on that. I’m working on that.
    Brandon [01:58:02]: Protoss had fairly bad political management at the top
    Yaroslav [01:58:04]: I wish Protoss with a speed closer to like, humans or Terrans, whatever it is. Hopefully we can do Protoss technology with a Zerg speed. That would be the best. I think that’s what the housewives are working on in fact.
    Brandon [01:58:20]: You cannot beat those housewives. Do not oppose Ukrainian housewives.
    Yaroslav [01:58:23]: Do not mess with Ukrainian housewives, for sure. Yeah.
    Noah [01:58:26]: Two final questions. First one, you started out by telling us a story about going to a chapel on February 23rd.
    Noah [01:58:34]: Were you able to get married there? Can you finish that story?
    Yaroslav [01:58:40]: We actually, we did get married, but we postponed the wedding as a social event, until the war is over.
    Noah [01:58:49]: Then last question, what do you want our audience to take away? If you have one point you want them to walk away with what would it be?
    Yaroslav [01:58:58]: You want peace, be prepared for war. Got to invest in defense and security.
    Noah [01:59:04]: All right. Thanks. Thank you for talking with us.
    Yaroslav [01:59:06]: Thank you.
    Noah [01:59:07]: Thank you, Noah, for all the great questions.
    Yaroslav [01:59:11]: No, it was fantastic.
    Yaroslav [01:59:12]: Thanks so much.
    Brandon [01:59:13]: Really fun.
    Noah [01:59:13]: Awesome. Thanks.


    This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.latent.space/subscribe
  • Latent Space: The AI Engineer Podcast

    AI-Native Healthcare: 100M Doctor Visits, 10–20 Hours Saved, Prior Auth in Minutes — Janie Lee & Chai Asawa, Abridge

    14/05/2026 | 1h 5min
    Special discounts up for AIE Melbourne (LS discount) and AIE World’s Fair (group discounts up to 25% - CFPs still open for Autoresearch and Vertical AI) Cya there!
    Abridge did not start as an “GPT wrapper”. It was founded in 2018, years before the Cambrian explosion of AI application layer companies. OpenAI launched ChatGPT publicly on November 30, 2022 and by then, Abridge had already spent years doing the unglamorous work of building trust for one of the highest context, most important workflows in healthcare: the conversation between a patient and a clinician.
    Abridge’s original wedge was clinical documentation. Listen to the visit, generate the note, reduce the clerical burden, and let clinicians spend more time with patients instead of the EHR. By focusing on how doctors actually document, how health systems actually buy, how EHR integration actually works, how clinicians verify outputs, and how missing context during a visit turns into downstream friction across billing, prior authorization, quality, and follow-up, the adoption of LLMs became a force multiplier on a workflow already optimized for sensitive context gathering.
    The company has scaled fast: Abridge says it is projected to support 80M+ patient-clinician conversations this year across 250 large and complex U.S. health systems, with support for 28+ languages and 50+ specialties. It raised $300M at a $5.3B valuation in June 2025, after a $250M round earlier that year.
    Today, Janie Lee and Chaitanya “Chai” Asawa of Abridge join us for another crossover pod with Redpoint’s Jacob Effron (who is on the board of Abridge) to dive into how Abridge is building the clinical intelligence layer for healthcare starting with ambient documentation, then expanding into clinical decision support, prior authorization, payer/provider/pharma workflows, and eventually real-time agents that act before, during, and after the patient conversation.
    We go inside the product, data, infra, evals, workflow, privacy, and org design choices behind bringing AI into one of the highest-stakes enterprise environments from 100M+ medical conversations and specialty-specific evals to real-time alerts, EHR integration, de-identification, clinician-scientist teams, and why healthcare may solve some of the hardest AI problems first.
    We discuss:
    * Why Abridge started with clinical documentation, “pajama time,” and saving clinicians 10–20 hours a week
    * The transition from ambient scribe to clinical intelligence layer: save time, save money, and save lives
    * Why conversations between patients and clinicians may be the most important workflow in healthcare (patient visit summary feature)
    * Chai’s “healthcare-coded Glean” framing: context is king, but healthcare raises the stakes on safety, evals, and rollout
    * Why Abridge wants AI to feel like “air conditioning”: always in the background, but only interrupting when it truly matters
    * The prior authorization example: turning a denied MRI weeks later into real-time guidance while the patient is still in the room
    * Why payer policies, EHR data, medical literature, and hospital-specific guidelines make the problem hard, and also create the moat
    * How Abridge thinks about ambient form factors: mobile, desktop, in-room devices, nursing workflows, multimodality, and future AR
    * The multi-sided healthcare customer: CMIOs, CFOs, CIOs, clinicians, patients, payers, and pharma
    * The hardest AI problem at Abridge: high-quality, low-latency, low-cost real-time support in a high-stakes clinical setting
    * When Abridge uses frontier models vs proprietary models, and why its unique data from medical conversations matters
    * Why “every agent is a coding agent underneath,” and how the EHR can be thought of as a filesystem for healthcare agents
    * How Abridge approaches personalization across individual doctors, specialties, and health systems
    * Why “AI slop” is AI without context, and how edits, memories, and clinician preferences create a data flywheel
    * Abridge’s eval stack: LFDs, LLM judges, in-house clinicians, third-party evaluators, specialty-specific evals, and progressive rollout
    * HIPAA, PHI, de-identification, one-way anonymization, customer contracts, and learning from healthcare data safely
    * What changes when you operate at 100M+ conversations: reliability, cost, post-training, model routing, and infrastructure optimization
    * Why the same clinical conversation can serve doctors, patients, payers, pharma, and future clinical-trial workflows
    * How Abridge works with EHRs, and why deep interoperability is table stakes for clinician adoption
    * Why healthcare AI has regulatory tailwinds, why 80/20 does not work here, and why high-stakes domains may drive AI forward
    * Why Abridge embeds “clinician scientists” into product and eval teams
    * What Chai learned from Glean about search, quality, and durable AI infrastructure
    * Why the future of AI infra may look like context layers, event-driven systems, Kafka, Temporal, sockets, CRDTs, and tools built for humans
    * Why Janie changed her mind on “PRDs are dead,” and why crisp written clarity matters more in complex AI products
    * How Abridge uses Claude Code, Cursor, and coding agents internally
    Abridge:
    * Website: https://www.abridge.com/
    * X: https://x.com/AbridgeHQ
    Janie Lee:
    * LinkedIn: https://www.linkedin.com/in/janiejlee
    Chaitanya “Chai” Asawa:
    * LinkedIn: https://www.linkedin.com/in/casawa
    Timestamps
    00:00:00 Introduction and what Abridge does
    00:02:05 From ambient documentation to clinical intelligence
    00:04:04 Clinical decision support and context as king
    00:06:57 Alert fatigue, proactive intelligence, and prior authorization
    00:12:36 Ambient AI form factors and healthcare customers
    00:16:59 The hardest AI problems in healthcare
    00:18:26 Frontier models, proprietary data, and model strategy
    00:21:07 The EHR as a filesystem for agents
    00:24:03 Personalization, memory, and clinician preferences
    00:30:40 Evals, LLM judges, and progressive rollout
    00:36:47 HIPAA, de-identification, and privacy
    00:39:21 100M conversations and operating at scale
    00:44:10 EHR integration and the clinical intelligence layer
    00:46:39 Healthcare regulation, latency, and high-stakes AI
    00:50:11 Clinician scientists and long-tail quality
    00:53:04 Lessons from Glean and durable AI infrastructure
    00:57:03 The future of agentic healthcare workflows
    00:57:34 PRDs, product clarity, and building serious AI products
    01:03:11 AI coding tools at Abridge
    01:04:06 Outro
    Transcript
    Introduction: Abridge, Clinical Intelligence, and the Latent Space x Unsupervised Learning Crossover
    Swyx [00:00:00]: Okay. This is a special crossover Latent Space Unsupervised Learning pod.
    Jacob [00:00:07]: Very excited to do this.
    Jacob [00:00:08]: At this point, we get together once a year.
    Swyx [00:00:10]: Once a year
    Jacob [00:00:11]: And this is a fun occasion to get to do it on.
    Swyx [00:00:13]: I really wanted to talk to Abridge but I felt very underqualified because healthcare is not something we cover very intensely. It just so happens that Redpoint’s our big investors and supporters of Abridge.
    Jacob [00:00:27]: Anytime you want to have a portfolio company on your podcast
    Jacob [00:00:29]: Please, by all means.
    Swyx [00:00:31]: So we’ll introduce our guests. Chai and Janie, welcome to the pod.
    Janie [00:00:34]: Thanks for having us.
    Chai [00:00:35]: Thank you.
    Janie [00:00:35]: We’re excited to be here.
    Chai [00:00:36]: Thank you.
    Swyx [00:00:36]: So for listeners, what do you guys do, just to situate you guys in the company?
    Janie [00:00:42]: Abridge is a clinical intelligence layer for health systems. We really started with documentation and building for clinicians and as we think about reducing the burden that clinicians have, they’re spending 10 to 20 hours a week on documentation. There’s a massive doctor shortage in the country. We also think that conversations between patients and clinicians are probably the most important workflow in healthcare. It’s where care is given and received but if you think about the 20% of our GDP that goes towards healthcare, almost everything is a derivative of that conversation, whether it’s the claim, the payment, the actual diagnosis given, the treatment. And we’ve started with a conversation to reduce the burden for doctors on documentation but we’re really excited about the path ahead as we become this broader clinical intelligence layer.
    Chai [00:01:34]: I’m Chai. I work on clinical decision support at Abridge.
    Swyx [00:01:37]: Yes.
    Chai [00:01:37]: And so as Janie said, we’re uniquely situated where we started off with the clinical note. What I’m really excited about and where we’re expanding towards is what are all the things you can do before the conversation, during the conversation and after the conversation if you did have access to all the context about patients, payer guidelines, medical literature and put that together and to serve, how healthcare could look fundamentally different.
    Swyx [00:02:01]: And that’s the context engine that you guys have?
    Chai [00:02:04]: Yes.
    Swyx [00:02:04]: Is that what it’s called? Okay.
    Swyx [00:02:05]: So historically, as I understand it, the company started in 2018. A lot of people would be familiar with the AI voice notes form factor that doctors would be “Well, do you consent to being recorded?” It replaces handwriting and what have you. But it sounds like more recently there’s been a big transition in the company. Tell me about the broader transition.
    From Documentation to Clinical Intelligence: Save Time, Save Money, Save Lives
    Janie [00:02:26]: So from a transition perspective, we really think about our journey as The first act was: how do we help save time? And that’s where a lot of that original product was.
    Swyx [00:02:37]: By the way, one of those interesting stats
    Swyx [00:02:39]: On your landing page was, doctors spend time after hours.
    Janie [00:02:43]: They call it pajama time.
    Swyx [00:02:44]: Why is that pajama time?
    Janie [00:02:46]: Doctors after work in their pajamas
    Swyx [00:02:48]: In their pajamas. Oh
    Janie [00:02:49]: At home are just writing and catching up on their notes every day.
    Janie [00:02:53]: Some of our favorite customer love stories, we have a Slack channel called Love Stories. We have clinicians telling us, “Abridge has helped us, from retiring early or we’re now finally able to
    Janie [00:03:06]: go home and eat dinner with our kids for the first time.”
    Chai [00:03:08]: Save the marriage in some cases.
    Swyx [00:03:10]: One of the quotes was “We’re not divorcing anymore.”
    Swyx [00:03:12]: I’m asking, “Why?”
    Swyx [00:03:14]: Because they’re working too much.
    Janie [00:03:16]: But, in terms of where we’re going and where we’re expanding, we really think about our second and third acts around how do we help health systems save and make more money. Health systems are operating with record-low operating margins. It’s getting harder and harder to serve patients and they have regulatory, some tailwinds but also a lot of headwinds coming their way and AI is ripe for helping on the saving and make-more-money piece. And then ultimately, how do we help save lives? The fact that our software and our product is open millions of times a week before, during and after a patient walks in the room, gives us massive opportunity with products like clinical decision support, which Chai is building but so many others to improve patient outcomes and probably one of the most important workflows and problems to be going after right now.
    From Glean to Healthcare: Context Is King
    Jacob [00:04:04]: One thing that’s interesting, Chai, is you came over to Abridge from Glean and clinical decision support, which for our listeners is, in the context of a visit, helping a doctor figure out the right type of care. It’s really a search problem in many ways, going through lots of different data sources. Very analogous to your previous role as one of the earliest engineers over at Glean. I’m sure a lot of our listeners are curious what’s similar about the problems that you’re going after now and what feels different, now that you’re in healthcare.
    Chai [00:04:33]: Very similar. Taking a step back, with every wave, there’s a lot of very similar patterns that happen across different products. A lot of social networking products look the same. A lot of credit-based products look the same. And we’re seeing that very similar in the agent era with many companies, of course, in Redpoint’s portfolio and so forth. And the key insight between both companies is that you have amazing models but context is king. Context is what puts them to work. So I see it in a lot of ways, a lot of similarities in this is a healthcare-coded version of Glean but the differences are really interesting. A couple things that come to mind. First and foremost, the rigor of the setting we’re in. The downside risk is extremely high here in healthcare. It can be fatal in some cases. You prescribe something that the patient is allergic to for example. Whereas at Glean, it’s “Oh, you got the question wrong.” It wasn’t the end of the world in most cases. And so what does that mean? That shapes our evaluation strategy, both offline evaluation, progressive rollout and there’s a lot more we could go into there. Second thing that comes to mind is, vertical versus horizontal. In both cases, there’s a large variance but when Glean is, it’s a much more horizontal company, there’s a variance of personas, companies that you’re working with. We also have a variance of personas, different types of specialties, different hospital systems. But the variance is a little more narrow. So from a product perspective, you’re able to focus far more, especially when you have a maturing technology and you’re building new products that never existed before. It lets you go after them much more easily and especially in healthcare where so many problems were solved with labor and process, that it’s extremely ripe for AI to keep helping augment and enable. And the final thing that’s really interesting, Abridge specifically compared to many other companies in the AI area, is the modality we started with where we’re ambient and we’re always listening in the background. And many more AI products will go that way but it’s how we started. And that’s the greatest form of AI we can create, AI that’s seamless. You’re not looking at your screen. It’s always there. It’s always helping you out and being proactive. The Jarvis vision that, every hackathon I went to over the past decade, there was always a Jarvis competitor. But Abridge very much started from the opportunity and continues to go that way.
    Ambient AI and Alert Fatigue: When Should the Product Interrupt?
    Jacob [00:06:57]: One thing that is super interesting then from a product perspective is you have this always-on seamless in the background and then you have to decide when you break the wall almost and say, “Hey, clinician, you might not have thought about X,” or whatever it is that you want to do. And in healthcare traditionally there’s been this idea of alert fatigue and a million pop-ups and then a doctor just ignores all of them. It’s probably a pattern that a lot of builders are thinking through now. How do you think about the right way to intervene or to pop up in a doctor visit?
    Janie [00:07:26]: It’s such a good question. Alerts are notorious in healthcare specifically. Over 90% of alerts are ignored. The first and most important thing is context is everything, as Chai alluded to and I also think about how do we go from being reactive alerting to really proactive intelligence at the point at which it matters most. One thing we like to say is we want our product to feel like air conditioning. It should be in the background just making things better and if there is something that has great clinical risk and we’re acutely aware that intervening now and not later is incredibly important, we should decide to act. But if you think about proactive versus reactive, instead of alerting a clinician during a visit when they’re with their patient having a pretty serious and sensitive conversation, how do we prep a clinician before they walk into the room with that patient? And so historically, clinicians might have to manually go through charts with a patient that they’ve had over the course of months or years and they’ll try to suss out what are the things they should be doing. You can imagine a world with Abridge. We’ll summarize all of the most recent context for you, tell you based on the reason for a visit the patient is coming in for the types of things you should be discussing. And so you’re going into that conversation prepped rather than walking in cold to that patient visit and then having this product interrupt you five or 10 times throughout the visit. And there might be times where it’s really important to interrupt. We have a product called Prior Authorization and so this is when you may go into a doctor’s office with knee pain. They’ll prescribe you an MRI and so many of us have had this experience before, where in four weeks you’ll get a call saying, “Hey, Sean, that MRI that you were prescribed wasn’t approved and why don’t you come back in? We’ll figure it out.” In a world with Abridge, we might choose to quietly but still alert a doctor in that visit. And alert is probably not even the word we would want to use. Before a patient leaves, we would want to tell the doctor, “Hey, Doctor, before Sean leaves, you should ask him, has he had physical therapy and has his pain lasted for more than six weeks? Because the Aetna plan that he’s on in California requires six things. We’ve already confirmed four of them have been met ‘cause we have all the context. But these two last criteria, if you can address with Sean before he leaves the room, we could guarantee that your MRI is approved before you leave.” And so when you think about clinical usefulness, impact to the patient, there are instances in which if we can catch a doctor while the patient is still in the room, as we think about save time, save money, save lives, we get to check all of those boxes. But when doctors have 15 minutes between visits, we have to be really thoughtful about when it matters.
    Prior Authorization: Reducing Latency in Care
    Chai [00:10:23]: There’s this interesting product opportunity AI has is reducing latency in the world. For example, prior authorization is an example of where care gets delayed and so great AI can reduce that. And the problem with alerts before partially is a technical problem: the quality of your alerts really matters. They’re going to get ignored if you get alerts that... Similarly in engineering, where they’re noisy alerts that you can’t act on. But if you can make really high-quality alerts with both the context, as Janie said, and really high-quality models, then you can create a whole other game.
    Janie [00:10:53]: And I really like that experience because it starts to tease apart, what makes this so hard and unique. One, to make that prior authorization example possible, think about all the data that you need to have. You need to integrate with the electronic health record to know all of the patient context. Do we have access to your previous labs, previous imaging? And then to match you and to know that you’re on Aetna, we have to collect all of the different payer policies and they vary by state. Some of these payer policies live on websites. Some of them live in unstructured 50-page PDF files.
    Jacob [00:11:31]: I thought this episode was
    Jacob [00:11:31]: To make sure we didn’t scare people from healthcare.
    Janie [00:11:34]: But when you think about the things that make it hard, it also gives you the moat.
    Janie [00:11:39]: And then the second is the AI and the model quality we need to be able to hang our hat on. And so the bar, similarly when I worked at Opendoor, I worked on pricing models. Every outlier wiped out the margins of 30 and so similarly here in healthcare, the bar for accuracy is so high. And then I’d say the last is workflow is everything. If insurance companies deploy AI, it typically happens too late and this is when you have the notorious comical examples of AI just fighting each other when it’s too late. But if we can pull forward the use of both the AI but also the ability to solve problems when the patient’s in the room, you can start to collapse what typically takes weeks or months after your visit, ideally down to minutes or real-time. And it’s where healthcare is both very difficult but also extremely rewarding if you can crack it.
    Product Form Factors: Mobile, Desktop, In-Room Devices, and AR
    Swyx [00:12:36]: Just to get some baseline on the form factors, because I’ve seen some videos on your website and stuff. You guys talk a lot about ambient AI. Is it primarily on the phone? Is there any other form factor that people get Abridge in? Is there an Abridge room setup where it’s always on? I don’t know.
    Jacob [00:12:55]: An Abridge podcast studio.
    Janie [00:12:58]: Primary form factor is mobile and desktop. Usually
    Janie [00:13:00]: Clinicians are walking in and out of rooms with mobile but at the end of the day, when they’re closing out their notes or wanting to prep for the day ahead, they might use desktop. We have been having a lot of really interesting partnership conversations with a lot of these in-room device companies as you think about the power of multimodality and even more data, as you think about all of what is not captured today. It is fascinating to think about, especially even as we go into building and scaling our nursing product. It’s one where nurses constantly, as they’re walking in to check in on a patient for two minutes or maybe even 30 seconds,
    Janie [00:13:43]: Starting an Abridge experience is probably going to take longer than the visit. And so what can we do with in-room devices that are always on starts to raise really interesting and fun product questions.
    Swyx [00:13:54]: I was thinking, the way in tech companies we have all these Google Meet
    Swyx [00:13:58]: And other things, we might as well set up entire rooms with just Abridge tech.
    Chai [00:14:02]: Very much. AR glasses and related form factors are also relevant: how do we bring the information to the clinician in real-time without a screen, while still letting them focus on the patient?
    Swyx [00:14:18]: Do you think they want that? I’m skeptical of AR, but I’m curious what you’ve tried.
    Chai [00:14:26]: Admittedly, it’s not a near-term product roadmap
    Chai [00:14:29]: By any means. I’m being far-fetched.
    Jacob [00:14:31]: There’s some sick AR stuff for surgeries.
    Swyx [00:14:33]: Really?
    Jacob [00:14:33]: When people are trying to visualize, you’re about to make an incision but you want to see, what the cut might look or what the body might look like inside and they can layer in imaging.
    Swyx [00:14:43]: That’s cool.
    Chai [00:14:45]: At some point in the future.
    Janie [00:14:46]: But there are a lot of our largest customers and at the largest health systems integrating already and so even as we think about building into it, unlocks a lot of product capabilities.
    Swyx [00:14:57]: And just to establish the terminology. Sorry, and I know I’m asking basic questions somewhat for myself but also for the audience who might be
    Health Systems, Buyers, Clinicians, Patients, and Payers
    Swyx [00:15:05]: Less integrated. When you say health systems, it’s like the Johns Hopkins, the Kaiser Permanentes.
    Janie [00:15:09]: Mayos, the Kaisers of the world.
    Swyx [00:15:10]: These are your customers, right? And the outcome that you deliver for them is happier doctors, reduced cost of processing, reduced mistakes. It’s weird in a sense that I feel like there’s also, a secondary customer, the customer of the customer and I don’t know if you — do you think about it that way?
    Janie [00:15:28]: The other interesting and complex part of building product is we have our buyers, who are the chief medical information officers
    Janie [00:15:39]: The chief financial officers, the CIOs of these large health systems. Our users today are clinicians but if you think about who downstream is impacted, it’s patients. And so as we build, with every product in mind, we think about who we’re building for, who the secondary user is and what does that mean either in terms of experience, security compliance, ROI that we have to make tangible. And so like you said, time savings is one of them. But for CFOs, they care a lot more than just time savings. We have to show for every dollar you put into Abridge, because you have more compliant documentation or because you have fewer queries coming from your billing team, we save or add real dollars to your bottom line or top line, are things that we’re constantly thinking about because of the dynamic across all three sets of users.
    Chai [00:16:32]: There’s a whole other axis too with the payers and pharma
    Chai [00:16:35]: as well. Connecting all these three big stakeholders in healthcare is
    Swyx [00:16:39]: Do the payers ever see your data? Sorry, the payers meaning the insurers, right?
    Chai [00:16:44]: Yes.
    Swyx [00:16:44]: They also see Abridge data?
    Chai [00:16:47]: No
    Swyx [00:16:47]: Like the direct integration to you guys
    Chai [00:16:48]: They wouldn’t see the raw Abridge data but when you’re working together on something like prior authorization, whatever information they need, we’d communicate to them.
    Jacob [00:16:59]: That’s cool. I would love to dig into the AI side. You still have a lot of problems on the AI side. And so maybe to start at the highest level, what’s one of the hardest problems you have to solve in AI at Abridge today?
    The Hardest AI Problems: Quality, Latency, and Cost
    Chai [00:17:11]: To make things simple, let’s take, building off the prior auth example. So one thing Janie talked about is okay, this data is all over the place and there’s this combinatorial explosion of procedures, payer policies and even sometimes different health systems. There can be some cross-product of all of these different considerations you have to take into account. But what’s really hard about this problem is doing it real-time in the conversation. So, in any AI product, usually the three KPIs you care about are quality, latency and cost. Now, what we’re saying is we want you to do this real-time in the conversation, guiding the clinician. How do we do it in a way that does not break the bank? But we’re using — But we also need very intelligent models because you’re working with this cross-product of data and this, all this context layer as well. So you need high intelligence and high-quality because you don’t want the alert fatigue but you also need to be fast and cost-effective. And so that’s where a lot of clever engineering goes. It’s okay, without getting into all the details here, can you model these policies in some intermediate representation or other things that you can do that can make this problem tractable? And of course, the Pareto frontier is always changing but we are also trying to do this now.
    Model Strategy: Third-Party Models, Proprietary Data, and Medical Conversations
    Jacob [00:18:26]: What implications has that had for what you take off-the-shelf and say, “ what? We don’t need to be world-class at X. We’ll just take this from the model providers or from some infrastructure player,” and what you’re “No, this is where we spend most of our time focused on”?
    Chai [00:18:38]: This is, the fun challenge in AI?
    Jacob [00:18:42]: It changes every three months? So
    Chai [00:18:42]: Of course, with the shifting landscape, we try to be extremely thoughtful on predicting the trends of where third-party models are going and where we can uniquely go. And, sometimes when you talk about AI models, we’re the models are just going to get infinitely better. But I don’t think... It may be in the grandness of time you could say that but, within every month, every quarter, there’s specific ways they’re getting better. They’re training on a lot more, coding data to be better coding agents, for example. And so
    Chai [00:19:14]: We have to think about where are the things that won’t — unique data that we’re uniquely training on or to step back a little, where is a proprietary model bringing advantage to us is if it can give higher quality or lower cost and latency for similar quality, very similar to many other companies. And when we can do that is when we have proprietary data. So, for example, we have on the order of eighty million or hundreds of millions now getting close to of medical conversations.
    Jacob [00:19:44]: It’s insane.
    Chai [00:19:45]: This is a unique data set. And this data set, it’s very interesting because this data set is effectively a large part of the trace between the patient and the provider. That’s where the quote-unquote debugging happens in healthcare. We have these traces at scale, as in as, our CEOs even called it, an exhaust that comes out of our product. And so when you have these traces, that’s how you can train better agents on certain use cases, whether it’s your transcription diarization use cases or so on or like note generation models and we can do that much cheaper and faster. But we’re always also working with these third-party model providers. We closely collaborate with them and that’s how we predict where the trends are going. The thing that I think about a lot is that, I know that the model providers are going to train much more on agentic workflows and so forth, so that’s great, so that you have a better agentic harness. But the other thing that’s interesting is that the model providers, because a large class of the consumer model providers is healthcare queries, that they might, optimize to train a lot of healthcare data to encode the knowledge in its weights. And this is just a great thing for us as well, where the off-the-shelf models can keep bett-getting better at general healthcare information, such that what our strategy is, we have a constellation of models, we can use something for this, that and, we only care about, at the end of the day, the best product experience.
    EHR as File System: Agentic Workflows and Real-Time Interfaces
    Jacob [00:21:07]: And, you have, overall capabilities improving. I’m curious, as these models get better, is there something you look at and you’re “, three months ago, we really couldn’t do that but God, the the latest models really allow us to do it”?
    Chai [00:21:19]: So here’s something interesting that I’ve, been toying with. So all models are... This wasn’t super obvious a year ago but now it’s become clear and clear that almost every agent is a coding agent underneath the hood? So you give it whatever file system, it can write its own code and so forth. So when you think about within healthcare and the use case that we have, you can think of the EHR effectively like a file system. It’s just — it’s a storage of all this information. It’s a lot of information there that cannot fit into the context window, at least of today’s models and you want to use that context effectively for all these product use cases we’re talking about. And so if you have better agents that can, manipulate data, read that data, treat it as a file system as we see they’re going and we know model companies are investing this way, then that very directly benefits us.
    Swyx [00:22:09]: Yeah. Okay, cool. Again, just establishing basic things. But we’re going back to the model stuff. I’m really interested in double-clicking more on the real-time, element, which is pretty important for both of you. Is it — Is real-time just batches of every one minute, every five minutes? Is that how we do it? Or is there some more native, genuinely real-time in the sense that OpenAI has a real-time API or Gemini has a real-time API?
    Chai [00:22:35]: Yeah. Yeah. So today it is more on the on the batch basis but there’s interesting
    Chai [00:22:41]: Prototypes that we have that we’re still not fully, full time, voice in text out or in that sense. But, can you trigger your models, your agents or agentic workflows, depending on the right times in the conversation?
    Chai [00:22:58]: And so you can imagine, different techniques to bring this latency down and, you want to bring the feedback loop down as much as you can. And so a lot of clever engineering there without fully... Maybe one day we’ll do full voice in and text out, train a model to do something like that.
    Swyx [00:23:15]: You do — People don’t want voice in voice out?
    Chai [00:23:18]: Now we aren’t creating experiences that are, during the conversation, inter — It’s almost like
    Swyx [00:23:25]: Might be too disruptive
    Chai [00:23:26]: Too disruptive until, who knows, maybe eventually you could have full voice agents once we — the quality and we improve the comfort of the technology. But right now gra — that change is much more gradual and it’s more text focus, text out.
    Janie [00:23:42]: And so much of currently what our product is trying to do is allow a clinician to focus on their patient and maybe at some point but right now patients, clinicians don’t want a third voice, at least in a literal voice in that room. And so how do we be there with all the contacts and information ready at hand when there’s the right moment?
    Personalization: Individual Doctors, Specialties, and Health Systems
    Jacob [00:24:03]: Jenny, one thing I’m curious about is how you think about, personalization in the product. I imagine, every doctor is a special snowflake in their own way, has their own way they like to do things. There are probably a bunch of different approaches you could take to doing that, both within the model layer itself but then also just with clever prompting or engineering. How do you
    Jacob [00:24:20]: Deliver on that?
    Janie [00:24:21]: It’s such a good question. Personalization is massive for us. We think about personalization at three levels. The first is at the individual, the second is at the specialty level and then the third is at the health system or the organization level. To your point, there are a lot of individual preferences. You-When a note is produced, it almost is a reflection that is so deeply personal of a doctor’s work and how they give care. And so do they have preferences on things like style? They might want bullets versus paragraphs, really concise versus comprehensive. They also might have phrases that they really like to use or the templates that they want every note to be structured. And, we see it in our feedback all the time. We want two spaces in between sentences or I refuse to use this tool. And so that’s something that we’ve had to build in. And the tricky part is how do you make sure that stylistic preferences don’t interrupt accuracy and quality and that’s something that we’ve really had to refine and hone over time. Second is at the specialty level. A cardiologist note or workflow is going to look very different from a dermatologist workflow.
    Jacob [00:25:32]: I assume cardiology notes are the highest stakes for you guys, given your CEO is a cardiologist.
    Jacob [00:25:36]: It’s “Oh my God, make sure we get this one.”
    Janie [00:25:37]: Shiv, our CEO, is still a practicing cardiologist. He rounds once a month. And so, first call when we want just quick and easy user feedback too.
    Janie [00:25:46]: But, specialties require a lot of personalization, both in terms of what does the product look and so we make sure that as new users onboard, we catch that and the product proportionally reflects that. But also on the back end, evals at the specialty level, they are hard-earned to calibrate and get. What does a really great dermatology note look like? What makes it complete? What makes it compliant and billable is very different than a primary care doctor. And so it’s not just about what does the product experience look but on the back end tuning and really deepening our understanding for the specialists. What does great output look like? And that’s, a problem that we need to calibrate internally, externally, online, offline but, takes lots of cycles but is necessary in a high-stakes environment. And then at the health system level, for products like clinical decision support, you have health systems who’ve spent years or decades refining their best practices and they want to know, “Hey, we love your clinical decision support product but how do we embed our own hospital guidelines into them to inform clinicians before, during or after a visit what brest — best practices should look like?” And as you think about, deepening moats as well, when health systems, trust us with that data, allow us to productize it and directly into the clinical workflow, makes us a really great partner to health systems who want to build something that truly meets their needs, their practicing guidelines.
    AI Slop, Memory, and Product Data Flywheels
    Chai [00:27:23]: And I want to add onto that. The for the clinical documentation problem, it’s very similar to AI writing that doesn’t feel like your own and then we call that slop. But the way I describe one framing of slop is like AI without context. But we have all that context and both the clinicians, can have it and can guide it. And so part of the other interesting exhaust for us is, memory is, one of these new systems records
    Chai [00:27:49]: Almost.
    Janie [00:27:50]: And we also have all the edits people make on our product and when you think about a data flywheel and how we get better over time becomes really powerful as a mechanism to just going deeper in personalization.
    Jacob [00:28:04]: It’s interesting. I love this idea of working with systems on the guidelines they built up over a long time. I feel like so many of the best AI app companies today are... The question is: How do you take the expertise that a law firm or a bank has built up over many years and then add that as context and also a special sauce over, a an AI tool? And so seems like y’all are really doing that very effectively.
    Janie [00:28:24]: We’re now starting to have our customers ask, “What are other customers doing?”
    Janie [00:28:28]: “And how are they doing it?”
    Janie [00:28:30]: And as we think about having visibility across such a large set of care being delivered right now, a really interesting place we could also partner.
    Swyx [00:28:40]: I’m just curious. I — This may be a nothing question but, how different are health system guidelines from each other? Don’t they all converge to the same thing? And if not, where do they differ?
    Chai [00:28:52]: At a really high level, they’re going to talk about very similar things but the difference is probably in some more of the details. “Oh, you should refer to specialists only when XYZ conditions are met,” or so forth and maybe different organizations have different practices and guidelines around that. But high level, talking about similar things but the details are what, of course, that shapes the context and the decisions you make.
    Swyx [00:29:15]: And this all goes into the context engine and it might affect the notes but maybe not.
    Chai [00:29:21]: The — For these local pathways, we’re definitely thinking about it a little more for our clinical decision support product.
    Chai [00:29:26]: So yeah.
    Swyx [00:29:27]: Which is your stuff, yeah.
    Swyx [00:29:28]: And then the memory which you raised, let’s just tell us more about that. What have you tried in memory? What’s the structure of the memory? What works? What doesn’t work?
    Chai [00:29:38]: There’s, of course, many different ways you could do memory, where it’s okay, can you bake it into the model weights or can you do it in some external store? For us, what’s interesting is, of course, when you think the models are rapidly changing, whether it’s in-house or third-party, baking into the model weights, sometimes you worry that it could be a little throwaway. And so, how do you... You need to find a way that you decompose the problem, the preferences from the underlying models and so forth. The thing we’re right now most both that’s easiest to start with and we’re excited about is having, a separate store for memory, where you have, for example, a memory sub-agent that’s, working in the background, figuring out what are the important parts of the clinician’s actions that we want to remember for the long term. And then you can also imagine, other things where in the — you have background jobs that are running that are collating these, memories similar to Sleep, of course and what other pattern, patterns products do as well. Learning over all these action, all the action data we have, again, note edits, the conversations they did and the actual transcripts.
    Evals: LFD, LLM Judges, and Clinical Safety
    Jacob [00:30:40]: What about evals? How in the world do you... It is such a complex product surface area. We would love to hear you riff on that and also how has that evolved? I’m sure you’ve gotten better at it, so any learnings along the way.
    Janie [00:30:50]: From an evals perspective, we, from day one when we build any new product or feature, we think about, what does good look like? And there are table stakes things like clinical safety but then you start to get deeper into what does good quality look like. And when you go into something like our core product, there’s stuff like style and completeness and there’s things like does this note become something that can be billable, which is very high stakes for a health system. We have a number of ways in which we get confidence for this. We have, internal in-house clinicians who do what we call an LFD process to give us our very first pass at is this or isn’t this a good enough output, look at the effing data.
    Jacob [00:31:41]: LFD?
    Chai [00:31:42]: That’s why I was smiling. I was “Is Janie going to mention what it stands for?”
    Jacob [00:31:46]: I was not... There’s like a million acronyms.
    Jacob [00:31:48]: How am I supposed to know that I don’t? So “Oh yeah, of course, an LFD.”
    Swyx [00:31:51]: I’ve never heard of LFDs.
    Chai [00:31:53]: It’s a bridge for sure.
    Janie [00:31:55]: I got through three days and then I had to ask someone.
    Janie [00:31:58]: I thought it was just me that didn’t know
    Janie [00:32:01]: It’s our internal process.
    Swyx [00:32:02]: But look at the data as a meme in ML, ‘cause you tend to not look at it. You just want to look at number go up.
    Chai [00:32:06]: Exactly.
    Swyx [00:32:07]: But yes.
    Janie [00:32:08]: But so, we make sure we look at the data and then as we think about all of the components of good output, we, one, create LLM judges across all of these and we make sure with annotated data and either internal or external evaluators, we feel like these judges are calibrated. And then depending on the stakes, we also work with in-house and third-party evaluators across all of these before we ship any big change. And the goal is, in terms of evolution, how do you go from this process taking months, down to weeks, down to days? Some of it is, a true science and ML problem. A lot of it’s also just, hard operational work. Have you planned ahead in terms of what you need? Have you really optimized the capacity that you need across all of the different specialties you need? Have you gotten a really good sense of which third parties are great to work with for what use cases? This takes a lot of domain, expertise and, lots of mistakes and errors in figuring that out. And so as much of it is an ML problem, so much of it has also been operational gains that are hugely important, where domain-specific expertise is everything.
    Specialty-Level Evaluation and Progressive Rollouts
    Jacob [00:33:23]: But it’s funny, ‘cause I feel like people talk about healthcare like it’s one giant market and the reality is
    Jacob [00:33:26]: It’s, dozens and dozens of sub-markets. And so it feels like in your evals you have to build that up across the board, probably.
    Swyx [00:33:34]: And is specialization the primary cardinality at... That’s the word that comes to mind.
    Janie [00:33:40]: Sometimes, depending on the product or the use case. And so if we’re making a note improvement or feature for a particular specialty, definitely but we have products that are for nurses. We have products that, are really aimed at making the document or the output a lot more billable. And so we’ll want to work with coding teams and not necessary clinicians. And so like
    Jacob [00:34:05]: Coding meaning healthcare coding.
    Janie [00:34:06]: Yes. Yes.
    Jacob [00:34:07]: Not
    Chai [00:34:07]: Yes. I see you.
    Swyx [00:34:07]: Other kinds.
    Janie [00:34:09]: But is this output proportional to the work that was delivered? Is there sufficient documentation to justify the amount that a health system may end up charging? And so, specialty sometimes but also domain, very different across all of the different products that we’re working for. And building out that network is, not easy and is where a lot of our operational investments have gone into.
    Chai [00:34:35]: And I view a lot of analogies to self-driving cars here, where, part of it is we really want progressive rollout of features to test in the real world is this useful? Is this going to work? One big difference compared to past lives is before I’d build a product, maybe I’d alpha it and then I’d like GA it the next week, ‘cause I’m “Go, move fast, ship,” and whatnot. But the mentality is like you... I want to make contact with the reality as quick as possible but I want a progressive rollout. Because as much as I get as large of an offline eval set, I want the distribution of that to match real-life distribution. And over time, by rolling out early, similar to Waymo has a tagline, “The world’s most experienced driver,” another thing that can, at least linearly increase for us is, both the size of our evaluation offline and online, that and it all feeds back.
    Janie [00:35:25]: Something that’s been earned over time, speaking of evolution, is just the trust we’ve gotten with customers. Historically, a lot of these health systems, when they bring on new vendors, their release cycles are quarters, sometimes twice a year. We’ve gotten our customers onto monthly release cycles, which is pretty fast for health systems but what is more exciting over the last, call it, few quarters, has been, a subset of our customers have said, “We want to innovate with you. We trust you,” and we have a pretty, decent chunk of our customers who say, “We’ll develop with you outside of these monthly release cycles. We have a higher tolerance. We know that the stakes are very high but we want to be the first ones using these products, giving you feedback.” And so for a pretty substantial set of our customers, we’ve been able to convince them to be able to ship, in this gradual way before GA. Something we talk about a lot internally is, trust is earned in drops, earned in buckets and so we still can’t do what I used to do when I worked at Loom. We had 30 million users. I’d just be, rolling out experiments left and. The bar is still quite high for iterative rollout but because of the trust we’ve earned, we’re able to learn at pretty high volume very quickly.
    Privacy, HIPAA, and De-Identification
    Swyx [00:36:45]: Your scale is still pretty huge.
    Swyx [00:36:47]: One thing I want to... We were going to go into scale? In a sec. One thing I wanted to call up, follow up on evals, which, again, just coming from a generalist engineer point of view, just thinking through what would people be scared of in doing this, the privacy and HIPAA
    Jacob [00:37:00]: Elements of this. I have zero experience in that. What do you have to do? What is surprisingly not that bad?
    Chai [00:37:06]: So one thing that’s really important here from a compliance perspective is very much that any of the data we use needs to be de-identified, any real-world data we use as a basis of online eval sets we’re learning from. And so you have to — And there’s, very clear, government guidelines, what counts as PHI. And so we’ve even have built models that can take, for example, a clinical transcript and remove all the key PHI indicators and so you have a scrubbed/de-identified version. And then once you... And so one thing that’s important is first you’ve got to get confidence in that model in the first place? And prove that out. Because, now you have, multiple probabilistic systems on top of each other.
    Chai [00:37:46]: But once you have that, then you can train on it use it for evaluation and so forth, provided one of the cool things also that you can do from a business side is the right data contracting as well with your partners.
    Jacob [00:37:57]: Is the anonymization one way? Once it’s done, you cannot undo it? Or is there someone
    Chai [00:38:01]: Yes
    Jacob [00:38:02]: Who holds the master key that can... Yeah, okay. So it’s one way.
    Chai [00:38:05]: It’s one way. Yeah.
    Jacob [00:38:06]: That’s how it works. I just wanted to... Because, there’s a lot of this, learning from feedback and everything that, you would want to debug more but you can’t because you just physically don’t allow yourself to.
    Janie [00:38:17]: Some of it’s also written in our customer contracts in terms of who can or can’t access PHI data, how long do we retain it,
    Jacob [00:38:27]: Very good
    Janie [00:38:27]: Before it gets de-identified. And so we have a pretty high bar for who can access that PHI data, just to make sure that we always respect our customer data and privacy. But that’s something that we partner with our customers on too, to make sure that as we want full, as close to precision as possible in that quality
    Janie [00:38:48]: We can still use it.
    Jacob [00:38:50]: But it’ll be fascinating to see how that space evolves? Because you think about, I used to work at a company that, did a lot of healthcare data in the cancer space and if you asked, the average cancer patient, “Hey, do you want people, do you want other patients to be able to learn-”
    Chai [00:39:03]: Take it.
    Jacob [00:39:03]: “... Learn from your experience?”
    Chai [00:39:04]: Take it all.
    Jacob [00:39:05]: They’re “Please.”
    Jacob [00:39:06]: “I’d love, nothing more than for other people to be able to learn from
    Jacob [00:39:10]: The experience that I had.” And so in the past it was a lot harder to do that learning. But with this technology, that might really be practical and so it’ll be fascinating to see how that continues to evolve.
    Chai [00:39:21]: There’s so much in our data set of 100 million conversations.
    Chai [00:39:26]: You can imagine things like insights that you can give to the clinician. How could you, oh, how could you have reacted to this? In coaching or insights around, which treatments are effective or, like... Because you have this, again, this data source that was never captured before but that’s, where, intuition or experience is created from, going back to this idea that the conversation is the agent of truth.
    Operating at Scale: Reliability, Cost, and Token Efficiency
    Jacob [00:39:46]: Back to the 100 million conversations, I feel like you have this insane scale that maybe only a few other AI app companies have and everyone else dreams of. So not everyone has had to confront this yet but maybe just talk about some of the challenges of operating at that scale and what, our listeners have to look forward to if they ever get to this level of scale.
    Chai [00:40:05]: At large and larger in scale, so of course there’s a general, infrastructure reliability. When you... In any given startup, you’re building the plane while it’s flying. So there’s some notion of that. But what gets interesting on the AI and ML side for sure is this, as you get at more and more scale, so one, you have the data to first and foremost do this. But, you start thinking about costs or infrastructure in a whole different way at scale versus, a prototype.
    Chai [00:40:34]: You can use the most expensive model, you can burn as many tokens as you want but when you’re doing 100 million conversations
    Jacob [00:40:41]: Token max on leaderboards are less upsetting than that context.
    Chai [00:40:45]: . When you’re doing that and so that comes for we have the data and we also have the team that’s able to post-train based on this and you can optimize for efficiency, especially in areas where you believe that maybe a lot of the quality headroom is less so and you don’t expect the other off-the-shelf models to go that way, such that you want to do, efficiency maximization, in terms of compute and tokens.
    Jacob [00:41:08]: I feel like you guys live in the future in some way where most use cases today are really just in use case discovery mode, where it’s “God, I really hope I can find something that can get to scale,” and so you’re always going to use the most powerful model. And then the few things that do get to this level of scale, you start to do those optimizations.
    Chai [00:41:22]: It’s a natural trajectory where it’s like zero-to-one, we’re not talking about any of these optimizations.
    Chai [00:41:26]: But when maybe we’re in the one-to-100 or so forth, then we’re in optimization mode and, what works out really well is you’ve got all this data from zero-to-one that lets you do this.
    What Comes Next: The Conversation as the Shared Healthcare Platform
    Jacob [00:41:36]: That’s fascinating. I feel like one thing that’s so interesting about the Abridge footprint is that you’re in the doctor-patient visit in real-time. I always like to say, there’s like probably 50 years’ worth of product you could build on top of that. What gets each of you, I don’t know, what are you most excited about building, either in the short term or medium term or even, long down the line?
    Janie [00:41:53]: Something that I get really excited about is that the same conversation can serve so many stakeholders. If you think about the conversation, a doctor needs to know what is the documentation, how do I make sure that this fully represent the care I gave? A patient needs to know, “What the heck just happened? This was really overwhelming. What are my next steps?” A payer needs to know, was this the proper and appropriate care given? A pharma company might want to know why isn’t this drug being properly used or is there a good candidate for this clinical trial that I’m about to run? And where I get excited is that our product and our platform and our infrastructure can be the same product across all of those things and start to what’s today, separate, very expensive, complex systems that serve each one of these stakeholders in very different ways, start to collapse all of that into a singular platform that enables not just more efficiency across the board but also better outcomes for everyone. And, all of us experience healthcare in probably very painful ways and knowing that there is a world in which we can simplify a lot is really exciting to me and it all starts with the conversation.
    Chai [00:43:15]: It’s interesting. Of it very similar to going back to the KPIs that any AI product cares about. How do you increase quality of care? How do you reduce latency to care? And how do you reduce costs? Which is a huge, in healthcare
    Jacob [00:43:28]: They call it the triple aim in healthcare.
    Chai [00:43:30]: But very similar to building AI products and the thing that really excites me is when we talk about that latency piece, we talked about one example earlier of prior authorization, can you reduce the latency to care? But you can imagine so much more. Oh, as soon as the lab value gets updated, do you have like a background agent that, kicks off and uses all the context to be “Oh, hey, the patient should do this next,” for example. And of flagging that to the clinician who’s always in the loop but reducing that latency, to care. And then you can imagine this is much further down the road but it’s like even connecting that to the direct patient and the consumer. And so how can you, how can you build a bridge to all of these things?
    EHR Partnerships and the Clinical Intelligence Layer
    Jacob [00:44:10]: Very cool. The connections piece is just an ever-growing thing. And one of the key partners is the EHR and I wonder what that relationship is like. Will they, look at this as, something that is valuable enough that they want to own someday?
    Janie [00:44:29]: Our partnerships with the EHR is, we know that we have to be extremely close partners with all the EHRs who we partner with. Being able to not only pull and push all of the data into the right places is, not only table stakes, if we can’t do that, health systems don’t want to use us. The second and the reality of today is clinicians spend a lot of their days in the EHR. So much of what allowed us to win in the largest health systems was pretty direct and, very close partnerships with some of the largest electronic health records that allowed us to pull and push data with APIs that weren’t ready out of the box. And clinicians want to save clicks. Anytime we introduce a new product that, adds two clicks for them in their day, they’re “We’re not going to use it.”
    Janie [00:45:21]: They have 15-minute back-to-back appointments with their patients. They’re spending, hours during pajama time doing documentation. Every second and every minute counts and so we really think about being deeply integrated into the EHR as also table stakes to getting real usage and adoption. And anything that we build or introduce, we really talk about earn the right internally a lot, which is we have to provide so much value or save so much time that people will use us. But those are the two things that are close to us, is we know that the product won’t be used unless it is deeply interoperable.
    Chai [00:46:01]: And strategically, to your point, it’s like what does EHR want to own versus us? EHRs are really focused on the clinical workflows and so forth but some of the things that we’re talking about here, I do these traditionally are outside of the domain where it’s oh, connecting pairs and providers together with provider policies or the clinical trial matching, as Janie brought up. And so these are, entirely — we position ourselves as building this entirely new intelligence, clinical intelligence layer across, again, providers, pharma and, payers.
    Chai [00:46:33]: And so that’s a it’s a whole different ballgame that we try to play
    Chai [00:46:36]: In combination with them.
    Jacob [00:46:37]: But it’s like a different layer of scope.
    Healthcare AI Regulation, Technical Depth, and What Changed Their Minds
    Jacob [00:46:39]: I’m curious, you are both relatively newcomers to healthcare. People have these, there’s lots of futuristic healthcare AI takes of “Oh, everything will look different.”, now that you’ve been in healthcare for a bit, you live at the edge of AI, what have you, changed your mind on around this, as you think about what healthcare looks like in ten, 20 years? Any updates to your mental model from the time being close to the problems?
    Chai [00:47:02]: One thing that I
    Chai [00:47:04]: Was hesitant about before and it’s a common thing when I’m trying to recruit engineers that people ask me around, is definitely oh, healthcare, heavily regulated space. And it is, rightfully so. You want to keep, the patients at the end of the day safe. But one of the interesting things that, is a that surprised me how much it is coming to the company is there’s a lot of really favorable regulatory tailwinds as well. Where you think about, government really wants interoperability between all these systems that we talked about and so agents can access this information. The government just in January, the FDA released updated guidance on clinical decision support, what I work on in such a way that they used to have guidance from like 2022 that required you to have, mention all these options and do all these other things but it’s a very forward and forward-looking way. And so for me, what’s been really cool to work on is this, there’s this very special moment both in AI in general, we all know that but there’s a special moment also regulatory in healthcare as well.
    Janie [00:48:05]: One thing I would call out is for the very reasons things are higher stakes or, potentially considered more difficult in healthcare, it’s where some of the hardest AI problems will get solved first, just because the bar is so high. When I first joined, I was “Oh, this is where we’ll be on the tail end of where, all of the AI innovation will be able to be applied.” But when you think about, zero error evals or multi-step workflows that have really low tolerance, a lot of the innovation will happen here just because we have to or else we can’t ship.
    Jacob [00:48:42]: ‘Cause like in other domains, you’d much rather just solve the 80%-is-good-enough problems first
    Janie [00:48:46]: 80/20 doesn’t work here
    Chai [00:48:48]: And building off that, traditionally, there was a bit of stigma that, oh, healthcare companies are not that interesting from a technical perspective or I’ve seen that or faced that myself. But these are really hard and fun problems from a pure technical perspective beyond just the impact. How do you bring the latency of this thing down and make it really high-quality?
    Reducing Latency: Clinical Workflows, Agents, and Implementation Reality
    Jacob [00:49:07]: How do you bring the latency of things down?
    Chai [00:49:10]: Yeah. Yeah. Yeah. So okay, let’s answer the latency question. And maybe hopefully not too redundant with some of the things I’ve said earlier but some part of it is with any latency, you have to like what is, what is really your bottleneck. In a lot of workflows, it’s sometimes it’s the model itself. And so that’s where like our data flywheel, our post-training team and so forth come in so that can you make the models far more efficient. So that’s one aspect of latency. But there’s whole other aspects of latency where it’s okay, on top of that, if you use a constellation of different models, can you use — can you first use like a — it’s like thinking fast and slow. Can you use a cheap, fast model that triages and hands it off to a larger model where you get more intelligence and so forth and so all these
    Chai [00:49:56]: Clever tricks to make it work.
    Chai [00:49:58]: And by the way, we are totally — we also realize that the parameter frontier is changing and so these tricks will — may not get us to where we want to be in five years but we need to if we want to build a useful product right now.
    Jacob [00:50:11]: Should we go to the quick-fire or you want to ask more about Abridge? We can stuff everything that’s not Abridge into the quick-fire
    Swyx [00:50:16]: I don’t mind. I was — I feel like Janie was on the topic of more long tail stuff, which is
    Swyx [00:50:21]: Not the eighty/twenty thing and that really matters. And I’ll —, if you have any tips or cool stories or just general approaches that have worked for you that’s interesting to dig into.
    Janie [00:50:32]: One of them is even just how we staff our teams looks different than a traditional software engineering team, I’d say.
    Swyx [00:50:40]: Let’s go.
    Clinician Scientists, Edge Cases, and Evals at Scale
    Janie [00:50:41]: We have a bunch of folks with different roles who are clinicians and so we have this role called the clinician scientist and I heard one of our leaders refer to them as mutants recently. But they are people who’ve had clinical backgrounds, so MDs typically, who are also deeply technical, somewhere, on the spectrum of like a full stack engineer all the way to like extremely scrappy prompter. But having each of these people embedded within our teams instantly raises the bar for everything that we build because not only are they determining, is this product clinically useful but they’re deeply embedded in our whole evals process. And so when we talk about LFDs, when we talk about what is our actual evaluation criteria, you don’t want Chai or me creating what those are because we don’t have clinical background. But is probably unique to Abridge but has been game changing. And when you think about where the puck is going, you have people build with clinical backgrounds who are technical and where AI tools are going, they just become
    Janie [00:51:53]: More and more, critical and like the killers of the team. And so that’s one. And then the second is just the scale at which we do evals to catch that long tail up front before anything ever gets into production is something that we’ve pretty much like really started to fine-tune, both from a scale but when do we know we need to get several hundred versus several thousand offline responses, what helps us make that quick decision and make this less of an art and as much of a science as possible. But that’s also been something we’ve had to tune over time.
    Swyx [00:52:27]: And you have partners who opted in to give you those evals.
    Janie [00:52:31]: So we work either internally or with third-party for offline evals and then we have customers who also agree to give us, whether it’s like thumbs up, thumbs down to like choose this or that, a lot of data to get us to what is as close to fully confident as possible.
    Swyx [00:52:51]: The term that comes to mind is
    Swyx [00:52:53]: Like active learning on things where you’re weak. I feel like it’s a lost art
    Swyx [00:52:58]: Is a lot of the polish that comes into doing something like this.
    Janie [00:53:02]: Really.
    Chai [00:53:03]: Hundred percent.
    Lessons from Glean: Technical Foundations and AI App Infrastructure
    Jacob [00:53:04]: Maybe, on a totally unrelated note, Chai, you had a very, storied run at Glean before heading over to Abridge. And so, I’m curious like that — it’s was one of the early AI app success stories. As reflecting back on that experience, what do you think Glean got most, maybe most wrong? Yeah, curious for your reflections.
    Chai [00:53:24]: The... I attribute Glean’s success really to very strong technical foundations, that have really stood the test of time. And so it started with — it started with a known problem and like finding information where work is hard. The best technology at the time was to build really high-quality search. A lot of times enterprise search startups failed because the quality wasn’t great enough. But the learning that people took away from that is, oh, enterprise search is not good enough. And so like quality, really changes the game of like if something can be useful or not. It’s like similarly like people may have taken it that way, “Oh, Alexa voice assistants are not that useful.” But when you have quality, things can change the game. And so Glean’s early foundations, by bringing people who had built search at Google, the best place to have ever built search and being really creative and having a very concrete problem to solve but with the right technical backgrounds, laid the foundation for all of its success for the many years to come. And what’s interesting is always figuring out, hey, how does a company adapt in this, as we all know and we’ve talked many times, in this changing landscape. And so for Glean, how do you put this context layer to the use, has been the thing that we’ve really, the last few years, has been the fun from the challenge. That where like you could say, that’s been the opportunity for the company as well as the challenge as well.
    Jacob [00:54:46]: Definitely a competitive market. It feels like one at the epicenter of the foundation models and, the hyperscalers, so it’ll be interesting to see how it all plays out.
    Chai [00:54:55]: When you think about can you build something that helps everyone at knowledge work as well is a massive opportunity.
    Jacob [00:55:02]: Always my mental model is like there’s a few markets that are like the foundation model companies have to win or are like big enough to go after and It’s probably like consumer code and that.
    Jacob [00:55:11]: And so it would definitely be interesting to see how it plays out. One thing we often think about on the investing side is, the pace of progress in models changes so fast and so the building patterns adjust so fast. And it’s always hard to figure out, what pieces of the way people are building today, the infrastructure tools they use, are going to prove persistent versus, okay, six months later we’re doing something completely different because
    Jacob [00:55:31]: Models have improved. I’m curious of the stuff you use today, how do you think about the pieces of AI infrastructure software that feel a little bit more persistent?
    Chai [00:55:40]: So generally, if you take the thesis that the models are going to be more and more agentic, before we had to build a lot of scaffolding around that. In previous gigs, I’ve — we’ve effectively, we made our own DSL effectively and you can view the because the models were not capable enough, so you needed to simplify things. And you can view it similar to other agent frameworks. But over time, if the models become more and more agentic and can use the similar tools that we already have, where it’s like computer use, writing code itself in sandbox, much more around, far more about, what are the right context layers and the tools to give agents. And then the other things that I think about are how do you really build truly event-driven real-time systems and especially at Abridge, again, where you’re doing something real-time in the conversation. And so there’s a lot of event-driven technology. And by the way, stuff that we’ve always used in the past, whether it’s Kafka, Temporal, Sockets and so forth, how do you bring that together is also durable. Or thinking about patterns in which humans collaborated with each other on Google Docs. How do you think about like CRDT and so forth when you have conflicts, when you have multi-agent systems? So all these things that we’ve built for — the things we’ve built for humans are the things that are going to be, continue to be durable.
    Jacob [00:56:55]: . Just with like 1,000 times more the scale of agents running at them instead.
    Jacob [00:56:58]: They’re going to really work.
    Chai [00:56:58]: So make sure that they scale, of course and fast and whatnot. Without a doubt, yes.
    How Agentic Does Abridge Become?
    Swyx [00:57:03]: Does Abridge become more agentic over time than, what is the next more agentic version of that look like?
    Swyx [00:57:10]: ‘Cause you’re already pretty proactive it’s, with like the notifications.
    Chai [00:57:15]: And so I view that as like a piece of being agentic but I also view it as maybe some of the things we mentioned before, oh, reacting to labs or, doing work in the background or doing
    Chai [00:57:25]: Even more capabilities on behalf of the clinician, who we believe has a super important role to play as, in terms of patient connection and so forth.
    What They Changed Their Minds On: PRDs, Prototypes, and Judgment
    Jacob [00:57:34]: I’m curious for both of you, what’s one thing you’ve changed your mind on in AI in the past year?
    Janie [00:57:39]: The one I flopped on and this is much more product specific, is, probably the hotter take is that prototypes are the end all be all and that PRDs are dead.
    Janie [00:57:51]: We’ve tried switching and... We continue to evolve the way product is developed and, the products that we’re building are extremely complicated and nuanced and it is very difficult for a prototype to capture the full complexity of what can we or can’t we do with this data. What and who... Is this the actual right problem to be solving for in a world where software has become so cheap? Yes, this is a cool looking prototype but should we be spending any of our precious hours here? If so, why? And how does this deepen our moat in a world of decreasing moats? Does this require custom implementation from our customer to use? None of that gets captured in a prototype and so we’ve, we’re continuously evolving the way that we develop product here but even if not written in the same traditional ways as it was two years ago, as a team we’ve gotten pretty, high conviction that in a world of so much noise, crisp written clarity is more important than ever. It might now live in a markdown file that more teams and systems can use as context but that’s probably one that is much more
    Swyx [00:59:06]: So you’re
    Janie [00:59:06]: Function specific to me.
    Jacob [00:59:08]: I love that.
    Swyx [00:59:09]: You’re disagreeing with the consensus
    Janie [00:59:10]: That PRDs are dead
    Swyx [00:59:11]: That’s great, yeah.
    Swyx [00:59:12]: So you are like
    Janie [00:59:14]: That prototypes are the thing.
    Janie [00:59:14]: We should partner with AI to create great documentation but first, probably most important, is strategically answering like why is this problem the one our company and our product should solve? What happens if the next 20 competitors build this? Why, what is our right to win and does this help us differentiate in any way or are we just adding noise? It’s important
    Swyx [00:59:39]: That’s a high bar. I don’t know if I could answer that
    Swyx [00:59:41]: Because a lot of the times the answer is let’s do it first.
    Janie [00:59:44]: And when the cost of doing it first is so expensive, we just talked through the process of getting something out to customers. You need to have a higher bar for as a business, should we invest here? And as all of our roles evolve, one of product or like all of our jobs become should we do this thing? And that’s something that is worth the time spending up front on. And then, as you think about prototypes, it’s still really valuable to quickly show, “Here are the 20 ways we could do it. Clinician, I would love your feedback, which one resonates more?” Or as you get into deeper fidelity, you can also make the prototypes deeper fidelity and like get it as close to production ready as possible. But, beyond that, to get it out to customers, there’s a lot of implementation details, security compliance, edge cases, things that never get caught in a prototype that need to be written out somewhere. And so they look different but still more important than ever.
    Jacob [01:00:52]: It’s interesting. I imagine a lot of that also is like given the context of the stage that Abridge is at.
    Jacob [01:00:58]: I feel like for so many early stage companies, it’s just a desperate race to... You throw like 30 things at the wall, you’re “Please, something just like resonate with my end buyer.” and, you find something and that’s, why the prototype first approach is so powerful. But for you all, it’s like anything you’re going to do is across 200 systems, there’s like a whole, implementation change management side of things and you get a few big bullets to fire at at what you want those systems to do. And so being really thoughtful about that.
    Chai [01:01:25]: It makes a ton of sense and maybe the prototype first takes will all grow into your view of the world when they’re a bit more scaled.
    Janie [01:01:32]: The weekend demo versus it works at the largest health systems is, a massive gap. I don’t think it means we can’t go fast. This is the fastest I’ve built in my career, right now and the
    Chai [01:01:47]: Compared to Loom?
    Janie [01:01:48]: From a the complexity and the scale of the products we’re trying to build and the problems we’re trying to solve, I’d say, yes, maybe I, updated a flow or, shipped a new feature pretty quickly but if you think about some of the products we’re building, we’re trying to collapse prior authorization, things that used to take 45 days across maybe 20 different touch points into one. I’m building faster than I ever have and so the thoughtfulness allows us just to go fast at the right things. It sounds contradictory but that
    Chai [01:02:28]: No
    Janie [01:02:28]: Thought up front
    Chai [01:02:28]: Go slow to go fast.
    Janie [01:02:29]: Exactly.
    Chai [01:02:30]: It’s interesting. In the... When a lot of things are changing and in the AI discourse, sometimes we lose sight of things that always stood the test of time. Judgment and clarity always matters. As an engineer, sometimes I don’t want a prototype. I would like to see... I want the written, the clarity that comes from writing and then we build that. And again, for some things, of course, where it’s a small thing, yeah, just ship the prototype. That’s why, don’t sweat the details. So the interesting thing, the nuance that gets lost sometimes in discussion is, sometimes we need to recalibrate our judgment for sure because the costs and gains have changed but that doesn’t mean we go all the way on one spectrum or the other.
    AI Tools, Claude Code, and Closing Notes
    Chai [01:03:11]: Outside of your specific tool, I always like to ask this question, any other AI tools that you guys are enjoying?
    Chai [01:03:16]: Claude Code. But, that feels, too basic of an answer.
    Chai [01:03:20]: Is all of Abridge engineering very built on Claude Code?
    Chai [01:03:23]: Yes.
    Chai [01:03:23]: Wow.
    Chai [01:03:23]: Very much so. I won’t
    Chai [01:03:26]: We also have Cursor as well.
    Chai [01:03:28]: Many of the
    Chai [01:03:29]: I’m just checking the boxes here.
    Chai [01:03:30]: Many of the tools available but it’s like you look at just earlier in the day, you see an engineer’s screen. You see, six different, Claudes running at it. Sometimes the same person, I’ve seen them on the sofa now with the remote control as well on the mobile. But, very much so. One of the interesting things for me is, as a relatively new person to companies, Claude Code helps me onboard much faster or any of these AI code... And, I feel like I learn so much. I do love the memes of “Claude’s going to do this.” So, I’d like to see Claude,
    Chai [01:04:00]: The venture equivalent is “I’d like to see Claude go do a company at a billion dollars pre-revenue.” Like
    Where to Learn More: Whitepapers, Research, and AbridgeHQ
    Chai [01:04:06]: We always like to leave the last word in these conversations to you both. And so, any place you want to point folks where they can go learn more about Abridge, the work you’re doing, any of the research you guys have done, whatever. The floor is yours.
    Chai [01:04:18]: A couple places. If you... On our Abridge website, we have a lot of our whitepapers where we’ve done a lot of interesting work, such as, reducing a hallucination objection.
    Chai [01:04:27]: Very well-presented, by the way. I liked it. Yeah.
    Chai [01:04:29]: Thank you. Our science team rigorously defined what is the problem. And one of the interesting things, by the way, at Abridge, is we have multiple, stats professors on staff as well. So in that specific whitepaper, Michael Oberst, who’s a professor at JHU. And so we have multiple... And from that comes, very high rigor and then also our taste for design comes from really good presentation. But setting that aside and we’re going to have many more technical topics there, please follow our Twitter account as well, AbridgeHQ. And then the other thing I’ll plug a little is, we have a open house of diving deep into AI and healthcare coming up with Andreessen Horowitz.
    Chai [01:05:07]: Amazing. Well, thanks so much.
    Janie [01:05:09]: Thanks.
    Chai [01:05:09]: This was super fun.
    Chai [01:05:10]: Thanks so much.
    Chai [01:05:10]: Thank you.


    This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.latent.space/subscribe
  • Latent Space: The AI Engineer Podcast

    🔬Doing Vibe Physics — Alex Lupsasca, OpenAI

    05/05/2026 | 1h 31min
    Some people are going crazy over GPT 5.5. Some people. This is the story of the Jagged Frontier. People who use AI to write emails or even code implementation work find the lift moderate whereas people pushing the limits of the model are figuring out that the limits just moved outwards.
    Alex Lupsaska has been tracking this limit for a year and a half now. “When GPT5 came out, it was able to reproduce one of my best papers (that took a very long time to come up with) in 30 minutes.”
    But Alex also notes that this shift was mostly invisible.
    I remember when GPT-5 came out… on Twitter, the reception was lukewarm. A lot of people were like, well, we expected a lot more, and it’s not better at writing email. And I remember thinking, well, okay, GPT-3 could write email. How much better can it get at writing email? That’s not the point. But at the science frontier, the capabilities were really taking off.
    We walk through his paper and more with him in today’s Science pod! Watch here.

    The “Oscar for physics”
    Alex made an early splash in his career with breakthroughs in our understanding of black holes. He’s also known for Black Hole Explorer and an iPhone app that makes visualizing black holes fun and interactive to regular audiences. Alex won the 2024 New Horizons in Fundamental Physics Breakthrough Prize. Known as the “Oscar for physics” this is arguably the most prestigious prize an early stage theoretical physicist can win.
    Alex first saw promise for AI in theoretical physics after he asked o3 for help on his research. In the podcast, Alex recalls asking GPT for help with a calculation that would have taken days, and getting a result in eleven minutes.
    He immediately recognized how impactful AI would be for his work even as though his physicist colleagues and the larger community gave it a lukewarm or skeptical reception.

    The Move 37 Moment for AI x Physics
    GPT-5 had just been released, and Alex tried asking it to solve a problem in a just published paper. GPT-5 said no answer. But Mark Chen, CRO of OpenAI, pushed a bit harder, and had Alex prime the model with a textbook warmup problem, which it easily solved. After using this “priming” trick, GPT-5 was able to reproduce his full result in eleven minutes (yes, the paper was released after the model’s training cutoff).
    “This changes everything.” Alex notes that we seem to be on the edge of a massive change in theoretical physics reasoning. A year prior LLMs were just starting do correct math. Now ChatGPT could reproduce his hardest paper in the time it takes to get a coffee.
    Alex was on sabbatical at Vanderbilt, and he joined OpenAI to start pushing the boundary of AI’s ability to accelerate physics.

    “AI solved the problem before the plane landed”
    Alex began to put GPT through it’s paces, reaching out to colleagues for problems they were stuck on. His old PhD advisor (Prof. Andrew Storminger at Harvard) had an insidght about certain physical quantities known as “single-minus gluon tree amplitudes”.
    In certain cases, these amplitudes may be non-zero when previously shown to always vanish. The team pushed this intuition forward, and came up with a formula for these quantities that appeared nonzero, but which was otherwise completely intractable.

    Spending over a year on this problem, no real progress was made.
    Prof. Storminger planned to visit OpenAI to work on the problem the week after the initial conversation started. In that one week ChatGPT fully solved the problem, as Alex recalled, before Prof. Storminger’s plane even landed.
    What was interesting is not only that ChatGPT solved this problem, but how it solved it. The model quickly realized found a limiting case (known as the “half-collinear regime”), that in hindsight has a nice intuitive explanation. Taking this limit, the gnarly results collapsed down to a simple and intuitive formula!
    The last step was to prove this intuitive formula. The team started with a fresh session, gave a prompt with the context of what they previously learned, and let the model loose. Not only was ChatGPT able to reproduce the previous result, it was able to prove it using a technique unknown to the authors!

    The Vibe Physics moment
    With a concrete success in the bag, the team asked if they could generate new physics from scratch using ChatGPT. They took on what they felt to be a harder problem, looking at the graviton, a proposed particle that should appear when one combines gravity and quantum mechanics. They wrote up a simple prompt asking ChatGPT to perform the same research as the gluon paper but instead for gravitons. And then hit go!
    What came next was truly “vibe physics”, with ChatGPT pushing out 110 pages of novel physics, new calculations, and novel techniques. This was over the course of a day, with most interactions the familiar following the now familiar pattern for anyone who uses a coding agent:
    GPT: Here's your .
    Would you like me to do ?
    Alex: Yes, please do!
    GPT:
    And for those who look deeply, this really was not just a direct 1-1 mapping between gluons and gravitons. ChatGPT imported new techniques that were necessary due to the nature of gravitons, and used them flawlessly.
    They spent the next three weeks verifying all the results. And voila! A new paper featuring novel results in quantum gravity, generated in less than three days total. Truly a “Feel the AGI moment”.

    For those interested, there’s a blog post with the full transcript from initial prompt to final paper. Even if you know no physics, it’s crazy seeing pages of correct calculations fall out of simple prompts such as “Yes calculate outside of SD first. This is the first step.”

    Out-of-domain = new knowledge
    The thing that is qualitatively different between Vibe Physics and Vibe Coding is that Vibe Physics means actually extending the frontier of human knowledge. Looking at the Gluon and Graviton results, they seem in retrospect, like many results in physics and math, like natural extensions of what we already know. This is in fact part of what makes them beautiful. But this was a problem that stumped experts in the domain for a year. Although it does still have a bit of a recombinant flavor, this thing has never been done before.
    It may be that there are still large classes of problems that AI won’t do well on, and approaches that an AI might not think to take. This is the “taste” that everyone has been talking about. Alex told us that these capabilities, however, allow him to explore many possible avenues in order to map out much more ambitious problems to tackle. With AI able to output results basically as fast as we can conceive and validate them, the scope of what one theorist can hope to achieve has just gotten a lot, lot bigger.


    This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.latent.space/subscribe
Mais podcasts de Ciência
Sobre Latent Space: The AI Engineer Podcast
The podcast by and for AI Engineers! In 2025, over 10 million readers and listeners came to Latent Space to hear about news, papers and interviews in Software 3.0. We cover Foundation Models changing every domain in Code Generation, Multimodality, AI Agents, GPU Infra and more, directly from the founders, builders, and thinkers involved in pushing the cutting edge. Striving to give you both the definitive take on the Current Thing down to the first introduction to the tech you'll be using in the next 3 months! We break news and exclusive interviews from OpenAI, Anthropic, Gemini, Meta (Soumith Chintala), Sierra (Bret Taylor), tiny (George Hotz), Databricks/MosaicML (Jon Frankle), Modular (Chris Lattner), Answer.ai (Jeremy Howard), et al. Full show notes always on https://latent.space www.latent.space
Site de podcast

Ouça Latent Space: The AI Engineer Podcast, Ciência Suja e muitos outros podcasts de todo o mundo com o aplicativo o radio.net

Obtenha o aplicativo gratuito radio.net

  • Guardar rádios e podcasts favoritos
  • Transmissão via Wi-Fi ou Bluetooth
  • Carplay & Android Audo compatìvel
  • E ainda mais funções