← 返回视频(38) I Studied Stripe's AI Agents... Vibe Coding Is Already DeadYouTube/2026年3月25日原始链接↗Transcript▲0:00Are you vibe coding or are you a gentic0:02engineering? The difference is massive.0:05Keep that question in mind as you look0:07at one of the best engineering teams on0:09the planet to determine if they're vibe0:12coding or agentic engineering. Stripe0:15engineers are shipping 1,300 pull0:18requests every single week. Get this.0:21There is zero human written code and0:24they're doing it right. Imagine what0:26will happen to their insane numbers of0:281.9 trillion in total volume up 34%0:33which is the equivalent to 1.6 of global0:36GDP. Stripe's doing 1 billion this year0:39and they power all of the best companies0:41that you and I use and you yourself0:44might be running on Stripe as well. What0:46happens when Stripe multiplies all this0:49with agents? And not just agents, what0:51happens when they multiply it with their0:52custom end-to-end solution they're0:55calling minions, fully unattended coding0:58agents that start from a Slack message1:00and end in a production ready PR. This1:03is Stripe's oneshot endto-end coding1:06agent. To me, the minions aren't even1:08the interesting part here. The1:10interesting stat here to me is that1:12their agents operate a code base with1:15millions of lines of code operating a1:18uncommon stack with a number of1:20homegrown libraries that are unique to1:23Stripe and therefore unknown to LLMs. On1:27top of that, the stakes that Stripe1:29operates in are extremely high. The code1:32they write moves over 1 trillion per1:35year of payment volume. They have a1:37number of real world dependencies,1:39regulatory and compliance obligations1:41that their code must honor. Now, here's1:43a simple important question for you. Do1:46you think Stripe can afford to vibe1:48code? I personally have written millions1:51of lines of code with agents and without1:54agents. I've been building with agents1:56since it was first possible way, way1:58back in the day when we were using GPT2:003.5 Turbo. Many engineers don't even2:03know that model exists or once existed.2:06So allow me to clarify these terms a2:08little bit. Aentic engineering is2:10knowing what will happen in your system2:12so well you don't need to look. Vibe2:16coding is not knowing and not looking.2:18It's very clear stripe engineers are2:21agentic engineering. And in this video2:23we'll break down stripes aentic layer so2:26you can take the best pieces and add it2:29to your agentic systems. Vibe coding is2:32the lowest hanging fruit. When you2:34agentic engineer systems just like2:36Stripe has, from the prompt to your2:38skills to your custom agents to your2:40agent harness all the way up through2:42your tech stack, you capitalize on the2:44greatest opportunity for engineers to2:46ever exist. Agents,2:53let's look at their agendic system at a2:55high level so that we can analyze the2:56key pieces of their system. If these2:59components interest you, definitely3:00stick around. We're going to be breaking3:01down Stripe's key components. And as we3:04do this, you'll see what you have and3:05what you're missing. All right. So, the3:07first thing is the API layer. They have3:09a way to communicate to their agents. As3:11you'll see, they have many ways to do3:13this. Then they have a warm devbox pool.3:16What is this? This is an agent sandbox,3:18a space to place their agent. Fantastic.3:21They then have the agent harness. Stripe3:23built their agent harness. They forked3:24it from a tool we'll cover in a second3:26here. And then they have this blueprint3:28engine, the marriage of the old world3:30and the new world, code and agents. This3:32is super super important. This single3:34piece has given Stripe a massive edge.3:37You'll see why in a second. All right.3:38Then we have the rules file. How did3:40they manage the context problem? Agents3:43cannot read their 100 million line of3:45code codebase. So how do they solve that3:47problem? We'll then talk about the meta3:49layer of their tool shed. You can3:51imagine they have hundreds of tools and3:53tens of services that they want their3:54agents to operate with. How do they3:56solve that problem? They built a tool3:57shed. All right. Then of course they3:59have a way to validate all their agents4:00work. This is a critical validation4:02layer that they can use to give their4:04agents feedback and to validate that4:07they're not breaking existing working4:09features that's helping them generate4:11and maintain that movement of that $14:13trillion. All right, so we're talking4:15about real stakes that the Stripe4:17engineers are facing. All right, this is4:18not a green field rapid prototype4:20application. All right, these are4:21serious stakes with real world4:23consequences. And then of course you4:24need a place to review your agents work.4:26They're using GitHub PRs. Everyone's4:27using GitHub PRS. This is the standard.4:29Nothing new here. All right. But these4:30are the critical pieces. We're going to4:32walk through these piece by piece and4:34understand how they put these together4:36to build their Agentic layer. Let's go4:37ahead and start with their minions. So4:39what is Stripe's take on agentic coding?4:42Let's find out. Aentic coding has gone4:44from new and exciting to table stakes.4:46Unattended coding agents have gone from4:49possibility to reality. You know, Stripe4:51engineers know what they're talking4:52about because this is true. If you are4:55not agentic coding, the gap between you4:57and the agent coding team within a week,5:00within a month is going to be5:01astronomical. Okay? It's going to be5:03exponential. This is the last moment to5:05hop on the train. Stripe minions are5:07Stripe's homegrown coding agents.5:10They're fully unattended, built to5:11oneshot tasks. Thousands of pull5:14requests merge each week. So one week5:16goes by, Stripe engineers merge a5:18thousand pull requests. Let's just5:21really understand that scale. All right.5:22And as I mentioned, they contain no5:24human written code. They realize you5:26have to stop coding to get the real5:28scale, to get the real power out of5:30these agents. You work on the agents,5:32not the application. Right? This is a5:34weird mindset shift that you need to5:36make if you're going to be building with5:37agents. Now, interesting to note here,5:39our developers can still plan and5:41collaborate with traditional agent5:43decoding tools, Claude and Cursor. But5:46in a world where one of our most5:48constrained resources is developer5:50attention, the agents allow for5:52parallelization of tasks. This is super5:54super super critical. All right, they5:56realize that the most important resource5:58and really any software company's most6:00important resource is your developers6:02time. It's your developer attention. And6:05when you maximize the leverage your6:06developers get, you can do crazy things6:08like this. They see their engineers6:10spinning up multiple minions in parallel6:11and able to solve multiple problems at6:13the same time in different conditions.6:15All right, so this is fantastic. So the6:17first thing we need to figure out is why6:18they built the minions in the first6:20place. Why did they build it themselves?6:22What's the point of this? Isn't cloud6:24code good enough? Vibe coding a6:26prototype from scratch is fundamentally6:28different from contributing to Stripe's6:30codebase. Okay, interesting. Say more.6:31Stress codebase encompasses hundreds of6:33millions of code across a few large6:35repositories. Okay. Written in Ruby,6:38uncommon stack, homegrown libraries,6:40LLMs don't have it baked in, right? It's6:42not in the models training data. Stakes6:45are high. Stripe moves over 1 trillion6:48per year in payment volume. As6:50mentioned, they have real world6:51dependencies and compliance obligations.6:53LLM agents are really great at building6:55from scratch when there are no6:57constraints on the system. However,6:59iterating on any codebase of scale,7:01complexity, and maturity is inherently7:04much harder. Very, very true. Engineers7:06build sophisticated models to make7:09changes inside their large repo. This is7:11huge. And we talk about this on the7:13channel all the time. Specialization is7:15how you win. When you're building a7:17great product, it is literally a7:20specialized solution to a specialized7:22problem. So, why would you stop at your7:25tooling? Your tooling and your code must7:27also be specialized. So, this is why7:30they built their own custom agent. It's7:32because they're solving specific7:34problems in specific ways better than7:36anyone. And again, this is a theme we7:38talk about on the channel all the time.7:40Specialization is your advantage. And in7:43last week's video, we talked about the7:45PI coding agent because there are many7:47coding agents, but this one is mine. We7:50emphasized this very idea. You can7:52customize your prompt. You can customize7:54your skills. You can customize your7:56custom agents and you can specialize7:58your agent harness. The more you're8:00specializing, the more you're building8:02specific solutions to specific problems,8:04the bigger your edge is. And the more8:06you distance yourself from the out-8:08of-the-box experiences that a lot of8:10agentic coding tools are driving8:11everyone toward, the better off you're8:13going to be. So, Stripe built minions to8:16solve their specific problem and to8:18operate their large code base better8:20than anyone. Makes sense, right? Big8:22shout out to everyone who shared that8:23video. That one went absolutely viral8:25and for a good reason. Engineers are8:27realizing that we don't want to be super8:29locked in to a single tool like cloud8:31code or cursor or codeex or whatever.8:33Every tool is going to have a problem,8:34but the tool that won't have a problem8:36is the one you customize to solve your8:38specific problems better than anyone.8:40There are many coding agents, but this8:42one is mine. I love this slogan. Let's8:45see how Stripe customized their minions.8:47So, what is it like to use a minion?8:49Right away, we jump into another8:51critical idea. There are several entry8:53points for minions. They're designed to8:55integrate as ergonomically as possible8:58where Stripe engineers are. All right,9:00so they use a CLI, a web interface, and9:03they have Slack. Already they have three9:05points of contact for kicking off their9:08API, I assume, right? They they have a9:10separate application which kicks off9:12their agents, right? Their pool of9:14agents, but they have multiple ways to9:16interface with that primary service.9:18Very important. And so we can see here,9:19you know, here's a clear example of an9:20engineer in Stripe using that at symbol9:23at devbox and then they write their9:25prompt to the agent, right? Makes sense.9:26Nothing new there. Okay. And so we can9:28see here they have a custom UI that they9:30built, right? They have an interface to9:32allow them to interface with their9:34custom agent. So you know on the left9:36you can see kind of a typical view. We9:38have that log of tools and the thought9:41process that their agents go through.9:42And then on the right we can see that9:43they have all the modified files. So9:45they can see very very quickly what's9:47going on with that agent. And then of9:48course in the top right here they have9:50their actions. All right, so create pull9:51request. And I'm sure they have some9:53prompt interface here as well. So nice9:55and simple, very concise. You can see9:57that they're just surfacing the most9:58important information. And this hints at10:00another key aspect of your agents and10:02your agentic system. You need to be able10:04to observe what's going on. Once a task10:06has been completed, a minion will create10:08a branch, pushes it to CI, and prepares10:10a pull request following Stripe's PR10:13template. And then they're going to10:14request another review from a Stripe10:16engineer. And they can also iterate. So10:18this is a classic end of process setup.10:21When you're a genta coding, you show up10:22at the beginning and the end during10:25planning and during review and ideally10:27not once in the middle. All right? And10:29that is what creates an outloop agent10:32coding system. You just write the prompt10:33and you just do the review. There's10:35inloop agent coding and then there's10:37outloop agent coding. All right? We'll10:38circle back to that idea in a second. So10:40how do their minions actually work? A10:42minion starts in an isolated developer10:45environment or a dev box. Fascinating.10:48So, this is a concept we've talked about10:49on the channel. They're giving their10:50agents their own environment to operate10:52in, right? Which is the same type of10:55machine that Stripes engineers write on.10:57This is a simple yet powerful idea. If10:59you want your agent to do what you can,11:01you must give it the tools and the11:03environment that you have. So, Stripe11:05realizes this. They reuse their11:07developer setup for their agents. They11:09give them everything that the engineer11:11has. Super super powerful idea here.11:13Dead boxes are pre-warmed, so one can be11:15spun up in 10 seconds. Love that. Uh,11:18not very fast, but for the machine that11:20they're booting up, which I think11:21they'll mention in a moment, that is11:22very fast. They're booting up full-on11:24AWS EC2 instances. All right, with11:26Stripe's code and services preloaded,11:29they're isolated. This is a safe space11:31to place their agent. And they do this11:33so that they can run minions on dev11:34boxes without human permission checks.11:37Of course, this also gives you11:38parallelization without the overhead of11:40something like Git Work trees, which11:42just falls apart at certain scales. All11:44right, after some time, the Git Work11:46trees just fall apart. You're going to11:47need your own dedicated device. I have a11:50Mac Mini here as a local personal kind11:52of private device. But recently, I also11:55just said, "Screw it. I'm going to need11:57more scale." And I started spinning up11:58entire dev boxes for my agents on, you12:01know, use your favorite cloud hosting12:03tool, GCP, AWS, and some of the other12:06ephemeral agent sandbox tools like E2B,12:08modal, so on so forth. But this is a12:10really big idea, right? Um, the more12:11autonomy you give your agents and the12:13more you set up their environment to be12:14yours, the more they can act and perform12:17as you would. The core agent loop runs12:19on a fork of blocks coding agent, goose.12:22One of the first widely used coding12:23agents, which they forked early on. So12:26shout out Goose. They took this and they12:28customized the orchestration flow in an12:30opinionated way to interle agent loops12:33and deterministic code. Huge huge huge12:36idea here and they're going to expand on12:38this even more in a moment here with one12:40of the big ideas they talk about later12:42which is their blueprint engine. Okay,12:44so this is a huge huge huge idea. Let me12:47just emphasize this. You want to be12:48interle agent loop with deterministic12:51code and what type of operations right12:52we're talking get liners and most12:54importantly testing. Okay, this lets12:56your agents, your system operate with12:58feedback. And this gives you the best of13:00both worlds. You get the deterministic13:03world and the non-deterministic13:05reasoning creativity world. And they13:06explicitly say that here they run a mix13:08of creativity of the agent with13:10asurances that they'll always complete13:13stripe specific steps like llinters. So13:16here we have stripe agentic engineering13:18determinism with agents. All right. So a13:20couple additional things to note here.13:22connected to MCP. They use cursor and13:24clawed code and some conditions. They13:26operate agent rule files. We'll talk13:28about that more in a second. This solves13:30the large context problem for Stripe.13:32All agent rules are conditionally13:34applied based on subdirectories. Super13:37important. They have MCP as I mentioned.13:39They have this tool shed idea which is13:41basically a meta tool to help them13:43select one or more of their 400 MCP13:46tools. Okay. A really big piece of why13:49this blog post is so incredible, you13:50know, shout out to the Stripe engineers,13:52shout out to Alistar Gray, is the fact13:54that they're operating at such a massive13:56scale, at such success, and they're13:58still gaining massive value from their14:00agents and from their agentic layer that14:02they're building. All right, managers14:03are built with a goal of oneshotting,14:05but if they don't, the key is to give14:06them feedback. Key key idea. Two more14:08ideas here. We seek to shift feedback14:11left when thinking about developer14:12productivity. The best thing for humans14:14and agents is basically you want the14:16issues to happen earlier rather than14:18later, right? On the engineers's device,14:20on the agents device as early in the14:22process as possible. All right? And then14:24if local testing doesn't catch anything,14:26they have a whole suite of tests over 314:29million tests that run upon push. Key14:31idea here, they figured out a way to14:33selectively run tests on push. All14:35right? And they're choosing from many of14:383 million tests. Okay? And this is going14:40to, as you can imagine, offer feedback14:42to their agentic system. Now, here's14:44something that I would critique Stripe14:47on a little bit here. Due to the cost14:48constraints, they only let their minion14:50run at most two rounds of CI. All right,14:53so you can imagine at this scale that14:55they have to just limit this for it to14:56be costefficient. This is where I would14:58push back a little bit. We'll talk about14:59that later, but this is a interesting15:00thing here, right? So, they basically15:02limited the rounds of feedback for their15:04minion to just two. This is part one of15:06their blog. Let's look at part two and15:08dig into some of the details of some of15:10these key nodes, right? Specifically,15:12their agent sandbox and their powerful15:14blueprint engine because their blueprint15:16engine sits at the center of how they15:18operate their strike minions at scale.15:25So, here's part two, dev boxes hot and15:27ready. So for maximum effectiveness,15:30their minion agents requires a cloud15:32developer environment that's15:33paralyzable, predictable, and isolated.15:36So this is very clearly an agent15:38sandbox. Okay, it gives them a place to15:41operate at scale with full autonomy. And15:43if something goes wrong, if they destroy15:45something, the agent can't cause as much15:48damage as they could if they were15:49operating your device or god forbid a15:51device connected to the production15:53system. And I completely agree.15:55Containerization, get work trees,15:56they're great, but they have hard limits15:58and it's hard to really really scale16:00without giving each agent their own16:03device, right? Again, if you want your16:04agent to perform like you, give them the16:06tools that you have. All right. What16:08else can we learn about Stripes Devbox16:10here? So, very cool. Stripes Devbox is a16:14full-on computer, right? It's an EC216:16instance and it contains their source16:19code and services under development.16:20Very, very cool. Many engineers use one16:22dev box per tasks and this means that16:26every engineer might have half a dozen16:28running at a time. Check out how awesome16:30this is, right? They're allowing their16:31engineers to scale their impact by16:33allowing parallelization of their agents16:36and every agent has their own sandbox.16:38Okay, so a question I would ask them is,16:40do their minions have access to16:42additional minion sub agents or not even16:45sub agents, other primary agents that16:47are specialized across their code base?16:49Very cool stuff here. This is again part16:51of their agentic system, right? It's16:53giving and servicing scale very very16:56quickly so that engineers can knock out16:58more problems than ever before. All17:00right, we want it to feel effortless to17:02spin up new dev boxes. Right, ready in17:0410 seconds. Hot and ready. So fantastic.17:07The raw pieces of engineering should17:09feel effortless. You want to be building17:10systems that allow you to move at the17:12agentic speed, the speed of agents.17:15Something kind of funny happened to me17:17the other day while I was reading this17:18blog. Actually, I'll throw the image on17:20the screen. I had to save it. The17:21agentic speed is just insane. Your17:23agents can process information much,17:24much faster than you can. I was reading17:27through this blog, you know, took me17:28maybe, you know, 20 minutes to read17:30through part one and part two, take17:32notes on this. I also spun up a cloud17:34code agent to read the blog alongside17:36me. It read the whole thing, of course,17:38in what was it, 5 seconds. And so I just17:41had like a really funny interaction17:43where uh you know I was shocked and then17:46you know I said something and the agent17:48literally said nothing. It was the first17:50time I've ever had my agent respond with17:52nothing. It was just a really17:53interesting interaction point. And and17:55this is the agentic speed, right? It's17:57this multiplied by every single agent17:59you can spin up. Your agents can read,18:01they can code, they can engineer at18:02agentic speed. So you need to build the18:04system that allows you to tap into that.18:06And you can see here Stripe is doing18:08that with their powerful dev boxes that18:10spin up in just 10 seconds and it18:12somehow sets up their entire gigantic18:14repository. Millions of lines of code,18:17tens and thousands, probably hundreds of18:18thousands of files. All right. And you18:20know, props to Stripe. We built out dev18:22boxes for the needs of human engineers18:24long before LLM coding agents. As it18:27turns out, parallelism, predictability,18:29isolation were very, very good18:31properties for engineers as well as18:33agents. Fantastic. We're almost at the18:35blueprint, which is a really, really big18:37idea. But let's talk about their agent a18:39little bit more, right? So, they built18:40this on their own. They forked Goose.18:42Let me be clear about that. They forked18:44Goose and then they customized it to18:46work within Stripe's LLM infrastructure.18:49Okay? So, you can imagine they have18:51custom prompts, custom skills, custom18:53agents, and then they customize the18:55agent harness. All right? And again,18:57this was the big idea we talked about in18:59last week's video. I'll leave that19:00linked in the description for you if19:02you're interested. Customizing your19:04Aentic harness gives you a massive edge.19:06You can do it your way. You can build it19:08to fit the needs of your specific19:10problem. Okay, once again, I want to19:12beat this idea over the head.19:14Specialization is the advantage of every19:17engineer. Now, you can build specialized19:19solutions, specialized developer tools19:21to help you solve your problems at the19:24agentic speed. Okay, the speed of19:26agents, not the speed of humans. And so19:28they focus their use on the needs of19:30minions rather than human supervised19:33tools. And this is another big idea we19:35need to double click into. That's the19:36use case well filled by third-party19:39tools such as cursor and claw code.19:41Okay, which are made readily available19:43for our engineers. So a couple things19:44here. They're not limiting, they're not19:46forcing their engineers to use any19:47specific tooling. That's a terrible idea19:49in general. But what they are doing is19:51building two types of agent coding19:53tools. Inloop and outloop. I've talked19:56about this on the channel before. This19:58is a critical idea to get right if you20:00want to do more with your agentic20:02engineering. When you are in loop20:03agentic coding, your butts in the seat20:05at your desk and you're prompting back20:07and forth and back and forth and back20:09and forth. This is great for highly20:10specialized work. This is great for when20:12you're building the system that builds20:14the system, but this is bad for20:15everything else. Okay? Uh as a general20:18rule, I recommend to engineers now that20:20you spend more than 50% of your time20:22building the system of agents that build20:25your application for you. That's inloop,20:27right? And that's really the value prop20:28of inloop. You get full control. You can20:30see everything. It's very manual, but it20:32is very slow and expensive. You're using20:34human engineered time. Then there's20:36outloop agent coding. And this is what20:38Stripe's minions offer Stripe. Okay,20:40they are building an Outloop system that20:42operates at scale in parallel in the dev20:45box, right? In dedicated agent20:47sandboxes. This is a big big idea. Why20:50is that? It's because now instead of20:52having one engineer with one terminal or20:54one engineer with three terminals, you20:56can have one engineer with six agent20:58sandboxes operating and solving problems21:01at scale in parallel, right? And six is21:03just the beginning of this. The whole21:05idea here is that you should be handing21:06off more work over time to your Outloop21:09system. If you're building a great21:11agentic layer, if you're building a21:12great system that has agents operating21:15your services for you, you should slowly21:17be handing off more work to them. Okay?21:19And that saves you from the expensive21:22time that you'll spend. And you know,21:24never forget your time is your most21:26important resource. It is constantly21:28running out. Okay? Let me just be super21:30clear about that. But your your agentic21:32systems, you can clone, you can dupe,21:34you can parallelize these as far as your21:37system allows you to. All right? And21:39that's the lever that agentic21:41engineering unlocks. If you build the21:43system that builds the system, you get21:45massive, massive reasoning at scale. you21:48get access to intelligence that21:50engineers your way and at some point21:53better than your way. But uh that's key.21:54So I just wanted to to really dial into21:56that. Minions give Stripe engineers21:58access to outloop agentic coding. Very22:01very powerful. And so they talk about22:03specialization a little bit more. And22:05you know they're really hitting on this22:06this idea I just mentioned there.22:08Offtheshelf local coding agents are22:10usually optimized for workflows where22:12the engineer is sitting looking over his22:14shoulder, right? And I just call this22:16babysitting the agent. Minions are fully22:18unattended and so their agent harness22:20can't use humanfacing features. Okay,22:23they built the minion to be fully22:25autonomous, right? They're built so that22:27humans cannot interject. That's not the22:29point, right? The point is that they22:31operate on their own. Again, inloop22:32agent coding, outloop agentic coding.22:35Cloud code minions. Okay? And just to22:37emphasize it once again, you know, cloud22:39code has the ability cursor has the22:40cursor CLI. And of course, there are22:42great tools we've covered on the channel22:44like pi.dev dev or you can22:46programmatically inject these into your22:48Outloop systems, right? You can deploy22:50an agent outside the loop and have them22:52run on a cron job, have them run via an22:54API request, so on and so forth. All22:56right, that is where all agent engineers22:59must move to get massive leverage. You23:01can see stripes engineers using minions23:04to do just that. All right, so uh they23:06talk about permissions. Uh let's focus23:08on the big idea here right next to dev23:10boxes. The next most important thing23:12here for sure is their blueprint engine.23:14So let's talk about this thing. So what23:15is this? So they talk about workflows23:17versus agents. They talk about loops.23:19They talk about, you know, series of23:21steps, which is like the workflow. This23:23is what a lot of prompts and skills23:25actually are. They're just steps that23:27you want to work through. Sometimes you23:28have an agent that's actually doing some23:30intelligent reasoning, right? Loop with23:32tools. But you can do much better than23:34that. You can push a lot further. And23:36that's exactly what Stripe has done.23:37Minions are orchestrated with a23:39primitive we call blueprints. Blueprints23:41are workflows designed in code that23:43direct a minion run. Okay. And then they23:46go on to say blueprints combine the23:48determinism of workflows with agents23:51flexibility in dealing with the unknown.23:54What is this? Every tactical agent23:56coding member knows this as an ADW, an23:59AI developer workflow. This is the past24:01and the future. This is code plus your24:03agent. Okay, this is the highest24:06leverage point of agent coding is when24:08you put these two together. You have24:09step-by-step workflows that have24:11determinism and non-determinism put24:13together. In essence, a blueprint is24:15like a collection of Asian skills24:17interwoven with deterministic code so24:20that particular subtasks can be handled24:22most appropriately. Okay, there are some24:25things like a llinter for instance or24:27like a git commit or a whole number of24:29things, right? Running tests, creating24:31certain structures, creating certain24:32templates, certain reusable pieces,24:35certain hard deterministic code24:37pathways. There are certain pieces that24:39an agent would perform worse in. Adding24:41an agent to specific steps actually24:43makes the whole system worse, more24:45brittle, and more expensive, frankly.24:47So, for these steps, why would you throw24:49an agent at that problem? Right? The24:50real advantage that Stripe has24:52completely identified here with24:53blueprints is the fact that agents plus24:56code beats agents alone and agents plus24:59code beats code alone. That's the big25:01idea here. So, Alistar goes on to break25:03this concept down here. You have the25:05agent call here implement the task fix25:08the CI failures whatever but you also25:10have the actual nodes run configuration25:13lenders push changes which are fully25:15deterministic okay they don't invoke an25:16LM at all they just run code so imagine25:19you know some toptobottom process where25:21you have agent running and then you have25:22code running and then you have agent25:24running and then you have code running25:25right so on and so forth this is what25:27you want to build right it's this it's25:28the combination of both the agent and25:31your code okay because not everything25:33needs an agent and not Everything needs25:35code. Okay, very very powerful idea25:36here. Another advantage of creating25:39these blueprints of combining code plus25:41agents is that their blueprint machinery25:43makes contact engineering with sub25:44agents easy. Why is that? It's because25:46they're operating at a specific step.25:47And so at that step, you might constrain25:49the tools, you might constrain the25:51system prompt, right? Or you might25:52modify the conversation required by the25:55subtask at hand. Okay? And again, we're25:57hitting on this idea of specialization.25:59There are specific steps in your26:01engineering, in your product, in your26:03tool that you've uniquely implemented.26:05Okay? And so when you can break that26:07down into determinism or in a gentic26:10process step by step, this allows you to26:12specialize, right? And so, you know,26:14once again, what are we doing? We're26:16back at foundational engineering. If26:17you're trying to tackle a big problem,26:19chunk it up into small pieces. every big26:21problem is just a you know a few small26:23problems put together and then chunk26:26those problems into types and then give26:28it to code or give it to agents. Okay,26:30that's what their blueprint system is26:31effectively doing. To me, this is the26:33highest leverage point. This is what26:35makes their agentic layer, their agentic26:38system so powerful. It's the combination26:40of code and agents inside of a26:43repeatable format for success. Okay?26:46Because guess what they can do? They can26:47now deploy meta aentics. they can26:49effectively create an agent that builds26:52their blueprint just in the right way26:54and then they can validate it, right?26:56They I wouldn't be surprised if they had26:58a blueprint for creating blueprints. All27:00right. Anyway, let's move into context.27:02So, they use the rule files setup that27:05you know is much like claude.md or27:07agents.md due to the size of the27:10repository they can't have unconditional27:12job rules. So, they need a specific27:13solution to do this. They're using a27:15standardized rule format much like27:17cursors. All right. So this is a rule27:19format that looks like this. So you have27:20your primary directory whatever tool27:22you're using you know tries to claim27:24that name and then you have / rules and27:27then you have some markdown files right27:29but the interesting part is that you27:31have a markdown file with some front27:32matter. All right, front matter is going27:34to be you know MDC files are like the27:36most popular file format and for good27:39reason right so they have these rules27:40here where you can specify the glob27:43pattern in which to activate this27:45context or you know a specific subset of27:47this context and then they have rule27:49anatomy you can imply intelligently or27:51you can apply only when specific files27:54are being accessed. All right. And so27:56this gives you more control over the27:58context that's loaded as you're28:00accessing different directories28:02throughout your codebase. Okay. And so28:04this is the structure that Stripe28:05Minions use, right? And the big line is28:08right here. We almost exclusively give28:10minions context from files that are28:12scoped to specific subdirectories or28:14patterns automatically attached as the28:16agent traverses the file system. And28:19they're using the, you know, kind of28:20cursor rules to do that. And so they've28:22combined it with a format from cloud28:24code. Once again here you can see that28:26they're building customized agentic28:28solutions that best solves the problems28:31they're facing. Okay and they're28:32combining the best for the industry. I'm28:33not saying that you know cursor agents28:35or claw code agents or how they do28:37things is wrong. That's not the point.28:38There are many ways to do things. The28:40question is what's the best way for you28:42and how do you get the most leverage out28:44of what's available? We can see stripe28:45engineers doing exactly that. Last28:47important idea to mention here is28:49Stripes gathering MCPs. Right. So what28:52are and and how does Stripe put together28:55the tools? So as we all know tools are28:57an essential element of the core for29:00context model prompt tools. Tools is29:02what created agentic coding, right? It's29:04the only reason that any of this is29:06possible because our agents can now use29:08tools to take actions as we can. So how29:11does Stripe handle their 500 MCP tools?29:14Won't this immediately cause a token29:16explosion? Absolutely right. It totally29:18would. What they've done here is they've29:20built a tool shed. They built a29:21centralized internal MCP server called a29:24tool shed which makes it easy for Stripe29:26engineers to make new tools and they're29:28automatically discoverable in their29:29agentic systems. Very very powerful29:32stuff here. Okay. All very agentic29:33systems are able to use the tool shed. I29:35want to be super clear about this. We're29:37talking about meta agentics. This is29:39something that keeps coming up over and29:40over. You build prompts that create29:42prompts. You make agents that build29:44agents. You have skills that build29:45skills. You have tools that allow you to29:47select tools. Okay. The tool shed is a29:50tool that unlocks tools for their29:52agents. Okay, so these are called29:55metaagentics and they're a powerful way29:58to solve the class of problems, right?30:00To to solve repeat problems in the space30:03of agents. And you know, to be clear,30:04this is not new at all, right? OG30:06engineers watching, you've heard of like30:08things like meta programming, right?30:10Passing functions into functions. This30:12is not a new phenomenon, but what is new30:14and what's really important for you and30:15I to focus on when we're building out30:17these powerful agentic layers is to30:18think about when we need to build the30:21thing that builds the thing, right? So,30:23Stripe uses a tool shed to create and30:26connect to over 500 or nearly 500 MCP30:30tools. Okay? Very, very powerful. And30:32you can imagine they have all types of30:33internal and external services that they30:35want to connect to. And the tool shed30:37lets them do that. This was completely30:39net new to me. I had not seen a concept30:41like this before. I think this is really30:42cool. A tool shed centralized location30:44to load specific tools. So, you know,30:47big shout out to the team for uh for30:49building something like this. And then30:51lastly, you know, one of the big ideas30:52they talk about and that's just super30:53critical for engineering. Like this is30:55just great engineering. You just30:56iterate, right? All this stuff is so30:58new. All this stuff is moving so30:59quickly. You and I strip engineers.31:01Doesn't really matter who you are. It's31:02not about what you can do anymore. It's31:04about what you can teach your agents to31:06do for you. Okay? This is a big idea.31:08It's one of the central thesis we talk31:10about in tactical agentic coding in31:12addition to building agentic layers like31:15handing off work and thinking about your31:16agents as tools that you're templating31:19into and templating your engineering31:21into. That's the name of the game,31:22right? Teach your agents how to build31:24like you would so you can scale them to31:27the moon. All right, so what else do we31:28have here? A lot of really great ideas.31:30I'm curious what you think if you've31:31operated in code bases with more than31:3410,000 files. Comment down below what31:36would you rank Stripe's agentic layer31:38based on everything we've gone through31:40here and you know our highle31:41understanding of their system right they31:43have multiple CI entry points they have31:46EC2 agent sandboxes that mirror31:48developer environments they have their31:50own custom agent harness they have a31:52customizable blueprint engine that lets31:54them combine code and agents together to31:56outperform either they have rules file31:59for context engineering they have tool32:01shed for selecting one of 500 tools or32:04many of 500 tools tools. They of course32:06have CI for self validation and they32:08have GitHub PRs to review the work their32:10agents have done on their dedicated32:12agent sandboxes. All right, so rank32:14this. I'm super curious what you think.32:16Rank Stripes agentic layer out of 10.32:19I'm going to go ahead and give them and32:21and you know again if you've worked on32:23code bases that are larger than 10,00032:24files, no offense, guys, but I don't32:26want to hear a vibe coders opinion on32:27Stripe's endto-end system. But for mid32:29to senior level plus engineers, I'd love32:31to hear what you think. I'm going to32:32give Stripe an eight out of 10. Okay, so32:35very very very powerful agent layer. And32:37let me be super clear here. I have no32:38ego in this. Let me say it this way. I32:41cannot solve Stripe's programmable32:43financial infrastructure problems better32:45than any one of their engineers on their32:46team could. They own that problem in32:48this problem space. So that's not what32:49I'm saying at all. They are the experts32:51there for sure. My expertise is in32:53agentic engineering. It's in building32:54agentic layers. And so I only have two32:57notes of of feedback for them here that32:59I would pitch to them as potential33:01improvements. The first thing is this.33:02you know, they they identify this right33:04away. Why only two rounds of feedback in33:06their CI for their agents? Okay. And so33:08they say, you know, speed, completeness,33:10cost, time, compute, blah blah blah.33:11These are fair constraints and reasons33:14to only run two rounds. But I think this33:16is a mistake, frankly. Think about33:17yourself as an engineer. Has anyone ever33:19said to you, "Solve this problem. You33:20have two attempts."33:22All right? You just have two shots at33:24this. Uh, no. No one said that. Right?33:26It often takes us tens and hundreds of33:28times to get something right. So, I33:29think limiting their minions to just two33:32shots is potentially going to cost them33:35more developer time and also increase33:38the gap between the next learning of how33:41to improve their agentic system by33:42letting their agents run more, right?33:44Like, I think the learnings you get from33:46running five rounds of your agent is33:48going to be a lot more informative than33:49running just two. All right, but I could33:50be totally wrong. Again, they know their33:52system better than we do. All right, but33:53that's my first note. And my last note33:55here is in the language of their33:59minions. So, you know, they call these34:02end to-end agents, but you might have34:04noticed they have a prompt step and they34:07have a review step. Okay, that's two34:10steps. End to end is this, right? And34:13you take out the review, right? It's34:15prompt to production, P2P. Okay? And34:18this is something we've talked about34:19inside of Tactical Agentic Coding. This34:21is the northstar for all agentic34:23engineers. This is an idea, a concept34:26called ZTE, zero touch engineering,34:29prompt to production. No review, no34:31human in the loop. I want to be a little34:33critical about their language here. I34:35know that this is industry standard and34:36of course, again, of course, they're34:38operating on a scale most of us34:40engineers will never get to, but that's34:43what I would push Stripe to think about34:44next. What are the lowlevel simple34:47tasks? maybe some lower risk tasks,34:49developer tools stuff, you know, some34:51non-userfacing stuff and even some34:53userfacing stuff that they could ship34:54actually end to end. And the value isn't34:57in doing it. It's in answering the34:58question, what would it take for you to35:01run a prompt and trust that your agentic35:03system can deliver this to production35:06without human oversight, right? The35:08value is in the journey of the question.35:10So, that's that's another area where I35:12would just like really try to push the35:13Stripe engineers to that next next35:15level. Um, I made a prediction on this35:17at the end of last year in our 2026 top35:202% engineering video. I think in 202635:23we're going to see a blog post very35:25similar to this where an engineer35:27operating at serious scale, we're35:28talking tens of millions in revenue. I35:30predict we're going to see a blog post35:31where they break down their agentic35:33layer and talk about how they ship from35:35prompt to production with ZTE zero touch35:38engineering. So those are the only two35:40notes I have. Again, you know, I'm not35:42trying to35:44Stripe has some of the most cracked35:45engineers on the planet. This is just a35:47note on the agentic system and not a35:49note on any of their true core domain35:52problem because again, if you operate a35:55specific domain for years and years, no35:57one knows how to solve it better than35:58you do. All right, so those are my two36:00notes. Big shout out to the Strap36:01Engineering team and you know, Alistar36:04Gray for writing this up. This is a36:05great post. This really caught my eye36:07and I thought it would be valuable to36:08share with you here because it really36:10emphasizes that point that building a36:12powerful agentic layer really comes down36:15to owning all the pieces bottom to top.36:18Now there is a point in which you want36:20to start owning your agentic technology,36:23right? And again, if you're like36:25creating a brand new net new product,36:27you probably don't need to do that for a36:29while. You just don't have the scale for36:31it out of the box. It's going to work36:32for you for a while. But then there's36:34going to be a point where you're going36:35to need a specific solution, right? A36:38customized solution to solve a specific36:40problem. And you want to boil that all36:43the way down just like your application36:45is a is a, you know, a detailed edge36:47case covering solution. Your agent36:50should reflect that too. That's why we36:52covered the PI coding agent. There are36:54many, but this one is mine. And the36:56whole idea here that I want to, you36:58know, connect with you on is that36:59specialization goes all the way up the37:01chain, all the way into the agent37:02harness, all the way to your stack of37:05technology that you operate. So anyway,37:07big shout out to Stripe Engineers. This37:09was really fun. I like blogs like this.37:11You know, frankly, I'm getting a bit37:12tired of everyone hyperfixating on37:15models and prompts and skills. Like37:18let's let's uplevel this and talk about37:20the systems that have agents inside of37:23them that contain agents and that37:25contain code and that contain, you know,37:28modern engineering technology that puts37:30it all together to generate real value37:32for you, your team, your company, and37:34ultimately your users and customers,37:35right? Because that's where the value37:37really is. That's what makes all this37:39stuff actually matter at all. All right?37:41If you're still watching, first off, you37:42know, big thanks to you. I hope these37:44ideas make sense. You really want to be37:46thinking about the agentic layer as a37:48whole, not just your coding tool, not37:50just the models. Let's let's ease up on37:52the obsession on these, you know, models37:54and who's winning and what genera37:57company is more just let's focus on37:59solving problems by building agentic38:01layers with the key pieces. All right?38:03And Stripe has outlined a lot of them,38:05right? Like every agentic layer, every38:08product is going to run into the38:09problems that each one of these nodes is38:12a solution to. So let's pay attention to38:14them, right? Let's think about how these38:15are pieces to the puzzle of building at38:18scale with agents. All right? And this38:20is just one interpretation. Uh no one38:22has all the answers right now. But it's38:24about collecting the right context to38:26solving the problem of agentic38:28engineering. Right? And and pushing what38:30you can do further beyond before the38:33industry before the mainstream catches38:35up. All right. Everything we do in38:37engineering represents an asymmetry of38:39information and then technology and then38:42results with your product, with your38:44tool, with your team, so on and so38:45forth. All right? So, you want to be38:47pushing forward on this stuff. Don't let38:48up the gas. Stay focused on valuable38:51information like this blog and, you38:52know, me being biased, but like this38:54channel. I really try to focus in on38:57concrete signal in the industry, not38:59hype, not slop. There's going to be a39:01lot of both of those as we move week39:03after week. But I want this to be a39:05place where you, the engineer, can come39:07to focus and get some serious insight on39:09how you can continue to win in the age39:11of agents. If you made it to the end and39:13you like this content, definitely feel39:15free to check out tactical agentic39:17coding. This is my take on how to scale39:19far beyond AI coding and vibe coding39:21with advanced agentic engineering so39:23powerful your codebase runs itself. As39:26you can imagine, a lot of the ideas39:28detailed in this blog, detailed in the39:31architecture of how Stripes built their39:33agentic coding tool has been detailed in39:37here. All right, I'll I'll be honest,39:38I'm not like gloating or anything. I've39:40been early to this. This is what happens39:42when you're a first mover, when you bet39:43big on an emerging technology.39:45Everything you're going to see over the39:46next year, I have in tactical Agenta39:48coding and Agenta Horizon, the second39:50part of this course detailed here. So,39:52if you're interested, I'm going to leave39:54a link to this. You can see all the39:56ideas are really in stone here and39:58thousands of engineers, some of your40:00favorite engineers mind you, are inside40:02of this course, have taken this course40:03and have gotten massive value and are40:05moving ahead of the curve. So, I'm going40:07to leave this in here, link in the40:09description for you. Of course, I'm also40:11going to link the minions post.40:13Definitely give this a look and I'll bet40:14that if we search hiring. Yeah, so40:16Stripe is hiring. If you're an agent40:18that's interested in this, you can tell40:19them that Andy Deb Dan sent you if you40:21want. And again, just big shout out to40:23the Stripe team. This is really great40:24stuff. really great engineering in the40:26age of agents. No matter what, stay40:28focused and keep building.