← 返回视频(38) Pi CEO Agents. Claude 1M Context. Multi-Agent Teams.YouTube/2026年3月25日原始链接↗Transcript▲0:00engineers, there are three massive0:02innovations available to you that unlock0:05high lever multi- aent teams like this.0:09This customized PI agent harness is0:12running seven clawed 1 million context0:15window agents, but these aren't your0:18normal low-level worker agents. This is0:20my CEO and board multi- aent team. This0:25is a glimpse into the future where your0:27agents are not only helping you ship the0:29low-level work, they're helping you make0:32high-level game time strategic0:35decisions. The first of three0:37innovations here is of course the new0:40clawed 1 million context models. Now,0:43these models existed before, but somehow0:46Enthropic was able to cut the pricing0:48down. And that's the real headliner0:50here. It's one price for the full0:53context window. No long context premium.0:56No model lab has been able to do this0:58while maintaining a model that's1:01actually useful. Gemini 3 series have1:04claimed the 1 million token context1:05window. Remember when we had Zuck with1:07Llama 4 claiming the 10 million Maverick1:10model? It's complete garbage. Anthropic1:12and the Claw team have actually pulled1:14it off. You can see right around that1:16250k mark is where things start1:18degrading. about how fast things to1:20grade really matters. For a lot of1:23tasks, you don't need perfect retrieval.1:25You need retrieval that doesn't lose the1:28key ideas, the key context, the key1:30information that you're using to solve1:32the problem at hand. And it's become1:33quite clear the Opus 4.6 and Sonic 4.61:36model can do that. It can get you there.1:39This enables the next frontier of1:41agentic engineering, and it's really not1:44getting talked about enough. This means1:45that the core 4 just got a massive buff.1:49Right now, everyone is still using1:51agents as worker bees. Coding, planning,1:54taking actions. You know the drill,1:56right? You've seen it. You're probably1:57running one right now. When you combine1:59the 1 million context window with these2:02additional two key emerging agentic2:05tools, you unlock incredible capability2:07that sits at the center of knowledge2:09work decisionm at this next tier of2:12agentic engineering. It's not just you2:15making key decisions anymore. And it's2:17not just your team either. It's you,2:19your team, and a country of geniuses in2:23a data center.2:28Here's a question to hold in your mind2:30as we work through the CEO and board.2:33How much is a high leverage decision2:35really worth to you? Really think about2:37that. How much are you willing to spend?2:39How long are you willing to wait to get2:40the best intelligence to answer your2:43hardest questions? Think about that as2:45we break down how this works. First2:47things first, as normal on the channel,2:49let's break down the architecture of the2:51system. What we have at a fundamental2:53level is we have uncertainty in and we2:55have decisions out. That is the purpose2:58of the CEO and board multi- aent system.3:00We take specific inputs and then we3:03deliver specific results. This is a big3:05piece of agentic engineering. You must3:06know what's your inputs and what's your3:08outputs. And then at a higher level,3:10what's the real thing, right? What's the3:12valuable thing that you're trying to3:13attain here? Now, the workflow is where3:16all the magic happens. We have the3:17brief, our input prompt or a question,3:19and then out comes the memo. This is the3:22Jeff Bezos response, the recommendation3:25that your CEO and board is going to3:28create for you. What happens during the3:29workflow step? This is the full kind of3:31breakdown. You sit at the top, you have3:33a customized agent harness and then we3:36have a multi- aent orchestration3:38pattern. Whenever you see a single node3:39going out to multiple agents,3:41immediately think multi- aent3:42orchestration. That's exactly what's3:44happening here. We have the powerful 13:45million context opus 4.6 model acting as3:48our CEO here. And that's exactly what's3:51happening right here. Our CEO is3:52controlling the conversation. You can3:53see they're wrapping up the meeting3:55here. As most meetings go, you know,3:57it's over time. Thankfully, it's not3:59over budget, though. All of our agents4:01are equipped with the 1 million token4:03context window. We can do a lot with4:05this as we move forward here. Our CEO is4:07conducting a valuable conversation with4:10multiple agents. This is a multi- aent4:12orchestration application and it's built4:14into a customized PI agent harness.4:18We're building a unique experience here.4:20This is where specialization is going to4:22take the cake. The PI agent harness lets4:24us do that. As Pi likes to say, there4:27are many coding agents, but this one is4:30mine. We have that prompt in also known4:31as our brief and then we get memo out.4:33Fantastic. But how does it work? This4:35here is our actual workflow. You got to4:38know your inputs. You got to know your4:39outputs. And then you have to design4:41your workflow. So the CEO is going to4:43frame the decision. The board is going4:45to debate. So we're going to have4:46multiple agents battling back and forth,4:48making key points, arguing pros and cons4:51of a certain question that you have.4:53Importantly here, we're going to check4:54our constraints. We're building with4:56agents and we're wrapping them in code4:57in our own customized agent harness.4:59That means they stop when we say stop or5:02at least, you know, roughly as we saw5:04here a little bit over time. The CEO5:06agent is just wrapping up here. We'll5:08come back to how this works in a moment.5:09But you can see here our CEO controls5:11the entire workflow. They decide when5:13enough is enough. And then the CEO5:16creates a final memo, a final decision.5:19Think about a high leverage, high impact5:21leader coming into the room, the chief5:24executive officer. They get all the5:26information, they debate with the board,5:28and then they create their final memo.5:29Perfect timing. Our memo was created. It5:32automatically opened up VS Code. And5:34now, uh, something really cool is going5:36to happen here. We're going to actually5:37get a natural language summary of all of5:40our work. So, let's go ahead and just5:41wait for this to fire off here. We're5:43using, of course, a skill that connects5:45to 11 Labs. board decided unanimously on5:48YouTube shorts as the primary platform5:50but reframed the question 62%.1%5:53churn rate and referral redesign blog5:55gets migrated decision gates at day 305:5845 and 60 keep us honest6:01>> nice so we got a natural language6:02summary there you know that's just icing6:04on the cake we added a skill to our CEO6:06agent so of course we can customize the6:08CEO as much as we want final response6:10comes here from our CEO agent again6:12we'll break this out in just a moment6:13that entire process that you just saw6:15produced a final memo. So this is the6:18action, right? This is the answer. So6:20question in, answer out. And that brings6:22us to the actual configuration. So this6:25is a great branching endpoint to really6:26dive into things. We have a couple key6:28sections here in our configuration file6:30that controls the entire process.6:32Constraints. This is really, really6:33important. You don't want this stuff to6:34run forever. And you want to set budgets6:36for this, right? Because with 1 million6:38context running rampid, you can do a6:40lot, but it has to be guided properly.6:43We then have our paths, all the key6:44files. We'll break these out in a6:45moment. The key ones here, as you can6:47see, is the briefs and the memos. And6:49then finally, you can customize whatever6:52agent you want to come to the board.6:53It's your board. So, you can say what6:55opinions you want to bring to the table.6:57And the key part here is every one of6:59these agents has their own customized7:01system prompt. You can also add7:03additional tools and capabilities via7:05skills to every individual agent,7:07further pushing their own unique7:09capabilities. And there's one additional7:11kind of high leverage piece to this, one7:13of the three key innovations that we'll7:15talk about later. So this is the7:17configuration. This is how that system7:18works. And it's all about removing7:21uncertainty. When you have a big7:23question, a career, company, even7:26personal lifestyle decision, what you7:28really have is you have a question and7:30you want a decision out. As a gentic7:32engineers, we need to push what we can7:34do with our agents, with our compute.7:35This is a really really high leverage7:37way I've been using agents over the past7:39let's say half a year now. And I want to7:40share this with you now because of the7:42new capability unlocked by the 1 million7:45context window models. Inside of this7:47memo, we had a question go in. Which7:49shorts platform should we lead with? You7:51know, we're running this company called7:52Blendstack. We needed to answer the7:54question, where should we put our7:55marketing efforts in terms of our shorts7:57channel? So, this is just one of many,7:59many, many business decisions you might8:01have to make. All right, so we have a8:02final decision here. Here you can see8:03our CEO agent put together a decision8:06map. Really broke things down for us8:07here in a very visual way. We then have8:10our top three recommendations here. Fix8:12the retention engine. Then YouTube8:13shorts. So it looks like the awareness8:15of having multiple agents debate created8:17a lot of value here. It didn't just8:19select a shorts platform. It said you8:21actually have a retention platform8:23issue. Fix that first otherwise nothing8:25else matters. Right? So great call up8:27there. You can see the stances of every8:30single one of our board members. So,8:32this is where a lot of the value lies.8:34Every one of these board members has8:35their own unique position, their own8:37opinion, their own stake in the matter.8:40And we have my favorite one down here,8:42the moonshot agent, which you'll see why8:44as we work through this. Our memo also8:46points out the resolved and unresolved8:49tension between all the members.8:51Remember, this is a multi- aent8:53adversarial pattern or adversarial tool.8:56The whole point is that we have multiple8:58perspectives that don't agree. So we can9:00really flesh out the mental model that9:02we need to make the best decision9:04possible and specifically that our CEO9:06needs to make the best decision9:07possible. And then we can just look at9:09the conversation as a kind of observer9:11above it all. And we can use this to9:13make high leverage decisions. You can9:14see trade-offs and risks, next actions,9:15so on and so forth. Everything you would9:17want in a strategic decision-making9:19partner is here.9:25Let's boot this up from scratch. I'm9:27going to really show you what it looks9:28like to make a high lever decision with9:31this tool. As usual, I use just file to9:34set up repeat agentic processes. If I9:37type J, which is a alias for just, you9:39can see we have the CEO command. If we9:41type J CEO, you can see the exact9:43command that that kicks off. We're9:45moving into the app and then we're9:46running this as a PI extension. Right9:48away, we're presented with our CEO and9:51board, our strategic decision-making9:53multi- aent team. We have a couple of9:55constraints. Constraints are really9:57important for making real decisions9:58because in reality the world always10:00applies constraints to us. And the nice10:02part here is for the first time ever we10:04can actually apply real constraints to10:06the duration and the budget of our10:08meetings. The first thing to note about10:10this system is that this is not your10:11normal agent harness. If you try to10:13prompt something, it will not accept it.10:15This is not a conversation. This is a10:17oneshot multi- aent system. So the only10:20command we can run here is CEO begin.10:23Let's fire this off. The first thing10:24this does is it looks for briefs. So,10:26we'll break down the codebase in a10:27second, but you can see here I've got a10:29few different briefs. I've been testing10:30a few different questions and problems10:33that I want to present to this multi-10:35aent team. Let's go ahead and run a10:37classic one, a big decision for every10:40successful product is acquisition10:41offers. All right, so let's go ahead and10:43run this and then we'll break down the10:44inputs and outputs as this executes. As10:46soon as we hit this, CEO first gathers10:48context. After it has the full picture,10:51you can see it's updating its mental10:53model. its personal memory file, its10:55scratch pad, and it's just going to10:57write down a couple key facts here. It's10:58going to take some notes on the question11:00and answer, and then it's running its11:01converse tool. And so, you can see the11:03CEO is writing to everyone at the head11:06of the table, and it's announced11:07something. Board, we're here to make a11:09call on neutral holdings acquisition11:11offer. And then we have our multi- aent11:13team, board members, revenue, technical11:15architect, compounder, product11:17strategist, contrarian, moonshot.11:18They're all responding in parallel at11:21the same time thinking through how they11:24would solve this problem and giving11:25feedback to the CEO. And so this process11:28is going to repeat as long as there is11:31time and budget. We're looking for that11:33$2 to5 minute range before we start11:35wrapping things up. And then we have a11:37$1 to $5 constraint. Uh just to be super11:40clear, I normally run this with a much11:43much higher budget and constraints. I11:46prioritize high leverage decisionm much11:48more than I prioritize uh time and11:51compute costs of course up to some11:52threshold. It's going to be different11:53for every engineer. Our agents are11:55starting to work and respond to our CEO.11:58And this of course runs in parallel,12:00right? This isn't sequential. Our agents12:01aren't blocked. They're all thinking.12:02They're all forming their own opinions12:04based on their system prompt, right?12:06Based on their own internal thoughts and12:08memory files. And so you can see things12:10start to tick up there. Uh product12:11strategist has responded here. 12 cents12:14for that. These are all running the12:15sonnet model and I put the opus model on12:17our CEO. Every staff member has12:19responded and now our CEO is going to12:21iterate and so we're still in time.12:23We're still in budget. So our agent is12:25updating its internal notes here. And12:27you can see here the moonshot has12:28rejected as it often does. The moonshot12:30wants to go bigger. It's looking for12:31that moonshot opportunity in this12:33decision. But you can see here uh the12:35room is 4:1. And so our CEO is12:37continuing the conversation broadcasting12:39out to its multi- aent team and it wants12:42to push the discussion further. That's12:44the core process, right? As long as12:45we're in time and in budget, the system12:47is just going to keep churning. They're12:49going to keep finding all the holes, all12:51the cracks, and come to a concrete12:52decision on our question, on our brief.12:55So, this is a good time to kind of dive12:57in and understand the inputs and the12:59outputs of the system, right? Because13:00this is a unique multi- aent application13:03that's going to deliver unique results.13:05So, it must be designed in a unique way.13:07This is one of my big problems with all13:09the out of the box agents, all the cloud13:11codes, all the codeexes. If you don't13:13specialize them and you don't build13:14custom agents inside of them, you're13:16getting the normal distribution of what13:18everyone else is building. Especially if13:19you're writing super short, super simple13:21props, you're really relying on13:23everything else you've built to make13:24that context and therefore the result13:26specialized just to kind of give it away13:28a little bit. That second big innovation13:31is a customizable agent harness like Pi.13:34There are many coding agents, but this13:36one is mine. That's the value13:38proposition. That's the big second13:39unlock here is a customizable Asian13:42harness where you're not just doing this13:44normal prompting back and forth in the13:45loop blah blah blah. This is old news.13:48There's much more you can do, especially13:50when you add the 1 million true context13:53into your system. All right, so our13:54agents are responding. The compounder13:56looking for that competitive compounding13:58advantage is responding here. Looks like14:00it's going to generate an SVG to aid its14:02argument here. We've gone over that14:045minute mark. So, what's going to happen14:05here is on completion of this message,14:08our agent harness is going to add a14:11response block back to the CEO and it's14:13going to tell the CEO, hey, we're moving14:15over on time andor budget. It's time to14:17wrap things up for all the cost14:19minmaxers, $2.50, not bad at all. Very,14:22very cheap. The real incredible part is,14:24as mentioned, these models, 1 million14:27tokens available to you with, you know,14:29Cirrus intelligence, the cost is never14:31going to go up. It's steady cost. I've14:34always been so annoyed with the Gemini14:35models. They always increase the price14:37after 200K. Enthropic has broken that14:40barrier. And if you're operating in claw14:42code, of course, you now have access to14:44these 1 million context models by14:47default. In Loop, the advantages are14:49obvious, more context. But outloop the14:52advantages are much less obvious, right?14:54When you build powerful tools like this14:55and you plug it up to your Outloop14:57system, the leverage you can get is now14:59massive. There we go. Okay, max reached.15:01So we are over in time. The CEO is going15:04to end the deliberation. It's going to15:05end the debate. It's prompting everyone.15:07We've hit our constraint. Final15:08position. One statement each. So every15:11agent now, every board member is going15:13to give their final closing statement on15:15what they think we should do with this15:17question, with this brief, with this15:20problem that we're presenting to our15:21multi- aent team, our CEO and board.15:24They're going to follow the normal15:25process. Let's go and jump into the15:26system so we can understand how this15:28actually works.15:33So, as mentioned, uncertainty in,15:35decision out. We put in a brief and we15:37get a memo out. So, this application15:39runs inside of its own app. Here you can15:41see we have the extension right here.15:43It's a single extension and it's quite15:45massive. I need to actually break this15:47out. The real value and the real15:49customization happens here. If we open15:50up Pi, we have CEO agents. This is15:53interesting right away. We're all used15:54to the doclaw directory. You have15:56agents, you have plugins, you have15:57commands, you have skills. With the PI15:58agent harness, you can make your own16:00structures. You're not limited to what16:02currently exists. So, we have a brand16:04new set of files here. We have our own16:06agents. We have our own configuration16:08file. We have our own expertise. More on16:11that later. And then we have the three16:13essential nodes of the system, right?16:15Our briefs, which is our prompts, and16:17then our memos, which is the response16:19from our system. And then the middle16:20step, right? The deliberation, also just16:22known as the debate. So, you can see16:23here I've had several previous debates.16:25Let's go ahead and start with our16:26configuration file. So configuration16:28file has a few key notes. We have our16:30meeting which sets up all the16:32constraints. We have our brief sections.16:34So these are the sections that are brief16:37must have. So we're applying prompt16:39engineering best practices as part of16:41the system. So if we try to execute a16:43brief that doesn't have key questions,16:46it's not going to run. And then we have16:47our paths, right? So these are just16:49references to our briefs, our debates,16:51our memos, and our agents. And then of16:53course we have our customizable board16:56members. And so you can see I left two16:58out. Feel free to add them. And then we16:59have our remaining agents. And each one17:01of these is their own system prompt. And17:04you can see our memo opened up here. And17:06we're working through an acquisition17:08offer. All right. So we're a business17:10that's been presented with a acquisition17:12offer from Neutra Holdings.17:14>> The board recommends accepting Neutra17:16Holdings $12 million acquisition offer17:18at 11 times ARR. The vote was 5 to1 in17:21favor with three conditions. A retention17:23linked earnout of $ 1.5 to$2 million, a17:2690-day knowledge transfer period, and a17:28founder clarity question on whether the17:30retention signal was ever tested.17:31Moonshot dissented, arguing the blend17:33engine is platform infrastructure worth17:35far more. But nobody in three rounds17:37could name the root cause of five17:39quarters of decelerating growth. And17:41that silence was the signal.17:42>> O, yeah, declining revenue. That silence17:44was the signal. Okay, brutal. Another17:47final decision made. We got a decision17:49framework. We have that summary file.17:51And we have our memo, the most important17:53piece, the output of this system, full17:56board memo with recommendations,17:58stances, tensions, and next actions. So,18:00let me be super clear here about what we18:02have. This is a decision-making agentic18:05coding tool. And really, we're not18:06coding at all here, right? This is an18:08agentic engineering tool. We're starting18:11to uplevel the conversation about what18:13our agents are doing. We're engineering18:15and solving problems with our agents.18:18We're not just coding anymore. As time18:20goes on, I really want you to pay18:21attention to this trend. Your agents can18:23do much more. This is why it's so18:25important to pay attention to the core18:264. This has been one of the biggest18:28things I just haven't seen in the18:29industry. So, I wanted to bring it to18:31you here and hand this idea directly to18:33you. Your agents can help you make18:35strategic decisions. The system prompts18:37you're going to see here are not normal18:39system prompts. If we take a quick look18:41at the CEO system prompt, you can see18:43that this looks very different from18:45system prompts you're probably used to.18:47Why? Because we have once again18:49specialized the experience. Our agent18:52harness knows how to parse this unique18:55front matter from our system prompt from18:57our agent which further customizes the18:59capability. Our CEO has expertise. It19:01has skills and of course it's got a19:03model and one more field here that uh19:06I'll leave for another day. Let's talk19:08about the briefs, right? What was the19:09brief that we input into the system? If19:11we open up a new terminal, type JCEO and19:14we do CEO begin, you know, we use this19:18acquisition offer brief. So this is how19:20the system works. So when you have a19:21question, you'll come into the briefs19:24section here. You'll use the brief19:26template. And of course, all my tactical19:28agentic coding members know exactly why19:31we template things out. If you template19:32your engineering, your agents can do19:35exactly what you did, right? Exactly19:38what you did. This is the big advantage19:39a lot of engineers have missed. When19:41you're prompting back and forth and19:42you're not creating prescriptions,19:44workflows, and systems for your agents19:46to repeat, uh you miss out on all the19:49true leverage, which is templating your19:51engineering into your systems and19:53teaching your agents how to do what you19:54do. All right, so we have a brief19:56template. You can kind of see the key19:57structure there. And you'll notice how19:58this aligns directly with the required20:01sections. So we have the debrief there.20:04We have our stakes. We have our20:06constraints and we have our key20:07questions. This is in our template. It's20:09in our validation. And if we submit a20:12brief into our system that does not have20:15these sections, it just rejects them.20:17And so we are kind of forcing great20:19prompt engineering as a pattern into the20:21agent harness. I'm specializing20:24requirements for success. What does an20:26actual brief look like? Let's see what20:28which one were we uh focused on there.20:30Let me just grab this one here.20:31Acquisition offer. And that's going to20:33be right here in briefs. And I'm going20:35to save all these and push it to the20:37codebase for all Agentic Horizon members20:39to access. But you can see here we have20:40the brief. And what does this look like?20:42This is how I am using agents to solve20:46problems and get direction on critical20:49decisions. And near the end, we're going20:50to talk about why just running this in20:52chatbt and claude in a single agent just20:55isn't as impactful. It's okay. It's a20:57great starting point for feedback. I'm20:58sure you've done it. I do it sometimes,21:00but never for serious things where21:02context is required. If you don't21:04context engineer the right information21:06into your agents or or single agent,21:09it's not going to perform like you want21:10it to. We've pushed this even further by21:12having multiple agents. So, the brief21:13looks like this, right? Nice and simple.21:15You can see this situation here. So,21:17this is what our agent worked through.21:18Should we take the 12 million21:20acquisition offer? So, this is a private21:21equity company that is combining and21:24rolling up purchasing supplement21:26companies. made a formal offer to21:27acquire our company, Blend Stack. All21:29right, so we are Blend Stack for 1221:31million cash. Kind of a low stakes S&P21:33exit. Still really good. That's 11x our21:35current AR of 1 million non-negotiable.21:38Price expires 30 days. We have stakes,21:40constraints, and we have the key21:43question. We're forcing this structure21:45in all of our briefs. We want you, the21:47engineer or any engineer on your team or21:50any product member on your team to21:52really think through things. Don't just21:54write a stupid two sentence prompt.21:56Think. Give your agents serious21:58information to work with and it will22:00give you a serious result. If you type22:02in a lazy BS prompt, it'll do its best22:05cuz that's what it's been trained on,22:06right? That's the RL that it just keeps22:08getting hammered thumbs up or thumbs22:10down. But if you really are serious in22:12your prompt engineering, and prompt22:14engineering is very much still alive.22:16Don't let anyone make you think that22:17it's not. It will give you much better,22:19higher quality results, especially when22:21you multiply it with multiple agents22:22like we have here. But you can see here22:23we also have additional files,22:25additional context for the business22:27that's going to persist. So we have22:29business metrics here. This is just22:31auxiliary information and product22:33overview. Right? So what we are, we are22:35blend stack, direct to consumer22:36supplement company. We let users build,22:38you know, customizable supplement22:39stacks, right? So pretty cool idea. This22:41is a real mock of legitimate company22:44strategic decisions. There are companies22:45that do exactly this out there in the22:47wild, but I'll bet you they don't have22:48agents to help them make critical22:49decisions. So we have that additional22:51information here and this is what our22:52brief looks like. So to be super clear,22:54every single one of our agents, you22:56know, every single staff member receives22:59all the extra context. So the CEO23:01prompts them the key information from23:02the brief, but then every one of them is23:05on the exact same page. They all load23:07the key additional context, right? So23:10product overview, business metrics,23:11they're all on the same page. This is23:14key. This is very, very important. So23:16that's the brief. That's the beginning23:17of our flow, right? Right? So if you hop23:19back to our flow, uncertainty in,23:21decision out. The brief is the input.23:23This is the prompt. This is the plan.23:24This is the question. Now what comes out23:26is the memo. But there's something that23:28sits here in the middle, the workflow.23:30And that's where the deliberation is.23:32Also just known as, you know, the debate23:33to put it in simple terms. But you can23:35see here we have a bunch of middle stage23:37files. Most importantly is the23:39conversation. Observability is a key23:41element of building powerful agendic23:43systems. If you don't measure it, you23:44cannot improve it. Full stop. So here we23:47have the full conversation that23:49occurred. If we look at the CEO here and23:52in fact as part of our staff's system23:54prompts where we constantly tell them23:56reread the entire conversation so they23:59know who's saying what to who and who's24:01responding right so we have the from and24:03the to most of our conversation here is24:05happening from the CEO to everyone and24:07then everyone or like you know the24:09individual staff members are responding24:11to everyone else. We can dial this in24:13further. Right now I have the system24:14configured in a just simple call and24:16response way broadcast mode. But you can24:19see how this could be valuable, right?24:20Because in reality on real boards there24:23is often what's called backroom talk or24:26behind the door under the table talk24:28where the compounder and the moonshod24:31they might want to converse about24:33something, you know, kind of behind24:34everyone else's backs, right? And they24:35might want to collude and find some24:37middle ground, make some trade-offs24:38together and make a key decision. This24:40is all good to help you make your key24:42decision, right? You want those kind of24:43like adversarial models, battling it24:46out, debating it out and once again just24:48to expose flaws in your mental model and24:51your systems mental model of the brief24:54and the memo the conversation here fully24:56visible. We have our tool use of every24:58agent. So you can see you know we are25:00pulling in right we should have a bunch25:02of uh yeah path reads here have a bunch25:04of read tools and some agents will also25:07write to their own personalized memory25:10file specifically an expertise file it's25:13a bit distinct from memory memory is25:15quite vague but expertise is memory and25:18patterns surrounding a specific problem25:20and that's the third key innovation here25:23if I close this you can see I have this25:25expertise file this is one of the big25:27topics we discussed inside of Agentic25:30Horizon. All members of Agentic Horizon25:32already know about this. You've known25:33about this for a while now. But25:34expertise is a really powerful pattern.25:36This isn't just arbitrary memory. It's25:38not just memorize this. It's based on25:40your domain expertise, the thing that25:42you focus on the most. And so I have25:44this in a very simple form here. You can25:46see here's the CEO scratch pad for this25:48session. Of course, I have this as a25:50presentational example for you to get25:52started with. But in my real versions of25:54my multiple CEO agents, I have multiple25:57versions of this depending on the25:59product tool, client work that I'm26:00doing, I have CEO agents that are26:03tracking a whole slew of working26:06expertise. The kind of real production26:08expertise files that I have running are,26:10you know, tens and thousands of tokens.26:11And I was able to expand their mental26:14model so much because of these new26:17powerful 1 million context window26:20models. This truly changes the types of26:23multi- aent experiences you can build,26:25the types of agentic engineering that26:27you can do. So anyway, that's the26:28expertise file. These are kind of, you26:30know, the middle state of what's going26:31on. And you'll notice here we also have26:33some SVGs. All of our agents can create26:36SVGs to their argument. The revenue26:38agent created this multi-year plan26:40thinking about what must go right to26:42beat $12 million in a future exit. Year26:451, year 2, year three. And it's got26:47these bull base bare case laid out like26:49a great investor or like a great board26:52member would think through. It's really26:54just kind of pointed out 12 million26:55today is an excellent offer given26:58everything that's going on in the bull27:00case and the bare case. Every agent can27:03create SVG elements to visually compel27:06the CEO toward their argument. So very27:08very powerful here. I hope you can see27:10that I am going to beat the out of the27:12box agent experience, right? Like I hope27:14that's not like even a question for you.27:16Having multiple agents inside of your27:19system is a massive advantage. When you27:22add multiple agents in your pipeline to27:25help you solve a specific problem, you27:26get multiple perspectives. And frankly,27:28you multiply the amount of context, the27:31angle, the perspective of the ability to27:34solve a certain problem. It's no27:36different than having a great, you know,27:37diverse team with different opinions. If27:40you get a bunch of Chads on your team,27:42every Chad is going to say, "Yeah, we27:44should use React." You you really want27:46these unique perspectives. You want to27:48build them into, you know, your own27:49agent harnesses, your own customized27:51agentic experiences. Okay? And that's27:53why PI Asian harness is a really big27:56piece of that. And we've done it here in27:582,000 lines of code. I do need to break28:00this up a little bit cuz it's quite28:02large. 2K lines is is not a great design28:05pattern here. So, I'm going to clean28:06this up. We have our memo. So, let's28:08open this up again. You can see just28:09very simple memo. We have our decision-m28:12diagram here. Our CEO has said that we28:14should accept the offer. 12 million cash28:16outcome. There's a condition. It's28:17really broken this down for us.28:19Rejected. The moonshot did not like28:20this. Everyone else is on the same page.28:22Have a nice simple highle visual of the28:25decision. Thanks to the PI agent harness28:27and thanks to our own customization. Our28:29system takes in brief markdown files in28:31a very specific format, right? And a28:33memo with an SVG and an MP3. So very,28:36very concrete unique inputs and outputs.28:38We've built a unique system here. I harp28:40on this a lot every single week on the28:42channel, but it's so critical. If you're28:44not building specialized agents, context28:46model prompt tools, and you're not going28:48high level and customizing your agent28:50harness, you are in the normal28:52distribution of what everyone is getting28:54out of the agents. Right now, the big28:55mainstream is the cloud code agent28:57running these powerful opus and sonnet28:59models. It doesn't take a lot to push29:01out that distribution just a little bit.29:03You have the core four. Change one of29:05them. Change two of them. Don't just29:06rely on your context being different. We29:08can dial into the full memo here and29:11really get a great breakdown of the full29:13decision in that exact output format29:15that we've specified. That's the memo.29:18Let's understand that we've customized29:20the system prompt. We do have a29:22traditional system prompt in the format29:25that you've seen on the channel over and29:26over and over. Consistency is a29:28extraordinarily powerful tool. Purpose,29:31variables, instructions, workflow,29:33context, report. All right. And this is29:35in the system prompt. You can see here I29:36have a bunch of static variables and I29:38have runtime variables. So I have29:40dynamic variables that get updated29:42before this agent starts in the system29:44prompt. So we are updating dynamic29:46variables inside the system prompt so29:48that after it boots up, our agent is29:50aware of this. We also have dynamically29:52inserted a couple additional things,29:54expertise and skills. So these are all29:58getting added. I'm modifying that normal30:00agent coding experience that you're30:02likely used to and doing it a specific30:05way. How can I do this? It's because I30:07know the actual primitives underneath30:09the tool. This is where vibe coders have30:11no shot. They cannot build something30:13like this. The awareness just isn't30:15there. You can go all the way down to30:16the prompt level, the system prompt30:18level, and you can redesign how things30:20work. If you dive into the code, you'll30:22see that I take the skill block here30:25that I can just quickly update from the30:26front matter of the agent, update it,30:28and I add it directly to the system30:30prompt on bootup. And of course, we have30:32the tool breakdown here for our primary30:35agent. We don't need to go into too much30:36detail here, but you can imagine the30:38rest of everything, but the key piece30:40here is that our CEO and the rest of our30:42agents, right? Like here's the30:43compounder agent. There's a model skills30:46expertise file. Similar format, similar30:48structure, expertise, skills, runtime,30:50yada yada yada. The key piece here is30:52that we are strongly defining and we're30:54to be super clear, we're completely30:56overwriting the system prompt. These are30:58not coding agents. This is a CEO and a31:00board built to make strategic decisions31:02on your behalf. One of many, many, many31:04thousands, tens of thousands of31:06specialized agents that can be built.31:09Coding is just the beginning. It's31:11really just the beginning. It's a lot31:13like vibe coding. It's the lowest31:15hanging fruit because there's much more31:16domain out there for you and I to31:18access. All right. So anyway, this31:20pattern repeats. Basically, I've created31:22my own system prompt structure with31:24front manner that my customized Asian31:26harness parses. And then from there,31:28it's classic system prompt prompt31:31engineering. And so, you know, scroll31:32down, you can see I've got temperament,31:34how this role thinks, reasoning31:36patterns, decision-making heruristics,31:38and you know, my favorite, if we go to31:40the uh where's our moonshot agent here?31:42You know, the moonshot agent is our big31:43bet thinker, and I love this line. What31:45if we're thinking too small? You31:47advocate for 10x moves, category31:49defining bets, the risky play that31:51changes the trajectory of the entire31:53business if it works. You can see that31:55in the decision that was made in this31:57last brief here. Our moonshot agent was31:59the only one that said, "Reject and run32:02a 30-day investor test. See if we can32:04raise some more funding. See if we can32:06keep pushing." And so, I love the32:07moonshot agent. It's helped me think32:09even bigger. Like, think long term.32:11think what if you expanded beyond what32:13you're doing right now. So anyway,32:14that's just one of many personalities32:15that you can build into your staff team32:18because you know you can customize this32:20completely. Moonshot contrarian product32:22strategist compounder thinking about32:24compounding advantages and a revenue32:26agent is really interesting as well. The32:27revenue agent just wants cash now32:29gravitational pull towards shipping,32:31selling and collecting money. I want a32:33version customers will pay for in 9032:35days, right? Maximize within the next 9032:38days. And this comes down to even more32:40details, right? Temperament, how this32:42role thinks, reasoning patterns, so on32:43and so forth. That's how this system32:46works. Great part about this system is32:47that they're going to retain knowledge.32:49These are not normal agents. These are32:51agent experts. You saw that scratchpad32:53file. I've built that out to be much32:55more persistent. In reality, we're not32:58going to be jumping around making32:59decisions against different business33:01domains. We have our briefs here, right?33:03These are all different business33:05domains, right? different uh samples to33:07showcase how this works, to showcase how33:09you can deploy a multi- aent team33:11against specific decision sets. In33:13reality, what you're going to end up33:14with is a chain and a stack of decisions33:17and questions and answers, briefs and33:19memos that you and your teams have33:21decided over time. And that's where33:23there's real value. It's in that33:25specialization of stacking context that33:28only you and your agents and your team33:31has. I have a code bases where I have33:33like 20 briefs and that means the33:35expertise becomes more and more33:37valuable. So I have versions of the CEO33:39agent where the revenue agent has you33:41know a long log file like this much much33:44longer than this but decent amount of33:45tokens here actually 11K it's taking33:47notes on all the other members and it33:49kind of knows who it likes to agree with33:51who it disagrees with often and for33:52instance you know a pattern that I see33:54which makes sense the revenue agent is33:56often at odds with a compounder agent33:59thinking sub90day prioritization range34:02is typically at odds with a compounder34:04who's thinking how can we compound this34:06advantage over multiple multle quarters34:08and multiple years. So, it's funny to34:10see, you know, the agents actually34:11taking notes, kind of colluding against34:13each other a little bit. You want your34:15agents opposing each other at odds so34:17that they poke holes in each other's34:19strategy, which ultimately give you and34:21your CEO every piece of information they34:23need to create the best memo, aka the34:26best answer to your question, the best34:28solution to your problem. Something34:29important to note here as well, you saw34:31that I'm exclusively using clawed models34:35here. agent model diversity is super34:37super important for creating better34:39conflict and unique opinion. Overriding34:41the system prompt does get us a lot of34:43that value. But what we really want here34:45is you know be great if OpenAI or Gemini34:48was putting out true 1,500 700K token34:52window models so that they could you34:54know really be a part of this. But the34:56key here is the models must be able to34:58maintain long context so that they can35:01maintain and execute on tons of35:03information about your business, about35:05your product, about your life. That's35:06the kind of key piece here. If they35:08can't have it in their memory, their35:10working memory, their context window,35:11you're kind of limited to the size of35:14problem you can really ask them to give35:16you a concrete opinion on, right? And a35:18concrete solution on. But that's35:20important here. Sometimes I do for my35:22smaller products I will actually drop35:24down some of these models. For instance,35:26the GBT 5.4 very powerful and Gemini 3.135:29Pro very powerful but only up to like35:31500K. You know, I hope it's clear here35:33that specialization is the advantage and35:35specialization once again increases the35:37trust we have in our agents because35:38you've designed a system to create and35:41maintain certain inputs and outputs. And35:43you know, we can just kick off another35:45one here. We got an FDA warning. FDA35:47sent us a warning about our supplement35:49company. you know, we just kick that35:50off. And so this really is a new35:52frontier of capabilities. The sonnet35:54models with the 1 million context35:56length. This changes things. This model35:58in itself really does change the36:00landscape. If you're doing normal inloop36:02agent decoding, it's pretty obvious that36:04you can just add more context and you'll36:06have to compress less, which is36:08massively valuable. But this goes even36:10further. When you build your own custom36:11agent harness, let's be super clear.36:13These are micro applications that do one36:15thing extraordinarily well, the agentic36:17way. So that's what PI unlocks. And the36:19last piece here is agent expertise. You36:22can now have your agents track very,36:24very long scratch pads, memory files,36:28specifically expertise that helps them36:30get an advantage in whatever situation36:33you're putting that specific custom36:34agent in. All right, so these are the36:36three innovations I want to sit down and36:38talk with you today about. Let me be36:40super clear. 1 million true context36:42window thanks to the Opus and Sonnet 4.636:45series. We have customized agent36:47harnesses. Once again, I'll link the36:49video where we break down PI coding36:51agent versus cloud code. PI is one of36:54the only models I see as a true Claude36:56code competitor because of the36:58customization. And then we have agent37:00expertise. Thanks to that extended37:03context window, your agents can remember37:06and get on the same page as you about37:09your product, your business, so on and37:10so forth. And so you can see once again37:12our agents are creating their unique37:14opinionated response to the FDA warning37:18that this hypothetical business37:20received. All right. And so they're37:21going to go on and help me make a37:22critical business decision. So this37:24codebase is available exclusive to37:26Tactical Agent Coding and Agent Horizon37:28members. For new engineers that don't37:30know what this is, welcome. I don't sell37:32or receive any sponsorships. The only37:34thing I sell is handcrafted courses for37:38mid to senior level engineers. engineers37:40that ship to production. This is my take37:42on how to scale far beyond a coding and37:45vibe coding with agentic engineering so37:47powerful your codebase runs itself.37:50You're starting to see big labs really37:52catch up and implement these ideas. This37:54course is unique in that it's not about37:56cloud code. It's not about any specific37:58tool. This is about agentic engineering.38:01It's about building the system that38:03builds the system. All right. Some of38:05the big ideas we talk about outloop38:06agentic engineering. Stop sitting in the38:08terminal prompting back and forth and38:09back and forth. Teach your agents how to38:11build your way, right? Template your38:13engineering. As we mentioned in the38:14beginning, we talk about the key38:15leverage points that change your impact38:18and much more. Okay? We build a system38:20that builds the system. I built this38:23channel and and this course in the same38:25mindset, the same frame every single38:27week. I aim to be your favorite38:29engineers's favorite engineer by really38:32sitting down and dialing in and breaking38:34down what you can really do with this38:36technology. All right, so to be super38:37clear, there are two courses in here.38:39There's the base course, tactical agent38:41coding, and then there's the second38:43extended course, all right, Agent38:45Horizon. I'm going to release the CEO38:47and board agent tool for Agentic Horizon38:50members. So, you have to have both of38:51these to gain access, but it'll be well38:53worth it. We talk about big hitting38:55ideas you've seen all over the AI38:57industry and some that you haven't. All38:59right, one of the big ones being agent39:01experts. This is phase two and it's39:04really, really important to get on top39:06of this stuff because phase three is39:08coming. Everything we're going to be39:10doing in phase three is going to build39:12on top of everything we've done in phase39:14two with aentic coding and agentic39:16engineering as a whole. So, this is here39:18if you're interested, if you understand39:20that valuable things are not always39:22free. The course is there if you're39:24interested. You can see here my CEO and39:27board is continuing to work. For39:29everyone else, I'll hold this here for a39:30second. Go ahead and take a screenshot.39:32Try to get your your Cloud Code agent to39:35reproduce the whole thing. Good luck.39:36I'll also link the PI coding agent video39:39in the description that broke down the39:41opportunity available for engineers that39:43want to customize their agent harness39:45and get an additional advantage. We are39:47in the age of agents. The engineers that39:49can scale and use compute in the form of39:52agents are the engineers that are and39:54will win. You know where to find me39:56every single Monday. Stay focused and39:59keep building.