推文 | Alfred's Site

账号:全部 mattpocockuk dair_ai simonw bcherny adocomplete trq212 dexhorthy yetone felixrieseberg 0xblacklight dani_avila7 zarazhangrui AlchainHust badlogicgames dotey petergyang vikingmute ryancarson ClaudeDevs kunchenguid leon7hao ctatedev iamzhihui kieranklaassen mckaywrigley lennysan thdxr RhysSullivan garrytan jakevin7 johnlindquist karpathy servasyy_ai theo 0xMovez 9hills RLanceMartin aidenybai dingyi hylarucoder mitchellh mitsuhiko mvanhorn omarsar0 steipete zeeg GeminiApp GergelyOrosz ZaynHao addyosmani antirez danshipper dillon_mulroy elonmusk ewind_dev idoubicc nummanali realWeZZard swyx thsottiaux yiliush 0xPaulius 0x_rody AYi_AInotes AnthropicAI Barret_China BenJames_____CMGS1988 ChadMoran DanielMiessler DaveJ DavidKPiano Dimillian FactoryAI FardeemM FarzaTV GoogleDeepMind HamelHusain HiTw93 HilaShmuel IfanJew IndieDevHailey Jack_W_Lindsey Jackywine JasonZX Jiaxi_Cui JinjingLiang KhalidWarsa LinearUncle LinghuaJ MatthewBerman MaxForAI Meari_V2_0_G Mnilax QingQ77 RaillyHugo Saccc_c Taniyatweets_VincentLogic ai_explorer25 aibuilderclub_alexalbert__alliekmiller andrewfarah antoinecojp anxue201 arkuy99 artman asynkimo bbssppllvv bearliu bentossell bridgemindai buaaxhm cailynyongyong cgtwts charmaine_klee chrisbarber chrisparkX clairevo delba_oliveira demishassabis doodlestein driaforall elithrar elvissun ericzakariasson francoisfleuret gabriell_lab iamsahaj_xyz imdigitalashish imvihv jarredsumner jerryjliu0 jianshuo jiayuan_jy jlongster jonas_nelle joshalbrecht jpschroeder kaiofreitas kepano lexrus logancyang mattyp nabeel nifinet ninthbit_ai oran_ge patrickc pejmanjohn petradonka prathamgrv prathyvsh quanruzhuoxiu quant_sheep raroque realCaigu repsiace rohanpaul_ai sama samuelstroschei sawyerhood shadcn shao__meng shaogefenhao shreyansj techgirl1908 techwith_ram thorstenball tianyi tobiaswup toddsaunders tricalt turingou tuturetom uniswap12 untraceable_the vanillaCitron velvet_shark wengtianxin wquguru xiaogaifun xicilion ycombinator zachlloydtweets ziwenxu_zodchiii

@0xblacklight·6月18日For You

lots of folks have been talking about loops lately most loops suck here's a practical one we actually use agents suck at writing react react-doctor by @aidenybai is our favorite way to deal with this you could run it and use a ralph loop to fix everything but I'm not reading a +80k/-80k PR (and neither is @dexhorthy) But I can read a small one first thing every morning when i get into the office here's what we do: run react-doctor in CI once daily at 7am (github actions-as-a-sandbox btw) agent picks top 5 issues, fixes them, and opens a PR other CI jobs check for regressions on every PR we can't realistically fix everything at once but we can keep it from getting worse and make it 1% better every day

↗ view on x.com

最近大家都在聊 loop，但大多数 loop 方案都烂透了。分享一个我们实际在用的： Agent 写 React 代码很烂。@aidenybai 的 react-doctor 是我们对付这个问题的最爱。你可以跑一次、用 ralph loop 把所有问题修光——但我不会去 review 一个 +80k/-80k 的 PR（@dexhorthy 也不会）。但我每天早上到办公室后能读一个小 PR。我们的做法是：每天早上 7 点在 CI 里跑 react-doctor（顺带说，GitHub Actions 是个绝佳沙盒） Agent 挑出前 5 个问题，修完，开一个 PR 其他 CI job 在每个 PR 上检查回归没办法一次把所有问题修完，但可以做到两件事：不让它变得更坏，以及每天 1% 的改进。

@0xblacklight·6月11日For You

.@mattpocockuk's /teach skill is great with humanlayer's inline HTML UI + skill bundle https://t.co/WVTXdmH4BA

↗ view on x.com

@mattpocockuk 的 /teach skill 搭配 humanlayer 的 inline HTML UI + skill bundle，效果很棒

@0xblacklight·6月9日For You

as the labs try to make models more "agentic" and better at long-horizon tasks, the models are overwhelmingly biased towards action so that when I ask a question about how something works, or "how could we accomplish X" the model just jumps straight to doing X which is not what I asked and which is ridiculously annoying

↗ view on x.com

随着各家实验室努力让模型更「智能体化」、更擅长长链任务，模型越来越偏向于直接行动。所以当我问「某件事是怎么运作的」或「我们怎么能做到 X」时，模型会直接跳去做 X。这不是我问的。而且非常烦人。

@0xblacklight·6月7日For You

This is such a banger There’s definitely a balance between “write a spec before writing code” and just getting into it Probably specs need to be more grounded in the code

@kinglycrow

so someone replied and then deleted, but I think it's a good point so i screencapped and anonymized it -- for me that friction is the point, forcing someone to create and maintain an RFC like thing this is the type of friction i feel like oss might need in the age of the agent https://t.co/1BTCjvZwBY

↗ view on x.com

这个说法太到位了「先写 spec 再写代码」和「直接动手」之间确实需要找平衡 Spec 大概需要更扎根于代码本身 [引用 @kinglycrow]：有人回复后删了，但我觉得这个观点值得保留，所以截图并匿名处理了——对我来说，那种摩擦感本身就是重点：强迫人去创建并维护类似 RFC 的东西。这种摩擦，正是 OSS 在 agent 时代可能需要的。

@0xblacklight·6月5日For You

agents are fundamentally a storage problem not even a compute problem lots of people are paying VM prices for POSIX filesystem APIs when agents actually don't need either agents don't see posix, they see tokens in / tokens out

@pdlug

The sandbox is a hack. LLMs got good at bash and grep, so they got their own computers. But most agent work is I/O. Move durability into the data layer and the loop becomes a cheap stateless function. That's the right primitive. So happy to see this.

↗ view on x.com

Agent 本质上是个存储问题，甚至不是计算问题。很多人在为 POSIX 文件系统 API 支付 VM 价格，但 agent 实际上两者都不需要。Agent 看不到 POSIX，它们看到的是 tokens in / tokens out。引用 @pdlug：沙盒是个 hack。LLM 擅长 bash 和 grep，所以就给了它们自己的电脑。但大多数 agent 工作是 I/O。把持久性移到数据层，循环就变成廉价的无状态函数——这才是正确的抽象原语。很高兴看到这个方向。

@0xblacklight·5月22日For You

imagine doing this with @EffectTS_

@swyx

working on a "take this vibecoded slop app and make it a production-ready, e2e tested, maintainable, parallelizable agent repo" skill. this thing ran for ~16 hours yesterday and made 103 commits all told and i ended up with exactly the same app but instead of fragile mvp it now looks like a codebase i can actually build on for th elong run

↗ view on x.com

想象一下用 @EffectTS_ 来做这个 [引用 @swyx]: 正在开发一个 skill：把 vibecoded 的烂摊子应用改造成生产就绪、端到端测试、可维护、可并行化的 agent 代码仓库。昨天跑了大约 16 小时，总共提了 103 个 commit，最终得到的还是完全相同的应用——但不再是脆弱的 MVP，而是一个我真正可以长期构建的代码库。

@0xblacklight·5月22日For You

this is basically code-mode for agent-driven SaaS configuration btw don't make the agent use a CLI or MCP server or agent browser - just let it write code that declares intent and let the provider work out how to wire it together

@0xblacklight

I recently chose one vendor over a second because the first one had a more robust API and in an afternoon codex has built a pulumi provider around their API for me so that all our configs in their SaaS are managed with declarative code that's version-controlled, type-safe, and explicit for agents (instead of needing their CLI/MCP server) and plugs into our other IaC so we don't need to go do things in dashboards and then configure it in our IaC this is what headless SaaS for agents means btw not "ship an MCP server" let me (or codex) configure it with code code mode for SaaS if you will - IaC for SaaS configuration

↗ view on x.com

这基本上就是 agent 驱动 SaaS 配置的「code mode」。别让 agent 去用 CLI、MCP server 或 agent browser——直接让它写声明意图的代码，让 provider 去负责 wiring（怎么把各部分连起来）。 [引用原推] 最近我选了 A 厂商而不是 B，原因是 A 的 API 更健壮。然后一个下午，codex 帮我围绕它的 API 构建了一个 Pulumi provider——现在他们 SaaS 里的所有配置都用声明式代码管理：版本控制、类型安全、对 agent 显式可读（而不是靠他们的 CLI/MCP server），还能接入我们其他的 IaC。从此不用再去 dashboard 里点点点，然后再去 IaC 里同步配置了。这才是「headless SaaS for agents」的真正含义——不是「出一个 MCP server」，而是让我（或 codex）用代码来配置它。可以叫它 SaaS 的 code mode，或者说 SaaS 配置的 IaC。

@0xblacklight·5月18日For You

the great thing about effect is you can adopt it super incrementally turn a promise into an effect turn an effect into a promise have a vendor API that's super flaky? wrap it in an effect, add bounded exponential backoff + jitter no other opinions required https://t.co/ZmmNNyU23v

↗ view on x.com

Effect 的优点是可以超增量地采用：把一个 Promise 转成 Effect，再把 Effect 转回 Promise—— 遇到特别不稳定的第三方 API？把它包进 Effect，加上有界指数退避 + jitter，其他地方完全不用动。

@0xblacklight·4月8日For You

have been saying this outer-loop agents just run things inside sandboxes when needed but operate on virtual filesystems and with virtual memory

@rafalwilinski

harness inside sandbox vs outside of sandbox debate is finally solved https://t.co/6wFM1v8GuJ

↗ view on x.com

一直在说这个 outer-loop agent 在需要时把任务丢进 sandbox 跑，但自身运行在虚拟文件系统和虚拟内存上 [引用 @rafalwilinski]：harness 放在 sandbox 内还是外的争论终于有定论了

@0xblacklight·3月28日For You

codexbros is this normal buddy had is brain fried in the RL torment nexus wouldn't even TRY reading outside the project directory (it could bc custom harness) https://t.co/sVOLqqLmET

↗ view on x.com

Codex 用户们，这正常吗？这家伙的大脑被 RL 训练折磨坏了——连尝试都不肯去读项目目录以外的文件（其实它是可以的，因为用的是自定义 harness）。

@0xblacklight·3月27日For You

the first search result for 'install claude code' on google is a (very convincing) clone of the claude code docs site that is distributing malware btw https://t.co/TeCZ4j4DAY

↗ view on x.com

顺便说一句，Google 搜索「install claude code」的第一条结果是一个（极具迷惑性的）claude code 文档仿冒站点，正在分发恶意软件 [截图链接]

@0xblacklight·3月26日For You

Indeed If you start with bad patterns you will never escape them due to LLMs auto-regressive nature We saw this working on a brownfield codebase and it heavily influenced our new workflow Nailing down good patterns for the agent to follow is an incredibly high-leverage point for humans

@dexhorthy

When @0xblacklight started working on the codelayer rewrite, he spent 2 weeks in vscode plumbing every pattern by hand so the clankers could be productive later - definitely paying off

↗ view on x.com

确实如此。如果你从糟糕的模式起步，由于 LLM 自回归的本质，你永远无法逃脱它们。我们在一个存量 codebase 上工作时深有体会，这也深刻影响了我们新的工作流。为 agent 奠定良好的模式，是人类能做的杠杆最高的事情之一。【引用 @dexhorthy】：当 @0xblacklight 开始做 codelayer 的重写时，他花了两周时间在 vscode 里手工铺设好每一个模式，这样后续的 clanker 才能高效工作——显然很值得。

@0xblacklight·3月24日For You

building something where you spent a lot of time thinking about the right abstractions and nailed them is amazing it's dramatically easier to ship if you nail this the temptation is to worry about all your abstractions up front the right thing to do though actually is to optimize for learning, ship things, and then refactor after patterns emerge from hitting the same problems 10 times

↗ view on x.com

花了大量时间思考正确的抽象层次并最终钉住它，这种感觉太棒了如果抽象做对了，交付速度会快得惊人很多人的误区是一开始就担心所有抽象的问题但正确的做法其实是：优先优化学习，先把东西发出去，等同一个问题撞墙 10 次之后，模式自然浮现，再重构

760 tweets · 188 sources