推文 | Alfred's Site

账号:全部 mattpocockuk dair_ai simonw bcherny adocomplete trq212 felixrieseberg dani_avila7 dexhorthy yetone zarazhangrui badlogicgames petergyang 0xblacklight ctatedev iamzhihui karpathy kunchenguid mckaywrigley ryancarson theo aidenybai dotey garrytan kieranklaassen leon7hao omarsar0 thdxr 9hills GeminiApp RLanceMartin RhysSullivan ZaynHao dillon_mulroy lennysan mvanhorn nummanali vikingmute yiliush 0xPaulius AYi_AInotes AlchainHust AnthropicAI BenJames_____ChadMoran ClaudeDevs DanielMiessler DaveJ DavidKPiano GoogleDeepMind HilaShmuel Jack_W_Lindsey Jiaxi_Cui LinghuaJ Meari_V2_0_G Mnilax RaillyHugo Taniyatweets_addyosmani aibuilderclub_alliekmiller andrewfarah antirez artman asynkimo bbssppllvv cailynyongyong cgtwts charmaine_klee chrisbarber chrisparkX clairevo danshipper demishassabis dingyi doodlestein elithrar elonmusk ericzakariasson hylarucoder imdigitalashish jakevin7 jarredsumner jerryjliu0 joshalbrecht jpschroeder kaiofreitas lexrus logancyang mitchellh petradonka prathamgrv prathyvsh quanruzhuoxiu raroque rohanpaul_ai sama sawyerhood servasyy_ai shao__meng steipete techgirl1908 thorstenball tobiaswup tricalt tuturetom untraceable_the velvet_shark ycombinator ziwenxu_

@simonw·19h ago

Also a great example of positive contribution to open source by wanderingmeow - you don't need to contribute code to have a positive impact, just providing detailed feedback and confirmation that something like this works is enormously useful

@antirez

High quality interactions are still possible in the AI era. https://t.co/CjkD9gzo6G

↗ view on x.com

这也是 wanderingmeow 对开源正向贡献的好例子——你不需要贡献代码才能产生正向影响，仅仅提供详细反馈、确认某个功能确实可用，本身就极具价值。 [引用 @antirez]：AI 时代，高质量的交流依然是可能的。

@simonw·3d ago

Mitchell's post here reminded me of a similar conversation I had recently about how cheap it can be to port native mobile apps to React Native using coding agents... and then port them back again later if it turns out not to work out https://t.co/p4xZ6bNqHi

@mitchellh

It isn't unexpected that the focus of the Bun Rust rewrite is on the anti-Zig side more than anything, since the internet loves to hate. What is unexpected and unfortunate is that leadership within Bun hasn't tried to steer the conversation away from that at all. There are so many positive and interesting takeaways from this and I'm not really seeing any of them pushed as the primary message. A positive thing that hasn't been talked about at all is how far Bun came thanks to Zig. And even if you dump it now, its meaningful for how good Zig was to even build a product to this point and impact by any metric. I would've loved to see anyone in leadership say this. On the interesting side is how fungible programming languages are nowadays. Programming languages used to be LOCK IN, and they're increasingly not so. You think the Bun rewrite in Rust is good for Rust? Bun has shown they can be in probably any language they want in roughly a week or two. Rust is expendable. Its useful until its not then it can be thrown out. That's interesting! There's been a lot of talk about memory safety and no doubt Rust provides more guarantees than Zig. But I'd love to see a better analysis of why Bun in particular suffered so much rather than take the language-blame path. How could engineering as a practice been more rigorous to prevent this? What were the largest sources of crashes other programs should watch out for? How does Rust prevent them? How could Zig theoretically prevent them? That's interesting. I know the official blog post hasn't come out yet from Bun. But they're smart enough to know that that PR would stir up controversy the moment it opened, or they should've been. And plenty in the company have been tweeting and writing about it. Its somewhat telling to me in various dimensions what they chose to talk about first. I tend to think I'm pretty good at corporate PR/comms (especially when it comes to developer audiences) and I think appealing to the negative is never the right long term strategy; it does work to get short term eyes though.

↗ view on x.com

@simonw: Mitchell 的这篇文章让我想起最近一次类似的对话——用 coding agent 把原生移动应用移植到 React Native 的成本有多低……如果后来发现不合适，再移回去也同样简单。引用 @mitchellh：Bun 的 Rust 重写讨论聚焦在反 Zig 情绪上并不意外，毕竟互联网天生喜欢喷。真正令人遗憾的是，Bun 的领导层完全没有尝试把话题引向别处。这件事其实有很多正面且有趣的角度，但我几乎看不到有人把这些作为主要信息来传递。一个完全没人讨论的正面角度是：Bun 走到今天，Zig 功不可没。就算现在要抛弃它，能靠 Zig 把产品做到这个规模和影响力，本身就说明 Zig 很优秀。多希望看到领导层有人说这句话。真正有趣的角度是：**编程语言正变得越来越可替换**。语言曾经意味着锁定（lock-in），但现在越来越不是了。你以为 Bun 用 Rust 重写是 Rust 的利好？Bun 已经证明他们大概可以在一两周内换成任何语言。Rust 是可抛弃的——有用时用，没用时扔掉。这才是真正值得讨论的！我更希望看到有人深入分析 Bun 为什么特别容易崩溃，而不是把锅甩给语言。崩溃的最大来源是什么？Rust 怎么防止？Zig 理论上能怎么防止？这才有趣。诉诸负面永远不是长期正确的传播策略，虽然短期确实能赚眼球。

@simonw·4d ago

Doing this is a great way to make a bonfire of your reputation

@GergelyOrosz

A person I know (and who is? was? a good professional al) left an AI-generated comment under my LinkedIn post. Full-on AI slop. I asked: why do this? He replied. It’s because of “engagement.” People are burning their profession al reputation, paying for AI tools, for nothing https://t.co/BxEgpt8Ppm

↗ view on x.com

@simonw: 这是快速烧掉自己声誉的好办法。引用 @GergelyOrosz：我认识一个人（曾经是个优秀的专业人士）在我的 LinkedIn 帖子下留了一条 AI 生成的评论。纯粹的 AI 水文。我问：为什么这样做？他回答：为了「互动」。人们正在花钱买 AI 工具，燃烧自己的职业声誉，换来的什么都没有。

@simonw·6d ago

Wrote about today's GitLab restructuring / "workforce reduction" announcement, and ended up digging around in version control for both the GitLab and the 37signals public employee handbooks to help illustrate my thoughts https://t.co/xkqehsa5hT

↗ view on x.com

@simonw: 写了一篇关于今天 GitLab 重组/「裁员」公告的文章，顺带挖了 GitLab 和 37signals 公开员工手册的版本历史，用来佐证我的观点。https://t.co/xkqehsa5hT

@simonw·6d ago

My Mac had less available memory than I expected, turned out the "claude" Claude Code processes on this machine (running in various terminal windows) were consuming ~30GB on their own! The largest one was using 4.9GB

↗ view on x.com

我的 Mac 可用内存比预期少，发现罪魁祸首是这台机器上跑在各个终端窗口里的 "claude" Claude Code 进程——光它们就占了约 30GB！最大的那个单独用了 4.9GB。

@simonw·6d ago

New TIL: I figured out how to use my LLM CLI tool in a shebang line, which means you can write executable scripts in English, or hook up more complex scripts with a snippet of YAML template https://t.co/8mngqTbiTO

↗ view on x.com

新 TIL：搞清楚了如何在 shebang 行里使用我的 LLM CLI 工具，这意味着你可以直接用英文写可执行脚本，或者用一段 YAML 模板接入更复杂的脚本。

@simonw·5月10日

Shopify's River agent system lives in Slack and can only be used in public so that other employees can learn from what you do with it Reminds me of how Midjourney's Discord-only launch helped people figure out the weird & complex craft of image prompting by watching each other

@tobi

Learning on the Shop floor

↗ view on x.com

Shopify 的 River agent 系统运行在 Slack 里，且只能在公开频道使用，这样其他员工可以从你的使用过程中学习。这让我想起 Midjourney 当初只在 Discord 上发布——人们在互相观摩的过程中，逐渐摸索出图像提示这门奇特而复杂的技艺。【引用 @tobi】在车间里学习

@simonw·5月8日

Asking for HTML explanations of things is pretty neat, I tried it just now with the obfuscated Python POC for the new https://t.co/MFpUWZ2yRX Linux vulnerability: https://t.co/9g9YRHVvQX https://t.co/6751HGFvJz

@trq212

Using Claude Code: The Unreasonable Effectiveness of HTML

↗ view on x.com

用 HTML 来解释事物这招还挺妙的——我刚用它试了最新 Linux 漏洞的混淆 Python POC：[链接] [引用 @trq212]: 用 Claude Code：HTML 的惊人效能

@simonw·5月7日

We already had gemini-3.1-flash-lite-preview back on March 3rd, not clear if this new gemini-3.1-flash-lite is different other than no longer being marked as a "preview". Pricing appears to be the same.

@GoogleAIStudio

gemini 3.1 flash-lite is here it's our most cost-efficient model, optimized for high-volume agentic tasks, translation, and simple data processing https://t.co/QhaTNoLcgu

↗ view on x.com

@simonw：我们早在 3 月 3 日就已经有了 gemini-3.1-flash-lite-preview，不清楚这个新的 gemini-3.1-flash-lite 除了不再标注「preview」之外有什么不同。定价看起来完全一样。 [引用 @GoogleAIStudio]：gemini 3.1 flash-lite 来了这是我们性价比最高的模型，专为大规模 agentic 任务、翻译和简单数据处理优化

@simonw·5月7日

Saw this and thought "yes! ChatGPT voice mode is going to stop acting like a two-year-model" but that upgrade hasn't shipped just yet

@OpenAI

Introducing GPT-Realtime-2 in the API: our most intelligent voice model yet, bringing GPT-5-class reasoning to voice agents. Voice agents are now real-time collaborators that can listen, reason, and solve complex problems as conversations unfold. Now available in the API alongside streaming models GPT-Realtime-Translate and GPT-Realtime-Whisper — a new set of audio capabilities for the next generation of voice interfaces.

↗ view on x.com

@simonw：看到这个我以为「太好了！ChatGPT 语音模式终于要告别像两年前的模型了」——但这次升级还没到产品端 [引用 @OpenAI]：发布 GPT-Realtime-2 API：我们最智能的语音模型，将 GPT-5 级别的推理能力带入 voice agent。 Voice agent 现在可以成为真正的实时协作者——边听、边推理、边解决复杂问题。现已在 API 中提供，附带流式模型 GPT-Realtime-Translate 和 GPT-Realtime-Whisper，构成下一代语音界面的全套音频能力。

@simonw·5月7日

Under-reported details of the xAI/Anthropic Colossus data center deal: Anthropic get Colossus 1 but xAI keep using the larger Colossus 2, Colossus 1 has a REALLY bad environmental record, and xAI just shut down a bunch of older models on 2 weeks' notice https://t.co/oCKBRNwPVH

↗ view on x.com

@simonw：xAI/Anthropic Colossus 数据中心交易中鲜被报道的细节：Anthropic 拿到的是 Colossus 1，而 xAI 继续使用更大的 Colossus 2；Colossus 1 的环境记录相当糟糕；而且 xAI 刚刚只给了两周通知就下线了一批旧模型。

@simonw·5月6日

Watching @jarredsumner and @bcherny at Code w/ Code talking about robobun, the Bun project's GitHub bot that's now made more contributions to Bun than Jarred has https://t.co/hXdWTC5kiG https://t.co/TRMW2zarCq

↗ view on x.com

在看 @jarredsumner 和 @bcherny 在 Code w/ Code 上聊 robobun——Bun 项目的 GitHub bot，现在对 Bun 的贡献数量已经超过了 Jarred 本人。

@simonw·5月6日

I'm at the Claude w/ Code event in San Francisco, and I'll be live blogging the keynote here https://t.co/6U1Amd2XSZ

↗ view on x.com

我正在旧金山参加 Claude w/ Code 活动，会在这里实时直播主题演讲 [链接]

@simonw·5月6日

I was talking with @josephruscio on the @heavybit podcast the other day when I realized that vibe coding and agentic engineering have started to blur a bit in some of my work - I published some extracts from the transcript https://t.co/evxoG06Vpa

↗ view on x.com

前几天我和 @josephruscio 在 @heavybit 播客聊天时意识到，vibe coding 和 agentic engineering 在我的一些工作里已经开始有点模糊了——我把部分对话记录摘出来发了出来 https://t.co/evxoG06Vpa

@simonw·5月5日

AI-run business experiments are interesting and fun up to the point where they waste the time of humans who haven't opted into the experiments - I think they need to keep their own human operators in the loop for outbound actions that affect other people https://t.co/QluYQ7lwi5

↗ view on x.com

AI 驱动的业务实验很有趣，但有个边界——当它开始浪费那些没有主动参与实验的人类时间时就越界了。我认为，对于会影响他人的对外行动，AI 系统必须保持自己的人类操作员在回路中。

@simonw·4月8日

Pelicans for Meta's new Muse Spark models - plus I did a bit of a deep dive into the Code Interpreter and fascinating "container.visual_grounding" tools in their https://t.co/Qq1tQtgKH7 chat UI https://t.co/NIQB6ZACDC

↗ view on x.com

Meta 新的 Muse Spark 模型用了鹈鹕形象——另外我深入研究了他们 chat UI 中的 Code Interpreter 和有趣的 "container.visual_grounding" 工具

@simonw·4月8日

The feature I most want from AI labs right now is documentation on which underlying search engines they use when their chat tools run a search OpenAI and Anthropic and Meta AI all have search and I have NO IDEA what index they are using (I would hope tha Gemini uses Google!)

↗ view on x.com

我现在最希望 AI 实验室做的事：说清楚他们聊天工具执行搜索时用的是哪个底层搜索引擎。OpenAI、Anthropic、Meta AI 都有搜索功能，但我完全不知道他们用的是什么索引。（希望 Gemini 用的是 Google！）

@simonw·4月7日

Wrote up some thoughts on Anthropic's Project Glassing, where their latest Opus-beating model is available to partnered security research organizations only Given recent alarm bells raised by credible security voices I think this is a justified decision https://t.co/PKgzp99sA0

↗ view on x.com

写了一些关于 Anthropic Project Glassing 的想法——他们最新的超越 Opus 的模型只对合作安全研究机构开放。鉴于近期可信安全人士发出的警告，我认为这是合理的决定。

@simonw·4月7日

754B parameters, 1.51TB on Hugging Face

@Zai_org

Introducing GLM-5.1: The Next Level of Open Source - Top-Tier Performance: #1 in open source and #3 globally across SWE-Bench Pro, Terminal-Bench, and NL2Repo. - Built for Long-Horizon Tasks: Runs autonomously for 8 hours, refining strategies through thousands of iterations. Blog: https://t.co/hmyDe4Nel3 Weights: https://t.co/CuUjXcPKJD API: https://t.co/fz6reja4fb Coding Plan: https://t.co/Nk8Y98HNhU Coming to https://t.co/WCqWT0qCQb in the next few days.

↗ view on x.com

754B 参数，在 Hugging Face 上占 1.51TB。 [引用] 发布 GLM-5.1：开源新高度 - 顶级性能：在 SWE-Bench Pro、Terminal-Bench 和 NL2Repo 上开源第一、全球第三 - 为长周期任务而生：可自主运行 8 小时，通过数千次迭代持续优化策略权重、API 等见链接，即将上线 Claude.ai。

@simonw·4月5日

Billing different based on text contained in the system prompt is a really bad look

@steipete

Anthropic now blocks first-party harness use too 👀 claude -p --append-system-prompt 'A personal assistant running inside OpenClaw.' 'is clawd here?' → 400 Third-party apps now draw from your extra usage, not your plan limits. So yeah: bring your own coin 🪙🦞

↗ view on x.com

@simonw：根据 system prompt 中的文本内容差异化计费，这真的很难看引用 @steipete：Anthropic 现在也屏蔽了第一方 harness 的使用 👀 claude -p --append-system-prompt 'A personal assistant running inside OpenClaw.' 'is clawd here?' → 400 第三方 App 现在从你的额外用量中扣，而不是套餐限额。所以：自备代币吧 🪙🦞

@simonw·4月5日

I built a new Python CLI tool for scanning folders for secret strings, useful if you want to share a bunch of log files but first want to check they didn't accidentally leak API keys or similar. Run this command to learn more: uvx scan-for-secrets --help

↗ view on x.com

我做了一个新的 Python CLI 工具，用于扫描文件夹中的敏感字符串——如果你想分享一批日志文件，但又想先确认里面有没有意外泄漏的 API key 之类的内容，这个工具很有用。运行以下命令了解详情： uvx scan-for-secrets --help

@simonw·4月4日

Started a new tag on my blog to track stories about AI-powered security research, which is very much having a moment right now - 11 posts so far already https://t.co/rlEjS0Ho1h

↗ view on x.com

我在博客上新建了一个标签，专门追踪 AI 驱动的安全研究相关文章——这个话题现在正处于爆发期，目前已经有 11 篇了。

@simonw·4月3日

Just noticed this has had 1.1m views now, which explains why I starting to see some less informed reactions to it starting to crop up now it's broken out of purely tech Twitter

@lennysan

"Using coding agents well is taking every inch of my 25 years of experience as a software engineer, and it is mentally exhausting. I can fire up four agents in parallel and have them work on four different problems, and by 11am I am wiped out for the day. There is a limit on human cognition. Even if you're not reviewing everything they're doing, how much you can hold in your head at one time. There's a sort of personal skill that we have to learn, which is finding our new limits. What is a responsible way for us to not burn out, and for us to use the time that we have?" @simonw

↗ view on x.com

刚注意到这条推文已经有 110 万次浏览了，这也解释了为什么我开始看到一些不那么专业的反应——它已经从纯技术 Twitter 圈子里破圈了。引用 @lennysan：「把 coding agent 用好，正在消耗我 25 年工程经验积累的每一寸能力，精神上极度疲惫。我可以同时跑四个 agent 处理四个不同问题，但到上午 11 点我就已经精力耗尽了。人类认知是有上限的——就算你不逐一审查它们的每个动作，能同时在脑子里装多少东西也是有限的。我们需要学习一种新的个人技能：找到自己的新极限。怎样才是不让自己 burnout、同时充分利用手头时间的负责任方式？」

@simonw·4月3日

Warning to open source maintainers: the Axios supply chain attack started with some very sophisticated social engineering targeted at one of their developers https://t.co/ykpjVUFmUu

↗ view on x.com

警告开源维护者：Axios 供应链攻击的起点，是针对其中一位开发者的高度精密的社会工程学攻击。

@simonw·4月3日

Anyone figured out a recipe to run Gemma 4 E2B or E4B against audio files locally on a Mac yet?

@osanseviero

@simonw The 2 small ones also support audio understanding! Including ASR, speech to translated text, and more

↗ view on x.com

有人找到在 Mac 本地用 Gemma 4 E2B 或 E4B 处理音频文件的方案吗？引用 @osanseviero：@simonw 那两个小模型也支持音频理解！包括 ASR、语音转译文等功能。

@simonw·3月27日

I've been vibe coding SwiftUI menu bar apps for my new Mac, turns out Claude Opus 4.6 and GPT-5.4 are both competent at Swift programming, no need to even open Xcode! https://t.co/1Vi0rtlVh1

↗ view on x.com

我一直在用 vibe coding 的方式为我的新 Mac 开发 SwiftUI 菜单栏应用，结果发现 Claude Opus 4.6 和 GPT-5.4 在 Swift 编程上都很能打，甚至不需要打开 Xcode！

@simonw·3月26日

To me this mostly illustrates the futility of robust jailbreaking prevention

@kotekjedi_ml

New paper: We deploy Claude Code in an autoresearch loop to discover novel jailbreaking algorithms – and it works. It beats 30+ existing GCG-like attacks (with AutoML hyperparameter tuning) This is a strong sign that incremental safety and security research can now be automated. https://t.co/cDwxVydVPr

↗ view on x.com

@simonw：对我来说，这主要说明了强健 jailbreak 防御的徒劳性。 [引用 @kotekjedi_ml]：新论文：我们将 Claude Code 部署在自动研究循环中，用于发现新型 jailbreak 算法——而且效果显著。它击败了 30 多种现有的 GCG 类攻击（配合 AutoML 超参数调优）。这是一个强烈信号：增量式安全研究现在可以被自动化了。

@simonw·3月24日

Thankfully the LiteLLM package has now been marked as "quarantined" on PyPI so attempting to install the compromised update via pip et al shouldn't work https://t.co/BmrbWCoLXn

@hnykda

LiteLLM HAS BEEN COMPROMISED, DO NOT UPDATE. We just discovered that LiteLLM pypi release 1.82.8. It has been compromised, it contains litellm_init.pth with base64 encoded instructions to send all the credentials it can find to remote server + self-replicate. link below

↗ view on x.com

好消息：LiteLLM 包已在 PyPI 上被标记为「隔离」，因此通过 pip 等工具安装被攻击的更新版本应该已经无法成功。 > 【引用】LiteLLM 已被入侵，请勿更新。我们刚刚发现 LiteLLM pypi 1.82.8 版本遭到投毒，其中包含 litellm_init.pth 文件，内含 base64 编码的指令，会将所能找到的所有凭证发送至远程服务器，并进行自我复制。

@simonw·3月24日

Turns out you can run enormous Mixture-of-Experts on Mac hardware without fitting the whole model in RAM by streaming a subset of expert weights from SSD for each generated token - and people keep finding ways to run bigger models Kimi 2.5 is 1T, but only 32B active so fits 96GB

@seikixtc

I got a 1T-parameter model running locally on my MacBook Pro. LLM: Kimi K2.5 1,026,408,232,448 params (~1.026T) Hardware: M2 Max MacBook Pro (2023) w/ 96GB unified memory Running on MLX with a flash-style SSD streaming path + local patching. This is an experimental setup and I haven’t optimized speed yet, but it’s stable enough that I’ve started testing it in an autoresearch-style loop. #LocalAI #MLX #MoE

↗ view on x.com

原来可以在 Mac 硬件上运行超大型 Mixture-of-Experts 模型，而无需将整个模型载入内存——方法是在每次生成 token 时从 SSD 流式加载一部分专家权重。而且人们还在不断想办法跑更大的模型。 Kimi 2.5 有 1T 参数，但每次只激活 32B，所以能放进 96GB 内存。 [引用 @seikixtc]：我在 MacBook Pro 上跑起来了一个 1T 参数的模型。 LLM：Kimi K2.5 参数量：1,026,408,232,448（约 1.026T）硬件：M2 Max MacBook Pro（2023），96GB 统一内存使用 MLX 运行，采用 flash-style SSD streaming 路径 + local patching。这是实验性配置，速度还没有优化，但已经稳定到足以在 autoresearch 风格的循环中测试使用。 #LocalAI #MLX #MoE

@simonw·3月23日

Starlette 1.0 is out! I used this as an opportunity to experiment with Claude Skills, since Claude isn't yet familiar with the (minor) breaking changes in the 1.0 release compared to 0.x https://t.co/IxQ18DShxg

↗ view on x.com

Starlette 1.0 发布了！我借这个机会尝试了 Claude Skills，因为 Claude 对 1.0 版本相比 0.x 的（小）breaking changes 还不熟悉。

357 tweets · 110 sources