AI 新聞與投資

Ilya Sutskever (伊利亞·蘇茨克維)

Co-founder · Safe Superintelligence

縮放天花板更聰明的架構

SSI 共同創辦人,前 OpenAI 首席科學家。 深水區先知——預判縮放撞到天花板,下一步在更聰明而非更大。

近期訪談

  • 引用r=0.25
    Reiner Pope – The math behind how LLMs are trained and served
    @ Dwarkesh Podcast

    Did a very different format with Reiner Pope - a blackboard lecture where he walks through how frontier LLMs are trained and served. It’s shocking how much you can deduce about what the labs are doing from a handful of equations, public API prices, and some chalk. It’s a bit technical, but I encourage you to hang in there – it’s really worth it. There are less than a handful of people who understa

  • 提及r=0.15
    OpenAI Misses Targets, Codex vs Claude, Elon vs Sam Trial, Big Hyperscaler Beats, Peptide Craze
    @ All-In Podcast

    (0:00) Bestie intros (3:05) OpenAI misses targets, Codex gains on Claude (20:02) AI cybersecurity: a market that's about to explode (31:03) Elon vs Sam Altman lawsuit (41:00) Big tech smashes earnings, Capex explosion (52:44) Vibecoding nightmare: AI deleted someone's codebase (58:33) Retatrutide craze: peptides go mainstream (1:06:34) Friedberg's Supreme Court experience Apply for Summit 2026: ht

  • 提及r=0.70
    #490 – State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI
    @ Lex Fridman Podcast

    Nathan Lambert and Sebastian Raschka are machine learning researchers, engineers, and educators. Nathan is the post-training lead at the Allen Institute for AI (Ai2) and the author of The RLHF Book. Sebastian Raschka is the author of Build a Large Language Model (From Scratch) and Build a Reasoning Model (From Scratch). Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/spons

  • 受訪r=0.95
    Ilya Sutskever — We're moving from the age of scaling to the age of research
    @ Dwarkesh Podcast

    Ilya & I discuss SSI’s strategy, the problems with pre-training, how to improve the generalization of AI models, and how to ensure AGI goes well. Watch on YouTube ; listen on Apple Podcasts or Spotify . Sponsors Gemini 3 is the first model I’ve used that can find connections I haven’t anticipated. I recently wrote a blog post on RL’s information efficiency, and Gemini 3 helped me think it all thro

  • 提及r=0.25
    Adam Marblestone — AI is missing something fundamental about the brain
    @ Dwarkesh Podcast

    Adam Marblestone is CEO of Convergent Research . He’s had a very interesting past life: he was a research scientist at Google Deepmind on their neuroscience team and has worked on everything from brain-computer interfaces to quantum computing to nanotech and even formal mathematics. In this episode, we discuss how the brain learns so much from so little, what the AI field can learn from neuroscien

  • 提及r=0.20
    Thoughts on AI progress (Dec 2025)
    @ Dwarkesh Podcast

    What are we scaling? I’m confused why some people have short timelines and at the same time are bullish on the current scale up of reinforcement learning atop LLMs. If we’re actually close to a human-like learner, this whole approach of training on verifiable outcomes is doomed. Currently the labs are trying to bake in a bunch of skills into these models through “mid-training” - there’s an entire

  • 提及r=0.30
    Thoughts on AI progress (Dec 2025)
    @ Dwarkesh Podcast

    What are we scaling? I’m confused why some people have short timelines and at the same time are bullish on the current scale up of reinforcement learning atop LLMs. If we’re actually close to a human-like learner, this whole approach of training on verifiable outcomes is doomed. Currently the labs are trying to bake in a bunch of skills into these models through “mid-training” - there’s an entire

  • 提及r=0.30
    Podcast Strategy Doc (December 2025)
    @ Dwarkesh Podcast

    The mission I originally titled my podcast The Lunar Society. I changed it to Dwarkesh Podcast eventually because people kept thinking it was a crypto podcast (”to the moon!!!”). I named it after The Lunar Society of Birmingham, an informal club that met in the late 18th century. Members included James Watt, Matthew Boulton, Erasmus Darwin, Joseph Priestley, and Josiah Wedgwood. These were the sci

  • 引用r=0.15
    RL is even more information inefficient than you thought
    @ Dwarkesh Podcast

    Recently, people have been talking about how it takes way more FLOPs to get a single sample in RL than it does in supervised learning. In pretraining, you get a signal on every single token you train on. In RL, you have to unroll a whole thinking trajectory that’s 10s of 1000s of tokens long in order to get a single reward signal at the end (for example, did the unit test for my code pass/did I ge

核心概念分布(歷來)

  • 縮放天花板× 3
  • 更聰明的架構× 2
  • 管線平行化× 1
  • 模型架構× 1
  • 技術評論× 1
  • OpenAI 內部動盪× 1
  • 預訓練× 1
  • 研究時代× 1
  • 模型 jaggedness× 1
  • 泛化差距× 1
  • 基因編碼× 1
  • 獎勵函數× 1
  • 神經科學× 1
  • 超人類AI研究員× 1
  • AGI演算法× 1
  • 自動化研究員× 1
  • AGI算法× 1
  • 持續學習× 1
  • SSI× 1
  • AGI瓶頸× 1