Powered by RND
PodcastsNotíciasAI Explained Official Podcast

AI Explained Official Podcast

Philip - Host of AI Explained YT
AI Explained Official Podcast
Último episódio

Episódios Disponíveis

5 de 24
  • o3 breaks (some) records, but AI becomes pay-to-win
    A green card, o3 vs Gemini 2.5, 6 Benchmarks and a whole bunch of my thoughts on what on earth is happening in AI, from here to 2030. Plus, how AI is becoming pay-to-win, and why. Crazy times, 14 mins probably wasn’t enough.https://app.grayswan.ai/ai-explainedAI Insiders ($9!): https://www.patreon.com/AIExplainedChapters:00:00 - Introduction00:33 - FictionLiveBench01:37 - PHYBench02:14 - SimpleBench02:54 - Virology Capabilities Test03:13 - Mathematics Performance04:29 - Vision Benchmarks05:43 - V* and how o3 works06:44 - Revenue and costs for you08:54 - Expensive RL and trade-offs 09:40 - How to spend the OOMs13:27 - Gray Swan ArenaGreen Card: https://techcrunch.com/2025/04/25/an-openai-researcher-who-worked-on-gpt-4-5-had-their-green-card-denied/PHYBench: https://arxiv.org/pdf/2504.16074Virologytest: https://www.virologytest.ai/How o3 Vision Works: https://arxiv.org/pdf/2312.14135 https://x.com/sainingxie/status/1912570624523829573Visual puzzles: https://neulab.github.io/VisualPuzzles/Fiction Bench: https://x.com/ficlive/status/1912863028141244850https://geobench.org/https://simple-bench.com/AIME 2025: https://openai.com/index/introducing-o3-and-o4-mini/USAMO: https://x.com/mbalunovic/status/1914398518896193747NaturalBench: https://linzhiqiu.github.io/papers/naturalbench/Where’s Waldo: https://uk.pinterest.com/pin/492792384225896298/IMO and AlphaProof:https://deepmind.google/discover/blog/ai-solves-imo-problems-at-silver-medal-level/Crazy Revenue: https://www.theinformation.com/articles/openai-forecasts-revenue-topping-125-billion-2029-agents-new-products-gain?rc=sy0ihqNumber of Users: https://www.theinformation.com/briefings/googles-gemini-user-numbers-revealed-court?rc=sy0ihqSubscriptions pay to win: https://www.forbes.com/sites/paulmonckton/2025/04/23/google-leak-reveals-new-gemini-ai-subscription-levels/GPU Trade-offs: https://x.com/sama/status/1915098951067554030RL Scale-up Amodei: https://www.darioamodei.com/post/on-deepseek-and-export-controlsLog-linear Returns: https://x.com/bobmcgrewai/status/18952282919819432652030 Scaling: https://epoch.ai/blog/can-ai-scaling-continue-through-2030Model Size: https://x.com/slow_developer/status/1874554473256997201Adam on AGI: https://x.com/TheRealAdamG/status/1913998366632968381Papers on Patreon: https://arxiv.org/pdf/2502.01839https://arxiv.org/pdf/2504.13837Chollet Quote: https://x.com/fchollet/status/1912934762580447447OpenSim: https://opensim.stanford.edu/Non-hype Newsletter: https://signaltonoise.beehiiv.com/
    --------  
    14:33
  • o3 and o4-mini - they’re great, but easy to over-hype
    Critical analysis of the two most powerful new models behind ChatGPT, o3 and o4-mini. Not just the system cards, benchmarks, and my own tests, but some you may not have seen before. Yes, they can whip up amazing front-end in a few seconds, but you always have to ask what is in their data. Either way, they prove the gains from RL are just beginning…https://weave-docs.wandb.ai/?utm_source=sponsorship&utm_medium=simple_bench&utm_campaign=ai_explainedAI Insiders ($9!): https://www.patreon.com/AIExplainedChapters:00:00 - o3 and o4-minihttps://simple-bench.com/Plus, Teams and Pro,  plus token count: https://x.com/btibor91/status/1912568994512662679System Card: https://openai.com/index/o3-o4-mini-system-card/Release Notes: https://openai.com/index/introducing-o3-and-o4-mini/https://deepmind.google/technologies/gemini/pro/https://x.com/DeryaTR_/status/1912558350794961168https://x.com/polynoamial/status/1912564068168450396API Pricing:https://openai.com/api/pricing/https://aider.chat/docs/leaderboards/Non-hype Newsletter: https://signaltonoise.beehiiv.com/
    --------  
    14:24
  • ‘Speaking Dolphin’ to AI Data Dominance, 4.1 + Kling 2: 7 Developments Critically Analysed
    This pod won’t just be about the release of GPT 4.1 in the last 48 hours, o3 build-up, Kling 2.0, a sneak-peak at the next OpenAI model, or even the new Dolphin language tool. It will be about 7 such stories that contextualise where we are in AI and what is happening.https://www.emergentmind.com/Chapters: 00:00 - Introduction00:30 - Kling 2.001:35 - GPT 4.105:25 - o3 Build-up07:37 - ‘Product Company’09:31 - Safe Superintelligence10:54 - DolphinGemma13:16 - Data Dominance?Kling 2.0: https://app.klingai.com/global/release-notesDolphin Gemma: https://blog.google/technology/ai/dolphingemma/?s=09https://openai.com/index/gpt-4-1/OpenAI o3 Build-up The Information: https://www.theinformation.com/articles/openais-latest-breakthrough-ai-comes-new-ideas?rc=sy0ihqPhysical reasoning: https://x.com/a_karvonen/status/1911839968990814503Fiction Live.bench: https://x.com/ficlive/status/1911853409847906626Altman Ted: https://www.youtube.com/watch?v=5MWT_doo68khttps://simple-bench.com/try-yourselfhttps://aider.chat/docs/leaderboards/4.5: https://www.youtube.com/watch?v=6nJZopACRuQGeospatial reasoning: https://research.google/blog/geospatial-reasoning-unlocking-insights-with-generative-ai-and-multiple-foundation-models/Pioneers: https://x.com/OpenAIDevs/status/1910017976256119151Evals: https://www.youtube.com/watch?v=scsW6_2SPC4Anthropic Updates: https://www.bloomberg.com/news/articles/2025-04-15/anthropic-is-readying-a-voice-assistant-feature-to-rival-openai?srnd=phx-aihttps://x.com/sethsaler/status/1912188383457059301https://techcrunch.com/2025/04/12/openai-co-founder-ilya-sutskevers-safe-superintelligence-reportedly-valued-at-32b/https://ai.meta.com/blog/llama-4-multimodal-intelligence/https://deepmind.google/technologies/gemini/pro/https://research.google/blog/accelerating-scientific-breakthroughs-with-an-ai-co-scientist/https://blog.google/products/google-cloud/ironwood-tpu-age-of-inference/OpenAI Documentary: https://www.patreon.com/posts/one-machine-to-121940490
    --------  
    20:09
  • AI CEO: ‘Stock Crash Could Stop AI Progress’, Llama 4 Anti-climax +‘Superintelligence in 2027’...
    The latest on Llama 4, and whether it signals a slowdown in AI, or solid progress. Plus, a deep dive on that viral prediction of superintelligence by 2027, and Amodei’s cautionary words on what could stop AI progress in its tracks. o3 news, and more, as well.Weights & Biases: https://weave-docs.wandb.ai/?utm_source=sponsorship&utm_medium=simple_bench&utm_campaign=ai_explainedDeepSeek Doc: https://www.patreon.com/posts/openai-is-not-r1-125869969AI Insiders ($9!): https://www.patreon.com/AIExplainedChapters:00:00 - Introduction00:47 - Stock Crash 02:28 - Llama 410:55 - o3 News11:59 - OpenAI non-profit?13:13 - AI 2027Llama 4 Release: https://ai.meta.com/blog/llama-4-multimodal-intelligence/Dario Amodei Comments: https://www.youtube.com/watch?v=esCSpbDPJikKnowledge Cut-off: https://www.llama.com/docs/model-cards-and-prompt-formats/llama4_omni/Aider Polyglot: https://aider.chat/docs/leaderboards/Gemini 1.5: https://arxiv.org/pdf/2403.05530Fiction-LiveBench: https://fiction.live/stories/Fiction-liveBench-Mar-25-2025/oQdzQvKHw8JyXbN87OpenAI Valuation: https://www.nytimes.com/2025/03/31/technology/openai-valuation-300-billion.html?login=smartlock&auth=login-smartlockOpenAI Cybersecurity: https://www.bloomberg.com/news/articles/2024-01-16/openai-working-with-us-military-on-cybersecurity-tools-for-veteransDeep research System Card: https://cdn.openai.com/deep-research-system-card.pdfhttps://openai.com/index/paperbench/AI 2027: https://ai-2027.com/METR Paper: https://arxiv.org/pdf/2503.14499OpenAI non-profit: https://openai.com/index/nonprofit-commission-guidance/NYT Piece: https://www.nytimes.com/2025/04/03/technology/ai-futures-project-ai-2027.html?unlocked_article_code=1.804._yKi.QhwOp15Q3tcU&smid=url-share&s=09Kokotajlo predictions 2021: https://www.lesswrong.com/posts/6Xgy6CAf2jqHhynHL/what-2026-looks-likehttps://simple-bench.com/Non-hype Newsletter: https://signaltonoise.beehiiv.com/Podcast: https://aiexplainedopodcast.buzzsprout.com/
    --------  
    23:51
  • Gemini 2.5 Pro - It’s a Smart Chatbot … (New Simple High Score)
    Gemini gets a new record on Simple Bench, and several other benchmarks. I’ll go deep to explore its nuances, including how it deceptively reverse engineers answers, does better on certain coding benchmarks than others, may have a universal ‘conceptual language’ …https://weave-docs.wandb.ai/?utm_source=sponsorship&utm_medium=simple_bench&utm_campaign=ai_explained… and more. Plus practical tips, a note on security and Kling vs Veo 2 guest appearance.AI Insiders ($9!): https://www.patreon.com/AIExplainedChapters:00:00 - Introduction00:36 - Fiction Bench02:41 - Practicality - YouTube urls + Security - cut-off date03:42 - Coding 06:22 - WeirdML Bench07:01 - Simple Bench Record High 11:23 - Reverse Engineering!13:22 - Anthropic Paper17:49 - 3 CaveatsGemini 2.5 Updated: https://deepmind.google/technologies/gemini/Fiction Live Bench: https://fiction.live/stories/Fiction-liveBench-Feb-19-2025/oQdzQvKHw8JyXbN87https://simple-bench.com/WeirdML: https://htihle.github.io/weirdml.htmlhttps://x.com/htihle/status/1905014058228625542Anthropic Thoughts: https://www.anthropic.com/research/tracing-thoughts-language-modelhttps://transformer-circuits.pub/2025/attribution-graphs/biology.html#dives-cothttps://aistudio.google.com/prompts/new_chatSearch Study: https://www.cjr.org/tow_center/we-compared-eight-ai-search-engines-theyre-all-bad-at-citing-news.phpLive bench: https://livebench.ai/#/Paper: https://arxiv.org/pdf/2406.19314LiveCode Bench: https://livecodebench.github.io/SWE-Verified: https://arxiv.org/pdf/2310.06770Non-hype Newsletter: https://signaltonoise.beehiiv.com/
    --------  
    21:21

Mais podcasts de Notícias

Sobre AI Explained Official Podcast

Covering the biggest news of the century - the arrival of smarter-than-human AI. From the author of Simple Bench, which reveals the remaining gap between LLM and human reasoning. Hype-free, and the British accent is a freebie bonus.
Site de podcast

Ouça AI Explained Official Podcast, UOL Prime e muitos outros podcasts de todo o mundo com o aplicativo o radio.net

Obtenha o aplicativo gratuito radio.net

  • Guardar rádios e podcasts favoritos
  • Transmissão via Wi-Fi ou Bluetooth
  • Carplay & Android Audo compatìvel
  • E ainda mais funções
Aplicações
Social
v7.16.2 | © 2007-2025 radio.de GmbH
Generated: 5/2/2025 - 4:19:54 AM