Haize Labs with Leonard Tang - Weaviate Podcast #121!
How do you ensure your AI systems actually do what you expect them to do? Leonard Tang takes us deep into the revolutionary world of AI evaluation with concrete techniques you can apply today. Learn how Haize Labs is transforming AI testing through "scaling judge-time compute" - stacking weaker models to effectively evaluate stronger ones. Leonard unpacks the game-changing Verdict library that outperforms frontier models by 10-20% while dramatically reducing costs. Discover practical insights on creating contrastive evaluation sets that extract maximum signal from human feedback, implementing debate-based judging systems, and building custom reward models that align with enterprise needs. The conversation reveals powerful nuggets like using randomized agent debates to achieve consensus and lightweight guardrail models that run alongside inference. Whether you're developing AI applications or simply fascinated by how we'll ensure increasingly powerful AI systems perform as expected, this episode delivers immediate value with techniques you can implement right away, philosophical perspectives on AI safety, and a glimpse into the future of evaluation that will fundamentally shape how AI evolves.
--------
54:15
Box AI with Ben Kus and Bob van Luijt
Ben walks us through Box's three-layer infrastructure puzzle: First, the mind-boggling base infrastructure (think millions of interactions per second and trillions of files). Second, their unique multi-tenant security challenge - unlike most SaaS platforms, Box users share content across company boundaries, making traditional tenant isolation impossible. And third, ensuring AI respects all these complex permissions while still delivering value. The podcast then dives further into how vector embeddings can balloon file sizes - a few hundred bytes of text can require 4-6KB of vector data storage! We also dig into why RAG remains essential despite growing context windows, and how Box is developing AI agents that transform painful enterprise processes like RFP responses.
--------
55:32
Structured Outputs with Will Kurt and Cameron Pfiffer - Weaviate Podcast #119!
Hey everyone! Thanks so much for watching another episode of the Weaviate Podcast! Dive into the fascinating world of structured outputs with Will Kurt and Cameron Pfeiffer, the brilliant minds behind Outlines, the revolutionary open-source library from .txt.ai that's changing how we interact with LLMs. In this episode, we explore how constrained decoding enables predictable, reliable outputs from language models—unlocking everything from perfect JSON generation to guided reasoning processes.Will and Cameron share their journey to founding .txt.ai, explain the technical magic behind Outlines (hint: it involves finite state machines!), and debunk misconceptions around structured generation performance. You'll discover practical applications like knowledge graph construction, metadata extraction, and report generation that simply weren't possible before this technology.Whether you're building AI systems or curious about where the field is heading, you'll gain valuable insights on how structured outputs integrate with inference engines like vLLM, why multi-task inference outperforms single-task approaches, and how this technology enables scalable agent systems that could transform software architecture forever. Join us for this mind-expanding conversation about one of AI's most important but under appreciated innovations—and discover why the future might belong to systems that combine freedom with structure.
--------
1:10:17
Synthetic Data with David Berenstein and Ben Burtenshaw - Weaviate Podcast #118!
Synthetic Data: The Building Bocks of AI's Future! Hey everyone! I am SUPER EXCITED to publish the 118th episode of the Weaviate Podcast featuring David Berenstein and Ben Burtenshaw from HuggingFace! This podcast explores the intricacies of synthetic data generation, detailing methodologies such as data augmentation, distillation, and instruction refinement. The conversation delves into persona-driven synthetic data, highlighting applications like Persona Hub, and discusses algorithms to enhance diversity, complexity, and quality of generated data. Additionally, they cover integration with Hugging Face’s ecosystem, including Argilla for annotation, AutoTrain for fine-tuning, and advanced data exploration tools like the Data Studio and SQL console. The podcast also touches upon the potential for synthetic image data generation and the exciting future of AI education and accessibility.
--------
1:02:01
Letta AI with Sarah Wooders - Weaviate Podcast #117!
Hey everyone! Thank you so much for watching the 117th episode of the Weaviate podcast! In this episode, we dive deep into the cutting edge of AI agent development with Sarah Wooders, co-founder and CTO of Letta AI. Emerging from Berkeley's Sky Computing Lab, Sarah and her team have pioneered a revolutionary approach to stateful agents - AI systems that genuinely remember both you and themselves across extended conversations. The conversation explores how the groundbreaking MemGPT project evolved into Letta's comprehensive Agent Development Environment (ADE), which empowers developers to build truly persistent AI experiences. Sarah shares powerful insights on context management, memory prioritization, and the critical role of databases in agent architecture. Whether you're building AI systems or simply curious about where conversational AI is heading, this episode illuminates how the future of agents depends not just on their reasoning capabilities, but on their ability to maintain coherent identity and memory over time.