PodcastsCiênciaData Skeptic

Data Skeptic

Kyle Polich
Data Skeptic
Último episódio

596 episódios

  • Data Skeptic

    Healthy Friction in Job Recommender Systems

    02/2/2026 | 26min
    In this episode, host Kyle Polich speaks with Roan Schellingerhout, a fourth-year PhD student at Maastricht University, about explainable multi-stakeholder recommender systems for job recruitment. Roan discusses his research on creating AI-powered job matching systems that balance the needs of multiple stakeholders—job seekers, recruiters, HR professionals, and companies. The conversation explores different types of explanations for job recommendations, including textual, bar chart, and graph-based formats, with findings showing that lay users strongly prefer simple textual explanations over more technical visualizations. Roan shares insights from his "healthy friction" study, which tested whether users could distinguish between real AI-generated explanations and randomly generated ones, revealing that participants often used explanations as information sources rather than decision-making tools.
    The discussion delves into the technical architecture behind these systems, including the use of knowledge graphs built from tabular data, inference rules, and large language models to generate human-friendly explanations. Roan explains how his research aims to open the black box of recommender systems, making them more transparent and trustworthy for non-technical users. Looking forward, he discusses ongoing work on automated knowledge graph construction from resumes and job listings, research into fairness considerations around gender and location, and plans for real-world testing with actual job seekers. The episode concludes with Roan's vision for the future: AI systems that support rather than replace human recruiters, making the job search process less grueling while maintaining the essential human judgment that recruitment requires.
  • Data Skeptic

    Fairness in PCA-Based Recommenders

    26/1/2026 | 49min
    In this episode, we explore the fascinating world of recommender systems and algorithmic fairness with David Liu, Assistant Research Professor at Cornell University's Center for Data Science for Enterprise and Society. David shares insights from his research on how machine learning models can inadvertently create unfairness, particularly for minority and niche user groups, even without any malicious intent. We dive deep into his groundbreaking work on Principal Component Analysis (PCA) and collaborative filtering, examining why these fundamental techniques sometimes fail to serve all users equally.
    David introduces the concept of "power niche users" - highly active users with specialized interests who generate valuable data that can benefit the entire platform. We discuss his paper "When Collaborative Filtering Is Not Collaborative," which reveals how PCA can over-specialize on popular content while neglecting both niche items and even failing to properly recommend popular artists to new potential fans. David presents solutions through item-weighted PCA and thoughtful data upweighting strategies that can improve both fairness and performance simultaneously, challenging the common assumption that these goals must be in tension. The conversation spans from theoretical insights to practical applications at companies like Meta, offering a comprehensive look at the future of personalized recommendations.
  • Data Skeptic

    Video Recommendations in Industry

    26/12/2025 | 38min
    In this episode, Kyle Polich sits down with Cory Zechmann, a content curator working in streaming television with 16 years of experience running the music blog "Silence Nogood." They explore the intersection of human curation and machine learning in content discovery, discussing the concept of "algatorial" curation—where algorithms and editorial expertise work together. Key topics include the cold start problem, why every metric is just a "proxy metric" for what users actually want, the challenge of filter bubbles, and the importance of balancing familiarity with discovery. Cory shares insights on why TikTok's algorithm works so well (clean data and massive interaction volume), the crucial role of homepage curation, and how human curators help by contextualizing content, cleaning data, and identifying positive feedback loops that algorithms might miss.
    The conversation covers practical challenges like measuring "surprise and delight," the content deluge created by democratized creation tools, and why trust in tech companies is essential for better personalization. Cory emphasizes that discovery is "a good type of friction" and explains how the CODE framework (Capture, Organize, Distill, Express, plus Analysis) guides professional curation work. Looking to the future, they discuss the need for systems thinking that creates narrative connections between content, the potential for conversational AI to help users articulate preferences, and why diverse perspectives beyond engineering are crucial for building effective discovery systems. Resources mentioned include the newsletter "Top Information Retrieval Papers of the Week" and Notebook LM for synthesizing research.
  • Data Skeptic

    Eye Tracking in Recommender Systems

    18/12/2025 | 52min
    In this episode, Santiago de Leon takes us deep into the world of eye tracking and its revolutionary applications in recommender systems. As a researcher at the Kempelin Institute and Brno University, Santiago explains the mechanics of eye tracking technology—how it captures gaze data and processes it into fixations and saccades to reveal user browsing patterns. He introduces the groundbreaking RecGaze dataset, the first eye tracking dataset specifically designed for recommender systems research, which opens new possibilities for understanding how users interact with carousel interfaces like Netflix. Through collaboration between psychologists and AI researchers, Santiago's work demonstrates how eye tracking can uncover insights about positional bias and user engagement that traditional click data misses.
    Beyond the technical aspects, Santiago addresses the ethical considerations surrounding eye tracking data, particularly concerning pupil data and privacy. He emphasizes the importance of questioning assumptions in recommender systems and shares practical advice for improving recommendation algorithms by understanding actual user behavior rather than relying solely on click patterns. Looking forward, Santiago discusses exciting future directions including simulating user behavior using eye tracking data, addressing the cold start problem, and translating these findings to e-commerce applications. This conversation challenges researchers and practitioners to think more deeply about de-biasing clicks and leveraging eye tracking as a powerful tool to enhance user experience in recommendation systems.
  • Data Skeptic

    Cracking the Cold Start Problem

    08/12/2025 | 39min
    In this episode of Data Skeptic, we dive deep into the technical foundations of building modern recommender systems. Unlike traditional machine learning classification problems where you can simply apply XGBoost to tabular data, recommender systems require sophisticated hybrid approaches that combine multiple techniques. Our guest, Boya Xu, an assistant professor of marketing at Virginia Tech, walks us through a cutting-edge method that integrates three key components: collaborative filtering for dimensionality reduction, embeddings to represent users and items in latent space, and bandit learning to balance exploration and exploitation when deploying new recommendations.
    Boya shares insights from her research on how recommender systems impact both consumers and content creators across e-commerce and social media platforms. We explore critical challenges like the cold start problem—how to make good recommendations for brand new users—and discuss how her approach uses demographic information to create informative priors that accelerate learning. The conversation also touches on algorithmic fairness, revealing how her method reduces bias between majority and minority (niche preference) users by incorporating active learning through bandit algorithms. Whether you're interested in the mathematics of recommendation engines or the broader implications for digital platforms, this episode offers a comprehensive look at the state-of-the-art in recommender system design.

Mais podcasts de Ciência

Sobre Data Skeptic

The Data Skeptic Podcast features interviews and discussion of topics related to data science, statistics, machine learning, artificial intelligence and the like, all from the perspective of applying critical thinking and the scientific method to evaluate the veracity of claims and efficacy of approaches.
Site de podcast

Ouça Data Skeptic, PQU Podcast e muitos outros podcasts de todo o mundo com o aplicativo o radio.net

Obtenha o aplicativo gratuito radio.net

  • Guardar rádios e podcasts favoritos
  • Transmissão via Wi-Fi ou Bluetooth
  • Carplay & Android Audo compatìvel
  • E ainda mais funções

Data Skeptic: Podcast do grupo

Informação legal
Aplicações
Social
v8.4.0 | © 2007-2026 radio.de GmbH
Generated: 2/5/2026 - 3:51:26 AM