Designing Recommender Systems for Digital Humanities
In this episode of Data Skeptic, we explore the fascinating intersection of recommender systems and digital humanities with guest Florian Atzenhofer-Baumgartner, a PhD student at Graz University of Technology. Florian is working on Monasterium.net, Europe's largest online collection of historical charters, containing millions of medieval and early modern documents from across the continent. The conversation delves into why traditional recommender systems fall short in the digital humanities space, where users range from expert historians and genealogists to art historians and linguists, each with unique research needs and information-seeking behaviors. Florian explains the technical challenges of building a recommender system for cultural heritage materials, including dealing with sparse user-item interaction matrices, the cold start problem, and the need for multi-modal similarity approaches that can handle text, images, metadata, and historical context. The platform leverages various embedding techniques and gives users control over weighting different modalities—whether they're searching based on text similarity, visual imagery, or diplomatic features like issuers and receivers. A key insight from Florian's research is the importance of balancing serendipity with utility, collection representation to prevent bias, and system explainability while maintaining effectiveness. The discussion also touches on unique evaluation challenges in non-commercial recommendation contexts, including Florian's "research funnel" framework that considers discovery, interaction, integration, and impact stages. Looking ahead, Florian envisions recommendation systems becoming standard tools for exploration across digital archives and cultural heritage repositories throughout Europe, potentially transforming how researchers discover and engage with historical materials. The new version of Monasterium.net, set to launch with enhanced semantic search and recommendation features, represents an important step toward making cultural heritage more accessible and discoverable for everyone.