PodcastsTecnologiaThe Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI

The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI

Astronomer
The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI
Último episódio

85 episódios

  • The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI

    How Snowflake Uses Airflow Sensors To Keep Financial Metrics Accurate with Ayush Pradhan

    08/1/2026 | 18min

    Clean financial reporting depends on orchestration that catches issues early, prevents bad data from spreading and helps teams react fast when something breaks.In this episode, Ayush Pradhan, Senior Analytics Engineer at Snowflake, joins us to explain how Snowflake’s finance data team relies on @Apache Airflow, sensors and dbt to keep revenue, cost and accounting pipelines accurate.Key Takeaways:00:00 Introduction.02:25 Airflow is used to coordinate recurring finance workflows.04:20 Sensors enforce the timing for the end of the measurement period06:50 Anomaly checks help catch issues near the source.08:30 Alerts route to owners quickly through common channels.10:38 Failed quality checks block downstream publishing.11:50 New features are adopted only when tied to business value.15:42 Vendor sensors can trigger workflows after upstream updates.16:52 Community learnings help teams keep pace with open-source change.Resources Mentioned:Ayush Pradhanhttps://www.linkedin.com/in/ayush-pradhan-845a19194/Snowflake | LinkedInhttps://www.linkedin.com/company/snowflake-computing/Snowflake | Websitehttps://www.snowflake.comApache Airflowhttps://airflow.apache.org/dbt Labshttps://www.getdbt.com/Slackhttps://slack.com/Thanks for listening to “The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI.” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations.#AI #Automation #Airflow

  • The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI

    The Role of Airflow in Building Smarter ML Pipelines at Vivian Health with Max Calehuff

    11/12/2025 | 19min

    The integration of data orchestration and machine learning is critical to operational efficiency in healthcare tech. Vivian Health leverages Airflow to power both its ETL pipelines and ML workflows while maintaining strict compliance standards.Max Calehuff, Lead Data Engineer at Vivian Health, joins us to discuss how his team uses Airflow for ML ops, regulatory compliance and large-scale data orchestration. He also shares insights into upgrading to Airflow 3 and the importance of balancing flexibility with security in a healthcare environment.Key Takeaways:00:00 Introduction.04:21 The role of Airflow in managing ETL pipelines and ML retraining.06:23 Using AWS SageMaker for ML training and deployment.07:47 Why Airflow’s versatility makes it ideal for MLOps.10:50 The importance of documentation and best practices for engineering teams.13:44 Automating anonymization of user data for compliance.15:30 The benefits of remote execution in Airflow 3 for regulated industries.18:16 Quality-of-life improvements and desired features in future Airflow versions.Resources Mentioned:Max Calehuffhttps://www.linkedin.com/in/maxwell-calehuff/Vivian Health | LinkedInhttps://www.linkedin.com/company/vivianhealth/Vivian Health | Websitehttps://www.vivian.comApache Airflowhttps://airflow.apache.org/Astronomerhttps://www.astronomer.io/AWS SageMakerhttps://www.google.com/aclk?sa=L&ai=DChsSEwj3-fbz1tiQAxWXlKYDHXUBBVoYACICCAEQABoCdGI&ae=2&aspm=1&co=1&ase=2&gclid=Cj0KCQiA5abIBhCaARIsAM3-zFWbfj2olUvX4dqoiYNaE3q2fMf_ZifRjmbKNQCVX7D6ZMClaUXUkFkaAuwmEALw_wcB&cid=CAASQuRoMccxWhBvMq-1Uez3XOZti1ul7mTDotKvSMoDHv0q2xCsyS2FzMptO5dJf3tmfkLRu22TtD8ChTmdjvs6YetTjQ&cce=2&category=acrcp_v1_35&sig=AOD64_2xE2xolEEVbpDb56qXQluxTzs-Aw&q&nis=4&adurl&ved=2ahUKEwj7le3z1tiQAxWXcvUHHfZePbAQ0Qx6BAgUEAEdbtLabshttps://www.getdbt.com/Cosmoshttps://github.com/astronomer/astronomer-cosmosSplithttps://www.split.io/Snowflakehttps://www.snowflake.com/en/Thanks for listening to “The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI.” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations.#AI #Automation #Airflow

  • The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI

    Scaling Airflow to 11,000 DAGs Across Three Regions at Intercom with András Gombosi and Paul Vickers

    04/12/2025 | 34min

    The evolution of Intercom’s data infrastructure reveals how a well-built orchestration system can scale to serve global needs. With thousands of DAGs powering analytics, AI and customer operations, the team’s approach combines technical depth with organizational insight.In this episode, András Gombosi, Senior Engineering Manager of Data Infra and Analytics Engineering, and Paul Vickers, Principal Engineer, both at Intercom, share how they built one of the largest Airflow deployments in production and enabled self-serve data platforms across teams.Key Takeaways:00:00 Introduction.04:24 Community input encourages confident adoption of a common platform.08:50 Self-serve workflows require consistent guardrails and review.09:25 Internal infrastructure support accelerates scalable deployments.13:26 Batch LLM processing benefits from a configuration-driven design.15:20 Standardized development environments enable effective AI-assisted work.19:58 Applied AI enhances internal analysis and operational enablement.27:27 Strong test coverage and staged upgrades protect stability.30:36 Proactive observability and on-call ownership improve outcomes.Resources Mentioned:András Gombosihttps://www.linkedin.com/in/andrasgombosi/Paul Vickershttps://www.linkedin.com/in/paul-vickers-a22b76a3/Intercom | LinkedInhttps://www.linkedin.com/company/intercom/Intercom | Websitehttps://www.intercom.comApache Airflowhttps://airflow.apache.org/dbtLabshttps://www.getdbt.com/Snowflake Cortex AIhttps://www.snowflake.com/en/product/features/cortex/Datadoghttps://www.datadoghq.com/Thanks for listening to “The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI.” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations.#AI #Automation #Airflow

  • The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI

    How Covestro Turns Airflow Into a Simulation Toolbox with Anja Mackenzie

    20/11/2025 | 23min

    Building scalable, reproducible workflows for scientific computing often requires bridging the gap between research flexibility and enterprise reliability.In this episode, Anja MacKenzie, Expert for Cheminformatics at Covestro, explains how her team uses Airflow and Kubernetes to create a shared, self-service platform for computational chemistry.Key Takeaways:00:00 Introduction.06:19 Custom scripts made sharing and reuse difficult.09:29 Workflows are manually triggered with user traceability.10:38 Customization supports varied compute requirements.12:48 Persistent volumes allow tasks to share large amounts of data.14:25 Custom operators separate logic from infrastructure.16:43 Modified triggers connect dependent workflows.18:36 UI plugins enable file uploads and secure access.Resources Mentioned:Anja MacKenziehttps://www.linkedin.com/in/anja-mackenzie/Covestro | LinkedInhttps://www.linkedin.com/company/covestro/Covestro | Websitehttps://www.covestro.comApache Airflowhttps://airflow.apache.org/Kuberneteshttps://kubernetes.io/Airflow KubernetesPodOperatorhttps://airflow.apache.org/docs/apache-airflow-providers-cncf-kubernetes/stable/operators.htmlAstronomerhttps://www.astronomer.io/Airflow Academy by Marc Lambertihttps://www.udemy.com/user/lockgfg/?utm_source=adwords&utm_medium=udemyads&utm_campaign=Search_DSA_GammaCatchall_NonP_la.EN_cc.ROW-English&campaigntype=Search&portfolio=ROW-English&language=EN&product=Course&test=&audience=DSA&topic=&priority=Gamma&utm_content=deal4584&utm_term=_._ag_169801645584_._ad_700876640602_._kw__._de_c_._dm__._pl__._ti_dsa-1456167871416_._li_9061346_._pd__._&matchtype=&gad_source=1&gad_campaignid=21341313808&gbraid=0AAAAADROdO1_-I2TMcVyU8F3i1jRXJ24K&gclid=Cj0KCQjwvJHIBhCgARIsAEQnWlC1uYHIRm3y9Q8rPNSuVPNivsxogqfczpKHwhmNho2uKZYC-y0taNQaApU2EALw_wcBAirflow Documentationhttps://airflow.apache.org/docs/Airflow Pluginshttps://airflow.apache.org/docs/apache-airflow/1.10.9/plugins.htmlThanks for listening to “The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI.” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations.#AI #Automation #Airflow

  • The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI

    Building Secure Financial Data Platforms at AgileEngine with Valentyn Druzhynin

    13/11/2025 | 21min

    The use of Apache Airflow in financial services demands a balance between innovation and compliance. Agile Engine’s approach to orchestration showcases how secure, auditable workflows can scale even within the constraints of regulatory environments.In this episode, Valentyn Druzhynin, Senior Data Engineer at AgileEngine, discusses how his team leverages Airflow for ETF calculations, data validation and workflow reliability within tightly controlled release cycles.Key Takeaways:00:00 Introduction.03:24 The orchestrator ensures secure and auditable workflows.05:13 Validations before and after computation prevent errors.08:24 Release freezes shape prioritization and delivery plans.11:14 Migration plans must respect managed service constraints.13:04 Versioning, backfills and event triggers increase reliability.15:08 UI and integration improvements simplify operations.18:05 New contributors should start small and seek help.Resources Mentioned:Valentyn Druzhyninhttps://www.linkedin.com/in/valentyn-druzhynin/AgileEngine | LinkedInhttps://www.linkedin.com/company/agileengine/AgileEngine | Websitehttps://agileengine.com/Apache Airflowhttps://airflow.apache.org/Astronomerhttps://www.astronomer.io/AWS Managed Airflowhttps://aws.amazon.com/managed-workflows-for-apache-airflow/Google Cloud Composer (Managed Airflow)https://cloud.google.com/composerAirflow Summithttps://airflowsummit.org/Thanks for listening to “The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI.” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations.#AI #Automation #Airflow #MachineLearning

Mais podcasts de Tecnologia

Sobre The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI

Welcome to The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI— the podcast where we keep you up to date with insights and ideas propelling the Airflow community forward. Join us each week, as we explore the current state, future and potential of Airflow with leading thinkers in the community, and discover how best to leverage this workflow management system to meet the ever-evolving needs of data engineering and AI ecosystems. Podcast Webpage: https://www.astronomer.io/podcast/
Site de podcast

Ouça The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI, Dwarkesh Podcast e muitos outros podcasts de todo o mundo com o aplicativo o radio.net

Obtenha o aplicativo gratuito radio.net

  • Guardar rádios e podcasts favoritos
  • Transmissão via Wi-Fi ou Bluetooth
  • Carplay & Android Audo compatìvel
  • E ainda mais funções

The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI: Podcast do grupo

Informação legal
Aplicações
Social
v8.2.2 | © 2007-2026 radio.de GmbH
Generated: 1/9/2026 - 12:19:29 PM