Session 16-E

How to Solve Your Data Integration Challenge

← Back to Agenda

Data integration requires solving several different problems (global schema construction, extract-transform-and-load, data cleaning, feature extraction, deduplication, golden record selection, classification, etc.)  To solve these challenges, you need to bring a variety of tools to bear including traditional machine learning, deep learning, large language models (LLMs), rule systems, and conventional analysis techniques.  In this talk I explain the best technology for a number of these problems and conclude that “one size does not fit all”.  Hence, end-to-end data integration will require a tool kit of different techniques.

I also believe that the best data integration leverage is to build up pretrained models in the popular semantic areas (products, suppliers, customers, etc).  This will allow you to get the best head start on most data integration challenges.

Speaker

Michael Stonebraker

Adjunct Professor/CTO, MIT/Tamr

THE CDOIQ SYMPOSIUM HAS BEEN SUPPORTED BY THE SYMPOSIUM SPONSORS, CHIEF DATA OFFICERS AND DATA LEADERS.

Welcome to the CDOIQ Symposium, where innovation meets data excellence! Our symposium sponsors play a crucial role in shaping the future of data and information quality. As champions of cutting-edge technologies, thought leadership, and industry advancements, these forward-thinking organizations contribute to a dynamic ecosystem dedicated to harnessing the power of data for transformative business outcomes. Join us in recognizing and celebrating the invaluable support of our sponsors, who drive the conversation and inspire breakthroughs in the rapidly evolving landscape of Chief Data and Information Quality. Together, we pave the way for a data-driven future that propels organizations to unprecedented heights of success.

TIER-1 SPONSORS

TIER-2 SPONSORS

TIER-3 SPONSORS