Academy & Industry Research Collaboration Center (AIRCC)

Volume 11, Number 24, December 2021

An Innovative Method to Extract Data in a Real-time Data Warehousing Environment

  Authors

Flavio de Assis Vilela1 and Ricardo Rodrigues Ciferri2, 1Federal Institute of Goiás, Brazil, 2Federal University of São Carlos, Brazil

  Abstract

ETL (Extract, Transform, and Load) is an essential process required to perform data extraction in knowledge discovery in databases and in data warehousing environments. The ETL process aims to gather data that is available from operational sources, process and store them into an integrated data repository. Also, the ETL process can be performed in a real-time data warehousing environment and store data into a data warehouse. This paper presents a new and innovative method named Data Extraction Magnet (DEM) to perform the extraction phase of ETL process in a real-time data warehousing environment based on non-intrusive, tag and parallelism concepts. DEM has been validated on a dairy farming domain using synthetic data. The results showed a great performance gain in comparison to the traditional trigger technique and the attendance of real-time requirements.

  Keywords

ETL, real-time, data warehousing, data extraction.