// MODEL OPTIMIZATION AND PROMPT SYNTAX TERM

ELT

ELT is similar to ETL, but the data is first moved into the target system (often a data lake) in its raw form, and then transformed there using the target system's processing power. It's like putting all ingredients in a big fridge first, and then deciding how to cook them later.

ELT — illustration from Wikipedia
Image via Wikipedia

TECHNICAL DEFINITION

ELT is a modern data integration paradigm where raw data is extracted from source systems, loaded directly into a scalable data lake or cloud data warehouse, and subsequently transformed in-place using the target system's compute resources, leveraging its elasticity for complex transformations.

BACKGROUND

Science and technology studies (STS) or science, technology, and society is an interdisciplinary field that examines the creation, development, and consequences of science and technology in their historical, cultural, and social contexts.

READ MORE ON WIKIPEDIA

SYNONYMS & ALIASES

  • Cloud Data Integration
  • Data Lake Processing
  • In-Database Transformation
  • Raw Data Loading

USAGE NOTE

ELT is favored in cloud environments due to scalable storage and compute capabilities.

DEVELOPERS

Organizations developing technology related to ELT.

  • Snowflake

    Snowflake provides a cloud data platform that enables data warehousing, data lakes, and data engineering, often serving as the target for ELT processes where raw data is loaded and then transformed for analytics and AI/ML workloads.

  • Databricks

    Databricks offers a Lakehouse Platform that unifies data, analytics, and AI. Its platform includes tools and frameworks like Delta Live Tables for building robust ELT pipelines, essential for preparing and managing data for machine learning models and AI applications.

  • dbt Labs

    dbt Labs develops dbt (data build tool), which is a key component for the 'Transform' stage in modern ELT pipelines. It enables data teams to transform data in their warehouse using SQL, creating reliable datasets for downstream analytics and AI applications.

  • Fivetran

    Fivetran automates the 'Extract' and 'Load' phases of ELT, connecting to various data sources and loading data into data warehouses or lakes. This streamlined data ingestion is crucial for ensuring fresh and comprehensive data is available for AI engineering.

  • Talend

    Talend provides data integration and data governance solutions, including robust ELT capabilities. Their platform helps organizations extract, load, and transform data from diverse sources, making it ready for advanced analytics and AI initiatives.

  • AWS Glue

    AWS Glue is a serverless data integration service that makes it easy to discover, prepare, and combine data for analytics, machine learning, and application development. It supports both ETL and ELT patterns, providing managed services for data processing.

  • Informatica

    Informatica offers a comprehensive suite of enterprise data management solutions, including powerful ELT tools. Their platform assists organizations in integrating, cleansing, and transforming data at scale to support AI and analytics initiatives.

  • Airbyte

    Airbyte provides an open-source data integration platform focused on the 'Extract' and 'Load' aspects of ELT. It offers a wide range of connectors to move data from various sources into data warehouses and lakes, serving as a foundation for AI data pipelines.

RELATED TERMS IN MLOPS & DEPLOYMENT