Build software better, together

edwinweber / dbt_duckdb_demo_public

Open Source data engineering demo project using dbt, DuckDB, dlt, Dagster and Metabase. Two storage modes for the delta tables are supported: local and Microsoft Fabric Onelake.

python open-data data-engineering dbt dlt scd2 delta-lake dagster duckdb medallion-architecture microsoft-fabric motherduck

Updated Jun 30, 2026
Python

KaterynaD / dbt_scd2_plus

Star

Slowly Changing Dimension Type 2 (scd2) custom materialization

dbt scd2

Updated Apr 6, 2026
Shell

spatil6 / ETL-SCD2

Star

SCD2 implementation using pyspark

pyspark datawarehouse scd2

Updated Mar 18, 2018
Jupyter Notebook

akshayush / SCD2-Implementation--using-pyspark

Star

SCD2 implementation using pyspark

pyspark scd2 multiday multiday-scd2

Updated Mar 10, 2019
Jupyter Notebook

ai-tech-karthik / banking-data-pipeline

Star

A modern banking data pipeline built with Dagster and DBT!

python data-engineering dbt databricks data-quality data-lineage scd2 dagster incremental-processing duckdb modendatastack

Updated Jan 31, 2026
Python

AlexMajiA / Ecommerce-analytics-platform

Star

ELT pipeline with dbt & Snowflake on Olist dataset. Medallion architecture, dimensional modeling, SCD2, RFM and Cortex AI sentiment analysis.

aws-s3 snowflake dbt elt scd2 medallion-architecture cortexai

Updated Jun 10, 2026
PLpgSQL

emudamah0906 / polaris-claims-lakehouse

Star

P&C insurance claims lakehouse: Azure ADLS + Databricks (PySpark/Delta) + Snowflake + dbt, real-time FNOL fraud signals via Kafka, Airflow-orchestrated, Terraform-provisioned, OIDC-secured, with data contracts, lineage, and ADRs throughout.

Updated Jun 29, 2026
Python

shivaranjanka / snowflake-healthcare-pipeline

Star

Advanced Healthcare Claims Pipeline using Snowflake, Snowpipe, Streams, Tasks, SCD Type 2, and AWS S3. Automates ingestion, CDC, dimensional modeling, and data quality checks for healthcare patient and claims data.

aws cloud sql analytics tasks snowflake streams data-engineering healthcare cdc data-pipeline scd2 snowpipe

Updated Nov 10, 2025

Mohameddfxxcxx / global-horizon-bank-dwh-project

Star

Fortune-500-grade banking analytics platform: OLTP -> medallion lakehouse -> Kimball star schema -> semantic layer -> 9-tab executive dashboard + 5 ML models (churn, fraud, segmentation, forecasting). Production-ready, governed, fully tested.

Updated Apr 30, 2026
Python

Mairondc21 / pipeline_delta_s3

Star

Pipeline 100% Open Source

docker airflow s3 pyspark cicd boto3 ruff datahub scd2 delta-lake great-expectations sqlfluff

Updated Mar 19, 2026
Python

sushmakl95 / dbt-bigquery-analytics-platform

Star

Modern data stack reference: dbt + BigQuery + Airflow (Cloud Composer) with medallion layering, SCD2 snapshots, exposures, freshness SLAs, and 45× cost reduction via partition + cluster + incremental tuning.

Updated Apr 23, 2026
Python

DustinPineau / cms_portfolio

Star

End-to-end Medicare data engineering pipeline: API ingestion, PostgreSQL 17, dbt, dimensional modeling (Kimball/SCD2), Apache Airflow orchestration, and Evidence.dev dashboard. Built on a QEMU/KVM Rocky Linux VM.

python cms portfolio sql etl postgresql data-engineering dbt data-pipeline medicare evidence apache-airflow kimball scd2 dimensional-modeling

Updated Apr 28, 2026
PLpgSQL

shukla2015 / Travel_Booking_SCD2_Project

Star

Production-grade parameterized ETL pipeline implementing SCD Type 2 for travel booking data using Databricks, Delta Lake, and ADLS — includes data quality checks, incremental fact table build, Z-Order optimization, and SQL reporting.

etl pyspark databricks scd2 delta-lake azure-data-engineering pydeequ

Updated Apr 6, 2026
Jupyter Notebook

EKOURAOGO / retail-data-warehouse-etl

Star

Pipeline ETL MySQL en 3 couches - staging, modele en etoile avec SCD Type 2, marts analytiques. Orchestrateur Python, 18 tests de coherence inter-couches

python sql data-warehouse mysql-database star-schema etl-pipeline scd2 dimensional-modeling

Updated Jun 28, 2026
Shell

Cindy-txr / Employee-data-platform

Star

Production-style Data Warehouse project using Airflow + PostgreSQL with CDC event layer, SCD2 modeling, checkpoint-based incremental loading, and idempotent pipelines.

python docker postgres airflow sql kafka analytics data-warehouse data-engineering cdc tel scd2

Updated May 21, 2026
Python

OsamaMustafa32 / Enterprise_Retail_Data_Lakehouse

Star

Batch retail data lakehouse on Databricks: Delta Live Tables (bronze → silver → gold), Unity Catalog, synthetic data generator, and an executive analytics dashboard.

python sql pyspark databricks data-quality-checks etl-pipeline scd2 delta-lake data-lakehouse delta-live-tables unity-catalog medallion-architecture

Updated Apr 2, 2026
Python

arponbiswasanik / api-telemetry

Star

An end-to-end analytics engineering pipeline that transforms raw API telemetry into actionable business metrics. Built with Python, DBT, and DuckDB to model usage, monitor latency, and calculate tiered billing.

python sql power-bi data-analytics dbt scd2 analytics-engineering duckdb

Updated Jun 28, 2026
Python

ZuhairBhati / travel_bookings_pipeline

Star

This is a data engineering pipeline built on Databricks + Delta Lake + PySpark that ingests travel booking and customer master data, applies SCD Type 2 logic, and delivers analytics-ready tables. It includes data quality enforcement, dimension versioning, fact aggregation, and performance tuning.

python analytics travel pyspark data-engineering hospitality notebooks databricks bookings etl-pipeline scd2

Updated Oct 8, 2025
Jupyter Notebook

moniburnejko / snowflake-ingestion-patterns

Star

reference snowflake ingestion patterns: streams and tasks, and dynamic tables with scd2 and deduplication. provisioned with terraform, plus a dbt sandbox.

terraform snowflake data-engineering dbt dynamic-tables elt scd2 streams-and-tasks

Updated Jun 8, 2026
HCL

szkad / piquillo-bi-platform-peru

Star

Plataforma BI end-to-end para agroexportadora peruana ficticia de pimiento piquillo. SQL Server DW con SCD2, ETL con stored procedures, dashboard Power BI con RLS.

python sql-server etl power-bi data-warehouse data-engineering business-intelligence dax peru kimball star-schema agroindustria scd2

Updated May 24, 2026
TSQL

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scd2

Here are 38 public repositories matching this topic...

edwinweber / dbt_duckdb_demo_public

KaterynaD / dbt_scd2_plus

spatil6 / ETL-SCD2

akshayush / SCD2-Implementation--using-pyspark

ai-tech-karthik / banking-data-pipeline

AlexMajiA / Ecommerce-analytics-platform

emudamah0906 / polaris-claims-lakehouse

shivaranjanka / snowflake-healthcare-pipeline

Mohameddfxxcxx / global-horizon-bank-dwh-project

Mairondc21 / pipeline_delta_s3

sushmakl95 / dbt-bigquery-analytics-platform

DustinPineau / cms_portfolio

shukla2015 / Travel_Booking_SCD2_Project

EKOURAOGO / retail-data-warehouse-etl

Cindy-txr / Employee-data-platform

OsamaMustafa32 / Enterprise_Retail_Data_Lakehouse

arponbiswasanik / api-telemetry

ZuhairBhati / travel_bookings_pipeline

moniburnejko / snowflake-ingestion-patterns

szkad / piquillo-bi-platform-peru

Improve this page

Add this topic to your repo