
India Reaches 27 Million GitHub Developers, Now the Platform's Fastest-Growing Community
Read the latest insights from the RepoRank editorial team.
Pillar
Explore the most popular data engineering repositories, pipeline tools, and open source data infrastructure projects. From ETL workflows and orchestration systems to warehousing utilities, streaming platforms, and data platform tooling, discover which data engineering projects are gaining traction on GitHub.
No active child topics are mapped to this pillar yet.

Read the latest insights from the RepoRank editorial team.

Read the latest insights from the RepoRank editorial team.

Read the latest insights from the RepoRank editorial team.
Trending open-source projects, delivered weekly.

Data engineering is the backbone of modern analytics and data-driven software, making it possible to collect, transform, move, and serve data reliably across systems. Open source repositories play a major role in this ecosystem by providing practical tooling for orchestration, pipelines, warehousing, streaming, and platform design.
The open source data engineering landscape includes ETL and ELT tools, workflow orchestration systems, transformation frameworks, stream processing projects, warehouse utilities, and infrastructure-focused repositories built for scalable data operations. RepoRank helps surface the repositories that are earning real attention and momentum.
This page helps you discover the data engineering tools developers, analytics teams, and platform engineers are actively using, evaluating, and watching.
RepoRank focuses on real GitHub growth signals, helping you identify data engineering repositories that are active, relevant, and gaining adoption across data platform and infrastructure workflows.
Whether you are building reliable pipelines, evaluating orchestration frameworks, or tracking open source repositories shaping modern data infrastructure, this page helps you stay close to the projects gaining traction across data engineering.
Use this page to discover trending data engineering repositories, compare tools, and stay current with the open source projects shaping modern data infrastructure.
Data engineering repositories are open source codebases related to moving, transforming, orchestrating, storing, and serving data across modern systems and analytics workflows.
This page includes ETL and ELT tools, orchestration systems, transformation frameworks, streaming projects, warehouse utilities, and broader open source repositories for data infrastructure.
RepoRank uses real GitHub growth signals such as star growth, activity, and project momentum to surface data engineering projects that are gaining traction.
Yes, all featured repositories are open source projects sourced directly from GitHub.
Tracking trending data engineering repositories helps you discover new data platform workflows, compare infrastructure patterns, and evaluate the tools data teams are actively adopting.
No. Data engineering tools are also useful for startups, product teams, and growing organizations that need reliable pipelines, better analytics foundations, or scalable data workflows.
Data engineering tools focus on data movement, orchestration, transformation, infrastructure, and reliability, while data science tools are generally more focused on analysis, experimentation, modeling, and insight generation.
Start with your data stack, scale needs, and workflow. Consider maintainability, ecosystem support, orchestration fit, operational complexity, documentation, and how well the repository aligns with your team and infrastructure.