Top Data Tools Repositories (2026)

Explore Data Topics

Trending Data Repositories

just now

#1King of the HillDataC++
RepoRank Score
33
#1King of the HillDataC++
Fincept-Corporation/FinceptTerminal
fincept-corporationfinceptterminal
Developer
Fincept Corporation
FinceptTerminal is a modern finance application offering advanced market analytics, investment research, and economic data tools, designed for interactive exploration and data-driven decision-making in a user-friendly environment.
28,307GitHub stars0boosts (24h)+49stars (24h)
RepoRank Score
33
#2DataPython
RepoRank Score
29
#2DataPython
scrapy/scrapy
scrapyscrapy
Developer
Scrapy
Scrapy, a fast high-level web crawling & scraping framework for Python.
63,114GitHub stars0boosts (24h)+16stars (24h)
RepoRank Score
29
#3DataTypeScript
RepoRank Score
27
#3DataTypeScript
Openpanel-dev/openpanel
openpanel-devopenpanel
Developer
Openpanel Dev
OpenPanel is an open-source web and product analytics platform, an open-source alternative to Mixpanel with optional self-hosting.
6,119GitHub stars0boosts (24h)+35stars (24h)
RepoRank Score
27
#4DataTypeScript
RepoRank Score
26
#4DataTypeScript
resend/react-email
resendreact-email
Developer
Alex Attinger
💌 Build and send emails using React
19,440GitHub stars0boosts (24h)+6stars (24h)
RepoRank Score
26
#5DataPython
RepoRank Score
23
#5DataPython
adithya-s-k/omniparse
adithya-s-komniparse
Developer
Adithya S K
Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks
7,630GitHub stars0boosts (24h)+1stars (24h)
RepoRank Score
23
#6DataTypeScript
RepoRank Score
23
#6DataTypeScript
any4ai/AnyCrawl
any4aianycrawl
Developer
Any4ai
AnyCrawl 🚀: A Node.js/TypeScript crawler that turns websites into LLM-ready data and extracts structured SERP results from Google/Bing/Baidu/etc. Native multi-threading for bulk processing.
3,369GitHub stars0boosts (24h)+12stars (24h)
RepoRank Score
23
#7DataPython
RepoRank Score
16
#7DataPython
yoheinakajima/activegraph
yoheinakajimaactivegraph
Developer
Yoheinakajima
An event-sourced reactive graph runtime for long-running, auditable, agentic systems.
394GitHub stars0boosts (24h)+6stars (24h)
RepoRank Score
16
#8DataSwift
RepoRank Score
16
#8DataSwift
heyderekj/dinky
heyderekjdinky
Developer
Heyderekj
Dinky makes files smaller.
476GitHub stars0boosts (24h)+1stars (24h)
RepoRank Score
16
#9DataLean
RepoRank Score
15
#9DataLean
Verified-zkEVM/ArkLib
verified-zkevmarklib
Developer
Verified Zkevm
Formally Verified Arguments of Knowledge in Lean
312GitHub stars0boosts (24h)+0stars (24h)
RepoRank Score
15

Recent data blogs

Finding Our Edge

Read the latest insights from the RepoRank editorial team.

Read article

Bull Markets Reward Attention. Weak Markets Reward Discovery.

Read the latest insights from the RepoRank editorial team.

Read article

The New Internet Economy For Builders

Read the latest insights from the RepoRank editorial team.

Read article

Stay Ahead in Data

Get weekly data repos in your inbox

Fresh data processing, analytics, pipeline, and ML tooling picks every week.

About Data on RepoRank

Data is at the core of modern software. Every application relies on data to operate, scale, and deliver value. From real-time analytics to large-scale data pipelines, the way data is collected, processed, and used defines how systems perform.

The data ecosystem is vast and constantly evolving. New tools are emerging to handle growing data volumes, improve performance, and simplify complex workflows. From data engineering platforms to analytics tools and machine learning pipelines, developers now have more options than ever.

RepoRank exists to make sense of it.

We surface data tools and open source projects that are gaining real momentum. By analysing GitHub activity, we highlight the technologies developers are actively using to build data-driven systems.

This includes tools for data processing, storage, analytics, visualization, and machine learning workflows.

Whether you are building pipelines, analysing datasets, or developing data-driven applications, this page helps you discover what is worth using.

What You Will Find Here

Trending data repositories with real growth signals
Tools for data processing and transformation
Data pipeline and workflow orchestration tools
Analytics and visualization platforms
Machine learning and data science tools

Every listing reflects real developer activity, not static recommendations.

Why RepoRank Is Different

Most data tool directories focus on established technologies. RepoRank focuses on momentum.

Discover emerging data technologies early
Stay current with evolving data workflows
Identify tools that are being actively adopted
Build more modern and scalable data systems

Built for Builders

Data is about scale, insight, and reliability.

Data engineers building pipelines and infrastructure
Developers working with data-driven applications
Analysts exploring new tools and workflows
Teams scaling data systems and analytics

If your product depends on data, this is your discovery layer.

Data FAQs

What are data tools?

Data tools are platforms and frameworks used to collect, process, store, and analyse data. They are essential for building data pipelines and extracting insights.

What is data engineering?

Data engineering focuses on building systems that move and process data. This includes pipelines, storage systems, and infrastructure for handling large datasets.

What is a data pipeline?

A data pipeline is a system that moves data from one place to another, often transforming it along the way. Pipelines are used for analytics, reporting, and machine learning workflows.

What is the difference between data engineering and data science?

Data engineering focuses on building systems to handle data, while data science focuses on analysing data and building models to generate insights.

What are analytics tools?

Analytics tools are used to explore, visualize, and interpret data. They help teams understand trends, performance, and user behaviour.

How does RepoRank identify trending data tools?

RepoRank analyses GitHub data such as star growth, activity, and engagement to surface tools that are gaining real momentum.

Are open source data tools reliable?

Many open source data tools are widely used in production environments. RepoRank highlights tools with strong activity and community support.

Can I submit my data project?

Yes. You can submit your repository to RepoRank, and it will be tracked and ranked based on activity and growth.

How often are data listings updated?

Listings are updated regularly to reflect real-time GitHub activity, ensuring the most relevant tools are surfaced.

What should I consider when choosing a data tool?

Focus on scalability, performance, ease of integration, community support, and how well the tool fits your data workflows.

Trending data tools repositories worth watching

Explore Data Topics

Trending Data Repositories

Recent data blogs

Finding Our Edge

Bull Markets Reward Attention. Weak Markets Reward Discovery.

The New Internet Economy For Builders

Stay Ahead in Data

Get weekly data repos in your inbox

About Data on RepoRank

What You Will Find Here

Why RepoRank Is Different

Built for Builders

Data FAQs

What are data tools?

What is data engineering?

What is a data pipeline?

What is the difference between data engineering and data science?

What are analytics tools?

How does RepoRank identify trending data tools?

Are open source data tools reliable?

Can I submit my data project?

How often are data listings updated?

What should I consider when choosing a data tool?