541 Open Source Data Mining Software Projects
Free and open source data mining code projects including engines, APIs, generators, and tools.
Ml From Scratch 20767 ⭐
Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.
Awesome Datascience 17836 ⭐
:memo: An awesome Data Science repository to learn and apply for real world problems.
Microsoft Lightgbm 13406 ⭐
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
Python Machine Learning Book 11468 ⭐
The "Python Machine Learning (1st edition)" book code repository and info resource
Jaidedai Easyocr 13651 ⭐
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
Awesome Production Machine Learning 10782 ⭐
A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning
Catboost 6315 ⭐
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
Mlxtend 3773 ⭐
A library of extension and helper modules for Python's data analysis and machine learning libraries.
Ai Learn 4521 ⭐
人工智能学习路线图，整理近200个实战案例与项目，免费提供配套教材，零基础入门，就业实战！包括：Python，数学，机器学习，数据分析，深度学习，计算机视觉，自然语言处理，PyTorch tensorflow machine-learning,deep-learning data-analysis data-mining mathematics data-science artificial-intelligence python tensorflow tensorflow2 caffe keras pytorch algorithm numpy pandas matplotlib seaborn nlp cv等热门领域
Pdftabextract 1969 ⭐
A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.
Awesome Machine Learning Interpretability 2447 ⭐
A curated list of awesome machine learning interpretability resources.
Awesome Ts Anomaly Detection 2079 ⭐
List of tools & datasets for anomaly detection on time-series data.
Patmartin Dex 1272 ⭐
Dex : The Data Explorer -- A data visualization tool written in Java/Groovy/JavaFX capable of powerful ETL and publishing web visualizations.
Tsv Utils 1297 ⭐
eBay's TSV Utilities: Command line tools for large, tabular data files. Filtering, statistics, sampling, joins and more.
Ail Framework 1166 ⭐
AIL framework - Analysis Information Leak framework. Project moved to https://github.com/ail-project
Papers Literature Ml Dl Rl Ai 1619 ⭐
Highly cited and useful papers related to machine learning, deep learning, AI, game theory, reinforcement learning
Dataflowjavasdk 859 ⭐
Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.
Clevercsv 957 ⭐
CleverCSV is a Python package for handling messy CSV files. It provides a drop-in replacement for the builtin CSV module with improved dialect detection, and comes with a handy command line application for working with CSV files.
Awesome Ai Books 970 ⭐
Some awesome AI related books and pdfs for learning and downloading, also apply some playground models for learning
Interpretable_machine_learning_with_python 587 ⭐
Examples of techniques for training interpretable ML models, explaining ML models, and debugging ML models for accuracy, discrimination, and security.
Vectorbt 1602 ⭐
Find your trading edge, using the fastest engine for backtesting, algorithmic trading, and research.
Cookbook 2nd Code 613 ⭐
Code of the IPython Cookbook, Second Edition, by Cyrille Rossant, Packt Publishing 2018 [read-only repository]
Book Socialmediaminingpython 489 ⭐
Companion code for the book "Mastering Social Media Mining with Python"
Feature Engineering And Feature Selection 676 ⭐
A Guide for Feature Engineering and Feature Selection, with implementations and examples in Python.
Cogcomp Nlp 432 ⭐
CogComp's Natural Language Processing Libraries and Demos: Modules include lemmatizer, ner, pos, prep-srl, quantifier, question type, relation-extraction, similarity, temporal normalizer, tokenizer, transliteration, verb-sense, and more.
Artificial Adversary 360 ⭐
🗣️ Tool to generate adversarial text examples and test machine learning models against them
Text_mining_resources 452 ⭐
Resources for learning about Text Mining and Natural Language Processing
Knowage Server 317 ⭐
Knowage is the professional open source suite for modern business analytics over traditional sources and big data systems.
Graph Adversarial Learning Literature 525 ⭐
A curated list of adversarial attacks and defenses papers on graph-structured data.
Statistical Learning 230 ⭐
Lecture Slides and R Sessions for Trevor Hastie and Rob Tibshinari's "Statistical Learning" Stanford course
Scriptsmith Reaper 282 ⭐
Social media scraping / data collection tool for the Facebook, Twitter, Reddit, YouTube, Pinterest, and Tumblr APIs
Automlpipeline.jl 292 ⭐
A package that makes it trivial to create and evaluate machine learning pipeline architectures.
Game Datasets 344 ⭐
:video_game: A curated list of awesome game datasets, and tools to artificial intelligence in games
Smartproxy Smartproxy 229 ⭐
HTTP(S) Rotating Residential proxies - Code examples & General information
Qminer 214 ⭐
Analytic platform for real-time large-scale streams containing structured and unstructured data.
Prefixspan Py 269 ⭐
The shortest yet efficient Python implementation of the sequential pattern mining algorithm PrefixSpan, closed sequential pattern mining algorithm BIDE, and generator sequential pattern mining algorithm FEAT.
Suod 305 ⭐
(MLSys' 21) An Acceleration System for Large-scare Unsupervised Heterogeneous Outlier Detection (Anomaly Detection)
Urs 423 ⭐
Universal Reddit Scraper - A comprehensive Reddit scraping command-line tool written in Python.
Graph Fraud Detection Papers 563 ⭐
A curated list of fraud detection papers using graph information or graph neural networks
Estadistica Con R 258 ⭐
Apuntes personales sobre estadística, machine learning y lenguaje de programación R
Pyss3 230 ⭐
A Python package implementing a new interpretable machine learning model for text classification (with visualization tools for Explainable AI :octocat:)