1030 Open Source Data Science Software Projects
Free and open source data science code projects including engines, APIs, generators, and tools.
Probabilistic Programming And Bayesian Methods For Hackers 21976 ⭐
aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)
Data Science Ipython Notebooks 19637 ⭐
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Ml From Scratch 18174 ⭐
Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.
Ipython 14431 ⭐
Official repository for IPython itself. Other repos in the IPython organization contain things like the website, documentation builds, etc.
Awesome Datascience 14325 ⭐
:memo: An awesome Data Science repository to learn and apply for real world problems.
Ray Project Ray 13495 ⭐
An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.
Python Machine Learning Book 10914 ⭐
The "Python Machine Learning (1st edition)" book code repository and info resource
Awesome Pytorch List 10843 ⭐
A comprehensive list of pytorch related content on github,such as different models,implementations,helper libraries,tutorials etc.
Dive Into Machine Learning 10177 ⭐
Dive into Machine Learning with Python Jupyter notebook and scikit-learn!
Awesome Bigdata 9344 ⭐
A curated list of awesome big data frameworks, ressources and other awesomeness.
Mit Deep Learning 7989 ⭐
Tutorials, assignments, and competitions for MIT Deep Learning related courses.
Pytorch Lightning 8901 ⭐
The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate.
Openrefine 7651 ⭐
OpenRefine is a free, open source power tool for working with messy data and improving it
Tpot 7554 ⭐
A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
D2l En 7757 ⭐
Interactive deep learning book with code, math, and discussions. Available in multi-frameworks.
Numerical Linear Algebra 7384 ⭐
Free online textbook of Jupyter notebooks for fast.ai Computational Linear Algebra course
Nni 7302 ⭐
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
Python Machine Learning Book 2nd Edition 6103 ⭐
The "Python Machine Learning (2nd edition)" book code repository and info resource
Industry Machine Learning 5671 ⭐
A curated list of applied machine learning and data science notebooks and libraries across different industries (by @firmai)
Roughviz 5499 ⭐
Lazyprogrammer Machine_learning_examples 5533 ⭐
A collection of machine learning examples and tutorials.
Catboost 5446 ⭐
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
Boltons 5259 ⭐
🔩 Like builtins, but boltons. 250+ constructs, recipes, and snippets which extend (and rely on nothing but) the Python standard library. Nothing like Michael Bolton.
H2o 3 5016 ⭐
Open Source Fast Scalable Machine Learning Platform For Smarter Applications: Deep Learning, Gradient Boosting & XGBoost, Random Forest, Generalized Linear Modeling (Logistic Regression, Elastic Net), K-Means, PCA, Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Data Analysis And Machine Learning Projects 4836 ⭐
Repository of teaching materials, code, and data for my data analysis and machine learning projects.
Imbalanced Learn 4773 ⭐
A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning
Blei Lab Edward 4556 ⭐
A probabilistic programming language in TensorFlow. Deep generative models, variational inference.
Knowledge Repo 4475 ⭐
A next-generation curated knowledge sharing platform for data scientists and other technical professions.
Nteract Hydrogen 3617 ⭐
:atom: Run code interactively, inspect data, and plot. All the power of Jupyter kernels, inside your favorite text editor.
D2l Pytorch 3450 ⭐
This project reproduces the book Dive Into Deep Learning (www.d2l.ai), adapting the code from MXNet into PyTorch.
Machine Learning Roadmap 3696 ⭐
A roadmap connecting many of the most important concepts in machine learning, how to learn them and what tools to use to perform them.
Mlxtend 3155 ⭐
A library of extension and helper modules for Python's data analysis and machine learning libraries.
Tensorwatch 2952 ⭐
Debugging, monitoring and visualization for Python Machine Learning and Data Science
Aksnzhy Xlearn 2769 ⭐
High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.
Rasbt Deep Learning Book 2672 ⭐
Repository for "Introduction to Artificial Neural Networks and Deep Learning: A Practical Guide with Applications in Python"
Telegram List 2665 ⭐
List of telegram groups, channels & bots // Список интересных групп, каналов и ботов телеграма // Список чатов для программистов
Python Is Cool 2590 ⭐
Cool Python features for machine learning that I used to be too afraid to use. Will be updated as I have more time / learn more.
Dowhy 2305 ⭐
DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphical models and potential outcomes frameworks.
Akshare 2334 ⭐
AkShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库
Ai Learn 2361 ⭐
人工智能学习路线图，整理近200个实战案例与项目，免费提供配套教材，零基础入门，就业实战！包括：Python，数学，机器学习，数据分析，深度学习，计算机视觉，自然语言处理，PyTorch tensorflow machine-learning,deep-learning data-analysis data-mining mathematics data-science artificial-intelligence python tensorflow tensorflow2 caffe keras pytorch algorithm numpy pandas matplotlib seaborn nlp cv等热门领域
Eli5 2158 ⭐
A library for debugging/inspecting machine learning classifiers and explaining their predictions
Eugeneyan Applied Ml 4130 ⭐
📚 Papers by organizations sharing their work on applied data science & machine learning.
Awesome Computer Science Opportunities 1981 ⭐
An awesome list of events and fellowship opportunities for Computer Science students