173 Open Source Feature Engineering Software Projects
Free and open source feature engineering code projects including engines, APIs, generators, and tools.
Tpot 8425 ⭐
A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
Nni 10910 ⭐
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
Transmogrifai 2089 ⭐
TransmogrifAI (pronounced trăns-mŏgˈrə-fī) is an AutoML library for building modular, reusable, strongly typed machine learning workflows on Apache Spark with minimal hand-tuning
Autodl 935 ⭐
Automated Deep Learning without ANY human intervention. 1'st Solution for AutoDL [email protected]
Kaggle Quora Question Pairs 720 ⭐
Kaggle：Quora Question Pairs, 4th/3396 (https://www.kaggle.com/c/quora-question-pairs)
Hyperparameter_hunter 681 ⭐
Easy hyperparameter optimization and automatic result saving across machine learning algorithms and libraries
Sgx Full Orderbook Tick Data Trading Strategy 913 ⭐
Providing the solutions for high-frequency trading (HFT) strategies using data science approaches (Machine Learning) on Full Orderbook Tick Data.
Feature Selection 580 ⭐
Features selector based on the self selected-algorithm, loss function and validation method
Mljar Supervised 1756 ⭐
Python package for AutoML on Tabular Data with Feature Engineering, Hyper-Parameters Tuning, Explanations and Automatic Documentation
Feature Engineering And Feature Selection 676 ⭐
A Guide for Feature Engineering and Feature Selection, with implementations and examples in Python.
Open Solution Home Credit 403 ⭐
Open solution to the Home Credit Default Risk challenge :house_with_garden:
Awesome Feature Engineering 497 ⭐
A curated list of resources dedicated to Feature Engineering Techniques for Machine Learning
Nlpython 285 ⭐
This repository contains the code related to Natural Language Processing using python scripting language. All the codes are related to my book entitled "Python Natural Language Processing"
Hanzi_char_featurizer 222 ⭐
汉字字符特征提取器 (featurizer)，提取汉字的特征（发音特征、字形特征）用做深度学习的特征 ｜ A Chinese character feature extractor, which extracts the features of Chinese characters (pronunciation features, glyph features) as features for deep learning
Deep Learning Machine Learning Stock 485 ⭐
Deep Learning and Machine Learning stocks represent a promising long-term or short-term opportunity for investors and traders.
Autofeat 275 ⭐
Linear Prediction Model with Automated Feature Engineering and Selection Capabilities
Remixautoml 179 ⭐
R package for automation of machine learning, forecasting, feature engineering, model evaluation, model interpretation, recommenders, and EDA.
The Building Data Genome Project 137 ⭐
A collection of non-residential buildings for performance analysis and algorithm benchmarking
Home Credit Default Risk 73 ⭐
Default risk prediction for Home Credit competition - Fast, scalable and maintainable SQL-based feature engineering pipeline
Autoencoders_keras 65 ⭐
Automatic feature engineering using deep learning and Bayesian inference using TensorFlow.
Dominance Analysis 105 ⭐
This package can be used for dominance analysis or Shapley Value Regression for finding relative importance of predictors on given dataset. This library can be used for key driver analysis or marginal resource allocation models.
Gan_keras 48 ⭐
Automatic feature engineering using Generative Adversarial Networks using TensorFlow.
Xiaoganghan Awesome Feature Engineering 48 ⭐
A curated list of feature engineering techniques for image and text machine learning
Drugs Recommendation Using Reviews 41 ⭐
Analyzing the Drugs Descriptions, conditions, reviews and then recommending it using Deep Learning Models, for each Health Condition of a Patient.
Feagen 33 ⭐
(deprecated) A fast and memory-efficient Python data engineering framework for machine learning.
Bike Sharing Demand Kaggle 33 ⭐
Top 5th percentile solution to the Kaggle knowledge problem - Bike Sharing Demand
Predicting Transportation Modes Of Gps Trajectories 34 ⭐
Understanding transportation mode from GPS (Global Positioning System) traces is an essential topic in the data mobility domain. In this paper, a framework is proposed to predict transportation modes. This framework follows a sequence of five steps: (i) data preparation, where GPS points are grouped in trajectory samples; (ii) point features generation; (iii) trajectory features extraction; (iv) noise removal; (v) normalization. We show that the extraction of the new point features: bearing rate, the rate of rate of change of the bearing rate and the global and local trajectory features, like medians and percentiles enables many classifiers to achieve high accuracy (96.5%) and f1 (96.3%) scores. We also show that the noise removal task affects the performance of all the models tested. Finally, the empirical tests where we compare this work against state-of-art transportation mode prediction strategies show that our framework is competitive and outperforms most of them.
Fifa 2019 Analysis 25 ⭐
This is a project based on the FIFA World Cup 2019 and Analyzes the Performance and Efficiency of Teams, Players, Countries and other related things using Data Analysis and Data Visualizations
Quora Paraphrase Question Identification 20 ⭐
Paraphrase question identification using Feature Fusion Network (FFN).
Predict Household Poverty 22 ⭐
Predict the poverty of households in Costa Rica using automated feature engineering.
Cortana Intelligence Customer360 22 ⭐
This repository contains instructions and code to deploy a customer 360 profile solution on Azure stack using the Cortana Intelligence Suite.
Bubble_plot 20 ⭐
Visualize linear and non-linear connections between numerical/categorical features (2D histogram with bubbles)
Pubmed Best Match 31 ⭐
Machine-learning based pipeline relying on LambdaMART currently used in PubMed for relevance (Best Match) searches
Feature Selection Techniques 37 ⭐
Python code source for features selection 👨🔬 series on medium website. 📰
Exemplary Ml Pipeline 21 ⭐
Exemplary, annotated machine learning pipeline for any tabular data problem.
Disentangled Attribution Curves 20 ⭐
Using / reproducing DAC from the paper "Disentangled Attribution Curves for Interpreting Random Forests and Boosted Trees"
Diamonds In Depth Analysis 17 ⭐
Given dataset of Diamonds with features such as Cut, Carat, Clarity etc. I have used libraries such as Pandas, Numpy, Matplotlib, Seaborn to Analyse and Estimate the Price of Diamonds based on the features. Using Scikit-Learn , implemented Algorithms to increase the effective R2 score.
Titanic Survival In Depth Analysis 12 ⭐
Used Pandas , Matplotlib , Seaborn libraries to Analyze , Visualize and Explore the data of people travelling on Titanic, and Used Scikit-learn Modelling Algorithms to predict their probability of Survival.
Clj Example Nlp Ml 13 ⭐
Example Project for Natural Language Processing and Machine Learning Libraries
Vulcan 15 ⭐
A high level deep learning framework for quickly prototyping networks with added tools in data visualisation, model interpretability and performance metrics
Kivyandroidclassification 18 ⭐
Image Classification for Android using Artificial Neural Network using NumPy and Kivy.
Loan Prediction Analytics Vidhya 23 ⭐
The solution to the Loan Prediction Practice Problem on Analytics Vidhya (https://datahack.analyticsvidhya.com/contest/practice-problem-loan-prediction-iii/)
Marcnuth Genetics 15 ⭐
Genetic Algorithm in Python, which could be used for Sampling, Feature Select, Model Select, etc in Machine Learning
Ddiextraction 12 ⭐
Detecting drug-drug interaction (DDI) has become a vital part of public health safety. This project is an implementation of NLP based approach for such relation extraction between entities.
Rl_sutton Barto_solutions 18 ⭐
Solutions and figures for problems from Reinforcement Learning: An Introduction Sutton&Barto
Contextfeatureextractor 11 ⭐
A neural text process python lib for context-based feature extraction on Seq-Tagging data.
World Food Production 12 ⭐
Comparing Top food and feed Producers around the globe and also seeking some interesting answers, solutions, patterns, hints and warnings through the power of Data Analysis and Data Visualization using Machine Learning.
Feature Engineering For Fraud Detection 27 ⭐
Implementation of feature engineering from Feature engineering strategies for credit card fraud