This repository includes h2o.ai based machine learning project implementations and documentations.
1- h2o frame
Pandas is de facto standard for data manipulation operations among data scientist. It is fast but it runs on single cpu core. Herein, h2o frame is a powerful alternative to pandas. It supports multi-core calculations whereas it covers almost same functions with Pandas.
GBM dominates tabular data based kaggle competitions nowadays. Herein, h2o covers both XGBoost and its own GBM. This is a gentle introduction to h2o GBM.
The hottest topic in machine learning is AutoML. Even though model design is accepted as state-of-the-art, today AutoML can design better models than us. h2o AutoML covers linear models, tree-based models including random forest and gradient boosting (XGBoost and h2o GBM) and deep learning (regular fully connected neural networks).
Interpretability and accuracy are inversely proportional concepts. You cannot deploy unexplainable models to production even if they have high accuracy. Here, lime offers to explain custom predictions of your built models.
SHAP offers very deeply explanations for built models against LIME. Still, it comes with a time cost. You should use SHAP if you have enough time to analysis your model.
h2o offers faster XGBoost models than regular XGBoost. We will compare these two XGBoost distribution performances.
I have tested this repository on the following environment configurations. Confirm your environment is same as below to avoid environmental issues.
>>> !python --version Python 3.6.3 >>> import h2o >>> h2o.__version__ 18.104.22.168
There are many ways to support a project - starring the GitHub repos is one.
This repository is licensed under MIT license - see
LICENSE for more details