81 Open Source Dask Software Projects
Free and open source dask code projects including engines, APIs, generators, and tools.
Swifter 1876 ⭐
A package which efficiently applies any function to a pandas dataframe or series in the fastest available manner
Ironmussa Optimus 1173 ⭐
:truck: Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
Dask Knit 53 ⭐
Deprecated, please use https://github.com/jcrist/skein or https://github.com/dask/dask-yarn instead
Jsybrandt Agatha 48 ⭐
AGATHA: Automatic Graph-mining And Transformer based Hypothesis generation Approach
Opendataanalytics Gaia 29 ⭐
Gaia is a geospatial analysis library jointly developed by Kitware and Epidemico.
Daskperiment 26 ⭐
Reproducibility for Humans: A lightweight tool to perform reproducible machine learning experiment.
Esmlab 23 ⭐
Earth System Model Lab (esmlab). ⚠️⚠️ ESMLab functionality has been moved into <https://github.com/NCAR/geocat-comp>. ⚠️⚠️
Cesm Lens Aws 30 ⭐
Examples of analysis of CESM LENS data publicly available on Amazon S3 (us-west-2 region) using xarray and dask
Arboreto 32 ⭐
A scalable python-based framework for gene regulatory network inference using tree-based ensemble regressors.
Dvc_dask_use_case 20 ⭐
A use case of a reproducible machine learning pipeline using Dask, DVC, and MLflow.
Mpes 24 ⭐
Distributed data processing routines for multidimensional photoemission spectroscopy (MPES)
Mercat 16 ⭐
MerCat: python code for versatile k-mer counting and diversity estimation for database independent property analysis for meta -ome data
Bumblebee 108 ⭐
🚕 A spreadsheet-like data preparation web app that works over Optimus (Pandas, Dask, cuDF, Dask-cuDF, Spark and Vaex)
Mars Project Mars 2338 ⭐
Mars is a tensor-based unified framework for large-scale data computation which scales numpy, pandas, scikit-learn and Python functions.
Fugue 488 ⭐
A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask without any rewrites.
Aicsimageio 105 ⭐
Image Reading, Metadata Conversion, and Image Writing for Microscopy Images in Python
Dask Pytorch Ddp 42 ⭐
dask-pytorch-ddp is a Python package that makes it easy to train PyTorch models on dask clusters using distributed data parallel.
Eip1559_analysis 11 ⭐
Can we estimate the economic impact of EIP-1559 on miners? This repository try to estimate the loss of miners' revenue coming from transactions fees, using Ethereum historical data.
Adaptive Scheduler 12 ⭐
Run many functions (adaptively) on many cores (>10k) using mpi4py.futures, ipyparallel, or dask-mpi. :tada: