Catboost
6315 ⭐
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
Lingvo
2377 ⭐
Emu
1462 ⭐
The write-once-run-anywhere GPGPU library for Rust
Tf Quant Finance
2966 ⭐
High-performance TensorFlow library for quantitative finance.
Nyuziprocessor
1477 ⭐
GPGPU microprocessor architecture
Pycuda
1260 ⭐
CUDA integration for Python, plus shiny features
Neanderthal
961 ⭐
Fast Clojure Matrix Library
Kubernetes Gpu Guide
754 ⭐
This guide should help fellow researchers and hobbyists to easily automate and accelerate there deep leaning training with their own Kubernetes GPU cluster.
Accelerate
788 ⭐
Embedded language for high-performance array computations
Bindsnet
996 ⭐
Simulation of spiking neural networks (SNNs) using PyTorch.
Arraymancer
906 ⭐
A fast, ergonomic and portable tensor library in Nim with a deep learning focus for CPU, GPU and embedded devices via OpenMP, Cuda and OpenCL backends
Luxcore
768 ⭐
LuxCore source repository
18337
1225 ⭐
18.337 - Parallel Computing and Scientific Machine Learning
Picongpu
526 ⭐
Particle-in-Cell Simulations for the Exascale Era :sparkles:
Stdgpu
640 ⭐
stdgpu: Efficient STL-like Data Structures on the GPU
Bayadera
350 ⭐
High-performance Bayesian Data Analysis on the GPU in Clojure
Trisycl
388 ⭐
Generic system-wide modern C++ for heterogeneous platforms with SYCL from Khronos Group
Cuda API Wrappers
426 ⭐
Thin C++-flavored wrappers for the CUDA Runtime API
Hipsycl
488 ⭐
Multi-backend implementation of SYCL for CPUs and GPUs
Clojurecl
275 ⭐
ClojureCL is a Clojure library for parallel computations with OpenCL.
Opt
244 ⭐
Opt DSL
Blendluxcore
417 ⭐
Blender Integration for LuxCore
Gpur
222 ⭐
R interface to use GPU's
Huiscliu Tutorials
392 ⭐
Some basic programming tutorials
Vuh
301 ⭐
Vulkan compute for people
Smistad Fast
246 ⭐
A framework for GPU based high-performance medical image processing and visualization
Gpufit
211 ⭐
GPU-accelerated Levenberg-Marquardt curve fitting in CUDA
Clojurecuda
162 ⭐
Clojure library for CUDA development
Accelerate Llvm
136 ⭐
LLVM backend for Accelerate
Clvk
200 ⭐
Experimental implementation of OpenCL on Vulkan
Pelemay
174 ⭐
Pelemay is a native compiler for Elixir, which generates SIMD instructions. It has a plan to generate for GPU code.
Ginkgo
191 ⭐
Numerical linear algebra software package
Fastflow
172 ⭐
FastFlow pattern-based parallel programming framework (formerly on sourceforge)
Montecarlomeasurements.jl
204 ⭐
Propagation of distributions by Monte-Carlo sampling: Real number types with uncertainty represented by samples.
Deep Learning In Cloud
436 ⭐
List of Deep Learning Cloud Providers
Awesome Webgpu
442 ⭐
😎 Curated list of awesome things around WebGPU ecosystem.
Openclga
109 ⭐
A Python Library for Genetic Algorithm on OpenCL
Pysnn
163 ⭐
Efficient Spiking Neural Network framework, built on top of PyTorch for GPU acceleration
Goofit
107 ⭐
Code repository for the massively-parallel framework for maximum-likelihood fits, implemented in CUDA/OpenMP
Deepnet
101 ⭐
Deep.Net machine learning framework for F#
Cuda By Example Source Code For The Book S Examples
142 ⭐
CUDA by Example, written by two senior members of the CUDA software platform team, shows programmers how to employ this new technology. The authors introduce each area of CUDA development through working examples.
Tensile
109 ⭐
Stretching GPU performance for GEMMs and tensor contractions.
Cekirdekler
78 ⭐
Multi-device OpenCL kernel load balancer and pipeliner API for C#. Uses shared-distributed memory model to keep GPUs updated fast while using same kernel on all devices(for simplicity).
Petibm
72 ⭐
PetIBM - toolbox and applications of the immersed-boundary method on distributed-memory architectures
Sushi2
60 ⭐
Matrix Library for JavaScript
Rtorch
79 ⭐
PyTorch bindings for R
Heteroflow
71 ⭐
Concurrent CPU-GPU Programming using Task Models
Opencl Examples
102 ⭐
Simple OpenCL examples for exploiting GPU computing
Etaler
63 ⭐
A flexable HTM (Hierarchical Temporal Memory) framework with full GPU support.
Kernelabstractions.jl
154 ⭐
Heterogeneous programming in Julia
Rbcuda
56 ⭐
CUDA bindings for Ruby
Gpuclothsimulationinunity
101 ⭐
Trying to replicate what this legend did: https://youtu.be/kCGHXlLR3l8
Kernel_tuner
93 ⭐
Kernel Tuner
Claymore
147 ⭐
Gpuowl
59 ⭐
GPU Mersenne primality test.
Raspberrypi_tempmon
66 ⭐
Raspberry pi CPU temperature monitor with many functions such as logging, GPIO output, graphing, email, alarm, notifications and stress testing. Python 3.
Gcngemm
44 ⭐
Optimized half precision gemm assembly kernels (deprecated due to ROCm)
Sixtyfour
44 ⭐
How fast can we brute force a 64-bit comparison?
Svenssonjoel Obsidian
37 ⭐
Obsidian Language Repository
Fractional_differencing_gpu
45 ⭐
Rapid large-scale fractional differencing with RAPIDS to minimize memory loss while making a time series stationary. 6x-400x speed up over CPU implementation.
Autodock Gpu
131 ⭐
AutoDock for GPUs and other accelerators
Opensbli
49 ⭐
A framework for the automated derivation and parallel execution of finite difference solvers on a range of computer architectures.
Nvidia_libs_test
39 ⭐
Tests and benchmarks for cudnn (and in the future, other nvidia libraries)
Gpu Cluster Config
41 ⭐
How to Configure a GPU Cluster Running Ubuntu Linux
Gpu Utils
73 ⭐
A set of utilities for monitoring and customizing GPU performance
Cuda_memtest
63 ⭐
Fork of CUDA GPU memtest :eyeglasses:
Hiperc
31 ⭐
High Performance Computing Strategies for Boundary Value Problems
Fast Tsetlin Machine In Cuda With Imdb Demo
27 ⭐
A CUDA implementation of the Tsetlin Machine based on bitwise operators
Brian2genn
36 ⭐
Brian 2 frontend to the GeNN simulator
Aer Engine
26 ⭐
:aquarius: An OpenGL 4.3 / C++ 11 rendering engine oriented towards animation.
Raytramp
39 ⭐
Shooting and bouncing rays method for radar cross-section calculations, accelerated with BVH algorithm running on GPU (C++ AMP).
Xlearning Gpu
22 ⭐
qihoo360 xlearning with GPU support; AI on Hadoop
Rlan Notebooks
23 ⭐
A docker-based starter kit for machine learning via jupyter notebooks. Designed for those who just want a runtime environment and get on with machine learning. Docker tags:
Vulkan Compute Example
39 ⭐
Simple example of using Vulkan for GPGPU computing
Euler2d_cudafortran
20 ⭐
2nd order Godunov solver for 2d Euler equations written in CUDA Fortran - Deprecated - see instead https://github.com/pkestene/euler2d_kokkos
Gpuvmem
25 ⭐
GPU Framework for Radio Astronomical Image Synthesis
Gpuhd
28 ⭐
Massively Parallel Huffman Decoding on GPUs
Gardenia
22 ⭐
GARDENIA: Graph Analytics Repository for Designing Efficient Next-generation Accelerators
G3
37 ⭐
G3: A Programmable GNN Training System on GPU
Orbital Framework
32 ⭐
Graphics / Video, Audio and Input frameworks. (Agnostic / Portable / Easy / Powerful / Fast)
Anydsl Runtime
17 ⭐
AnyDSL Runtime Library
Lvarray
20 ⭐
Portable HPC Containers (C++)
Llnl Care
19 ⭐
CHAI and RAJA provide an excellent base on which to build portable codes. CARE expands that functionality, adding new features such as loop fusion capability and a portable interface for many numerical algorithms. It provides all the basics for anyone wanting to write portable code.
Vulkanfft
26 ⭐
Fast Fourier Transform using the Vulkan API
Openph
14 ⭐
Parallel reduction of boundary matrices for Persistent Homology with CUDA
Lluvia
30 ⭐
A real-time computer vision engine implemented on top of Vulkan API.
Windflow
28 ⭐
A C++17 Data Stream Processing Parallel Library for Multicores and GPUs
Qcuda
25 ⭐
qCUDA: GPGPU Virtualization at a New API Remoting Method with Para-virtualization
Learn Gpgpu
29 ⭐
Algorithms implemented in CUDA + resources about GPGPU
Anvilkit
11 ⭐
AnvilKit tames Metal. Very much WIP.
Tensorforce Client
12 ⭐
TensorForce-Client: Running Parallelized Reinforcement Learning Experiments in the Cloud
Qmc
15 ⭐
A Quasi-Monte-Carlo Integrator Library with CUDA Support
Multians
22 ⭐
Massively Parallel ANS Decoding on GPUs
Spatialcl
15 ⭐
Library for the GPU-accelerated spatial indexing and processing of particles in 2D and 3D with OpenCL. Currently offers trees based on space-filling-curves.
Akarirender
57 ⭐
High Performance CPU/GPU Physically Based Renderer
Cppbenchgpu
19 ⭐
Comparing GPGPU approaches. OpenCL, GLSL, Vulkan, Halide, SyCL.
Semiproctex
21 ⭐
Semi-procedural textures using PPTBF (Point Process Texture Basis Functions).
Taskflow
6308 ⭐
A General-purpose Parallel and Heterogeneous Task Programming System
Pai
2248 ⭐
Resource scheduling and cluster management for AI
Vulkan Kompute
737 ⭐
General purpose GPU compute framework for cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). Blazing fast, mobile-enabled, asynchronous and optimized for advanced GPU data processing usecases based on Vulkan compute. Backed by the Linux Foundation.