929 Open Source Bioinformatics Software Projects
Free and open source bioinformatics code projects including engines, APIs, generators, and tools.
Deepvariant 2087 ⭐
DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.
Awesome Single Cell 1498 ⭐
Community-curated list of software packages and data resources for single-cell, including RNA-seq, ATAC-seq, etc.
Fastp 850 ⭐
An ultra-fast all-in-one FASTQ preprocessor (QC/adapters/trimming/filtering/splitting/merging...)
Multiqc 655 ⭐
Aggregate results from bioinformatics analyses across many samples into a single report.
Cromwell 605 ⭐
Scientific workflow engine designed for simplicity & scalability. Trivially transition between one off use cases to massive scale production environments
Getting Started With Genomics Tools And Resources 495 ⭐
Unix, R and python tools for genomics and data science
Biojava 410 ⭐
:book::microscope::coffee: BioJava is an open-source project dedicated to providing a Java library for processing biological data.
Combine Lab Salmon 406 ⭐
🐟 🍣 🍱 Highly-accurate & wicked fast transcript-level quantification from RNA-seq reads using selective alignment
Edlib 267 ⭐
Lightweight, super fast C/C++ (& Python) library for sequence alignment using edit (Levenshtein) distance.
Homebrew Bio 221 ⭐
:beer::microscope: Bioinformatics formulae for the Homebrew package manager (macOS and Linux)
Single Cell Pseudotime 216 ⭐
An overview of algorithms for estimating pseudotime in single-cell RNA-seq data
Dnafeaturesviewer 213 ⭐
:eye: Python library to plot DNA sequence features (e.g. from Genbank files)
Awesome Cancer Variant Databases 192 ⭐
A community-maintained repository of cancer clinical knowledge bases and databases focused on cancer variants.
Goleft 164 ⭐
goleft is a collection of bioinformatics tools distributed under MIT license in a single static binary
Seqan3 168 ⭐
The modern C++ library for sequence analysis. Contains version 3 of the library and API docs.
Clairvoyante 141 ⭐
Clairvoyante: a multi-task convolutional deep neural network for variant calling in Single Molecule Sequencing
Ragoo 143 ⭐
Fast Reference-Guided Scaffolding of Genome Assembly Contigs. RagTag, the successor to RaGOO, is now available here: https://github.com/malonge/RagTag
Mixcr 138 ⭐
MiXCR is a universal software for fast and accurate extraction of T- and B- cell receptor repertoires from any type of sequencing data. Free for academic use only.
Bioinformatics Centre Kaiju 130 ⭐
Fast taxonomic classification of metagenomic sequencing reads using a protein reference database
Dash.jl 173 ⭐
Sanger Pathogens Artemis 128 ⭐
Artemis is a free genome viewer and annotation tool that allows visualization of sequence features and the results of analyses within the context of the sequence, and its six-frame translation
Hgvs 128 ⭐
Python library to parse, format, validate, normalize, and map sequence variants. `pip install hgvs`
Krakenuniq 116 ⭐
🐙 KrakenUniq: Metagenomics classifier with unique k-mer counting for more specific results
Awesome Bioinformatics Benchmarks 125 ⭐
A curated list of bioinformatics bench-marking papers and resources.
Mrbayes 115 ⭐
MrBayes is a program for Bayesian inference and model choice across a wide range of phylogenetic and evolutionary models. For documentation and downloading the program, please see the home page:
Somalier 115 ⭐
fast sample-swap and relatedness checks on BAMs/CRAMs/VCFs/GVCFs... "like damn that is one smart wine guy"
Cgranges 109 ⭐
A C/C++ library for fast interval overlap queries (with a "bedtools coverage" example)
Pegasus Isi Pegasus 104 ⭐
Pegasus Workflow Management System - Automate, recover, and debug scientific computations.
Hicexplorer 103 ⭐
HiCExplorer is a powerful and easy to use set of tools to process, normalize and visualize Hi-C data.
Bioconvert 100 ⭐
Bioconvert is a collaborative project to facilitate the interconversion of life science data from one format to another.
Peddy 97 ⭐
genotype :: ped correspondence check, ancestry check, sex check. directly, quickly on VCF
Sorgerlab Indra 96 ⭐
INDRA (Integrated Network and Dynamical Reasoning Assembler) is an automated model assembly system interfacing with NLP systems and databases to collect knowledge, and through a process of assembly, produce causal graphs and dynamical models.
Buddysuite 93 ⭐
Bioinformatics toolkits for manipulating sequence, alignment, and phylogenetic tree files
Clusterflow 87 ⭐
A pipelining tool to automate and standardise bioinformatics analyses on cluster environments.
Awesome 10x Genomics 84 ⭐
List of tools and resources related to the 10x Genomics GEMCode/Chromium system
Decontam 82 ⭐
Simple statistical identification and removal of contaminants in marker-gene and metagenomics sequencing data
Biojupies 81 ⭐
Automated generation of tailored bioinformatics Jupyter Notebooks via a user interface.
Swne 83 ⭐
Similarity Weighted Nonnegative Embedding (SWNE), a method for visualizing high dimensional datasets
Plip 86 ⭐
Protein-Ligand Interaction Profiler - Analyze and visualize non-covalent protein-ligand interactions in PDB files according to 📝 Salentin et al. (2015), https://www.doi.org/10.1093/nar/gkv315
Obofoundry.github.io 82 ⭐
Metadata and website for the Open Bio Ontologies Foundry Ontology Registry
Saber 78 ⭐
Saber is a deep-learning based tool for information extraction in the biomedical domain. Pull requests are welcome! Note: this is a work in progress. Many things are broken, and the codebase is not stable.
Oswitch 75 ⭐
Provides access to complex Bioinformatics software (even BioLinux!) in just one command.
Immunarch 86 ⭐
🧬 Immunarch by ImmunoMind: R Package for Fast and Painless Exploration of Single-cell and Bulk T-cell/Antibody Immune Repertoires
Coursera Specializations 72 ⭐
Solutions to assignments of Coursera Specializations - Deep learning, Machine learning, Algorithms & Data Structures, Image Processing and Python For Everybody
Edamontology 73 ⭐
EDAM is an ontology of bioinformatics types of data including identifiers, data formats, operations and topics.
Pydna 70 ⭐
Double stranded DNA & simulation of homologous recombination, Gibson assembly, cut&paste cloning in Python and Jupyter notebooks
Hickit 64 ⭐
TAD calling, phase imputation, 3D modeling and more for diploid single-cell Hi-C (Dip-C) and general Hi-C
Awesome Expression Browser 60 ⭐
😎 A curated list of software and resources for exploring and visualizing (browsing) expression data 😎
Gubbins 61 ⭐
Rapid phylogenetic analysis of large samples of recombinant bacterial whole genome sequences using Gubbins
Globalbioticinteractions 62 ⭐
Global Biotic Interactions provides access to existing species interaction datasets
Hahnlab Cafe 61 ⭐
Analyze changes in gene family size and provide a statistical foundation for evolutionary inferences.
Graph Network Explainability 65 ⭐
Explainability techniques for Graph Networks, applied to a synthetic dataset and an organic chemistry task. Code for the workshop paper "Explainability Techniques for Graph Convolutional Networks" (ICML19)
Opengene.jl 59 ⭐
(No maintenance) OpenGene, core libraries for NGS data analysis and bioinformatics in Julia
Epiviz 59 ⭐
EpiViz is a scientific information visualization tool for genetic and epigenetic data, used to aid in the exploration and understanding of correlations between various genome features.
Grabseqs 60 ⭐
A utility for easy downloading of reads from next-gen sequencing repositories like NCBI SRA
Llevar Butler 58 ⭐
Butler is a framework for running scientific workflows on public and academic clouds.
Snprelate 57 ⭐
R package: parallel computing toolset for relatedness and principal component analysis of SNP data (Development Version)
Clair 60 ⭐
Clair: Exploring the limit of using deep neural network on pileup data for germline variant calling
Honeybadger 58 ⭐
HMM-integrated Bayesian approach for detecting CNV and LOH events from single-cell RNA-seq data
Sigminer 55 ⭐
🌲 An easy-to-use and scalable toolkit for genomic alteration signature (a.k.a. mutational signature) analysis and visualization in R
Palimpsest 54 ⭐
An R package for studying mutational signatures and structural variant signatures along clonal evolution in cancer.
Fastv 60 ⭐
An ultra-fast tool for identification of SARS-CoV-2 and other microbes from sequencing data. This tool can be used to detect viral infectious diseases, like COVID-19.
Lanl Bioinformatics Edge 51 ⭐
EDGE is a highly adaptable bioinformatics platform that allows laboratories to quickly analyze and interpret genomic sequence data.
Reg Gen 54 ⭐
Regulatory Genomics Toolbox: Python library and set of tools for the integrative analysis of high throughput regulatory genomics data.
Openanno Bget 55 ⭐
Portable command-line tool to query bioinformatics APIs, data, databases and files.
Physicell 50 ⭐
PhysiCell: Scientist end users should use latest release! Developers please fork the development branch and submit PRs to the dev branch. Thanks!
Bioperl6 49 ⭐
reimplementation of BioPerl classes in Raku (e.g. the language formerly known as Perl6)
Tibanna 51 ⭐
Tibanna helps you run your genomic pipelines on Amazon cloud (AWS). It is used by the 4DN DCIC (4D Nucleome Data Coordination and Integration Center) to process data. Tibanna supports CWL/WDL (w/ docker), Snakemake (w/ conda) and custom Docker/shell command.
Sample 49 ⭐
Performs memory-efficient reservoir sampling on very large input files delimited by newlines
Terpene Profile Parser For Cannabis Strains 53 ⭐
Parser and database to index the terpene profile of different strains of Cannabis from online databases
Deepchrome 48 ⭐
Bioinformatics16: DeepChrome: Deep-learning for predicting gene expression from histone modifications
Cute Nucleotides 48 ⭐
Cute tricks for SIMD vectorized binary encoding and decoding of nucleotides, in Rust.
Awesome Bioie 46 ⭐
🧫 A curated list of resources relevant to doing Biomedical Information Extraction (including BioNLP)
Ococo 43 ⭐
Ococo: the first online variant and consensus caller. Call genomic consensus directly from an unsorted SAM/BAM stream.
Ucscxenatools 45 ⭐
:package: An R package for accessing genomics data from UCSC Xena platform, from cancer multi-omics to single-cell RNA-seq https://cran.r-project.org/web/packages/UCSCXenaTools/
Tskit 41 ⭐
tskit provides a C and Python API for accessing, analysing and creating tree sequences - a highly efficient way of storing a set of related DNA sequences.
Verifybamid 44 ⭐
VerifyBamID2: A robust tool for DNA contamination estimation from sequence reads using ancestry-agnostic method.
Neodti 41 ⭐
NeoDTI: Neural integration of neighbor information from a heterogeneous network for discovering new drug-target interactions
Vr Spec 40 ⭐
Extensible specification for representing and uniquely identifying biological sequence variation
2020plus 39 ⭐
Classifies genes as an oncogene, tumor suppressor gene, or as a non-driver gene by using Random Forests
P2rank 43 ⭐
P2Rank: Protein-ligand binding site prediction tool based on machine learning. Stand-alone command line program / Java library for predicting ligand binding pockets from protein structure.
Gblastn 41 ⭐
G-BLASTN is a GPU-accelerated nucleotide alignment tool based on the widely used NCBI-BLAST.
Deep Learning For Clustering In Bioinformatics 41 ⭐
Deep Learning-based Clustering Approaches for Bioinformatics
Metaeuk 40 ⭐
MetaEuk - sensitive, high-throughput gene discovery and annotation for large-scale eukaryotic metagenomics
Sigprofilermatrixgenerator 39 ⭐
SigProfilerMatrixGenerator creates mutational matrices for all types of somatic mutations. It allows downsizing the generated mutations only to parts for the genome (e.g., exome or a custom BED file). The tool seamlessly integrates with other SigProfiler tools.
Eutils 35 ⭐
simplified searching, fetching, and parsing records from NCBI using their E-utilities interface
Sigprofilerextractor 45 ⭐
SigProfilerExtractor allows de novo extraction of mutational signatures from data generated in a matrix format. The tool identifies the number of operative mutational signatures, their activities in each sample, and the probability for each signature to cause a specific mutation type in a cancer sample. The tool makes use of SigProfilerMatrixGenerator and SigProfilerPlotting.
Jannovar 37 ⭐
Annotation of VCF variants with functional impact and from databases (executable+library)
Picardmetrics 37 ⭐
:vertical_traffic_light: Run Picard on BAM files and collate 90 metrics into one file.
Computationalgenomicsmanual 36 ⭐
Robs manual for the computational genomics and bioinformatics class.
Tcr 34 ⭐
[DEPRECATED, see https://immunarch.com/] tcR: an R package for immune receptor repertoire advanced data analysis.
Wdlrunr 34 ⭐
Elastic, reproducible, and reusable genomic data science tools from R backed by cloud resources
Fluentdna 33 ⭐
FluentDNA allows you to browse sequence data of any size using a zooming visualization similar to Google Maps. You can use FluentDNA as a standalone program or as a python module for your own bioinformatics projects.
Docker Builds 36 ⭐
:package: :whale: Dockerfiles and documentation on tools for public health bioinformatics
Deepscreen 39 ⭐
DEEPScreen: Virtual Screening with Deep Convolutional Neural Networks Using Compound Images
Biostructures.jl 37 ⭐
A Julia package to read, write and manipulate macromolecular structures (particularly proteins)
Uta 34 ⭐
Universal Transcript Archive: comprehensive genome-transcript alignments; multiple transcript sources, versions, and alignment methods; available as a docker image
Run_dbcan 32 ⭐
Run_dbcan V2, using genomes/metagenomes/proteomes of any orgaisms(prokaryotes, fungi, plants, animals, viruses) to search for CAZymes.
Bindash 32 ⭐
Fast and precise comparison of genomes and metagenomes (in the order of terabytes) on a typical personal laptop
Cytometry Clustering Comparison 30 ⭐
R scripts to reproduce analyses in our paper comparing clustering methods for high-dimensional cytometry data
Skyhawk 30 ⭐
An Artificial Neural Network-based discriminator for validating clinically significant genomic variants
Peptides 30 ⭐
An R package to calculate indices and theoretical physicochemical properties of peptides and protein sequences.
Seq2science 39 ⭐
Automated preprocessing of Next-Generation Sequencing data, including full (sc)ATAC-seq, ChIP-seq, and RNA-seq workflows. Works equally easy with public as local data.
Seuratv3wizard 29 ⭐
A web-based interactive (wizard style) application to perform a guided single-cell RNA-seq data analysis and clustering based on Seurat v3
Qmflows 29 ⭐
This library tackles the construction and efficient execution of computational chemistry workflows
Montilab Pipeliner 28 ⭐
A flexible Nextflow-based framework for the definition of sequencing data processing pipelines
Biokeen 30 ⭐
A computational library for learning and evaluating biological knowledge graph embeddings
Sirius Libs 28 ⭐
sirius-libs - Metabolomics mass spectrometry framework for molecular formula identification of small molecules written in Java
Seqarray 28 ⭐
Data management of large-scale whole-genome sequence variant calls (Development Version)
Maui 26 ⭐
Multi-omics Autoencoder Integration: Deep learning-based heterogenous data analysis toolkit
Hotsub 25 ⭐
Command line tool to run batch jobs concurrently with ETL framework on AWS or other cloud computing resources
Flashweave.jl 26 ⭐
Inference of microbial interaction networks from large-scale heterogeneous abundance data
Htstream 26 ⭐
A high throughput sequence read toolset using a streaming approach facilitated by Linux pipes
Singlecellhaystack 33 ⭐
Finding surprising needles (=genes) in haystacks (=single cell transcriptome data).
Dieterich Lab Dcc 24 ⭐
DCC uses output from the STAR read mapper to systematically detect back-splice junctions in next-generation sequencing data. DCC applies a series of filters and integrates data across replicate sets to arrive at a precise list of circRNA candidates.
Soap3 Dp 24 ⭐
To tackle the exponentially increasing throughput of Next-Generation Sequencing (NGS), most of the existing short-read aligners can be configured to favor speed in trade of accuracy and sensitivity. SOAP3-dp, through leveraging the computational power of both CPU and GPU with optimized algorithms, delivers high speed and sensitivity simultaneously.
2017_2018 Single Cell Rna Sequencing Workshop Ucd_ucb_ucsf 26 ⭐
2017_2018 single cell RNA sequencing Workshop UCD_UCB_UCSF
Ibrahimtanyalcin Lexicon 24 ⭐
Data visualization library for creating interactive graphs and dashboards for bioinformatics etc.
Msfragger 25 ⭐
Ultra fast and comprehensive peptide identification in mass spectrometry–based proteomics
Dram 33 ⭐
Distilled and Refined Annotation of Metabolism: A tool for the annotation and curation of function for microbial and viral genomes
Qtip 24 ⭐
Qtip: a tandem simulation approach for accurately predicting read alignment mapping qualities
Afternotes 24 ⭐
Afternotes for the attended courses at Ca' Foscari University, master in Data Management and Analytics.
Argparse2tool 23 ⭐
transparently build CWL and Galaxy XML tool definitions for any script that uses argparse
Find_differential_primers 26 ⭐
Code for design of diagnostic PCR primers, and metabarcoding markers.
Mindthegap 24 ⭐
MindTheGap performs detection and assembly of DNA insertion variants in NGS read datasets with respect to a reference genome.
Primerminer 24 ⭐
R mased batch sequence downloader, with primer development and in silico evaluation capabilities
Covid19kg 23 ⭐
COVID-19 Knowledge Graph: a computable, multi-modal, cause-and-effect knowledge model of COVID-19 pathophysiology
Wittelab Orchid 24 ⭐
A novel management, annotation, and machine learning framework for analyzing cancer mutations
Haploclique 22 ⭐
Viral quasispecies assembly via maximal clique finding. A method to reconstruct viral haplotypes and detect large insertions and deletions from NGS data.
Bioposdep 23 ⭐
Tokenization, sentence segmentation, POS tagging and dependency parsing for biomedical texts (BMC Bioinformatics 2019)
Metacache 23 ⭐
memory efficient, fast & precise taxnomomic classification system for metagenomic read mapping
Orthoevolution 22 ⭐
An easy to use and comprehensive python package which aids in the analysis and visualization of orthologous genes. 🐵
Lightdock 26 ⭐
Protein-protein, protein-peptide and protein-DNA docking framework based on the GSO algorithm
Tadlib 21 ⭐
A Library to Explore Chromatin Interaction Patterns for Topologically Associating Domains
Platon 30 ⭐
Identification & characterization of bacterial plasmid-borne contigs from short-read draft assemblies.
Stochpy 22 ⭐
StochPy is a versatile stochastic modeling package which is designed for stochastic simulation of molecular control networks
Lightdock Python2.7 20 ⭐
Protein-protein, protein-peptide and protein-DNA docking framework based on the GSO algorithm
Gssplayground 21 ⭐
Lightweight single-html-file-based Genome Segments playground for Visualize genome features cluster(gene arrow map or other features), add synteny among genome fragments or add crosslink among features, add short(PE/MP)/long reads(pacbio or nanopore) mapping or snpindel in vcf(not support complex sv yet), support all CIGAR of sam alignment, directly modify almost all features in Chrome by click the feature
Course2017 20 ⭐
Genomics lessons for week 4 of the Microbial Diversity course at the Marine Biological Lab in Woods Hole, MA.
Metaomgraph 22 ⭐
MetaOmGraph: a workbench for interactive exploratory data analysis of large expression datasets
Comprna Rattle 20 ⭐
Reference-free reconstruction and error correction of transcriptomes from Nanopore long-read sequencing
Plantinformatics Pretzel 20 ⭐
Tiledb Vcf 21 ⭐
Efficient variant-call data storage and retrieval library using the TileDB storage library.
Immunomind Covid19 22 ⭐
🦠 Regularly updated list of publicly available datasets with single-cell (scRNAseq) and T-cell/antibody immune repertoire (AIRR / RepSeq / immunosequencing) data of COVID-19 patients with SARS-CoV-2.
Seqmaker.jl 19 ⭐
(No maintenance) Next Generation Sequencing Simulation with SNP, Variation and Sequencing Error Integrated
Mageri 19 ⭐
MAGERI - Assemble, align and call variants for targeted genome re-sequencing with unique molecular identifiers
Soapdenovo Trans 20 ⭐
SOAPdenovo-Trans, a de novo transcriptome assembler designed specifically for RNA-Seq. We evaluated its performance on transcriptome datasets from rice and mouse.
Deepchem Workshop 19 ⭐
DeepChem 2017: Deep Learning & NLP for Computational Chemistry, Biology & Nano-materials
Rcpi 18 ⭐
Molecular informatics toolkit with a comprehensive integration of bioinformatics and cheminformatics tools for drug discovery.
Kraken Biom 19 ⭐
Create BIOM-format tables (http://biom-format.org) from Kraken output (http://ccb.jhu.edu/software/kraken/, https://github.com/DerrickWood/kraken).
Drug Drug Interaction Prediction 19 ⭐
Drug-Drug Interaction Prediction Based on Knowledge Graph Embeddings and Convolutional-LSTM Network
Homologene 20 ⭐
:mouse: :left_right_arrow: :couple: An r package that works as a wrapper to homologene
Gor 20 ⭐
GORpipe is a tool based on a genomic ordered relational architecture and allows analysis of large sets of genomic and phenotypic tabular data using declarative query language, in a parallel execution engine.
Snpbinner 19 ⭐
SNPbinner is a utility for the generation of genotype crossover points and binmaps based on SNP data across recombinant inbred lines.
Team Rosalind Project 18 ⭐
This is the main repository for the HackBio'2020 Virtual Internship Experience ❤️
Introduction To Genomic Analysis 18 ⭐
Welcome to the website and github repository for the Genome Analysis Module. This website will guide the learning experience for trainees in the UBC MSc Genetic Counselling Training Program, as they embark on a journey to learn about analyzing genomes.
Martinsos Opal 17 ⭐
SIMD C/C++ library for massive optimal sequence alignment (local/SW, infix, overlap, global)
Doepipeline 17 ⭐
A python package for optimizing processing pipelines using statistical design of experiments (DoE).
Cwlab 17 ⭐
An open-source framework for simplified deployment of the Common Workflow Language using a graphical web interface
Adversarial Relation Classification 17 ⭐
Unsupervised domain adaptation method for relation extraction
Mview 16 ⭐
MView extracts and reformats the results of a sequence database search or multiple alignment.
Saffrontree 16 ⭐
SaffronTree: Reference free rapid phylogenetic tree construction from raw read data
Plant Disease Identification Using Cnn 21 ⭐
Plant Disease Identification Using Convulutional Neural Network
Aptasuite 16 ⭐
A full-featured bioinformatics software collection for the comprehensive analysis of aptamers in HT-SELEX experiments.
Taeper 16 ⭐
A small python program to simulate a real-time Nanopore sequencing run based on a previous experiment.
Wormanalysis 21 ⭐
Biological Engineering Test Code for Worm Analysis: https://stackoverflow.com/q/37820629/293195
Coursera Bioinformatics 18 ⭐
My solution to Bioinformatics Specialization (Finding Hidden Messages in DNA; Genome Sequencing; Comparing Genes, Proteins, and Genomes; Molecular Evolution; Genomic Data Science and Clustering; Finding Mutations in DNA and Proteins; Bioinformatics Capstone: Big Data in Biology)
Vargeno 15 ⭐
Towards fast and accurate SNP genotyping from whole genome sequencing data for bedside diagnostics.
Rnaseq_titration_results 15 ⭐
Cross-platform normalization enables machine learning model training on microarray and RNA-seq data simultaneously
Gcmodeller 15 ⭐
GCModeller: genomics CAD(Computer Assistant Design) Modeller system in .NET language
Gtaxon 15 ⭐
gTaxon - a fast cross-platform NCBI taxonomy data querying (gi2taxid, taxid2taxon, name2taxid, LCA) tool, with cmd client and REST API server for both local and remote server.
U Track 15 ⭐
Multiple-particle tracking designed to (1) track dense particle fields, (2) close gaps in particle trajectories resulting from detection failure, and (3) capture particle merging and splitting events resulting from occlusion or genuine aggregation and dissociation events
Nf Core Configs 16 ⭐
Config files used to define parameters specific to compute environments at different Institutions
Tailfindr 15 ⭐
An R package for estimating poly(A)-tail lengths in Oxford Nanopore RNA and DNA reads.
Bco_specification 15 ⭐
Repository for support of the IEEE 2791-2020 standard. Please see our home page for communications/publications:
Mapping Iterative Assembler 16 ⭐
Consensus calling (or "reference assisted assembly"), chiefly of ancient mitochondria
Corda 15 ⭐
An implementation of genome-scale model reconstruction using Cost Optimization Reaction Dependency Assessment by Schultz et. al
Bmi219 2017 Proteinfolding 15 ⭐
UCSF BMI219 Deep Learning (2017), Coding example (Prediction of protein folding with RNN and CNN)
Chise.js 14 ⭐
A web application to visualize and edit the pathway models represented by SBGN Process Description Notation
Pcg 14 ⭐
𝙋𝙝𝙮𝙡𝙤𝙜𝙚𝙣𝙚𝙩𝙞𝙘 𝘾𝙤𝙢𝙥𝙤𝙣𝙚𝙣𝙩 𝙂𝙧𝙖𝙥𝙝 ⸺ Haskell program and libraries for general phylogenetic graph search
Nki Ccb Discover 14 ⭐
DISCOVER co-occurrence and mutual exclusivity analysis for cancer genomics data
Awesome Sequencing Tech Papers 19 ⭐
A collection of publications on comparison of high-throughput sequencing technologies.
Plink2 14 ⭐
pLink is a software dedicated for the analysis of chemically cross-linked proteins or protein complexes using mass spectrometry.
Biocommons.seqrepo 15 ⭐
non-redundant, compressed, journalled, file-based storage for biological sequences
Rnftools 13 ⭐
RNF framework for NGS: simulation of reads, evaluation of mappers, conversion of RNF-compliant data.
Nasqar 14 ⭐
NASQAR: A web-based platform for High-throughput sequencing data analysis and visualization
Hibag 13 ⭐
R package – HLA Genotype Imputation with Attribute Bagging (development, unstable version)
Iroki 13 ⭐
Super snazzy online phylogenetic tree viewer with automatic customization using simple, tab-separated text files.
Rna Playground 15 ⭐
Visualize the inner workings of RNA bioinformatics algorithms for structure prediction, interaction prediction and sequence alignment.
Pymod 18 ⭐
PyMod 3 - sequence similarity searches, multiple sequence/structure alignments, and homology modeling within PyMOL.
Phylotoast 12 ⭐
Tools for phylogenetic data analysis including visualization and cluster-computing support.
Scaff10x 13 ⭐
Pipeline for scaffolding and breaking a genome assembly using 10x genomics linked-reads
Partie 14 ⭐
PARTIE is a program to partition sequence read archive (SRA) metagenomics data into amplicon and shotgun data sets. The user-supplied annotations of the data sets can not be trusted, and so PARTIE allows automatic separation of the data.
Gslcore 12 ⭐
Core library and basic plug-ins for the Amyris Genotype Specification Language (GSL) compiler.
Clihelpparser 12 ⭐
Library for reading the output from CLI help commands, and generating machine readable schemas (CWL etc)
Epigenomicstutorial Ismb2017 12 ⭐
Repository for the Epigenomics Tutorial hold at ISMB 2017 in Prague
Longbow 13 ⭐
Longbow is a tool for automating simulations on a remote HPC machine. Longbow is designed to mimic the normal way an application is run locally but allows simulations to be sent to powerful machines.
Circompara 12 ⭐
:microscope: A multi-method comparative bioinformatics pipeline to detect and study circRNAs from RNA-seq data
Seave 12 ⭐
Seave is a web platform that enables genetic variants to be easily filtered and annotated with in silico pathogenicity prediction scores and annotations from popular disease databases. Seave stores genomic variation of all types and sizes, and allows filtering for specific inheritance patterns, quality values, allele frequencies and gene lists. Seave is open source and deployable locally, or on a cloud computing provider, and works readily with gene panel, exome and whole genome data, scaling from single labs to multi-institution scale.
Circtools 12 ⭐
circtools: a modular, python-based framework for circRNA-related tools that unifies several functionalities in a single, command line driven software.
Libflagstats 12 ⭐
Efficient C functions to compute the summary statistics (flagstats) for sequencing read sets.
Cloudconductor 11 ⭐
CloudConductor is a workflow management system that generates and executes bioinformatics pipelines
Immunedb 11 ⭐
ImmuneDB - A system for the analysis and exploration of high-throughput adaptive immune receptor sequencing data
Deepcovidexplainer 12 ⭐
DeepCOVIDExplainer: Explainable COVID-19 Diagnosis from Chest Radiography Images
Klugem Watchdog 11 ⭐
Workflow management system for the automated and distributed analysis of large-scale experimental data.
Sapporo 11 ⭐
SAPPORO is a workflow and individual task execution system. It is also useful for continuous testing of workflows.
Conekt 11 ⭐
CoNekT (short for Co-expression Network Toolkit) is a platform to browse co-expression data and enable cross-species comparisons.
Gaussdca.jl 11 ⭐
Multivariate Gaussian Direct Coupling Analysis for residue contact prediction in protein families - Julia module
Sequencework 13 ⭐
programs and scripts, mainly python, for analyses related to nucleic or protein sequences
Tangram 11 ⭐
:black_square_button::atom_symbol: A collection of molecular modelling tools for UCSF Chimera
Boecker Lab Sirius 13 ⭐
SIRIUS is a software for discovering a landscape of de-novo identification of metabolites using tandem mass spectrometry. This repository contains the code of the SIRIUS Software (GUI and CLI)
Review 11 ⭐
:earth_americas: Review materials for bioinformatics, computational biology, and science in general
Bio3Dview.jl 13 ⭐
A Julia package to view macromolecular structures in the REPL, in a Jupyter notebook/JupyterLab or in Pluto
Reductionwrappers 11 ⭐
R wrappers to connect Python dimensional reduction tools and single cell data objects (Seurat, SingleCellExperiment, etc...)
Peptools 10 ⭐
PepTools - An Immunoinformatics (Immunological Bioinformatics) R-package for working with peptide data
Dl Based Tumor Classification 12 ⭐
Deep Learning Based Tumor Type Classification Using Gene Expression Data
A Sparse Coding Based Approach For Class Specific Feature Selection 10 ⭐
A novel Sparse-Coding Based Approach Feature Selection with emphasizing joint l_1,2-norm minimization and the Class-Specific Feature Selection.
Sopang 10 ⭐
SOPanG, a simple tool for pattern matching over an elastic-degenerate string, a recently proposed simplified model for the pan-genome.
Synopsys Project 2017 10 ⭐
A deep learning based bioinformatics project on epigenetics in Type 2 Diabetes.
Guix Bimsb 11 ⭐
Packages for GNU Guix that have not yet or will not be submitted upstream for various reasons
Kidera Atchley 10 ⭐
Load biophysicochemical properties of amino acids, Kidera factors and Atchley factors into R and Python in only a single line.
Neurobind 10 ⭐
Yet Another Model Using Neural Networks for Predicting Binding Preferences of for Test DNA Sequences
Sequence Database Curator 10 ⭐
This program dereplicates and/or filter nucleotide and/or protein database from a list of names or sequences (by exact match).
Causalpath 10 ⭐
A project for exploring differentially active signaling paths related to proteomics datasets
Spreading Correction 11 ⭐
Supplementary information to "Computational correction of index switching in multiplexed sequencing libraries", Anton J.M. Larsson, Geoff Stanley, Rahul Sinha, Irving L. Weissman and Rickard Sandberg
Scanpav 10 ⭐
Pipeline to detect PAVs (presence/absence variations) in genome comparison using whole genome alignment.
Srijan Gsoc 2020 12 ⭐
Healthcare-Researcher-Connector Package: Federated Learning tool for bridging the gap between Healthcare providers and researchers
Ilus 17 ⭐
A handy pipeline generator for whole genome re-sequencing (WGS) and whole exom sequencing data (WES) analysis.
Dmrichr 11 ⭐
An executable and package for the statistical analysis and visualization of differentially methylated regions (DMRs) from CpG count matrices (Bismark cytosine reports)
Managing_your_biological_data_with_python_3 11 ⭐
<<Managing Your Biological Data with Python>> wirtten in Python 3
Guobioinfolab Catt 11 ⭐
An ultra-sensitive and precise tool for characterizing T cell CDR3 sequences in TCR-seq and RNA-seq data.
Deeppurpose 195 ⭐
A Deep Learning Toolkit for DTI, Drug Property, PPI, DDI, Protein Function Prediction
Npdtools 10 ⭐
Natural Product Discovery tools -- a toolkit containing various pipelines for in silico analysis of natural product mass spectrometry data
Deweylab Cello 10 ⭐
CellO: Gene expression-based hierarchical cell type classification using the Cell Ontology
bad-slug 15 ⭐
PanPhlAn is a strain-level metagenomic profiling tool for identifying the gene composition of individual strains in metagenomic samples
bad-slug 14 ⭐
Protwis is the backbone of the GPCRdb. The GPCRdb contains reference data, interactive visualisation and experiment design tools for G protein-coupled receptors (GPCRs).
bad-slug 12 ⭐
R/Pharma Conference 2020. Repository for the workshop "Artificial Neural Networks in R with Keras and TensorFlow" by Leon Eyrich Jessen
bad-slug 11 ⭐
Write PubMed search results with two display options (citation or listview) to PDF or Word
bad-slug 10 ⭐
Allelic decomposition and exact genotyping of highly polymorphic and structurally variant genes