Awesome Dataset Tools

A curated list of awesome dataset tools

Labeling Tools


  • CVAT - Online, interactive video and image annotation tool for computer vision
  • COCO Annotator - Web-based image segmentation tool for object detection, localization and keypoints
  • Scalabel - Versatile and scalable tool that supports various kinds of annotations
  • EVA - EVA is a web-based tool for efficient annotation of videos and image sequences and has an additional tracking capabilities
  • LOST - Design your own smart Image Annotation process in a web-based environment
  • Boobs - Fast and efficient BBox annotation for your images in YOLO, VOC/COCO formats
  • MuViLab - Tool to help you labelling videos for computer vision
  • Turkey - Web UI on Amazon Mechanical Turk to crowd-source image segmentation
  • React Image Annotation - An infinitely customizable image tool built on React
  • Point Cloud Annotation Tool - Annotate 3D boxes in point cloud
  • ImageTagger - Open source online platform for collaborative image labeling
  • DeepLabel - A cross-platform image annotation tool for machine learning
  • Visual Object Tagging Tool - An electron app for building end to end Object Detection Models
  • VGG Image Annotator - Standalone image annotator application packaged as a single HTML file
  • SMART - Efficiently build labeled training datasets for supervised machine learning tasks
  • Pixel Annotation Tool - Uses the algorithm watershed marked of OpenCV to annotate images in directories
  • Pixie - GUI annotation tool which provides the bounding box, polygon, and semantic segmentation
  • Turktool - Modern React app for scalable bounding box annotation of images
  • LabelD - Simple image annotation tool to streamlining the overall process
  • Comma Coloring - Adult coloring book for image segmentation
  • LabelImg - Graphical image annotation tool and label object bounding boxes in images
  • LCs Finder - Image annotation and object detection tool written in C
  • js-segment-annotator - Javascript image annotation tool based on image segmentation
  • Cytomine - Analysis of multi-gigapixel images
  • labelme - Image Polygonal Annotation with Python (polygon, rectangle, circle, line, point and image-level flag annotation)
  • SimpleAnnotate - Open source video and image annotation software for, currently only for OSX
  • Sloth - Labeling image and video data for computer vision research
  • Fast Annotation Tool - Online platform for collaborative image annotation
  • Anno-Mage - Helps you in annotating images by suggesting you annotations for 80 object classes
  • MedTagger - Collaborative framework for annotating medical datasets using crowdsourcing
  • OpenLabeling - Labeling in multiple annotation formats
  • Alturos.ImageAnnotation - Collaborative tool for labeling image data for yolo
  • Yolo_mark - GUI for marking bounded boxes of objects in images
  • imglab - peedup and simplify image labeling/ annotation process with multiple supported formats
  • OpenLabeler - Open source desktop application for annotating objects
  • UltimateLabeling - A multi-purpose Video Labeling GUI with integrated SOTA detector and tracker

Closed Source

  • DataTorch - Platform for creating and shareing datasets.
  • Labelbox - Platform for data labeling, data management, and data science. Its features include image annotation, bounding boxes, text classification, and more
  • - Image annotation and data management tool that you can use create image and video datasets
  • Prodigy - Various machine learning models such as image classification, entity recognition and intent detection
  • RectLabel - Label images for bounding box object detection and segmentation
  • Lionbridge AI - Quickly annotate thousands of images and videos with relevant tags
  • - Medical image annotation tool for data labeling. Spports DICOM image format for radiology AI
  • Spare5 - Crowdsourcing service for tasks such as data and image annotation, language assessment, and more
  • Hive - Text and image annotation service that helps you create training datasets
  • Figure Eight - Supports audio , computer vision, natural language processing, and other data tasks
  • Dataturks - Image segmentation, named entity recognition (NER) tagging in documents, and POS tagging
  • Playment - Services offered include bounding boxes, points and lines, polygons, semantic segmentation, and more
  • Cogito Tech - Image annotation, content moderation, sentiment analysis, chatbot training
  • OCLAVI - Annotate Bounding Box, Polygon, Circle, Point and Cuboidal annotations with precision
  • Humans in the Loop - Use cases include face recognition, autonomous vehicles, and figure detection
  • WorkAround - Host and annotate data, manage projects, and build datasets alongside top companies
  • TaQadam - On-demand annotation with agents-in-the-loop
  • Zillin - Image annotation service for classification, object detection and segmentation with API access and georeferenced images support.
  • IBM Cloud Annotations - Simple and collaborative image annotation tool for teams and individuals inside ibm cloud environment.


  • Audio Annotator - JavaScript interface for annotating and labeling audio files
  • Dynitag - Web-based collaborative audio annotator tool
  • EchoML - play, visualize, and annotate your audio files for machine learning

Closed Source

  • Figure Eight - Supports audio , computer vision, natural language processing, and other data tasks

Time Series

  • Curve - An integrated experimental platform for time series data anomaly detection
  • TagAnomaly - Anomaly detection analysis and labeling tool, specifically for multiple time series
  • time-series-annotator - Implements classification tasks for time series.
  • WDK - Tools to facilitate the development of activity recognition applications with wearable devices


  • brat - For all your textual annotation needs
  • doccano - Open source text annotation tool for machine learning practitioner.
  • Inception - A semantic annotation platform offering intelligent annotation assistance
  • NeuroNER - Named-entity recognition using neural networks
  • YEDDA - For annotating chunk/entity/event on text, symbol and even emoji
  • TALEN - Web-based tool for annotating word sequences
  • WebAnno - Web-based annotation tool for a wide range of linguistic annotations
  • MAE - Lightweight, general-purpose natural language annotation tool
  • Anafora - Web-based raw text annotation tool
  • TagEditor - Label dependencies, parts of speech, Named entities, and text categories
  • ML-Annotate - Supports binary, multi-label and multi-class labeling of text

Closed Source

  • Hive - Text and image annotation service that helps you create training datasets
  • Figure Eight - Supports audio , computer vision, natural language processing, and other data tasks
  • LightTag Text Annotation Tool for Teams.



  • Muda - Python library for augmenting annotated audio data

Awesome Dataset Tools

🔧 A curated list of awesome dataset tools

Awesome Dataset Tools Info

⭐ Stars 409
🔗 Source Code
🕒 Last Update 7 months ago
🕒 Created 2 years ago
🐞 Open Issues 3
➗ Star-Issue Ratio 136
😎 Author jsbroks