36 Open Source Data Generation Software Projects
Free and open source data generation code projects including engines, APIs, generators, and tools.
Awesome Ai Ml Dl 1005 ⭐
Awesome Artificial Intelligence, Machine Learning and Deep Learning as we learn it. Study notes and a curated list of awesome resources of such topics.
Data Augmentation Review 1095 ⭐
List of useful data augmentation resources. You will find here some not common techniques, libraries, links to GitHub repos, papers, and others.
Datahelix 118 ⭐
The DataHelix generator allows you to quickly create data, based on a JSON profile that defines fields and the relationships between them, for the purpose of testing and validation
Smartcat Labs Ranger 55 ⭐
Ranger is contextual data generator used to make sensible data for integration tests or to play with it in the database
Neuralyzer 47 ⭐
Neuralyzer is a library and a command line tool to anonymize databases (by updating existing data or populating a table with fake data)
Fake Data Generator 37 ⭐
Just a small open-source script to create fake data given a simple JSON model.
Imagedataaugmentor 73 ⭐
Custom image data generator for TF Keras that supports the modern augmentation module albumentations
Traffic Sign Recognition Basd On Synthesised Training Data 14 ⭐
Using synthetic data in combination with Deep Learning, to determine if a system can be made that will be able to recognise and classify correctly real traffic signs.
Datamaker 20 ⭐
Data generator command-line tool and library. Create JSON, CSV, XML data from templates.
Genalog 187 ⭐
Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and text alignment capabilities.
RAPIddweller Benerator Ce 56 ⭐
Benerator is a leading software solution to generate, obfuscate, pseudonymize and migrate data for development, testing, and training purposes.
Dbldatagen 39 ⭐
Generate relevant data quickly for your projects. The Databricks data generator can be used to generate large simulated / synthetic data sets for test, POCs, and other uses
K6 Example Data Generation 21 ⭐
Example repository showing how to utilise k6 and faker to load test using generated data
Autofillr 17 ⭐
A browser extension that fills registration forms with randomly but consistently generated fake data.
Codemixed Text Generator 16 ⭐
This tool helps automatic generation of grammatically valid synthetic Code-mixed data by utilizing linguistic theories such as Equivalence Constant Theory and Matrix Language Theory.
Vesselextract 10 ⭐
U-net based CNN for segmenting blood vessel and thereafter removal of vessels from fundus image
Optical Flow 2d Data Generation 11 ⭐
Caffe(v1)-compatible codebase to generate optical flow training data on-the-fly; used for the IJCV 2018 paper "What Makes Good Synthetic Training Data for Learning Disparity and Optical Flow Estimation?" (http://dx.doi.org/10.1007/s11263-018-1082-6)