ETL Library for Machine Learning - data pipelines, data munging and wrangling

IMPORTANT: THIS REPOSITORY HAS BEEN ARCHIVED AND DATAVEC HAS BEEN MOVED DataVec has been migrated to a sub-module of the mono-repository here: https://github.com/deeplearning4j/deeplearning4j All future development will continue

Related Repos



apache Apache Horn The Apache Horn is an Apache Incubating project, a neuron-centric programming model and Sync and Async hybrid distributed training framework, supports both data and model parallelism for training large models with mas
 

stratospark Accelerating Deep Learning with Multiprocess Image Augmentation in Keras Code available @ https://github.com/stratospark/keras-multiprocess-image-data-generator Introduction Benchmark: CIFAR10 - In Memory Performance, Ima
 

Jasonnor Backpropagation Using Java Swing to implement backpropagation neural network. Learning algorithm can refer to this Wikipedia page. Input consists of several groups of multi-dimensional data set, The data were cut into three part
 

taki0112 Vector_Similarity Python, Java implementation of TS-SS called from "A Hybrid Geometric Approach for Measuring Similarity Level Among Documents and Document Clustering" Also, I have summarized "A Hybrid Geometric Approach for
 

yuantiku Ytk-learn is a distributed machine learning library which implements most of popular machine learning algorithms. It runs on single, multiple machines and major distributed environments(hadoop, spark),and supports major operating systems(Li
 

Ldpe2G DeepLearning for fun Implementation of some interesting ideas of deeplearning with MXNet List of implementations Self-Normalizing Networks Pix2Pix CycleGAN Playing Flappy Bird Using Deep Reinforcement Learn
 

Azure Microsoft Machine Learning for Apache Spark MMLSpark is an ecosystem of tools aimed towards expanding the distributed computing framework Apache Spark in several new directions. MMLSpark adds many deep learning and
 

ICT-BDA Easy Machine Learning What is Easy Machine Learning Machine learning algorithms have become the key components in many big data applications. However, the full potential of machine learning is still far from been reali