A Python tool that automatically cleans data sets and readies them for analysis.

datacleaner A Python tool that automatically cleans data sets and readies them for analysis. datacleaner is not magic datacleaner works with data in pandas DataFrames. datacleaner is not magic, and it won't
Category: Python / Data Analysis
Watchers: 54
Star: 954
Fork: 197
Last update: Oct 18, 2021

Related Repos

juliasilge tidytext: Text mining using tidy tools

khuyentran1401 Efficient Python Tricks and Tools for Data Scientists

avito-tech Framework for creating efficient data processing pipelines.

mdecrevoisier Set of Mindmaps providing a detailed overview of the different #Windows auditing capacities and event log files.

coolbutuseless numberwang will convert floating point numbers (and integers) to their word representations, and vice versa.

posthog PostHog is an open-source product analytics suite, built for developers. Automate the collection of every event on your website or app, with no need to send data to 3rd parties.

mapillary Mapillary Street-level Sequences (MSLS) is a large-scale long-term place recognition dataset that contains 1.6M street-level images.

awslabs DenseClus is a Python module for clustering mixed type data using UMAP and HDBSCAN. Allowing for both categorical and numerical data, DenseClus makes it possible to incorporate all features in clustering.

SciML PreallocationTools.jl is a set of tools for helping build non-allocating pre-cached functions for high-performance computing in Julia.

Amaguk2023 This is my first Data Engineering project, it extracts data from the user's recently played tracks using Spotify's API, transforms data and then loads it into Postgresql using SQLAlchemy engine. Data is shown as a Spark Dataframe before loading and the whole ETL job is scheduled with crontab.

MoH-Malaysia Official data on the COVID-19 epidemic in Malaysia. Powered by CPRC, CPRC Hospital System, MKAK, and MySejahtera.

ryxcommar Implementation of Stata's tabulate command in Pandas for extremely easy to type one-way and two-way tabulations.

vopani To provide 100 datatable exercises over different sections structured as a course or tutorials to teach and learn for beginners, intermediates as well as experts.

daleroberts Implementation in R of the Black Scholes formula and some greeks.

CITF-Malaysia Official data on Malaysia's National Covid-​19 Immunisation Programme (PICK). Powered by MySejahtera.