Text normalization library for Python

normalizr Normalizr is a Python library for text normalization that offers a bunch of actions to manipulate your text as much as you want. With normalizr you can replace symbols, punctuation, remove stop words and much more.

Related Repos



hit9 img2txt Image to Ascii Text, can output to html or ansi terminal. See also gif2txt for animated version. Example img2txt.py jiaozhu.jpg > without-color.html : demo img2txt.py jiaozhu.jpg --dith
 

erinxocon Requests-XML: XML Parsing for Humans This library intends to make parsing XML as simple and intuitive as possible. Requests-XML is related to the amazing Requests-HTML and delivers the same quality of user experience — wi
 

Hultner ♜ safemd A markdown renderer focusing on security first Building upon the strong foundation of GitHub's fork of cmark while adding additional security precautions to be safe out of the box. When auditing applications r
 

camelot-dev Excalibur: A web interface to extract tabular data from PDFs Excalibur is a web interface to extract tabular data from PDFs, written in Python 3! It is powered by Camelot. Note: Excalibur only works with text-based P
 

Acidham Alfred Markdown Notes Markdown Notes is a comprehensive note taking tool embedded into Aflred with powerful full text search (supports & and |), tag search and search capabilities for todos ( - [ ] or * [ ]) . With MD Notes y
 

mrtzh Unbuch A simple pandoc setup to compile a book from markdown sources into html pages and pdf based on pandoc and python filters. Features: Tufte-inspired layout with sidenotes Latex formulas via katex plugin Environments
 

alan-turing-institute CleverCSV provides a drop-in replacement for the Python csv package with improved dialect detection for messy CSV files. It also provides a handy command line tool that can standardize a messy file or generate Python
 

dividuum Generates self-contained HTML files protecting secret text content.