CleverCSV provides a drop-in replacement for the Python csv package with improved dialect detection for messy CSV files. It also provides a handy command line tool that can standardize a messy file or genera
A simple pandoc setup to compile a book from markdown sources into html pages and pdf based on pandoc and python filters.
Tufte-inspired layout with sidenotes
Latex formulas via katex plugin
Alfred Markdown Notes
Markdown Notes is a comprehensive note taking tool embedded into Aflred with powerful full text search (supports & and |), tag search and search capabilities for todos ( - [ ] or * [ ]) . With
Excalibur: A web interface to extract tabular data from PDFs
Excalibur is a web interface to extract tabular data from PDFs, written in Python 3! It is powered by Camelot.
Note: Excalibur only works with te
A markdown renderer focusing on security first Building upon the strong foundation of GitHub's fork of cmark while adding additional security precautions to be safe out of the box.
When auditing appl
Requests-XML: XML Parsing for Humans
This library intends to make parsing XML as simple and intuitive as possible. Requests-XML is related to the amazing Requests-HTML and delivers the same quality of user exper
The macro_pack is a tool used to automatize obfuscation and generation of retro formats such as MS Office documents or VBS like format. Now it also handles various shortcuts for
A readability parser which can extract title, content, images from html pages
pip install jparser
from jparser import PageModel
html = urllib2.
Duckling is a Haskell library that parses text into structured data.
"the first Tuesday of October"
A Haskell environm
TextBlob: Simplified Text Processing
TextBlob is a Python (2 and 3) library for processing textual data. It provides a simple API for diving into common natural language p
Lark - a modern parsing library for Python
Parse any context-free grammar, FAST and EASY!
Beginners: Lark is not just another parser. It can parse any grammar you throw at it, no matter how complicated or ambiguous, an
This is a simple Markdown resumé template, LaTeX header, and pre-processing script that can be used with Pandoc to generate professional-looking PDF and HTML output.
The Markdown flavor supported is Pandoc mark
xart: generate art ascii texts.
xart is a pure Python library that provides an easy way to generate art ascii texts. Life is short, be cool.
██╗ ██╗ █████╗ ██████╗ ████████╗
Parse strings using a specification based on the Python format() syntax.
parse() is the opposite of
The module is set up to only export parse(), search(), findall(), and with_pattern() when import \* is used:
Create Elm type aliases and decoders based on JSON input
This project allows you to automate the creation of:
type aliases from JSON data
decoders from type aliases and some union types
Normalizr is a Python library for text normalization that offers a bunch of actions to manipulate your text as much as you want. With normalizr you can replace symbols, punctuation, remove stop words and much
Normality is a Python micro-package that contains a small set of text normalization functions for easier re-use. These functions accept a snippet of unicode or utf-8 encoded text and remove various classes o
unp is a command line tool that can unpack archives easily. It
mainly acts as a wrapper around other shell tools that you can
find on various POSIX systems.
It figures out how to invoke an unpacker to achieve the
phonenumbers Python Library
This is a Python port of Google's libphonenumber library It supports Python 2.5-2.7 and Python 3.x (in the same codebase, with no 2to3 conversion needed).
Original Java code is Copyright (
csvkit is a suite of command-line tools for converting to and working with CSV, the king of tabular file formats.
It is inspired by pdftk, gdal and the original csvcut tool by Joe Germuska and Aaron Bycoffe.
This is a Python implementation of John Gruber's Markdown. It is almost completely compliant with the reference implementation, though there are a few known issues. See Features for information on
Markdown is a light text markup format and a processor to convert that to HTML. The originator describes it as follows:
Markdown is a text-to-HTML conversion tool for web writers. Markdown allows you to write using an easy-to-
PyPDF2 is a pure-python PDF library capable of splitting, merging together, cropping, and transforming the pages of PDF files. It can also add custom data, viewing options, and passwords to PDF files. It can retr
Automated conversion and styling using LibreOffice
Universal Office Converter (unoconv) is a command line tool to convert any document format that LibreOffice can import to any document format that LibreOffi