A very naive classifier to figure out if a sentence contains dirty words

is-dirty A very naive classifier to figure out if a sentence contains dirty words. Works for English well, and can handle popular Spanish, Indonesian, Arabic and Hindi explicit words. Motivation I quickly wrote up thi

Related Repos

babylonhealth Aligning the fastText vectors of 78 languages Facebook recently open-sourced word vectors in 89 languages. However these vectors are monolingual; meaning that while similar words within a language share similar vectors, translati

facebookresearch InferSent InferSent is a sentence embeddings method that provides semantic sentence representations. It is trained on natural language inference data and generalizes well to many different tasks. We provide our pre-trained sente

indiejoseph Hierarchical Attention Networks for Chinese Sentiment Classification This is HAN version of sentiment classification, with pre-trained character-level embedding, and used RAN instead of GRU. Dataset Downloaded from in

r9y9 nnmnkwii ([nanamin kawaii]) Library to build speech synthesis systems designed for easy and fast prototyping. Supported python versions: 2.7 and 3.6. Documentation STABLE — most recently tagged version o

pytorch Introduction FAIR Sequence-to-Sequence Toolkit (PyTorch) This is a PyTorch version of fairseq, a sequence-to-sequence learning toolkit from Facebook AI Research. The original authors of this reimplementation are (in no particula

Bohdan-Khomtchouk biosemble AI assembly of biological wordnets About biosemble is a Python natural language processing (NLP) software program for assembling biological wordnets from structured and unstructured

Kyubyong Natural Language Processing Tasks and Selected References I've been working on several natural language processing tasks for a long time. One day, I felt like drawing a map of the NLP field where I earn a living. I'm sure I'm not

johnsnowlabs Spark NLP: State of the Art Natural Language Processing Spark NLP is a Natural Language Processing library built on top of Apache Spark ML. It provides simple, performant & accurate NLP annotations for machine learning