NLP Recipes for Japanese
This repository contains samples codes for natural language processing in Japanese. It's highly inspired by microsoft/nlp-recipes.
Content
The following is a summary of the commonly used NLP scenarios covered in the repository. Each scenario is demonstrated in one or more scripts or Jupyter notebook examples that make use of the core code base of models and repository utilities.
Category | Methods |
---|---|
Basic | Normalization, Sentence Segmantation, Ruby |
Embeddings | Word2Vec |
Feature Engineering | Bag-of-Words, TF-IDF, BM25, SWEM, SCDV |
Morphological Analysis | Konoha, nagisa |
Sentence Similarity | Cosine Similarity |
Text Classification | TF-IDF & Logistic Regression, TF-IDF & LightGBM, BERT |
Visualization | Visualization with Japanese texts |
Environment
docker-compose up -d --build
docker exec -it nlp-recipes-ja bash