Using speech-to-text to fully check out during con calls

Say What? This script listens to meetings I'm supposed to be paying attention to and pings me on hipchat when my name is mentioned. It sends me a transcript of what was said in the minute before my name was mentioned and some ti

Related Repos

zzw922cn awesome-speech-recognition-speech-synthesis-papers automatic speech recognition/speech synthesis paper roadmap, including HMM, DNN, RNN, CNN, Seq2Seq, Attention Introduction Automatic Speech Recognition has been inves

aofdev vue-speech A Vue2 Performs synchronous speech recognition with Google Cloud Speech on Progressive Web App Config Step 1 Enable the Cloud Speech API fo

aofdev vue speech streaming A Vue2 Performing Streaming Speech Recognition with Google Cloud Speech on Progressive Web App Authentication Visit the Google De

vishwajeetanand21 Text to speech is a process to convert any text into voice. Text to speech project takes words on digital devices and convert them into audio. Here I have used Google-text-to-speech library popularly known as gTTS library to convert text file to .mp3 file. Hope you like my project!

codeforequity-at Botium Speech Processing Botium Speech Processing is a unified, developer-friendly API to the best available free and Open-Source Speech-To-Text and Text-To-Speech services. What is it ? Botium Speech Processing i

espnet ESPnet is an end-to-end speech processing toolkit, mainly focuses on end-to-end speech recognition and end-to-end text-to-speech. ESPnet uses chainer and pytorch as a main deep learning engine, and also follows Kaldi style data processing, feature extraction/format, and recipes to provide a complete setup for speech recognition and other speech processing experiments.

vadimkantorov Discord Speech-To-Text bot in Python using Google Cloud Speech-To-Text API

buriburisuri Speech-to-Text-WaveNet : End-to-end sentence level English speech recognition using DeepMind's WaveNet A tensorflow implementation of speech recognition based on DeepMind's WaveNet: A Generative Model for Raw Audio. (Hereafter th

LianjiaTech Athena is an open-source implementation of end-to-end speech processing engine. Our vision is to empower both industrial application and academic research on end-to-end models for speech processing. To make speech processing available to everyone, we're also releasing example implementation and recipe on some opensource dataset for various tasks (Automatic Speech Recognition, Speech Synthesis, Voice Conversion, Speaker Recognition, etc).

tts-tutorial Text to speech (TTS), or speech synthesis, which aims to synthesize intelligible and natural speech given text, has been a hot research topic in speech, language, and machine learning communities and has become an important commercial service in the industry.

Kyubyong A TensorFlow Implementation of DC-TTS: yet another text-to-speech model I implement yet another text-to-speech model, dc-tts, introduced in Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks with Gui

rishikksh20 Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration This repo contains only model Implementation of Zero-Shot Text-to-Speech for Text

ttaoREtw Semi-supervised Learning for Multi-speaker Text-to-speech Synthesis Using Discrete Speech Representation

speechbrain SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch. The goal is to create a single, flexible, and user-friendly toolkit that can be used to easily develop state-of-the-art speech technologies, including systems for speech recognition, speaker recognition, speech enhancement, multi-microphone signal processing and many others.

at16k at16k Pronounced as at sixteen k Click here for live demo What is at16k? at16k is a Python library to perform automatic speech recognition or speech to text conversion. The goal of this project is to provide

Kyubyong Speech Recognition Using Tacotron Motivation Tacotron is an end-to-end speech generation model which was first introduced in Towards End-to-End Speech Synthesis. It takes as input text at the character level, and targe

boku7 Assembly HellGate implementation that directly calls Windows System Calls and displays the PPID of the explorer.exe process.

danielsun1106 Yet another extremely fast alternative for Java reflection, making dynamic calls as efficient as direct calls.

Picovoice Speech-to-Text Benchmark Made in Vancouver, Canada by Picovoice This is a minimalist and extensible framework for benchmarking different speech-to-text engines. It has been developed and tested on Ubuntu 18.04 (x86_64) using Pyt