Accent recognition is closely related to speech recognition, It is easy to fall into the overfitting situation if we only do simple accent classification, hence we introduce speech recognition task to build a multi-task model.
LWANet can segment surgical instruments in real-time while takes little computational costs. Based on 960×544 inputs, its inference speed can reach 39 fps with only 3.39 GFLOPs. Also, it has a small model size and the number of pa
Parametric UMAP embeddings for representation and semisupervised learning. From the paper "Parametric UMAP: learning embeddings with deep neural networks for representation and semi-supervised learning" (Sainburg, McInnes, Gentner
In this paper, we propose Minimalist Transfer Learning (MinTL) to simplify the system design process of task-oriented dialogue systems and alleviate the over-dependency on annotated data. MinTL is a simple yet effective transfer l
We introduce a new problem of retrieving 3D models that are deformable to a given query shape and present a novel deep deformation-aware embedding to solve this retrieval task. 3D model retrieval is a fundamental operation for rec
This package helps users draw bounding boxes around objects, without doing the clumsy math that you'd need to do for positioning the labels. It also has a few different types of visualizations you can use for labeling objects afte