Twins: Revisiting the Design of Spatial Attention in Vision Transformers

Very recently, a variety of vision transformer architectures for dense prediction tasks have been proposed and they show that the design of spatial attention is critical to their success in these tasks

Related Repos

charliegerard đź‘€ Use machine learning in JavaScript to detect eye movements and build gaze-controlled experiences.

ucbds-infra Otter Grader is a light-weight, modular open-source autograder developed by the Data Science Education Program at UC Berkeley

neulab Heavy Workload on Reviewing Papers? ReviewAdvisor Helps out

HKUST-KnowComp TransOMCS is a commonsense knowledge resource transferred from ASER. It is in the format of OMCS but two orders of magnitude larger.

DonkeyShot21 A simple and complete implementation of Bootstrap your own latent: A new approach to self-supervised Learning in PyTorch + PyTorch Lightning.

TUI-NICR ESANet: Efficient RGB-D Semantic Segmentation for Indoor Scene Analysis

microsoft UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation.

PhilipZRH FERM is a framework that enables robots to learn tasks within an hour of real time training.