This repo contains the inference code for two neural networks that were pre-trained on millions of videos. Both neural networks are rather small and should smoothly run in real time on a CPU.
The following steps have been confirmed to work on a Linux machine (Ubuntu 18.04 LTS). They probably also work on MacOS/Windows.
To begin, clone this repository to a local directory:
git clone [email protected]:TwentyBN/20bn-realtimenet.git cd 20bn-realtimenet
Create a new conda environment:
conda create -y -n realtimenet python=3.6 conda activate realtimenet
Install Python dependencies:
pip install -r requirements.txt
pip install -r requirements.txt will install the CPU-only version of PyTorch. To run inference on your GPU, another version of PyTorch should be installed (e.g.
conda install pytorch torchvision cudatoolkit=10.2 -c pytorch, see all available options here).
Pre-trained weights can be downloaded from here. After download, be sure to unzip and place the contents of the directory into
scripts/gesture_recognition.py applies our pre-trained models to hand gesture recognition. 30 gestures are supported (see full list here)
PYTHONPATH=./ python scripts/gesture_recognition.py
(full video can be found here)
Fitness Activity Tracking
scripts/fitness_tracker.py applies our pre-trained models to real-time fitness activity recognition and calorie estimation. In total, 80 different fitness exercises are recognized (see full list here).
(full video can be found here)
PYTHONPATH=./ python scripts/fitness_tracker.py --weight=65 --age=30 --height=170 --gender=female
Weight, age, height should be respectively given in kilograms, years and centimeters. If not provided, default values will be used.
Some additional arguments can be used to grab frames from a different source:
--camera_id=CAMERA_ID ID of the camera to stream from --path_in=FILENAME Video file to stream from. This assumes that the video was encoded at 16 fps.
It is also possible to save the display window to a video file using:
--path_out=FILENAME Video file to stream to
Best performance is obtained in these conditions:
- Camera on the floor
- Body fully visible (head-to-toe)
- Clean background
In order to estimate burned calories, we trained a neural net to convert activity features to the corresponding MET value. We then post-process these MET values (see correction and aggregation steps performed here) and convert them to calories using the user's weight.
If you're only interested in the calorie estimation part, you might want to use
scripts/calorie_estimation.py which has a slightly more detailed display (see video here which compares two videos produced by that script).
The estimated calorie estimates are roughly in the range produced by wearable devices, though they have not been verified in terms of accuracy. From our experiments, our estimates correlate well with the workout intensity (intense workouts burn more calories) so, regardless of the absolute accuracy, it should be fair to use this metric to compare one workout to another.
The code is copyright (c) 2020 Twenty Billion Neurons GmbH under an MIT Licence. See the file LICENSE for details. Note that this license only covers the source code of this repo. Pretrained weights come with a separate license available here.
This repo uses PyTorch, which is licensed under a 3-clause BSD License. See the file LICENSE_PYTORCH for details.