Title: A comprehensive video dataset for Multi-Modal Recognition Systems
Type Dataset Anand Handa, Dr. Rashi Agarwal, Prof. Narendra Kohli (2018): A comprehensive video dataset for Multi-Modal Recognition Systems. Zenodo. Dataset. https://zenodo.org/record/1492227
Links
- Item record in Zenodo
- Digital object URL
Summary
A fully-labelled video dataset will act as a unique resource for researchers and analysts in the fields such as machine learning, computer vision and deep learning. The videos contain similar text recited by 67 different subjects. The text contains digits from 1 to 20 recited by 67 different subjects within the same experimental setup.
More information
- DOI: 10.5281/zenodo.1492227
- Language: en
Dates
- Publication date: 2018
- Issued: December 16, 2018
Notes
Other: The dataset folder contains the HD videos of 67 subjects. The corresponding sample for one video has been uploaded with the python scripts which can be customized for the entire dataset videos to get the frames, frames with Boundary Box detection, Audio of the entire video, split audio for the text being recited and the waveforms for entire video audio files and the split text. Uncompress the Video_Dataset_uploaded folder. There are two folders : 1. Main_video_dataset: This folder consists of all the HD videos of 67 subjects. 2. Pre-Processed dataset and scripts: This folder consists of samples for a single video such as frames, audio .wav files, split audio .wav files, and waveforms for both. It also consists of python scripts which can be used to extract the same information for all the videos of the dataset.Rights
- https://creativecommons.org/licenses/by/4.0/legalcode Creative Commons Attribution 4.0 International
- info:eu-repo/semantics/openAccess Open Access
Format
electronic resource
Relateditems
Description | Item type | Relationship | Uri |
---|---|---|---|
IsVersionOf | https://doi.org/10.5281/zenodo.1492226 | ||
IsPartOf | https://zenodo.org/communities/zenodo |