Nonverbal Vocalization Data

Enter your email below for pricing and additional information. For faster reply, please contact - contact@deeplyinc.com

Thank you! We will contact you soon.

Summary

The Nonverbal Vocalization Dataset is a human nonverbal vocal sound dataset (a.k.a. vocal characterizer) consisting of 56.7 hours of short clips from 1419 speakers, crowdsourced by the general public in South Korea. Also, the dataset includes metadata such as age, sex, noise level, and quality of utterance. 16 classes of Included human nonverbal sound data contain ‘teeth-chattering’, ‘teeth-grinding’, ‘tongue-clicking’, ‘nose-blowing’, ‘coughing’, ‘yawning’, ‘throat clearing’, ‘sighing’, ‘lip-popping’, ‘lip-smacking’, ‘panting’, ’crying’, ‘laughing’, ‘sneezing’, ‘moaning’, and ‘screaming’.

Device : Android phones

Volume (sample) : ~ 57(~ 0.6) hours, ~ 70,000(~ 800) utterances, ~ 18(~ 0.1) GB, ~ 1500(~ 500) speakers

Format : wav/h5 (16/44.1kHz, 16-bit, mono)

Refer to the dataset descriptions in 'docs' for detailed description and statistics of the full set of the dataset.

 

The sample audio data is a subset (approximately 1%) of a much bigger dataset which were recorded under the same circumstances as these open source samples.

Please contact us (contact@deeplyinc.com) for the pricing and licensing.

Featured Nonverbal Sound Sample

  • Coughing Sound

  • Crying Sound

  • Screaming Sound

  • Moaning Sound

  • Laughing Sound

    And 11 more Sound!

Click here to download entire sample data 

00:00 / 00:05
00:00 / 00:03
00:00 / 00:02
00:00 / 00:04
00:00 / 00:05

Dataset statistics

The illustrations below are the statistics about the Deeply Nonverbal Vocalization dataset. The first two are from the sample audio data, And the others are from the full dataset. To attain more insight about the dataset, please refer to the detailed description in 'docs'.

fig0 (1).png
fig2 (1).png
fig1 (1).png
fig3 (1).png
fig4 (1).png

Structure

dataset

├── dataset
│   ├── Nonverbal_Vocalization_metadata.json
│   ├── coughing
│   │   ├── 0C1S_4_8_0_27_0_1_1.wav
│   │   ├── ...
│   ├── crying
│   │   ├── 1TCO_11_10_0_20_0_0_0.wav
│   │   ├── ...
│   ├── ...
│   ├── ...
│   ├── tongue-clicking
│   │   ├── 06RU_2_7_1_38_0_0_0.wav
│   │   ├── ...
│   └── yawning
│       ├── 0DYI_5_10_1_12_0_1_0.wav
│       ├── ...
└── docs
   ├── Deeply\ Nonverbal\ Vocalization\ Dataset\ description_Eng.pdf
    └── Deeply\ Nonverbal\ Vocalization\ Dataset\ description_Kor.pdf

Nonverbal_Vocalization_metadata.json

{

   'LAA7': {'sex': 'Male',

   'age': 22,

   'class': ['teeth-chattering', 'teeth-grinding', 'lip-smacking']},

   ...

   'WVST': {'sex': 'Female',

   'age': 15,

   'class': ['nose-blowing','coughing','yawning','throat-clearing','sighing',

   'lip-popping','sneezing','screaming']} }

Filename convention

{speaker_ID}_{class}_{trial}_{sex}_{age}_{location}_{quality}_{noise}.wav

Class: {0: ‘teeth-chattering’, 1: ‘teeth-grinding’, 2: ‘tongue-clicking’, 3: ‘nose-blowing’,

        4: ‘coughing’, 5: ‘yawning’, 6: ‘throat-clearing’, 7: ‘sighing’, 8: ‘lip-popping’,

        9: ‘lip-smacking’, 10: ‘panting’, 11: ‘crying’, 12: ‘laughing’, 13: ‘sneezing’,

       14: ‘moaning’, 15: screaming’}

Sex: {0: ‘Female’, 1: ‘Male’}

Location: {0: ‘indoor’, 1: ‘outdoor’}

Quality: {0: ‘High’, 1: ‘Low’}

Noise: {0: ‘Noiseless’, 1: ‘Noisy’}

License

Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)

Thank you! We will contact you soon.

Enter your email below for pricing and additional information. For faster reply, please contact - contact@deeplyinc.com

2_Horziontal_transparent.png

Copyright © Deeply, Inc. All rights reserved.

Office : E02, Space Sallim 2F, 10, Noryangjin-ro, Dongjak-gu, Seoul, Republic of Korea

Tel : +82 70-7459-0704

E-mail : contact@deeplyinc.com