Title: Reddit Mental Health Dataset
Type Dataset Low, Daniel M., Rumker, Laurie, Talker, Tanya, Torous, John, Cecchi, Guillermo, Ghosh, Satrajit S. (2020): Reddit Mental Health Dataset. Zenodo. Dataset. https://zenodo.org/record/3941387
Links
- Item record in Zenodo
- Digital object URL
Summary
This dataset contains posts from 28 subreddits (15 mental health support groups) from 2018-2020. We used this dataset to understand the impact of COVID-19 on mental health support groups from January to April, 2020 and included older timeframes to obtain baseline posts before COVID-19.
Please cite if you use this dataset:
Low, D. M., Rumker, L., Torous, J., Cecchi, G., Ghosh, S. S., & Talkar, T. (2020). Natural Language Processing Reveals Vulnerable Mental Health Support Groups and Heightened Health Anxiety on Reddit During COVID-19: Observational Study. Journal of medical Internet research, 22(10), e22635.
@article{low2020natural, title={Natural Language Processing Reveals Vulnerable Mental Health Support Groups and Heightened Health Anxiety on Reddit During COVID-19: Observational Study}, author={Low, Daniel M and Rumker, Laurie and Torous, John and Cecchi, Guillermo and Ghosh, Satrajit S and Talkar, Tanya}, journal={Journal of medical Internet research}, volume={22}, number={10}, pages={e22635}, year={2020}, publisher={JMIR Publications Inc., Toronto, Canada} }License
This dataset is made available under the Public Domain Dedication and License v1.0 whose full text can be found at: http://www.opendatacommons.org/licenses/pddl/1.0/
It was downloaded using pushshift API. Re-use of this data is subject to Reddit API terms.
Reddit Mental Health Dataset
Contains posts and text features for the following timeframes from 28 mental health and non-mental health subreddits:
15 specific mental health support groups (r/EDAnonymous, r/addiction, r/alcoholism, r/adhd, r/anxiety, r/autism, r/bipolarreddit, r/bpd, r/depression, r/healthanxiety, r/lonely, r/ptsd, r/schizophrenia, r/socialanxiety, and r/suicidewatch) 2 broad mental health subreddits (r/mentalhealth, r/COVID19_support) 11 non-mental health subreddits (r/conspiracy, r/divorce, r/fitness, r/guns, r/jokes, r/legaladvice, r/meditation, r/parenting, r/personalfinance, r/relationships, r/teaching).filenames and corresponding timeframes:
post: Jan 1 to April 20, 2020 (called "mid-pandemic" in manuscript; r/COVID19_support appears). Unique users: 320,364. pre: Dec 2018 to Dec 2019. A full year which provides more data for a baseline of Reddit posts. Unique users: 327,289. 2019: Jan 1 to April 20, 2019 (r/EDAnonymous appears). A control for seasonal fluctuations to match post data. Unique users: 282,560. 2018: Jan 1 to April 20, 2018. A control for seasonal fluctuations to match post data. Unique users: 177,089Unique users across all time windows (pre and 2019 overlap): 826,961.
See manuscript Supplementary Materials (https://doi.org/10.31234/osf.io/xvwcy) for more information.
Note: if subsampling (e.g., to balance subreddits), we recommend bootstrapping analyses for unbiased results.
More information
- URL: https://zenodo.org/record/3941387
- ISIDENTICALTO: https://doi.org/10.17605/OSF.IO/7PEYQ
- ISIDENTICALTO: https://doi.org/10.17605/OSF.IO/7PEYQ
- Language: en
Subjects
- Natural Language Processing, Mental Health, Psychiatry, COVID-19, Reddit, Social Media
Dates
- Publication date: 2020
- Issued: July 13, 2020
Rights
- http://www.opendefinition.org/licenses/odc-pddl Open Data Commons Public Domain Dedication and Licence 1.0
- info:eu-repo/semantics/openAccess Open Access
Format
electronic resource
Relateditems
Description | Item type | Relationship | Uri |
---|---|---|---|
IsDocumentedBy | https://doi.org/10.31234/osf.io/xvwcy | ||
IsPartOf | https://zenodo.org/communities/covid-19 | ||
IsPartOf | https://zenodo.org/communities/medicalnlp | ||
IsPartOf | https://zenodo.org/communities/natural-language-processing | ||
IsPartOf | https://zenodo.org/communities/zenodo |