Twitter and Obesity
Data Reduction
Using an SQL query, we reduced our database of tweets to a more manageable amount. The query consisted of keywords and hashtags relevant to our study, such as “workout” and #foodporn. A bounding box was used to limit our results to the greater Vancouver area.
54,000 Tweets
SQL
Query
Tweet Database
> 700 million tweets
Getting Close to the Data
The next step was reading through the tweets to identify patterns and evaluate our original SQL query. We identified three main themes: Diet, Lifestyle/Activity, and Appearance. We removed erroneous terms from our query and added relevant terms. We then ran the new query on the database to retrieve the tweets we would analyze.
Qualitative Coding
Qualitative coding involved formally assigning themes and subthemes to tweets using Nvivo. The parent themes we identified were Healthy and Unhealthy, and the subthemes were Diet, Lifestyle/Activity, and Appearance. Once the tweets were coded, we were able to analyze the distribution of tweets between themes. Additionally, since each tweet had an latitude and longitude coordinate, we were able to analyze the spatial distribution as well.
Healthy
Unhealthy
Lifestyle
/Activity
Diet
Appearance
Lifestyle
/Activity
Diet
Appearance
Coding Examples
Unhealthy
Diet
Lifestyle
Appearance
Diet
Lifestyle
Appearance
Healthy