Joe Murphy is a senior survey methodologist with over 17 years of research and project management experience. Mr. Murphy has extensive experience developing and applying new technologies and modes of communication to improve the quality, relevance, and efficiency of survey research. His recent work has centered on the use and analysis of social media to supplement survey data, with a detailed focus on Twitter. Mr. Murphy also investigates optimal designs for multi-mode data collection platforms, data visualization, crowdsourcing, and social research in virtual worlds. Mr. Murphy is a demographer by training and survey methodologist by practice. His significant research experience includes the substantive topics of energy, hospitals and health care, and substance use and mental health. Mr. Murphy is also a proficient SAS programmer, experienced in the analysis and manipulation of large, complex data sets. @joejohnmurphy
1. Twitter is like a giantopt-in survey with one question.
Twitter started in 2006 with a simpleprompt for its users: “what are you doing?” From a survey methodologist’s perspective, this isn’t really optimal question design. How people actually use Twitter is so varied,there might as well be no questionat all. We aren’t used to workingwith answers to a questionno one asked, and Twitter is a good example of what has been described as "organic data" – it just appears without our having designed for it. Tweets are limited to 140 characters in length. Pretty short, but a Tweet can capture a lot of information, and include links to other websites, photos,videos, and conversations.
2. Twitter is massive.
Every day, half a billionTweets are posted.Half a billion! That means by the time you finish readingthis, there will be approximately one millionnew Tweets. And the pace is only growing. With Twitter’s application programming interface (API) you can pull from a random1% of Tweets. To get at all Tweets, or the Firehose (100% of Tweets), you need to go through one of a few vendor and for a fee, though the Library of Congress is working on providing access in the future.
3. Twitter is increasingly popular on mobiledevices like smartphonesand tablets.
You’ll see people tweeting at events, as news is happening right in front of them, or where you don’t really expect or want to see them tweeting,like while they’re driving. Many use Twitter on mobile devices with another screen on at the same time. That’s called multiscreening. Like when people tweet while watching television in a backchannel discussion with friendsand fans of their favouriteshows.
4. The user-base is large, but it doesn’texactly reflect the general population.
It would be kind of weird itif did, honestly. Thereare surely many factorsthat influence thelikelihood of adoption and wouldn’t it be surprising if we saw no differences by demographics? The Pew Research Center estimates 16% of onlineAmericans now use Twitter, and about half of thosedo so on a typicalday. Users are younger, more urban, and disproportionately black non-His- panic compared to the generalpopulation. This is interesting when thinking aboutnew approaches for sometimes hard-to- reach populations.
5. It is made up of more than just people.
Twitter is not cleanly defined with one account per person or even just one personbehind every account.Some people have multiple accountsand some accountsare inactive. Groupsand organizations use Twitter to promote productsand inform followers. They can purchase“promoted Tweets” that show up in users’ streams like a commercial. And watch out for robots! Some soft- ware applications run automated tasksto query or Retweet con- tent making it extra challenging when trying to interpret the data.
6. There are researchapplications beyond trying to sup- plant surveyestimates.
Think about the surveylifecycle and wherethere may be needs for a large,cheap, timely sourceof data on behaviours and opinions or a standingnetwork of users to provideinformation. In the design phase of a survey, can we use Twitter to help identify items to include? Can we identifyand recruit subjectsfor a study using Twitter? How about a diary study when we needa more continuous data collection and want to let peoplework with a system they know insteadof trying to train them to do something unfamiliar? Can Twitter be used to disseminate study results? What about network analysis? Is there information that can be gleaned from someone’s network of friends and followers, or the spread of tweets from one (or few) users to many? We often think of public opinion as characterizing sentiment at a specificplace and time, but are there insightsto be had from Twitter on opinion formation and influence?
7. Twitter is cheap and fast, but making sense of it may not be.
What’sthe unit of analysis? Can we apply or adapt the total survey error framework when looking at Twitter? What does itmeanwhen someone tweetsas opposed to gives a response in a survey? Beyond demographics, how do Twitterusers differ from other populations? How can we accountfor Twitter’s exponential growth when analysing the data? The best answerto each right now is “it depends” or “more researchis needed.” We need a more solid understanding and some commonmetrics as we look to use Twitter for research. Work on this front is beginning but has a long way to go.
8. Naïve and generaltext mining methodsfor tweets can beseverely lacking in quality.
The brevity of tweets, inclusionof misnomers, misspellings, slang, and sarcasmmake sentiment analysisa real challenge. We’ve found the off-the-shelf systems pretty bad and inconsistent when coding sentiment on tweets. If you’re going to do automated sentiment analysis,be sure to account for nuances of your topic or populationas much as possible and have a human coding component for validation. One approach we’ve found to be promisingis to use crowdsourcing for human coding of tweet content.
9. Beware of the curse of Big Data and the file cabinet effect.
Searching for patternsin trillions of data points, you’re bound to find coincidences with no predictivepower or that can’t be replicated. The file cabinet effect is when researchers publish exciting results about Twitter but hide away their null or negative findings.
10. Surveys aren’t perfecteither.
Surveys are getting harder to complete with issues like declining response rates and reducedlandline coverage. Twitter isn’t a fix-all but it may be able to fill some gaps. It’ll take some focused study and creative thinkingto get there.
0 komentar:
Post a Comment