Detecting Political tweets based on hashtags: (single iteration proposed by Conover et al. can it be improved by multiple iterations?)
- start by labeling one popular/predictive hashtag from each camp.
- label new hashtags if they co-occur with already labeled hashtags above a threshold rate (not necessarily to be in the same camp)
- manually remove the false positives.
Constructing communication networks:
- vertices are tweeters of the political hashtags detected above.
- mention edge weights: number of mentions between the two users.
- retweet edge weights: number of retweets between the two users.
Clustering communication networks:
- starting with the retweet network constructed above, applies Newman’s modularity based clustering algorithm.
- cluster by label propagation method (Raghavan,2007): iteratively assign each node the label that is shared by most of its neighbors. (I don’t understand why need this step?)
Mentions form a communication bridge across which information flows between ideologically-opposed users; whereas, people with similar ideologies tend to retweet exclusively each other’s messages, especially propagandists:
- First, label one known popular user from each camp.
- At each iteration relabel the users by argmax(assoc1,…, assocn) where associ is the ratio of users retweeted of campi or/∪ by campi. Stop after some iterations.
- If at least a fraction f of the connections are to users in the same cluster then the user is a hyperadvocate; otherwise, the user is neutral.