Detecting Political tweets based on hashtags: (single iteration proposed by Conover et al. can it be improved by multiple iterations?)
- start by labeling one popular/predictive hashtag from each camp manually.
- label new hashtags if they co-occur with the already labeled hashtags
- remove hashtags that co-occur below a threshold value, also manually remove the false positives.
Constructing communication networks:
- vertices of this network are tweeters of the political hashtags detected above.
- mention edge weights: number of mentions between the two users.
- retweet edge weights: number of retweets between the two users.
Clustering communication networks:
- starting with the retweet network constructed above, apply Newman’s modularity based clustering algorithm.
- cluster by label propagation method (Raghavan,2007): iteratively assign each node the label that is shared by most of its neighbors.
Mentions form a communication bridge across which information flows between ideologically-opposed users; whereas, people with similar ideologies tend to retweet exclusively each other’s messages, especially propagandists:
- First, label one known popular user from each camp.
- At each iteration relabel the users by argmax(assoc1,…, assocn) where associ is the ratio of users retweeted of campi or/∪ by campi. Stop after some iterations.
- If at least a fraction f of the connections are to users in the same cluster then the user is a hyperadvocate; otherwise, the user is neutral.