Monthly Archives: December 2013

Turkish Media in 2D (MDS & PCA Plots)

Multidimensional scaling (MDS) allows us to visualize (dis)similarities in 2D by trying to preserve the distance between the objects as much as possible. The positioning of the newspapers in the image below is generated using this technique (employed manifold.MDS class in scikit-learn). The colors of the labels are the cluster colors obtained by modularity measure as stated in a previous post.

We also considered reducing the dimension by Principal Component Analysis (PCA). Resulting image is very similar to that of MDS and that is also available here.

This visualization helps us to see beyond clusters. For example (along X-axis) the newspapers to the right are right-wing newspapers while the ones to the left are more leftist. And as we go from center to periphery we see that more popular media are close to the central while the ones to the periphery are more isolated/extreme.

Multidimensional Scaling

Multidimensional Scaling applied to Turkish Media Follower Similarity on Twitter



Doctor Referral Network

[This study will be presented at Sunbelt 2014 Conference on February 2014]

Examining the referral network characteristics of the doctors and specialties may help us understand the workflows in the healthcare system better. In this study we reveal the patterns as well as anomalies in a complex network. We call it complex because we are dealing with many networks at different scale. Referral network in a zip code area emerges with others to form a city or county network, and county networks together form a state network. Besides, network of physicians is very different from that of specialties where doctors grouped by the same specialty forms a single vertex.

In the first phase of this project, weighted directed graphs of specialties are generated. The graph below is generated from National Claims History (NCH) records using Gephi with Force Atlas 2 layout.

Curved edges in clock-wise shows the direction, size of the nodes indicates number of distinct specialties they refer to, widths of the edges are the patient traffic load, colors of the nodes denote the total patients they are referred to:

Specialty Referral Network of a County

Specialty Referral Network of a County

Turkish Media Analysis Exploiting Twitter

[UPDATE on 1/11/2014]: Meeting the first item in the future work list below: Groups & Media in Turkey.

This study examines the relationship between the followers of Turkish media and their ideological and political positioning exploiting Twitter, a popular micro-blogging and social networking platform. Traditional media, i.e. newspapers, magazines and television channels have accounts/profiles/pages on social media where they post the title or a short excerpt of a news along with a link to the page that has the full article/story. Therefore, audience profile of a traditional news source expected to be similar to its followers on social media. Based on this assumption, in this study we examine the relative positioning of traditional media, first by leveraging the characteristics of their followers. Second, we exploit the posts/status updates of the news sources.

We just completed the first phase of the project. Here is an adjacency matrix diagram visualizes readership similarity/distance of Turkish Media by exploiting Twitter Followers API. We collected all of the follower IDs of 31/35 Turkish Media Twitter accounts and four of the newspapers hit our 1 million followers limit in the code. (Note: Twitter API returns the most recent followers). Here is the follower counts of Turkish media on Twitter:


Twitter Follower Counts of Turkish Media

Twitter Follower Counts of Turkish Media

Future Work:

According to the phenomenon of value homophily, people are more inclined towards to the worldviews of their source of information. So, user profiling can be  extended to some other domains as well.

  • Common follower correlations in other dimensions such as politicians, celebrities, etc.
  • Hashtag/Text Similarity of newspapers/columnists/readers
  • First name, location, access (mobile devices/web), etc. distribution
  • Predicting readership from one’s old tweets
  • Visualization techniques for understanding the data and findings better