Learning about people’s interests is hard. Traditionally, this has been done using surveys. And they are great when you have the audience and resources to do them.
However, recently social media has become a public source of people communicating their interests. Social listening has three dimensions; the audience, the topic and the time frame. In this case, we are interested in the audience of VCs in the US. We want to know all the topics they have been interested in for the past year.
To better define the audience, we use Contexto.io to help us find the most relevant community for our topic. With Contexto we can build a network of all the accounts of VCs in the US. From this network, we can download the content they generate. In this case we are interested in the links they share to news or blog posts, since they are a great source of content. We download these links from Contexto and upload them to Graphext.
Using web scraping to download the full articles from the links and NPL techniques, we built a network consisting of the articles. This network allows is to identify clusters of articles that are about similar topics.
In the left panel, you see at the top the significant terms of all the articles, and at the bottom the significant terms of the particular clusters that Graphext has identified automatically. Let’s explore some of these clusters. We can see a that the green one is about trump, and it has by its side a small cluster about “trump”, “tech leader” and “immigration”. We can click on them, and automatically learn what they are about, and their metrics.
Graphext has found that these two clusters are very similar, but about two sub narratives about the main Trump narrative. We can also see their temporality in the “Stream” section. From the analysis, we see that the peak of the immigration cluster was in January and February 2017, the cluster of Trump as a president also has peaks in May 2017 with the Trump-Russia investigation.
We can observe another interesting cluster, around the topic of Women in tech or VCs.
Graphext provides the option of “drilling down” clusters into sub-narratives. In this case the sub narratives found are about “Diversity in companies”, “Sexual Harassment” and “Females as founders and partners in VCs”. These sub-clusters are definitely very different, but all refer to the central theme of “women”.
Also, we could compare these clusters. For example, comparing the main red cluster about women with the main green cluster about Trump, we can see which media is being more linked for each topic. The links to the Trump cluster come mainly from The New York Times, Bloomberg, and links to Twitter. However, the links to the women cluster come from Medium, Techcrunch and Recode.