How to Harness Sexual Energy for Greater Mindpower

In order to harness mating energy and gain greater mindpower, it is important to understand what mating energy is and how to generate it. Mating energy is a type of sexual energy that is released…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




Analyzing Medium posts to understand impact of Cambridge Analytica Scandal

Has there been a focus shift towards privacy since the scandal?

When the Cambridge Analytica- Facebook scandal emerged, articles related to misuse of user data by technology companies were ubiquitous. The issues raised related to privacy made me want to understand the impact of the scandal on the views people had on Facebook and how their perception of Facebook has changed since the scandal.

The Scandal — A brief outline

This analysis considers 27,317 posts published in English between Jan 1st 2016 to Jul 31st 2018 which had a “Facebook” tag and were not responses. From the overall distribution of posts (see below), you can observe the huge spike in the number of posts between March and April 2018 which coincides with the time the scandal broke.

The scandal led to a serious debate on privacy and user data infringement. I wanted to check if the scandal was a wake-up call towards understanding Privacy issues on social media like Facebook, requiring an understanding of the post.

On Medium, one way to do this is to use the Tags associated with posts. Tags are topic identifiers on Medium. They allow users to retrieve all stories of a particular topic.

What data said was in line with my expectation — pre-scandal, four of the top 10 tags associated with Facebook was related to advertising and marketing, one of the primary revenue generating sources for Facebook. The number of posts with the tag “Privacy” between Jan 2016 to 16 March 2018, were only 481 but between March 17th to July 31st 2018 there were 1005 such posts.

Distribution of Posts Tagged Privacy or Data on Medium

From the images above, it looks like the scandal was a wake-up call and people have been unaware or negligent about the usage of their data and its potential impact. As we can see from the image above, the number of posts per day related to Privacy is higher than what it was before the scandal.

We saw above that the number of posts tagged Privacy or Data has increased after the scandal, but it doesn’t give an idea about the central idea or context of the post. For this purpose, we extracted the content of a post using Scrapy by passing the URL of every post. Medium does not directly provide us with the link of the post, so we concatenated the “https://medium.com/s/story/ “ and the unique slug of a post to get the URL.

To understand the themes of posts which have “Privacy” or “Data” tags, we calculate the TF-IDF matrix and cluster documents using K-Means based on cosine distance. TF-IDF indicates the importance of a word in a document in a collection of documents. Term Frequency (tf) calculates the number of occurrences of a word in a document. Inverse Document Frequency (idf) calculates how frequent is a word in the entire corpus. If a word occurs in every document in the corpus — it is not a rare or a significant word. Using this technique, we identified 12 clusters.

On identification of the clusters, we needed to understand the themes of each of these clusters, so for each cluster, we got the top 10 tags, excluding “Facebook”, “Privacy” and “Data”. Few cluster themes stood out — Cluster 9, which had the majority of the posts was about Social Media, Technology and Data Science and its influence on Politics. Cluster 0, was about the effect of the scandal and how Advertising model of Facebook was affecting privacy. The introduction of GDPR in European Union, was the central theme in Cluster 2.Cluster 7, had posts related to the data breach and the scandal and cluster 3, had posts related to Zuckerberg’s testimony . We also identified a cluster, which had “Eid100” as a tag — this tag was about getting working knowledge about digital technologies. Cluster 6 about Privacy in newer technologies like Augmented Reality, Artificial Intelligence. Cluster 8, was about Privacy issues in Whatsapp and its Encryption. However, across clusters, except for cluster 6, every other cluster had “Cambridge Analytica” in the top 10 tags. This is another evidence to the fact that we as users have been negligent towards Privacy till the scandal.

Tags Associated with Different Clusters

We looked at the different themes that emerged out of posts related to Privacy or Data and we also saw an increase in number of such posts after the scandal. It is now time to look at how user engagement (claps, recommends) has changed across this themes- what are users more interested in reading and how is it different post and pre-scandal.

Cluster 0, which was about how the advertising model of Facebook was affecting privacy has been losing user engagement. The introduction of GDPR(cluster 2) has high recommends and claps post the scandal. Cluster 8, which was about Whatsapp and its Encryption, saw an increase in the number of claps and recommends post-scandal, even though 77% of the posts were before the scandal. Cluster 3 and 7 were predominantly post-scandal. Overall, the interest of the user regarding Privacy spiked after the scandal.

Add a comment

Related posts:

Is social media fatigue a real thing?

According to Dr. Pragya Agarwal, 80% of people reported feeling stressed and overwhelmed about their social media every day. More than 52% of small business owners confirmed that they were ‘feeling…

How To Teach Physical Therapists Better Than Anyone Else

When I first decided I wanted to be effective in some form of healing, I had grand visions of curing a few of the world's most serious diseases. I imagined myself winning Nobel Prizes and…

Will Artificial Intelligence support Consultants.

The world is at the cusp of the digital revolution and during this transformation, the needs of industry are also changing. In view of the rapidly growing demand of Information Technology…