Why and how I deleted 4000+ connections from Linkedin

The Procession of the Trojan Horse in Troy by Domenico Tiepolo (1773)
The Procession of the Trojan Horse in Troy by Domenico Tiepolo (1773)

 

First things first: could someone remind me of the main goal (or goals) of using Linkedin as a professional social network? 

It undeniably seems to me that, while I had more than a few thousand connections and followers, for all the times I have opened Linkedin to check what was going on (or to reply to that desperate student looking for advice or an internship), I could not fully process what was posted and the quantity of things that are of absolutely no interest to me personally.


In this post, I will explain the reasons that lead me to delete a big chunk of my Linkedin Network, and how I defined and identified those that have to be deleted


As for Facebook, Instagram or even Twitter, every place has its own use. 

Keeping in touch with close friends and family on Facebook, taking interest in following beloved artists and brands on Instagram, and enjoying the occasional discussions with likeminded people on Twitter is according to me, the appropriate way of utilizing each and every social network website.

However, for Linkedin it has alway been a relationship of love and not-so-much-love. You may wonder why ... 

I think that the way I view and use Linkedin are far different from what other people are using it for.

Like any product, we may have different ways of using it: 

a car can drive you from point A to point B, and it can also have a couple of subwoofers and loud speakers to play music as if you were in a night club. 

I don't mind having cars in our cities per say (a part from them being a source of pollution, accidents and produce a lot of waste),  they solve a fundamental problem of transporting people and goods, however in my personal opinion,  playing loud music around urban areas is not what a car was mainly built for.

Therefore, living in an area where most people try to compete for the loudest bass coming out of their trucks at 2 am in the morning is not something I would welcome with a happy attitude, and moving to a calmer area with more civilized people would be on top of my to-do list.

The same analogy could be applied to Linkedin. 

We all have our uses of this website, and despite other people's different understanding of why they are there, I do not intend to move out of it, especially since in real life we don't have the luxury of a "delete connection" and an "unfollow button ( yeah I hear you wishing only if that were true ).

So what is Linkedin anyways ? Let's break it down by asking more specific questions. 

Is it :

  • a job posting site?
  • a professional network?
  • a place to share your kid's drawings?
  • a place to ask for donations?
  • a place to promote products?
  • a place to find love?
  • a place to scam / be scammed?
  • a place to post political views?
  • a place to talk about religion?
  • a place to gather information / spy of people ?

Digging deeper requires much needed enlightenment regarding this matter. We all have observed a certain change in this website. I believe it all started with someone posting a meme or a funny joke, people liking it, others adding them to their network, and BAM! 

The change happened. I've seen it all... I mean all types of garbage...

As a principal, I'm all for seeing a cute kid's drawing of his father's ugly face with spaghetti sauce on a dirty napkin. However, I am a true believer that Linkedin is not the proper place to do that. 

Linkedin lost its primary goal according to me.

This is a personal opinion type of post, because again, I'm all for your freedom to post, say, and share anything you feel like sharing. And I would love for you to extend me the same courtesy and accept that I have the freedom and the right to not wanting to see irrelevant posts there.

I will not be analyzing or talking about the change of Linkedin feed algos and how they affected what we see and what type of posts get the most visibility. 

I'm sure the folks over there are doing their best to maximize shareholder profits (and by the by, maximizing user engagement).

I'm here to talk a bit about the kinds of engagement that I did not like, and what I did to do the "grand-ménage" of my network. 

While we're at it, If someone from Linkedin is reading this, I'm curious to know if you guys have thought about the engagement quality on the site. 

Your abuse reporting system is an absolute joke in terms of UX and you really need to figure out a way to offer help, the same way you are shoving the premium upgrade button everywhere, and the same way you made it a 3 step process to take our money. 

I would love for you to do the same UX simplifications to hear out your users and not send them in an infinite dance around the FAQ page (yes I tried to report someone posting really offensive stuff, and spent 35 minutes clicking from page to page, only to come back the the main FAQ page afterwards) this is done by design and a very bad one. 


Let's know the targets:

a disclaimer: this is not an attack on these people and their freedom to share whatever they want. This is a personal desire from me to clean up a personal space, a space where I do not tolerate low value posts filling up my sight.


In order to identify potential candidates for deletion, I needed to firstly define the different target groups with clear and simple characterizing feats. Since the process of deleting a few thousand connections all at once was made almost impossible by Linkedin ( UX team need to look at the possibility of a "select all to delete" feature).

Let us start with the recent trendy garbage:


1- People tapping twice, and any one sharing a "tap twice to see..."



This segment is the most annoying one recently and I don't have to sell you on this, no justification is needed for anyone participating in this wave of spamming. It was a trick to get more exposure by applying the same method used first on Instagram ( Insta has a double-tap-to-like feature that was not available until recently on Linkedin).

I understand if a social media influencer uses those silly methods to collect likes and get more exposure. Doing this on Linkedin is a sign of stupidity. STOP THIS!


2- No personal photo whatsoever



This is self explanatory, If I have never met you, and I already have you on Linkedin, the minimum thing to have is to put your full name and photo.

If you are afraid of showing the world how you look like, it is hard to accept being connected with actual faceless ghosts. Unless I personally know you, and somehow you decided to delete your profile photo, "you be gooooone" too buddy.


3- Photos of cats, dogs, cars, the beach, natures, food.. instead of a human being

Well, If you have no respect for yourself, how do you expect other to have any for you. Imagine having a photo on your ID, with something else besides your personal photo.

Since I believe that an online profile as something we have control over, why not put your best professional photo, or at least a photo that reflects a nice smile, without showing the world your bathing suite. 

Are you a fan of cats or dogs? We all are! who doesn't love those cute human companions, but as I have mentioned above, there are much appropriate place for that. Linkedin is definitely not one of them.


4- No name, or only initials


It is like, you are sitting at Starbucks, someone introduces themselves to you, asking for your business card, and when they hand your theirs , it is blank, or it has just a couple of initials. No pal, not interested.


5- Natural spammers 

This segment both tricky and easy to get rid of. I will explain in the "How" part how I got rid of most of them. However, my specific rule may not generalize well with others. 

It is composed of those that do the following things repeatedly  :

- Post more than 5 times /day

- Like more than 5 posts / day

- Commenting more than 5 posts / day

Combined with:

- Shouldn't show up in my feed more than 5 times / day for any of the above reasons

- Never sent me a message or an inmail

- Have sent me a message or an inmail, which was totally irrelevant 

- Those scraping profiles for contact info, AKA:  have sent me an unsolicited email at least once

- Those shoving their political / religious views down our throats, one mistake & "you be goooone"  


Doing the filtering and selection for this segment took a bit of time too, it required massive amount of historic data and manual labeling of some of the content in advance.. It was worth every second !

 

6- Romeos... Juliettes... the other type

This segment annoys most people. The definition is pretty strait-forward: Those who consider Linkedin as a dating website. All those that post inappropriate content, and those that have been publicly named and shamed by others.

I admit that the work I did to remove this segment of people from my network was done purely manually. I did not have enough data points to build something that wouldn't miss.

And since I was luckily not the target for marriage proposals on Linkedin, I had to rely on this basic rule, and I choose not to disclose it here.


7- I am recruiting for UAE, DUBAI, CANADA, MARS...


It is a sad feeling when, while browsing through the posts, you see a well respected connection falling victim to the scam of those trying to be the wiseass. I do not believe that copy/pasting this type of posts proves any point what so ever. As much as I don't appreciate scammers, I have less appreciation for lesson-givers. So this group is also an identified target for deletion.


Let's summarize and move forward :


The potential target list can get larger with special cases. I decided to keep it as short as possible for the time being in order to test the actual benefits of this first iteration. 

All in all, 7 target groups have been identified for immediate removal. Some of them have direct identifying characteristics, and other are only identified by behavioral traits . 




This simplification will help in the execution phase. Since profile identifying features are easy to spot once well-defined, and do not require any activity history to be processed. While the behavioral features are harder to come by, and require complex definitions, and an actual analysis of posts, comments and user activity to be able to flag those suckers and delete them.
 
I must note that I was extremely careful while implementing this project for 2 main reasons:

  • Mistakes happen, and I needed to make sure that my code does not delete based on false interpretations

  • Requiring user activity data was extremely tricky, It needed some of patience and a lot of scraping. Doing so meant that I could be detected by LinkedIn for having unusual activity. So my work needed to be as humanly-like as possible, or else I could risk being flagged by the website.
The breakdown of the work went as this:

Profile features identification


The easiest one was identifying the first segment based on predefined profile features. As explained above, there are mainly 3 features I am interested in:
  • The name
  • The profile picture
  • Profile description
Since Linkedin allows for a full download of network information, this process was done manually. However, it did not yield some of the data I needed to analyze and apply the identification rules. Yes you've guessed it right, I couldn't download profile photos from the archive dump provided automatically.
This is why an additional script was added to go through all the profiles one by one, after retreiving the corresponding URLs.









Behavioral features identification


Behavioral features identification was the longest part to work on for obvious reasons.
The challenging part was getting the data needed, AKA the signals required to flag targeted profiles.
Without going into more details, it took around 90 days of monitoring to be able to recover a decent amount of data that have allowed me to identify the largest group .
Below is a simplified diagram explaining how I split the targets, and what went into doing the identification.

I insist on "simplified" because I do not intent to reveal the actual work and the details of the various steps for obvious reasons.


This simplified analysis lead to the implementation of 2 modules:

- Activity module

- Text analysis module

For the first module, and during the monitoring period, since one of the criteria for identifying potential targets was a limit on how often they post, like or comment, per day, I had to capture this information and for every connection going beyond that limit, a flag would be set to the corresponding profile.

The text analysis module was at first a sub part of the activity module. Since I was getting post contents, I figured I can download it all. However it was a very complex task.

Honestly, the whole project was as you would have imagined it, full of trials and errors. Figuring out the optimal way to capture relevant information without consuming much time or resources took a while to optimize.


After the whole thing was set in motion, I was able to gather and flag a large number of information. It was unexpectedly mind blowing!


The Final fun part: Results ! 


As the title reveals it all, I was able to successfully and happily delete more than 4k useless connections on Linkedin. 

Most of them were profiles of people I never met, people outside of my work network, working in industries and in positions that were not relevant to me. So the actual lost was not huge in terms of being exposed to the world based on the 6-degree theory. 

Was it worth the effort? YES! a thousand times YES!

I actually feel the difference in the quality of posts I m experiencing in my feed. I got rid of many spammers, and the overall garbage-to-gold ratio has declined tremendously.

Other than the actual approximation of the total number of deleted connections, I will keep the rest of the details for myself. No personal identifying information or codes will be posted anywhere.

This project was a very self-indulging exercise, where I applied some of the stuff I usually do at work.

Regex, image processing and Deep Learning were heavily used in the behavioral-targeting part of the project.  

I had to analyze a large volume of texts, activity flags, posts and content varying from photos, links and videos. Comments were out of the scope of this project, however comment activity was taken into the equation for obvious reasons.


I personally went through each and every deletion-candidate for manual validation, since the data volumes and the mix of different techniques made it a bit tricky to pre-label posts and profiles, some manual labour went into achieving the final goal.  


My final advise for anyone complaining about LinkedIn: you should clean up and delete those who annoy you. simple..


This work took around 4 months to complete, about 90 days went into gathering the necessary data and 4 weeks of trial and error over multiple weekends. 

Doing this write-up took around a couple of days, and I went back and forth regarding the use of certain terms, especially the "not so politically correct" ones. So What you got here is a mix of both worlds and the fruit of some post-meditation writing.


Below are some other sources of fellow humans complaining about how sucky Linkedin became.. I feel you guys!


Quora : Why does LinkedIn suck?

Quora : What are the things that suck on LinkedIn?

Quora again : How bad is Linkedin?


legal disclaimer: For apparent reasons, I would like to say that this post, and all the stuff behind it are a work of fiction. I love sci-fi stuff and this write-up is a part of an imaginary scenario that never happened. It was all a dream, probably. No one is responsible for this. And if it bothers you in anyway, feel free to take a chill-pill. This never happened, Okay?

Popular posts from this blog

Mathematical Symbols in LATEX

Analyzing Twitter data with R (part 3: Cleaning & organizing the Data)

Linkin Park: Analyzing the causes of death of Rock band members