Scrapping Connections' info on Linkedin

Last update: Feb 11, 2022

Overview

Scrap It!

! Disclaimer:

THIS CODE HAS BEEN IMPLEMENTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE INTERVIEW PROCESS OF MCI.IR AND INTERVIEWEES WERE SUPPOSED TO PUSH THE CODE ON THEIR GITHUB. CONTACT ME TO REMOVE THIS REPOSITORY, IN CASE IT IS AGAINST YOUR TOS.
IF ANY CONNECTION IS NOT OK TO THEIR CONTACT INFO BE HERE, CONTACT ME TO REMOVE THEM ASAP.

Functionalities:

This script automatically:

opens your Linkedin profile
accesses your connections page
crawls the page for grabbing their profile links
scraps each person's information and dumps it to Sqlite db
and simultaneously logs all necessary level of info into Linkedin.log

DataFlowDiagram

Enlisted desing patterns are (but not limited to):

Creator
Low Coupling
High Cohesion
Indirection
Modularization
Information Expert

Log/DB files:

Further develepments notes:

Check out other DBs that supports multithreading which anable us dumpping all information rows at once
change IP per request (You can find its code on my "Social Media Computing course" repository)
Sometimes you need to scroll down manually when "connection" page is being loaded. You can add one line code to scroll down for you.

References:

https://www.linkedin.com/pulse/how-easy-scraping-data-from-linkedin-profiles-david-craven

https://www.geeksforgeeks.org/scrape-linkedin-using-selenium-and-beautiful-soup-in-python/

https://stackoverflow.com/questions/28883769/remove-odd-indexed-elements-from-list-in-python#:~:text=Fun%20fact%3A%20to%20remove%20all,remove(x)%20.

https://stackoverflow.com/questions/34759787/fetch-all-href-link-using-selenium-in-python

https://www.tutorialspoint.com/fetch-all-href-link-using-selenium-in-python

https://stackoverflow.com/questions/64717302/deprecationwarning-executable-path-has-been-deprecated-selenium-python

https://chromedriver.chromium.org/home

https://www.youtube.com/watch?v=-ARI4Cz-awo

Scrapping Connections' info on Linkedin

Related tags

Overview

Scrap It!

Functionalities:

DataFlowDiagram

Enlisted desing patterns are (but not limited to):

Log/DB files:

Further develepments notes:

References:

Owner

MohammadReza Ardestani

Scraping followers of an instagram account

Script used to download data for stocks.

A Scrapper with python

A powerful annex BUBT, BUBT Soft, and BUBT website scraping script.

Examine.com supplement research scraper!

Python script who crawl first shodan page and check DBLTEK vulnerability

feapder 是一款简单、快速、轻量级的爬虫框架。以开发快速、抓取快速、使用简单、功能强大为宗旨。支持分布式爬虫、批次爬虫、多模板爬虫，以及完善的爬虫报警机制。

Comment Webpage Screenshot is a GitHub Action that captures screenshots of web pages and HTML files located in the repository

This script is intended to crawl license information of repositories through the GitHub API.

Proxy scraper. Format: IP | PORT | COUNTRY | TYPE

Creating Scrapy scrapers via the Django admin interface

Scrape plants scientific name information from Agroforestry Species Switchboard 2.0.

simple http & https proxy scraper and checker

A tool for scraping and organizing data from NewsBank API searches

A Python library for automating interaction with websites.

Scrapes the Sun Life of Canada Philippines web site for historical prices of their investment funds and then saves them as CSV files.

This is a webscraper for a specific website

Consulta de CPF e CNPJ na Receita Federal com Web-Scraping

Twitter Claimer / Swapper / Turbo - Proxyless - Multithreading

Generate a repository with mirror links for DriveDroid app