Using Selenium with Python to Web Scrap Popular Youtube Tech Channels.

Last update: Aug 18, 2021

Overview

Web Scrapping Popular Youtube Tech Channels with Selenium

Data Mining, Data Wrangling, and Exploratory Data Analysis

About the Data

Web scraping was performed on the Top 10 Tech Channels on Youtube using Selenium (an automated browser (driver) controlled using python, which is often used in web scraping and web testing). Web scrapped Youtube channels were were determined using a Top 10 Tech Youtubers list from blog.bit.ai.

All data was saved to multiple CSV files to aid in further analyze on a Google Colab notebook. Please see my for more more details.

Sample of Data Collected

The average number of videos per channel was around 200. In total, the data from 2000 videos was scrapped.

Word Cloud of Word Frequency in Video Titles

Take Aways

Video Comment numbers have very little correlation to any data that was obtained in this project.
The following seem to be seems to be highly correlated.
- Channel Views and Subscribers
- Interactions and Video Views
Video titles fall into 5 topic groups.

Kmeans and PCA used to create clusters for video titles
- Iphone (kmeans 0)
- Samsung (kmeans 1)
- Reviews (kmeans 2)
- Unboxing (kmeans 3)
- How-to (kmeans 4)
70% of the the most viewed videos are about phones.
Join Date (Date a Youtube Channel was created) does not seem to have any relationship to number of subscribers or overall cha

Project Links

"Data Analysis of Youtube Tech Channels"

Using Selenium with Python to Web Scrap Popular Youtube Tech Channels.

Related tags

Overview

Web Scrapping Popular Youtube Tech Channels with Selenium

Data Mining, Data Wrangling, and Exploratory Data Analysis

About the Data

Sample of Data Collected

Word Cloud of Word Frequency in Video Titles

Take Aways

Kmeans and PCA used to create clusters for video titles

Project Links

Owner

David Rusho

用python爬取江苏几大高校的就业网站，并提供3种方式通知给用户，分别是通过微信发送、命令行直接输出、windows气泡通知。

A Telegram crawler to search groups and channels automatically and collect any type of data from them.

A web Scraper for CSrankings.com that scrapes University and Faculty list for a particular country

京东茅台抢购最新优化版本，京东茅台秒杀，优化了茅台抢购进程队列

Consulta de CPF e CNPJ na Receita Federal com Web-Scraping

A Smart, Automatic, Fast and Lightweight Web Scraper for Python

Pro Football Reference Game Data Webscraper

This is a module that I had created along with my friend. It's a basic web scraping module

基于Github Action的定时HITsz疫情上报脚本，开箱即用

🐞 Douban Movie / Douban Book Scarpy

京东茅台抢购

for those who dont want to pay $10/month for high school game footage with ads

Scrape all the media from an OnlyFans account - Updated regularly

Scrapy-based cyber security news finder

Poolbooru gelscraper - a simple python script for scraping images off gelbooru pools.

This Spider/Bot is developed using Python and based on Scrapy Framework to Fetch some items information from Amazon

Google Developer Profile Badge Scraper

Instagram_scrapper - This project allow you to scrape the list of followers, following or both from a public Instagram account, and create a csv or excel file easily.

Amazon web scraping using Scrapy Framework

An helper library to scrape data from Instagram effortlessly, using the Influencer Hunters APIs.