爬虫案例合集。包括但不限于《淘宝、京东、天猫、豆瓣、抖音、快手、微博、微信、阿里、头条、pdd、优酷、爱奇艺、携程、12306、58、搜狐、百度指数、维普万方、Zlibraty、Oalib、小说、招标网、采购网、小红书》

Last update: Jan 05, 2023

Overview

lxSpider

爬虫案例合集。包括但不限于《淘宝、京东、天猫、豆瓣、抖音、快手、微博、微信、阿里、头条、pdd、优酷、爱奇艺、携程、12306、58、搜狐、百度指数、维普万方、Zlibraty、Oalib、小说网站、招标采购网》

简介：

时光荏苒，记不清写了多少案例了。作者文章发布在csdn，代码随后往github上更新。csdn部分文章为收费案例，合理订阅。

声明：

本库以教学为基准、本库提供的可操作性不得用于任何商业用途和违法违规场景。
作者对任何原因在使用本库中提供的代码和策略时可能对用户自己或他人造成的任何形式的损失和伤害不承担责任。
因本库引起的或与之有关的任何争议，各方应友好协商解决，协商不成的任何后果与作者无关。

专栏

网络爬虫基础：适合有python语法基础准备学爬虫的同学

web逆向基础：有爬虫经验即可（包含猿人学爬虫题目解析）

安卓逆向基础：工具介绍、逆向记录、案例分享

爬虫案例合集：付费专栏、经典案例、持续更新

博客

交流

Releases(快手弹幕采集工具)

快手弹幕采集工具(Jan 30, 2021)
使用说明：

1、启动dist目录下的run.exe程序。

2、填入主播uid，你的cookie，房间id

3、点击启动后，等待即可，不可重复点击。

4、需要确认主播当前是否还在直播。

参数获取：

主播uid：浏览器上的网址最后一个参数。

比如网址为： https://live.kuaishou.com/u/yingjia2019

主播的uid为： yingjia2019

你的cookie：

1、打开控制台，鼠标右键点击审查元素或者按F12.

2、点击控制台的Network。

3、刷新页面，可已按F5刷新

4、找到和主播uid一样html文件，然后点击右侧的headers

5、鼠标划到最下面找到cookie一行。复制里面的did=web_xxxxxxxxxxxxxx;

6、需要在软件上填入的cookie是 web_xxxxxxxxxxxxxx

房间id：

1、点击控制台的 Elements，按ctrl+F，打开搜索框。输入： live-stream-id

2、复制 live-stream-id="Zo9Upaz8w90"

3、要输入的房间id是 Zo9Upaz8w90

运行时最好保持页面打开，关闭页面后过一段时间会导致cookie失效。

此工具以学习为主，禁止滥用
Source code(tar.gz)
Source code(zip)
default.rar(21.47 MB)
小说下载器(Feb 2, 2021)
简介

1、小说下载(优势：速度快，直接从网络上搜集完整txt文件速度快) 2、在线小说爬取(优势：资源全，已上架的小说几乎都能找到)

特别声明:

本脚本仅用于测试和学习研究，禁止用于商业用途，不能保证其合法性，准确性，完整性和有效性，请根据情况自行判断。

本项目内所有资源文件，禁止任何公众号、自媒体进行任何形式的转载、发布。

本项目内任何脚本问题概不负责，包括但不限于由任何脚本错误导致的任何损失或损害.

请勿将项目的任何内容用于商业或非法目的，否则后果自负。

本项目遵循GPL-3.0 License协议，如果本特别声明与GPL-3.0 License协议有冲突之处，以本特别声明为准。

Source code(tar.gz)
Source code(zip)
default.zip(44.16 MB)

Owner

lx

Every noble work is at first impossible.

GitHub Repository

Twitter Eye is a Twitter Information Gathering Tool With Twitter Eye

Twitter Eye is a Twitter Information Gathering Tool With Twitter Eye, you can search with various keywords and usernames on Twitter.

19 Dec 12, 2022

A web service for scanning media hosted by a Matrix media repository

Matrix Content Scanner A web service for scanning media hosted by a Matrix media repository Installation TODO Development In a virtual environment wit

5 Dec 01, 2022

Python script who crawl first shodan page and check DBLTEK vulnerability

🐛 MASS DBLTEK EXPLOIT CHECKER USING SHODAN 🕸 Python script who crawl first shodan page and check DBLTEK vulnerability

4 Jan 09, 2022

Generate a repository with mirror links for DriveDroid app

DriveDroid Repository Generator Generate a repository for the app that allow boot a PC using ISO files stored on your Android phone Check also an offi

11 Nov 19, 2022

A high-level distributed crawling framework.

Cola: high-level distributed crawling framework Overview Cola is a high-level distributed crawling framework, used to crawl pages and extract structur

1.5k Jan 04, 2023

This is a script that scrapes the longitude and latitude on food.grab.com

grab This is a script that scrapes the longitude and latitude for any restaurant in Manila on food.grab.com, location can be adjusted. Search Result p

0 Nov 22, 2021

A Telegram crawler to search groups and channels automatically and collect any type of data from them.

Introduction This is a crawler I wrote in Python using the APIs of Telethon months ago. This tool was not intended to be publicly available for a numb

39 Dec 28, 2022

UdemyBot - A Simple Udemy Free Courses Scrapper

112 Nov 12, 2022

This program will help you to properly scrape all data from a specific website

0 May 15, 2022

✂️🕷️ Spider-Cut is a Network Mapper Framework (NMAP Framework)

Spider-Cut is a Network Mapper Framework (NMAP Framework) Installation | Usage | Creators | Donate Installation # Kali Linux | WSL

3 Mar 07, 2022

A multithreaded tool for searching and downloading images from popular search engines. It is straightforward to set up and run!

🕳️ CygnusX1 Code by Trong-Dat Ngo. Overviews 🕳️ CygnusX1 is a multithreaded tool 🛠️ , used to search and download images from popular search engine

32 Dec 31, 2022

爬虫案例合集。包括但不限于《淘宝、京东、天猫、豆瓣、抖音、快手、微博、微信、阿里、头条、pdd、优酷、爱奇艺、携程、12306、58、搜狐、百度指数、维普万方、Zlibraty、Oalib、小说、招标网、采购网、小红书》

Related tags

Overview

lxSpider

专栏

目录

博客

推荐

交流

You might also like...

Releases(快手弹幕采集工具)

快手弹幕采集工具(Jan 30, 2021)

使用说明：

参数获取：

你的cookie：

房间id：

小说下载器(Feb 2, 2021)

简介

特别声明:

Owner

lx

Twitter Eye is a Twitter Information Gathering Tool With Twitter Eye

A web service for scanning media hosted by a Matrix media repository

Python script who crawl first shodan page and check DBLTEK vulnerability

Generate a repository with mirror links for DriveDroid app

A high-level distributed crawling framework.

This is a script that scrapes the longitude and latitude on food.grab.com

A Telegram crawler to search groups and channels automatically and collect any type of data from them.

UdemyBot - A Simple Udemy Free Courses Scrapper

This program will help you to properly scrape all data from a specific website

✂️🕷️ Spider-Cut is a Network Mapper Framework (NMAP Framework)

A multithreaded tool for searching and downloading images from popular search engines. It is straightforward to set up and run!

A scalable frontier for web crawlers

京东抢茅台，秒杀成功很多次讨论，天猫抢购，赚钱交流等。

Minecraft Item Scraper

腾讯课堂，模拟登陆，获取课程信息，视频下载，视频解密。

A web Scraper for CSrankings.com that scrapes University and Faculty list for a particular country

fork huanghyw/jd_seckill

A universal package of scraper scripts for humans

A simple django-rest-framework api using web scraping

Introduction to WebScraping Workshop - Semcomp 24 Beta