A dead simple crawler to get books information from Douban.

Last update: Jan 10, 2022

Related tags

Web Crawling douban-books-crawler

Overview

Introduction

A dead simple crawler to get books information from Douban.

Pre-requesites

Python 3
Install dependencies from requirements.txt
(Optional) Install Anaconda to handle environment

Usage

Run get_tags to fetch all the trending tags.

# This will generate a file tags.csv
python app.py get_tags

Run crawl_books to start crawling the books by the tags from the previous step.

python app.py crawl_books -i tags.csv

Certainly, you can create the tags.csv without using the get_tags script. You might want to make sure the tags you specified can lead to any actual result of books.

License

MIT © mogita

Owner

Yun Wang

GitHub Repository

A Spider for BiliBili comments with a simple API server.

BiliComment A spider for BiliBili comment. Spider Usage Put config.json into config directory, and then python . ./config/config.json. A example confi

3 Jul 05, 2021

ChromiumJniGenerator - Jni Generator module extracted from Chromium project

4 Jun 12, 2022

simple http & https proxy scraper and checker

11 Nov 15, 2021

Web3 Pancakeswap Sniper bot written in python3

Pancakeswap_BSC_Sniper_Bot Web3 Pancakeswap Sniper bot written in python3, Please note the license conditions! The first Binance Smart Chain sniper bo

295 Dec 31, 2022

🐞 Douban Movie / Douban Book Scarpy

Python3-based Douban Movie/Douban Book Scarpy crawler for cover downloading + data crawling + review entry.

1 Dec 03, 2022

Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.

Pattern Pattern is a web mining module for Python. It has tools for: Data Mining: web services (Google, Twitter, Wikipedia), web crawler, HTML DOM par

8.4k Jan 08, 2023

Crawler do site Fundamentus.com com o uso do framework scrapy, tanto da aba detalhada como a de resumo.

Crawler do site Fundamentus.com com o uso do framework scrapy, tanto da aba detalhada como a de resumo. (Todas as infomações)

3 Oct 04, 2022

Video Games Web Scraper is a project that crawls websites and APIs and extracts video game related data from their pages.

Video Games Web Scraper Video Games Web Scraper is a project that crawls websites and APIs and extracts video game related data from their pages. This

1 Jan 12, 2022

Instagram profile scrapper with python

IG Profile Scrapper Instagram profile Scrapper Just type the username, and boo! :D Instalation clone this repo to your computer git clone https://gith

6 Nov 07, 2022

Nekopoi scraper using python3

Features Scrap from url Todo [+] Search by genre [+] Search by query [+] Scrap from homepage Example # Hentai Scraper from nekopoi import Hent

9 Apr 06, 2022

This program scrapes information and images for movies and TV shows.

Media-WebScraper This program scrapes information and images for movies and TV shows. Summary For more information on the program, read the WebScrape_

1 Dec 05, 2021

Find thumbnails and original images from URL or HTML file.

Haul Find thumbnails and original images from URL or HTML file. Demo Hauler on Heroku Installation on Ubuntu $ sudo apt-get install build-essential py

150 Oct 15, 2022

Scraping followers of an instagram account

ScrapInsta A script to scraping data from Instagram Install First of all you can run: pip install scrapinsta After that you need to install these requ

1 Sep 05, 2021

Comment Webpage Screenshot is a GitHub Action that captures screenshots of web pages and HTML files located in the repository

Comment Webpage Screenshot is a GitHub Action that helps maintainers visually review HTML file changes introduced on a Pull Request by adding comments with the screenshots of the latest HTML file cha

21 Sep 29, 2022

A dead simple crawler to get books information from Douban.

Related tags

Overview

Introduction

Pre-requesites

Usage

License

Owner

Yun Wang

A Spider for BiliBili comments with a simple API server.

ChromiumJniGenerator - Jni Generator module extracted from Chromium project

simple http & https proxy scraper and checker

Web3 Pancakeswap Sniper bot written in python3

🐞 Douban Movie / Douban Book Scarpy

Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.

Crawler do site Fundamentus.com com o uso do framework scrapy, tanto da aba detalhada como a de resumo.

Video Games Web Scraper is a project that crawls websites and APIs and extracts video game related data from their pages.

Instagram profile scrapper with python

Nekopoi scraper using python3

This program scrapes information and images for movies and TV shows.

Find thumbnails and original images from URL or HTML file.

Scraping followers of an instagram account

Comment Webpage Screenshot is a GitHub Action that captures screenshots of web pages and HTML files located in the repository

crypto currency scraping

This was supposed to be a web scraping project, but somehow I've turned it into a spamming project

This is a sport analytics project that combines the knowledge of OOP and Webscraping

Scrape all the media from an OnlyFans account - Updated regularly

This is python to scrape overview and reviews of companies from Glassdoor.

Transistor, a Python web scraping framework for intelligent use cases.