Amazon scraper using scrapy, a python framework for crawling websites.

Last update: Dec 26, 2021

Overview

#Amazon-web-scraper

This is a python program, which use scrapy python framework to crawl all pages of the product and scrap products data. This program scrap product data, such as price, discounted price and link to the product. And store each data into the sql database, for easily accessible.

#How to run the program.

To run this program, open the terminal on the program location

And type -

         1. cd amazonscraper
         2. scrapy crawl amazon -a product=laptops -O laptops.json
                       
                       #OR
                       
         1. cd amazonscraper
         2. scrapy crawl amazon -a product=laptops

Where '-a' is a argument and '-O' is the output file type.

#Method

In the first method, the products data are stored in database as well as in a json file.

And in the second method, the product data are stored only in the database.

Owner

Akash Das

GitHub Repository

script to scrape direct download links (ddls) from google drive index.

bhadoo Google Personal/Shared Drive Index scraper. A small script to scrape direct download links (ddls) of downloadable files from bhadoo google driv

53 Dec 16, 2022

PyQuery-based scraping micro-framework.

demiurge PyQuery-based scraping micro-framework. Supports Python 2.x and 3.x. Documentation: http://demiurge.readthedocs.org Installing demiurge $ pip

109 Jul 20, 2022

Binance harvester - A Python 3 script to harvest data from the Binance socket stream and calculate popular TA indicators and produce lists of top trending coins

68 Oct 08, 2022

Web scraper build using python.

Web Scraper This project is made in pyhthon. It took some info. from website list then add them into data.json file. The dependencies used are: reques

2 Jul 22, 2022

Bulk download tool for the MyMedia platform

MyMedia Bulk Content Downloader This is a bulk download tool for the MyMedia platform. USE ONLY WHERE ALLOWED BY THE COPYRIGHT OWNER. NOT AFFILIATED W

3 Oct 14, 2022

A webdriver-based script for reserving Tsinghua badminton courts.

AutoReserve A webdriver-based script for reserving badminton courts. 使用说明下载 chromedriver 选择当前Chrome对应版本安装 selenium pip install selenium 更改场次、金额信息dat

4 Nov 09, 2021

Scraping Top Repositories for Topics on GitHub,

0.-Webscrapping-using-python Scraping Top Repositories for Topics on GitHub, Web scraping is the process of extracting and parsing data from websites

2 Mar 18, 2022

A training task for web scraping using python multithreading and a real-time-updated list of available proxy servers.

Parallel web scraping The project is a training task for web scraping using python multithreading and a real-time-updated list of available proxy serv

1 Feb 10, 2022

A web scraping pipeline project that retrieves TV and movie data from two sources, then transforms and stores data in a MySQL database.

New to Streaming Scraper An in-progress web scraping project built with Python, R, and SQL. The scraped data are movie and TV show information. The go

1 Mar 28, 2022

Libextract: extract data from websites

Libextract is a statistics-enabled data extraction library that works on HTML and XML documents and written in Python

499 Dec 09, 2022

Binance Smart Chain Contract Scraper + Contract Evaluator

Pulls Binance Smart Chain feed of newly-verified contracts every 30 seconds, then checks their contract code for links to socials.Returns only those with socials information included, and then submit

14 Dec 09, 2022

A social networking service scraper in Python

snscrape snscrape is a scraper for social networking services (SNS). It scrapes things like user profiles, hashtags, or searches and returns the disco

2.4k Jan 01, 2023

This script is intended to crawl license information of repositories through the GitHub API.

GithubLicenseCrawler This script is intended to crawl license information of repositories through the GitHub API. Taking a csv file with requirements.

4 Oct 25, 2022

AssistScraper - program for /r/nba to use to find list of all players a player assisted and how many assists each player recieved

5 Nov 25, 2021

Twitter Scraper

Twitter's API is annoying to work with, and has lots of limitations — luckily their frontend (JavaScript) has it's own API, which I reverse–engineered. No API rate limits. No restrictions. Extremely

45 Dec 30, 2022

Demonstration on how to use async python to control multiple playwright browsers for web-scraping

Playwright Browser Pool This example illustrates how it's possible to use a pool of browsers to retrieve page urls in a single asynchronous process. i

8 Oct 27, 2022

This is a module that I had created along with my friend. It's a basic web scraping module

QuickInfo PYPI link : https://pypi.org/project/quickinfo/ This is the library that you've all been searching for, it's built for developers and allows

2 Dec 13, 2021

a Scrapy spider that utilizes Postgres as a DB, Squid as a proxy server, Redis for de-duplication and Splash to render JavaScript. All in a microservices architecture utilizing Docker and Docker Compose

This is George's Scraping Project To get started cd into the theZoo file and run: chmod +x script.sh then: ./script.sh This will spin up a Postgres co

7 Nov 27, 2022

Linkedin webscraping - Linkedin web scraping with python

linkedin_webscraping This is the first step of a full project called "LinkedIn J

4 Apr 24, 2022

A scrapy pipeline that provides an easy way to store files and images using various folder structures.

scrapy-folder-tree This is a scrapy pipeline that provides an easy way to store files and images using various folder structures. Supported folder str

7 Oct 23, 2022