Amazon Scraper: A command-line tool for scraping Amazon product data

Overview

Amazon Product Scraper: 2021

Description

A command-line tool for scraping Amazon product data to CSV or JSON format(s).

Requirements

  • Python 3
  • pip3

Installation

Using git clone (you'll need git installed for this):

git clone https://github.com/scrapewalrus/amazon-scraper-python-2021

Or download and extract the zip file of the project manually

You'll also need to install requirements for the project to run. Locate amazon-product-scraper folder via terminal and type pip install -r requirements.txt:

Usage

To launch the Amazon scraper locate the amazon-product-scraper folder via terminal and type python amazon_scraper.py -k "your keyword". This will start the program.

NOTE: you must declare either -k or --keyword before entering your keyword. It's a required argument.

Example:

amazon_scraper.py - the name of a scraper file.

-k or --keyword - required argument to pass before entering your keyword.

-p or --proxies - optional argument to enable proxies. To avoid getting blocked I highly recommend using proxies. I'm using Residential Proxies from Oxylabs. For highest success rate, I suggest Residential Proxies over Datacenter as they're almost impossible to detect and have the smallest footprint. If you decide to use different proxy provider services keep in mind that you'll have to make some minor adjustments in get-proxies.py file.

-j or --json - optional argument for storing extracted data in .json format. Default output format is .csv.

Example of product data #1: JSON

[
    {
        "SOURCE_URL": "https://www.amazon.com/s?k=funny+t+shirt+for+women&page=1",
        "PAGE": 1,
        "KEYWORD": "funny t shirt for women",
        "PRODUCT_LINK": "https://www.amazon.com/Mostly-T-Shirt-Womens-Letter-Printed/dp/B07QN2NQ59/ref=sr_1_3?dchild=1&keywords=funny+t+shirt+for+women&qid=1627833682&sr=8-3",
        "PRODUCT_NAME": "I'm Mostly Peace Love and Light Funny T-Shirt Womens Graphic Printed Short Sleeve Tops Tee",
        "PRICE": "$21.99",
        "PRODUCT_RATING": "4.6",
        "NUMBER_OF_RATINGS": "1,637"
    },
    {
        "SOURCE_URL": "https://www.amazon.com/s?k=funny+t+shirt+for+women&page=1",
        "PAGE": 1,
        "KEYWORD": "funny t shirt for women",
        "PRODUCT_LINK": "https://www.amazon.com/YITAN-Women-Graphic-Funny-X-Large/dp/B074QMG4D7/ref=sr_1_4?dchild=1&keywords=funny+t+shirt+for+women&qid=1627833682&sr=8-4",
        "PRODUCT_NAME": "YITAN Women's Cute Juniors Tops Teen Girl Tee Funny T Shirt",
        "PRICE": "$12.99",
        "PRODUCT_RATING": "4.6",
        "NUMBER_OF_RATINGS": "12,281"
    },
    {
        "SOURCE_URL": "https://www.amazon.com/s?k=funny+t+shirt+for+women&page=1",
        "PAGE": 1,
        "KEYWORD": "funny t shirt for women",
        "PRODUCT_LINK": "https://www.amazon.com/DANVOUY-Womens-V-Neck-Doesnt-Definitely/dp/B07V55ZXVS/ref=sr_1_5?dchild=1&keywords=funny+t+shirt+for+women&qid=1627833682&sr=8-5",
        "PRODUCT_NAME": "DANVOUY Womens If My Mouth Doesn't Say It My Face Definitely Will T Shirt",
        "PRICE": "$12.99",
        "PRODUCT_RATING": "4.6",
        "NUMBER_OF_RATINGS": "6,787"
    }]

Example of product data #2: CSV

amazon-csv-product-data-example

Tncli - TON smart contract command line interface

Tncli TON smart contract command line interface State Not working, in active dev

Disintar IO 100 Dec 18, 2022
Python Library and CLI for exporting MySQL databases

expdb Python library and CLI for exporting MySQL databases Installation Pre-requisites MySQL server Python 3.9+ Using git Clone the repository to your

Devansh Singh 1 Nov 29, 2021
🐍The nx-python plugin allows users to create a basic python application using nx commands.

🐍 NxPy: Nx Python plugin This project was generated using Nx. The nx-python plugin allows users to create a basic python application using nx command

StandUP Communications 74 Aug 31, 2022
Command line tool for interacting and testing warehouse components

Warehouse debug CLI Example usage for Zumo debugging See all messages queued and handled. Enable by compiling the zumo-controller with -DDEBUG_MODE_EN

1 Jan 03, 2022
A simple yet powerful timer and time tracker from the command line.

Focus Phase Focus Phase (FP) is a simple yet powerful timer and time tracker. It is a command-line application written in Python and can be installed

Ammar Alyousfi 13 Jan 13, 2022
Enriches Click with option groups, constraints, command aliases, help sections for subcommands, themes for --help and other stuff.

Enriches Click with option groups, constraints, command aliases, help sections for subcommands, themes for --help and other stuff.

Gianluca Gippetto 62 Dec 22, 2022
NudeNet wrapper made to provide a simple cli interface to the library

Nudenet Wrapper. Small warpper script for NudeNet Made to provide a small and easy to use cli interface with the library. You can indicate a single im

1 Oct 20, 2021
πŸ•° The command line tool for scheduling Python scripts

hickory is a simple command line tool for scheduling Python scripts.

Max Humber 146 Dec 07, 2022
Easily turn single threaded command line applications into a fast, multi-threaded application with CIDR and glob support.

Easily turn single threaded command line applications into a fast, multi-threaded application with CIDR and glob support.

Michael Skelton 1k Jan 07, 2023
EODAG is a command line tool and a plugin-oriented Python framework for searching, aggregating results and downloading remote sensed images while offering a unified API for data access regardless of the data provider

EODAG (Earth Observation Data Access Gateway) is a command line tool and a plugin-oriented Python framework for searching, aggregating results and downloading remote sensed images while offering a un

CS GROUP 205 Jan 03, 2023
instant coding answers via the command line

howdoi instant coding answers via the command line Sherlock, your neighborhood command-line sloth sleuth. Are you a hack programmer? Do you find yours

Benjamin Gleitzman 9.8k Jan 08, 2023
Standalone Tailwind CSS CLI, installable via pip

Standalone Tailwind CSS CLI, installable via pip Use Tailwind CSS without Node.j

Tim Kamanin 144 Dec 22, 2022
ghfetch is ai customizable CLI GitHub personal README generator.

ghfetch is ai customizable CLI GitHub personal README generator. Inspired by famous fetch such as screenfetch, neofetch and ufetch, the purpose of this tool is to introduce yourself as if you were a

Alessio Celentano 3 Sep 10, 2021
flora-dev-cli (fd-cli) is command line interface software to interact with flora blockchain.

Install git clone https://github.com/Flora-Network/fd-cli.git cd fd-cli python3 -m venv venv source venv/bin/activate pip install -e . --extra-index-u

14 Sep 11, 2022
Commandline Python app to Autodownload mediafire folders and files.

Commandline Python app to Autodownload mediafire folders and files.

Tharuk Renuja 3 May 12, 2022
You'll never want to use cd again.

Jmp Description Have you ever used the cd command? You'll never touch that outdated thing again when you try jmp. Navigate your filesystem with unprec

Grant Holmes 21 Nov 03, 2022
commandline version of wordle game and my auto solver.

Wordle Machine (and Wordle Game) (in commandline) My implementation of the Wordle game (inspired by https://www.powerlanguage.co.uk/wordle/) and my in

Kevin Xu 11 Jan 03, 2023
Simple tool, to update linux kernel on ubuntu

Kerbswap Simple tool, to update linux kernel on ubuntu Information At the moment, this tool only supports "Ubuntu" distributions, but will be expanded

dword 1 Oct 31, 2021
GetRepo-py is a command line client that queries GitHub API and searches repositories by given arguments

GetRepo-py is a command line client that queries GitHub API and searches repositories by given arguments

Davidcin 3 Feb 14, 2022
A command line tool to remove background from video and image

A command line tool to remove background from video and image, brought to you by BackgroundRemover.app which is an app made by nadermx powered by this tool

Johnathan Nader 1.7k Jan 01, 2023