webScrap

WebScraping first step.

Authors: Paulo, Claudio M.

First steps in Web Scraping. Project carried out for training in Web Scrapping. The export of information to a structured database (Pandas DataFrame) where the information was obtained by making a request() call from pages with known addresses. Find the information in the 'lxml' code formatted by BeautfullSoup, and finally exported in csv format.

How to automate the search for related words in OLX ads.
Can I use quartile analysis to find the best product at the best price?

Our Plan

Select the list of related words.
Use requests to download the page.
Use BSsoup to format the downloaded page in lxml.
Create a structured database with date and time of posting, ad title, product value, city and neighborhood where it is being advertised.
Filter the database by removing ads whose ad title does not contain the desired words.
Use the percentile and average value metric to find the average price of advertisements by cities (of Brazilian states).

Current progress

Data scraping was carried out and the database was created to analyze the average value by city.

Database formed by information in OLX Brasil website advertisements.

The code is with variables and comments in Portuguese, and the search for advertisements is carried out with words in the Portuguese language.

Web Scraping OLX with Python and Bsoup.

Related tags

Overview

webScrap

WebScraping first step.

Authors: Paulo, Claudio M.

Our Plan

Current progress

References

Owner

claudio paulo

Free-Game-Scraper is a useful script that allows you to track down free games and DLCs on many platforms.

for those who dont want to pay $10/month for high school game footage with ads

Meme-videos - Scrapes memes and turn them into a video compilations

A multithreaded tool for searching and downloading images from popular search engines. It is straightforward to set up and run!

This program will help you to properly scrape all data from a specific website

A Python module to bypass Cloudflare's anti-bot page.

Scrape puzzle scrambles from csTimer.net

A web scraper that exports your entire WhatsApp chat history.

A spider for Universal Online Judge(UOJ) system, converting problem pages to PDFs.

The open-source web scrapers that feed the Los Angeles Times California coronavirus tracker.

A simplistic scraper made to download tons of random screenshots made by people.

Works very well and you can ask for the type of image you want the scrapper to collect.

A Very simple free proxy list scraper.

Web Crawlers for Data Labelling of Malicious Domain Detection & IP Reputation Evaluation

UdemyBot - A Simple Udemy Free Courses Scrapper

Web Scraping Framework

A high-level distributed crawling framework.

A Simple Web Scraper made to Extract Download Links from Todaytvseries2.com

A universal package of scraper scripts for humans

Newsscraper - A simple Python 3 module to get crypto or news articles and their content from various RSS feeds.