当前位置:网站首页>Dodging ice cream assassins?Crawling ice cream prices through crawlers
Dodging ice cream assassins?Crawling ice cream prices through crawlers
2022-07-30 17:44:00 【m0_54850825】
Requirements Analysis
The weather in summer is so hot that people don't want to move. Only staying in an air-conditioned room can bring a little comfort.Of course, there is no need to eat ice cream
However, the price of ice cream is not cheap now. For example, a certain Aido chocolate ice cream has been sold for 5 yuan in retail. How can I remember that it was only 3 yuan before? Anyway, the price is a little expensive.However, what I didn't expect was that a little friend suddenly said to me today, "I was fooled today. I picked up a medium ice cream at the convenience store. I didn't expect that they would charge me 16 yuan! I was attacked by ice cream assassins.!”
Medium ice cream?16 dollars?Oh my God, I asked him, "Then this is so expensive, why don't you put it back? You can eat more than 3 pieces of this one?"
My friend is very helpless, "I have taken it all, I have to pay the bill, I am embarrassed to put it back..."
Alas, I'm afraid this is a matter of life and death. What should I do?Is there any way to help my friends and avoid getting high-priced ice cream next time?Of course, there are already many big guys who have made cheats, such as imported ice cream is more expensive, ice cream with some kind of chocolate is more expensive, etc., but these rules are too complicated and not direct enough, we should take a faster method,Climb down the price of ice cream directly, so that you can see which ice cream is more expensive
Implementing Analysis
This demand is not very difficult, it is to crawl the price of ice cream, just find a store and save the name and price of the ice cream.I also easily found a target and sent a request through requests
If there is no accident, there is another accident, that is, why don't you see the data in the request?like this

It can be seen that the price behind the money symbol should be the price, but there is no price here. This is really strange. Where did the price go?Obviously there is a price on the page, how come there is no price in our requests?What the hell is going on here?
Well, then I can only find it, it's not difficult, if I guessed correctly, I think I have found the price

It can be seen that the price is in this request. There is a p above it. It should be price or something. So what is this?This is a jquery file, that is to say, the price of the ice cream is written in the page through jquery.Not visible in basic requests sent by requests
Okay, that's basically sure, don't think about it, it's time to use selenium again today.There may be some friends who don't understand it very much. Isn't it just a jquery file? Let's crawl this file and then parse it. Why don't we have to use selenium?
It's right to think this way, but if you want to determine the corresponding jquery file according to the page, you may need to go through an encryption parameter test during the period. Think about it and know that it takes too much time on this.If there is no special requirement, you must use selenium directly. The usage method is also very simple. Open a browser, then get the page, and get the code of the page through driver.page_source, which can be obtained as a normal requests request.The response used
Full code demo
from selenium import webdriverfrom lxml import etreefrom base64 import b64decodeurl = b64decode("aHR0cHM6Ly93d3cuamQuY29tL3BoYi8xMjIxODU1MTY0MzIxMmY1MDE5NTkuaHRtbA==").decode()driver = webdriver.Chrome()driver.get(url)html = etree.HTML(driver.page_source)driver.quit()i_name = html.xpath("//div[@class='detail']/a/text()")i_price = html.xpath("//span[@class='price-rmb']/text()")i_comment = html.xpath("//div[@class='evaluate-detail']/a/text()")text = ""for i in range(len(i_name)):text += "Name: " + i_name[i] + ""text += "price:" + i_price[i] + "yuan"text += "Comments: " + i_comment[i] + ""print(text)The result of running the program is as follows

In general, if you encounter a page that needs to be dynamically rendered, or a page that needs to execute js, if there are no special requirements, such as fast execution, or you are willing to pay a high costTo upgrade the program, otherwise, it is recommended to use the application of dynamic rendering directly, such as the use of selenium
In addition, it can also be seen that this program cannot directly calculate the unit price of ice cream because the selected page is general, because it is difficult to extract the quantity of ice cream. If you want to solve this problem, it is better to change to a better one's product page
边栏推荐
猜你喜欢

升级Win11后不喜欢怎么退回Win10系统?

一个 15 年 SAP ABAP 开发人员分享的 SAPGUI 一些个性化设置和实用小技巧试读版

Ecplise执行C语言报错:cannot open output file xxx.exe: Permission denied

JVM诊断命令jcmd介绍

宽带射频放大器OA4SMM4(1)

MySQL中的存储过程(详细篇)

公司部门来了个00后测试卷王之王,老油条表示真干不过,已经...

Error occurred while trying to proxy request项目突然起不来了
![(18)[系统调用]追踪系统调用(服务表)](/img/05/2529e49932f7bdc9d30f7d267a1d29.png)
(18)[系统调用]追踪系统调用(服务表)

LeetCode 952. 按公因数计算最大组件大小
随机推荐
C陷阱与缺陷 第6章 预处理器
Promise入门到精通(1.5w字详解)
数据库系统原理与应用教程(068)—— MySQL 练习题:操作题 90-94(十二):DML 语句练习
升级 MDK 5.37 后的问题处理: AC6编译选项, printf, 重启失效等
C陷阱与缺陷 第6章 预处理器 6.2 宏并不是函数
分账系统二清解决方案如何助力电商平台合规经营?
从零开始的Multi-armed Bandit
S7-200SMART中定时器的使用方法和常见注意事项汇总
17.机器学习系统的设计
LeetCode 952. 按公因数计算最大组件大小
fast shell porting
线程同步 控制执行顺序
Error EPERM operation not permitted, mkdir 'Dsoftwarenodejsnode_cache_cacach Two solutions
【综合类型第 34 篇】喜讯!喜讯!!喜讯!!!,我在 CSDN 的第一个实体铭牌
Error EPERM operation not permitted, mkdir ‘Dsoftwarenodejsnode_cache_cacach两种解决办法
shell快速移植
JMeter笔记4 | JMeter界面介绍
UE5第一人称射击游戏蓝图教程
592. Fraction Addition and Subtraction
主流的深度学习推理架构有哪些呢?