当前位置:网站首页>没有对象的程序员如何过七夕
没有对象的程序员如何过七夕
2022-08-05 01:50:00 【Z_Xshan】

运行代码输入连接爬取视频 直接上代码
# encoding: utf-8
'''
爬取b站视频
'''
import requests
import json
import re
import os
class BilibiliVideoSpider(object):
def __init__(self, url, output_root=''):
self.url = url
if not os.path.isdir(output_root):
output_root = os.path.abspath(os.path.dirname(__file__))
self.output_root = output_root
self.headers = {
'Accept': '*/*',
'Accept-Language': 'en-US,en;q=0.5',
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.84 Safari/537.36'
} # 定义请求头
def _match(self, text, pattern):
match = re.search(pattern, text)
if match is None:
print('this pattern was not matched !')
return json.loads(match.group(1))
def getHtml(self):
try:
response = requests.get(url=self.url, headers=self.headers) # 发请求,拿数据 (获取响应对象)
print(f'status_code: {response.status_code}')
if response.status_code == 200:
return response
except RequestException:
print('html reques error !')
def parseHtml(self, response):
playinfo = self._match(response.text, '__playinfo__=(.*?)</script><script>') # 视频详情json
initial_state = self._match(response.text, r'__INITIAL_STATE__=(.*?);\(function\(\)') # 视频内容json
video_url = playinfo['data']['dash']['video'][0]['baseUrl'] # 视频分多种格式,直接取分辨率最高的视频 1080p
audio_url = playinfo['data']['dash']['audio'][0]['baseUrl'] # 取音频地址
video_name = initial_state['videoData']['title'] # 取视频名字
return video_url, audio_url, video_name
def video_audio_merge(self, video_src, audio_src, video_dst):
'''使用ffmpeg单个视频音频合并'''
import subprocess
command = 'ffmpeg -i %s_video.mp4 -i %s_audio.mp4 -c copy %s.mp4 -y -loglevel quiet' % (
video_src, audio_src, video_dst)
subprocess.Popen(command, shell=True)
def downloadVideo(self, video_url, audio_url, video_name):
self.headers.update({"Referer": self.url})
print('开始下载视频: ')
video_content = requests.get(video_url, headers=self.headers)
audio_content = requests.get(audio_url, headers=self.headers)
print('%s视频大小:' % video_name, video_content.headers['content-length'])
print('%s音频大小:' % video_name, audio_content.headers['content-length'])
# 下载视频
received_video = 0
video = f'{self.output_root}/video.mp4'
with open(video, 'ab') as output:
while int(video_content.headers['content-length']) > received_video:
self.headers['Range'] = 'bytes=' + str(received_video) + '-'
response = requests.get(video_url, headers=self.headers)
output.write(response.content)
received_video += len(response.content)
# 下载音频开始
audio_content = requests.get(audio_url, headers=self.headers)
received_audio = 0
audio = f'{self.output_root}/audio.mp4'
with open(audio, 'ab') as output:
while int(audio_content.headers['content-length']) > received_audio:
self.headers['Range'] = 'bytes=' + str(received_audio) + '-'
response = requests.get(audio_url, headers=self.headers)
output.write(response.content)
received_audio += len(response.content)
print('视频下载完成')
video_dst = f'{self.output_root}/download.mp4'
self.video_audio_merge(video, audio, video_dst)
print(f'下载的视频: {video_dst}')
os.remove(video)
os.remove(audio)
def video_audio_merge(self, video_src, audio_src, video_dst):
'''使用ffmpeg单个视频音频合并'''
cmd = f'ffmpeg -y -i {audio_src} -i {video_src} -vcodec copy -acodec aac -strict -2 -q:v 1 {video_dst}'
print('execute cmd:', cmd)
os.system(cmd)
# subprocess.Popen(command, shell=True)
def run(self):
response = self.getHtml()
video_url, audio_url, video_name = self.parseHtml(response)
self.downloadVideo(video_url, audio_url, video_name)
def demo():
# url = 'https://www.bilibili.com/video/BV1Q5411p7bz?from=search&seid=14643382716113842219'
url=input('请输入视频地址:')
b = BilibiliVideoSpider(url)
b.run()
if __name__ == '__main__':
demo()
然后就保存到当前路径下的文件里了
边栏推荐
- [Machine Learning] 21-day Challenge Study Notes (2)
- The use of pytorch: temperature prediction using neural networks
- Residential water problems
- 数仓4.0(三)------数据仓库系统
- 深度学习:使用nanodet训练自己制作的数据集并测试模型,通俗易懂,适合小白
- Introduction to JVM class loading
- 第09章 性能分析工具的使用【2.索引及调优篇】【MySQL高级】
- C# const readonly static 关键字区别
- Three handshake and four wave in tcp
- 高数_复习_第1章:函数、极限、连续
猜你喜欢

Exercise: Selecting a Structure (1)

ExcelPatternTool: Excel表格-数据库互导工具
![[GYCTF2020]EasyThinking](/img/40/973411c69d1e4766d22f6a4a7c7c01.png)
[GYCTF2020]EasyThinking

迁移学习——Joint Geometrical and Statistical Alignment for Visual Domain Adaptation

一文看懂推荐系统:召回06:双塔模型——模型结构、训练方法,召回模型是后期融合特征,排序模型是前期融合特征

MySQL学习

【Redis】Linux下Redis安装

ORA-01105 ORA-03175

Dynamic Programming/Knapsack Problem Summary/Summary - 01 Knapsack, Complete Knapsack

开篇-开启全新的.NET现代应用开发体验
随机推荐
dotnet 6 为什么网络请求不跟随系统网络代理变化而动态切换代理
超越YOLO5-Face | YOLO-FaceV2正式开源Trick+学术点拉满
Bit rate vs. resolution, which one is more important?
JZ搜索引擎solr研究-从数据库创建索引
释放技术创新引擎,英特尔携手生态合作伙伴推动智慧零售蓬勃发展
多线程涉及的其它知识(死锁(等待唤醒机制),内存可见性问题以及定时器)
优化Feed流遭遇拦路虎,是谁帮百度打破了“内存墙”?
Opencv - video frame skipping processing
亚马逊云科技携手中科创达为行业客户构建AIoT平台
[Redis] Redis installation under Linux
Oracle encapsulates restful interfaces into views
AI+PROTAC|dx/tx完成500万美元种子轮融资
How to create an rpm package
A new technical director, who calls DDD a senior, is convinced
2022杭电多校第一场
[Unity Entry Plan] Handling of Occlusion Problems in 2D Games & Pseudo Perspective
习题:选择结构(一)
方法重写与Object类
亚马逊云科技 + 英特尔 + 中科创达为行业客户构建 AIoT 平台
【TA-霜狼_may-《百人计划》】图形4.3 实时阴影介绍