当前位置:网站首页>当 Pandas 遇见 SQL,一个强大的工具库诞生了
当 Pandas 遇见 SQL,一个强大的工具库诞生了
2022-06-23 10:15:00 【Python数据挖掘】
本文的所有演示数据,均是基于下方的四张表。下面这四张表大家应该不陌生,这就是网传50道经典MySQL面试题中使用到的几张原表。关于下方各表之间的关联关系,我就不给大家说明了,仔细观察字段名,应该就可以发现。喜欢本文记得收藏、关注、点赞。
注:技术交流、资料获取,文末见

pandasql简介
pandas中的DataFrame是一个二维表格,数据库中的表也是一个二维表格,因此在pandas中使用sql语句就显得水到渠成,pandasql使用SQLite作为其操作数据库,同时Python自带SQLite模块,不需要安装,便可直接使用。
这里有一点需要注意的是:使用pandasql读取DataFrame中日期格式的列,默认会读取年月日、时分秒,因此我们要学会使用sqlite中的日期处理函数,方便我们转换日期格式,下方提供sqlite中常用函数大全,希望对你有帮助。
sqlite函数大全:http://suo.im/5DWraE
导入相关库:
import pandas as pd
from pandasql import sqldf
声明全局变量的2种方式
① 在使用之前,声明该全局变量;
② 一次性声明好全局变量;
在使用之前,声明该全局变量
df1 = pd.read_excel("student.xlsx")
df2 = pd.read_excel("sc.xlsx")
df3 = pd.read_excel("course.xlsx")
df4 = pd.read_excel("teacher.xlsx")
global df1
global df2
global df3
global df4
query1 = "select * from df1 limit 5"
query2 = "select * from df2 limit 5"
query3 = "select * from df3"
query4 = "select * from df4"
sqldf(query1)
sqldf(query2)
sqldf(query3)
sqldf(query4)
部分结果如下:

一次性声明好全局变量
df1 = pd.read_excel("student.xlsx")
df2 = pd.read_excel("sc.xlsx")
df3 = pd.read_excel("course.xlsx")
df4 = pd.read_excel("teacher.xlsx")
pysqldf = lambda q: sqldf(q, globals())
query1 = "select * from df1 limit 5"
query2 = "select * from df2 limit 5"
query3 = "select * from df3"
query4 = "select * from df4"
sqldf(query1)
sqldf(query2)
sqldf(query3)
sqldf(query4)
部分结果如下:

写几个简单的SQL语句
查看sqlite的版本
student = pd.read_excel("student.xlsx")
pysqldf = lambda q: sqldf(q, globals())
query1 = """ select sqlite_version(*) """
pysqldf(query1)
结果如下:

where筛选
student = pd.read_excel("student.xlsx")
pysqldf = lambda q: sqldf(q, globals())
query1 = """ select * from student where strftime('%Y-%m-%d',sage) = '1990-01-01' """
pysqldf(query1)
结果如下:

多表连接
student = pd.read_excel("student.xlsx")
sc = pd.read_excel("sc.xlsx")
pysqldf = lambda q: sqldf(q, globals())
query2 = """ select * from student s join sc on s.sid = sc.sid """
pysqldf(query2)
部分结果如下:

分组聚合
student = pd.read_excel("student.xlsx")
sc = pd.read_excel("sc.xlsx")
pysqldf = lambda q: sqldf(q, globals())
query2 = """ select s.sname as 姓名,sum(sc.score) as 总分 from student s join sc on s.sid = sc.sid group by s.sname """
pysqldf(query2)
结果如下:

union查询
student = pd.read_excel("student.xlsx")
pysqldf = lambda q: sqldf(q, globals())
query1 = """ select * from student where strftime('%Y-%m',sage) = '1990-01' union select * from student where strftime('%Y-%m',sage) = '1990-12' """
pysqldf(query1)
结果如下:

技术交流
目前开通了技术交流群,群友已超过3000人,添加时最好的备注方式为:来源+兴趣方向,方便找到志同道合的朋友
方式①、发送如下图片至微信,长按识别,后台回复:加群;
方式②、添加微信号:dkl88191,备注:来自CSDN
方式③、微信搜索公众号:Python学习与数据挖掘,后台回复:加群
边栏推荐
猜你喜欢

Install the typescript environment and enable vscode to automatically monitor the compiled TS file as a JS file

陆奇首次出手投资量子计算

文献综述怎么写 ,一直没头绪写不出来怎么办?
![[software and system security] heap overflow](/img/ca/1b98bcdf006f90cabf3e90e416f7f2.png)
[software and system security] heap overflow

Unity技术手册 - 生命周期内速度限制(Limit Velocity Over Lifetime)子模块和速度继承(Inherit Velocity)子模块

Nuxt.js spa与ssr的区别

Solve the problem that Preview PDF cannot be downloaded

Several practical software sharing

一个优秀速开发框架是什么样的?

RPC kernel details you must know (worth collecting)!!!
随机推荐
Shengshihaotong enables high-quality development with industrial Digitalization
How to solve the problem that easycvr does not display the interface when RTMP streaming is used?
2021-04-16 recursion
Unity技术手册 - 生命周期内速度限制(Limit Velocity Over Lifetime)子模块和速度继承(Inherit Velocity)子模块
NOI OJ 1.4 03:奇偶数判断 C语言
搭建一个QQ机器人叫女友起床
CVPR大会现场纪念孙剑博士,最佳学生论文授予同济阿里,李飞飞获黄煦涛纪念奖...
解决audio自动播放无效问题
STM32F1与STM32CubeIDE编程实例-红外寻迹传感器驱动
What is JSX in the JS tutorial? Why do we need it?
Nuxt.js spa与ssr的区别
Year end answer sheet! Tencent cloud intelligent comprehensive strength ranks first in China!
线程池在项目中使用的心得体会
The second Tencent light · public welfare innovation challenge was launched, and the three competition topics focused on the social value of sustainable development
[day 23] given an array of length N, insert element x into the position specified by the array | array insertion operation 4
AI芯片技术-2022年
Tencent tangdaosheng: practice "science and technology for the good" and promote sustainable social value innovation
RT-Thread 添加 msh 命令
Install the typescript environment and enable vscode to automatically monitor the compiled TS file as a JS file
文件IO(1)