当前位置:网站首页>Pyspark Machine Learning: Vectors and Common Operations
Pyspark Machine Learning: Vectors and Common Operations
2022-08-01 04:28:00 【Sun_Sherry】
Spark版本:V3.2.1
本篇主要介绍pyspark.ml.linalgvector operations in.
1. DenseVector(稠密向量)
1.1 创建
Dense vectors are similar to normal arrays,其创建方法如下:
from pyspark.ml import linalg
import numpy as np
dvect1=linalg.Vectors.dense([1,2,3,4,5])
dvect2=linalg.Vectors.dense(1.2,3,3,4,5)
print(dvect1)
print(dvect2)
其结果如下(To pay attention to its data typefloat型):
1.2 常用操作
- Add, subtract, multiply and divide operations on two vectors of the same length.具体如下:
res1=dvect1+dvect2
res2=dvect1-dvect2
res3=dvect1*dvect2
res4=dvect1/dvect2
print(res1)
print(res2)
print(res3)
print(res4)
其结果如下:
- 可以使用numpy.darray中的一些属性
dvec1_shape=dvect1.array.shape
dvec1_size=dvect1.array.size
print(dvec1_shape)# 其结果为:(5,)
print(dvec1_size)# 其结果为:5
- dot点乘操作
res_1=dvect1.dot([1,2,3,4,5])
res_2=dvect1.dot([0,1,0,0,0])
res_3=dvect1.dot(dvect2)
print(res_1) #结果为55
print(res_2) #结果为2
print(res_3) #结果为57.2
- 求向量的范式
dvect1=linalg.Vectors.dense([1,2,3,4,5])
norm_0=dvect1.norm(0)
norm_1=dvect1.norm(1)
norm_2=dvect2.norm(2)
print('dvect1的L0范式为:{}'.format(norm_0))
print('dvect1的L1范式为:{}'.format(norm_1))
print('dvect1的L2范式为:{:.3f}'.format(norm_2))
其结果如下:
- numNonZeros()统计非0元素的个数
dvect1=linalg.Vectors.dense([1,0,3,0,5])
num_nonzero=dvect1.numNonzeros()
print(num_nonzero)#其结果为3
- squared_distance()Find the squared distance of two vectors with the same dimension
dvect1=linalg.Vectors.dense([1,0,3])
dvect2=linalg.Vectors.dense([1,1,1])
dist=dvect1.squared_distance(dvect2) #其值为5
- get the value of the vector
dvect1=linalg.Vectors.dense([1,0,3])
print(dvect1.toArray())
print(dvect1.values)
2. SparseVector(稀疏向量)
2.1 创建
There are several ways to create sparse vectors::
- Vectors.sparse(向量长度, 索引数组,With the index array corresponding numerical arrays),其中索引从0开始编号,下同;
- Vectors.sparse(向量长度, {索引:数值,索引:数值, … \dots …})
- Vectors.sparse(向量长度,[(索引,数值),(索引,数值), … \dots …])
举例如下:
svect1=linalg.Vectors.sparse(3,[0,1],[3.4,4.5])
svect2=linalg.Vectors.sparse(3,{
0:3.4,2:4.5})
svect3=linalg.Vectors.sparse(4,[(2,3),(3,2.3)])
2.2 常用操作
Some operations on sparse variables are the same as those on dense vectors,不再赘述.Only the following two operations are introduced here:
- toArrayDisplay all values of a sparse variable
svect1=linalg.Vectors.sparse(3,[0,1],[3.4,4.5])
svect2=linalg.Vectors.sparse(3,{
0:3.4,2:4.5})
svect3=linalg.Vectors.sparse(4,[(2,3),(3,2.3)])
print(svect1.toArray())
print(svect2.toArray())
print(svect3.toArray())
其结果如下:
- indices()Returns a sparse vector in0元素的索引值
svect1=linalg.Vectors.sparse(3,[0,1],[3.4,4.5])
svect2=linalg.Vectors.sparse(3,{
0:3.4,2:4.5})
svect3=linalg.Vectors.sparse(4,[(2,3),(3,2.3)])
print(svect1.indices) #返回[0 1](array类型,下同)
print(svect2.indices) #返回[0 2]
print(svect3.indices) #返回[2 3]
边栏推荐
- PMP工具与技术总结
- 数组问题之《两数之和》以及《三数之和 》
- 高数 | 【重积分】线面积分880例题
- JS new fun(); class and instance JS is based on object language Can only act as a class by writing constructors
- MySQL3
- UE4 模型OnClick事件不生效的两种原因
- Flutter Tutorial 02 Introduction to Flutter Desktop Program Development Tutorial Run hello world (tutorial includes source code)
- 7 行代码搞崩溃 B 站,原因令人唏嘘!
- 【愚公系列】2022年07月 Go教学课程 025-递归函数
- 基于ProXmoX VE的虚拟化家庭服务器(篇一)—ProXmoX VE 安装及基础配置
猜你喜欢
随机推荐
The 16th day of the special assault version of the sword offer
Mysql基础篇(Mysql数据类型)
解决ffmpeg使用screen-capture-recorder录屏,有屏幕缩放的情况下录不全的问题
PMP 项目沟通管理
让你的 Lottie 支持文字区域内自动换行
基于Arduino制作非接触式测温仪
Flink 1.13 (8) CDC
【无标题】
typescript27 - what about enumeration types
在互联网时代,有诸多「互联网+」模式的诞生
MySQL4
2022-07-31: Given a graph with n points and m directed edges, you can use magic to turn directed edges into undirected edges, such as directed edges from A to B, with a weight of 7.After casting the m
基于STM32设计的UNO卡牌游戏(双人、多人对战)
How to write a high-quality digital good article recommendation
[FPGA tutorial case 43] Image case 3 - image sobel edge extraction through verilog, auxiliary verification through MATLAB
Progressive Reconstruction of Visual Structure for Image Inpainting 论文笔记
Simple and easy to use task queue - beanstalkd
6-23漏洞利用-postgresql代码执行利用
button remove black frame
Message queue design based on mysql









