当前位置:网站首页>Pyspark Machine Learning: Vectors and Common Operations
Pyspark Machine Learning: Vectors and Common Operations
2022-08-01 04:28:00 【Sun_Sherry】
Spark版本:V3.2.1
本篇主要介绍pyspark.ml.linalgvector operations in.
1. DenseVector(稠密向量)
1.1 创建
Dense vectors are similar to normal arrays,其创建方法如下:
from pyspark.ml import linalg
import numpy as np
dvect1=linalg.Vectors.dense([1,2,3,4,5])
dvect2=linalg.Vectors.dense(1.2,3,3,4,5)
print(dvect1)
print(dvect2)
其结果如下(To pay attention to its data typefloat型):
1.2 常用操作
- Add, subtract, multiply and divide operations on two vectors of the same length.具体如下:
res1=dvect1+dvect2
res2=dvect1-dvect2
res3=dvect1*dvect2
res4=dvect1/dvect2
print(res1)
print(res2)
print(res3)
print(res4)
其结果如下:
- 可以使用numpy.darray中的一些属性
dvec1_shape=dvect1.array.shape
dvec1_size=dvect1.array.size
print(dvec1_shape)# 其结果为:(5,)
print(dvec1_size)# 其结果为:5
- dot点乘操作
res_1=dvect1.dot([1,2,3,4,5])
res_2=dvect1.dot([0,1,0,0,0])
res_3=dvect1.dot(dvect2)
print(res_1) #结果为55
print(res_2) #结果为2
print(res_3) #结果为57.2
- 求向量的范式
dvect1=linalg.Vectors.dense([1,2,3,4,5])
norm_0=dvect1.norm(0)
norm_1=dvect1.norm(1)
norm_2=dvect2.norm(2)
print('dvect1的L0范式为:{}'.format(norm_0))
print('dvect1的L1范式为:{}'.format(norm_1))
print('dvect1的L2范式为:{:.3f}'.format(norm_2))
其结果如下:
- numNonZeros()统计非0元素的个数
dvect1=linalg.Vectors.dense([1,0,3,0,5])
num_nonzero=dvect1.numNonzeros()
print(num_nonzero)#其结果为3
- squared_distance()Find the squared distance of two vectors with the same dimension
dvect1=linalg.Vectors.dense([1,0,3])
dvect2=linalg.Vectors.dense([1,1,1])
dist=dvect1.squared_distance(dvect2) #其值为5
- get the value of the vector
dvect1=linalg.Vectors.dense([1,0,3])
print(dvect1.toArray())
print(dvect1.values)
2. SparseVector(稀疏向量)
2.1 创建
There are several ways to create sparse vectors::
- Vectors.sparse(向量长度, 索引数组,With the index array corresponding numerical arrays),其中索引从0开始编号,下同;
- Vectors.sparse(向量长度, {索引:数值,索引:数值, … \dots …})
- Vectors.sparse(向量长度,[(索引,数值),(索引,数值), … \dots …])
举例如下:
svect1=linalg.Vectors.sparse(3,[0,1],[3.4,4.5])
svect2=linalg.Vectors.sparse(3,{
0:3.4,2:4.5})
svect3=linalg.Vectors.sparse(4,[(2,3),(3,2.3)])
2.2 常用操作
Some operations on sparse variables are the same as those on dense vectors,不再赘述.Only the following two operations are introduced here:
- toArrayDisplay all values of a sparse variable
svect1=linalg.Vectors.sparse(3,[0,1],[3.4,4.5])
svect2=linalg.Vectors.sparse(3,{
0:3.4,2:4.5})
svect3=linalg.Vectors.sparse(4,[(2,3),(3,2.3)])
print(svect1.toArray())
print(svect2.toArray())
print(svect3.toArray())
其结果如下:
- indices()Returns a sparse vector in0元素的索引值
svect1=linalg.Vectors.sparse(3,[0,1],[3.4,4.5])
svect2=linalg.Vectors.sparse(3,{
0:3.4,2:4.5})
svect3=linalg.Vectors.sparse(4,[(2,3),(3,2.3)])
print(svect1.indices) #返回[0 1](array类型,下同)
print(svect2.indices) #返回[0 2]
print(svect3.indices) #返回[2 3]
边栏推荐
- 出现Command ‘vim‘ is available in the following places,vim: command not found等解决方法
- 【愚公系列】2022年07月 Go教学课程 025-递归函数
- [Getting Started Tutorial] Rollup Module Packager Integration
- Software Testing Interview (3)
- PMP 80个输入输出总结
- MySQL3
- PMP 项目资源管理
- UE4 rays flashed from mouse position detection
- Visual Studio提供的 Command Prompt 到底有啥用
- leetcode:126. Word Solitaire II
猜你喜欢
JS new fun(); class and instance JS is based on object language Can only act as a class by writing constructors
<JDBC> 批量插入 的四种实现方式:你真的get到了吗?
故乡的素描画
API设计笔记:pimpl技巧
Invalid classes inferred from unique values of `y`. Expected: [0 1 2], got [1 2 3]
"ArchSummit: The cry of the times, technical people can hear"
【kali-信息收集】枚举——DNS枚举:DNSenum、fierce
typescript28-枚举类型的值以及数据枚举
Input输入框光标在前输入后自动跳到最后面的bug
What is dynamic programming and what is the knapsack problem
随机推荐
认真对待每一个时刻
Immutable
使用ts-node报错
[Getting Started Tutorial] Rollup Module Packager Integration
Unity在BuildIn渲染管线下实现PlanarReflection的初级方法
Typescript20 - interface
typescript24-类型推论
动态规划 01背包
PMP工具与技术总结
Progressive Reconstruction of Visual Structure for Image Inpainting 论文笔记
Flutter "Hello world" program code
Software Testing Interview (3)
最新 955 不加班的公司名单
typescript27 - what about enumeration types
[FPGA tutorial case 43] Image case 3 - image sobel edge extraction through verilog, auxiliary verification through MATLAB
Summary of mobile page optimization in seconds
雪糕和轮胎
Unknown Bounded Array
FFmpeg 搭建本地屏幕录制环境
25. Have you been asked these three common interview questions?