当前位置:网站首页>Pyspark Machine Learning: Vectors and Common Operations
Pyspark Machine Learning: Vectors and Common Operations
2022-08-01 04:28:00 【Sun_Sherry】
Spark版本:V3.2.1
本篇主要介绍pyspark.ml.linalgvector operations in.
1. DenseVector(稠密向量)
1.1 创建
Dense vectors are similar to normal arrays,其创建方法如下:
from pyspark.ml import linalg
import numpy as np
dvect1=linalg.Vectors.dense([1,2,3,4,5])
dvect2=linalg.Vectors.dense(1.2,3,3,4,5)
print(dvect1)
print(dvect2)
其结果如下(To pay attention to its data typefloat型):
1.2 常用操作
- Add, subtract, multiply and divide operations on two vectors of the same length.具体如下:
res1=dvect1+dvect2
res2=dvect1-dvect2
res3=dvect1*dvect2
res4=dvect1/dvect2
print(res1)
print(res2)
print(res3)
print(res4)
其结果如下:
- 可以使用numpy.darray中的一些属性
dvec1_shape=dvect1.array.shape
dvec1_size=dvect1.array.size
print(dvec1_shape)# 其结果为:(5,)
print(dvec1_size)# 其结果为:5
- dot点乘操作
res_1=dvect1.dot([1,2,3,4,5])
res_2=dvect1.dot([0,1,0,0,0])
res_3=dvect1.dot(dvect2)
print(res_1) #结果为55
print(res_2) #结果为2
print(res_3) #结果为57.2
- 求向量的范式
dvect1=linalg.Vectors.dense([1,2,3,4,5])
norm_0=dvect1.norm(0)
norm_1=dvect1.norm(1)
norm_2=dvect2.norm(2)
print('dvect1的L0范式为:{}'.format(norm_0))
print('dvect1的L1范式为:{}'.format(norm_1))
print('dvect1的L2范式为:{:.3f}'.format(norm_2))
其结果如下:
- numNonZeros()统计非0元素的个数
dvect1=linalg.Vectors.dense([1,0,3,0,5])
num_nonzero=dvect1.numNonzeros()
print(num_nonzero)#其结果为3
- squared_distance()Find the squared distance of two vectors with the same dimension
dvect1=linalg.Vectors.dense([1,0,3])
dvect2=linalg.Vectors.dense([1,1,1])
dist=dvect1.squared_distance(dvect2) #其值为5
- get the value of the vector
dvect1=linalg.Vectors.dense([1,0,3])
print(dvect1.toArray())
print(dvect1.values)
2. SparseVector(稀疏向量)
2.1 创建
There are several ways to create sparse vectors::
- Vectors.sparse(向量长度, 索引数组,With the index array corresponding numerical arrays),其中索引从0开始编号,下同;
- Vectors.sparse(向量长度, {索引:数值,索引:数值, … \dots …})
- Vectors.sparse(向量长度,[(索引,数值),(索引,数值), … \dots …])
举例如下:
svect1=linalg.Vectors.sparse(3,[0,1],[3.4,4.5])
svect2=linalg.Vectors.sparse(3,{
0:3.4,2:4.5})
svect3=linalg.Vectors.sparse(4,[(2,3),(3,2.3)])
2.2 常用操作
Some operations on sparse variables are the same as those on dense vectors,不再赘述.Only the following two operations are introduced here:
- toArrayDisplay all values of a sparse variable
svect1=linalg.Vectors.sparse(3,[0,1],[3.4,4.5])
svect2=linalg.Vectors.sparse(3,{
0:3.4,2:4.5})
svect3=linalg.Vectors.sparse(4,[(2,3),(3,2.3)])
print(svect1.toArray())
print(svect2.toArray())
print(svect3.toArray())
其结果如下:
- indices()Returns a sparse vector in0元素的索引值
svect1=linalg.Vectors.sparse(3,[0,1],[3.4,4.5])
svect2=linalg.Vectors.sparse(3,{
0:3.4,2:4.5})
svect3=linalg.Vectors.sparse(4,[(2,3),(3,2.3)])
print(svect1.indices) #返回[0 1](array类型,下同)
print(svect2.indices) #返回[0 2]
print(svect3.indices) #返回[2 3]
边栏推荐
- Introduction to Oracle
- mysql中解决存储过程表名通过变量传递的方法
- 请问shake数据库中为什么读取100个collection 后,直接就退出了,不继续读了呢?
- 项目风险管理必备内容总结
- 移动端页面秒开优化总结
- Unknown Bounded Array
- Message queue design based on mysql
- Typescript22 - interface inheritance
- 出现Command ‘vim‘ is available in the following places,vim: command not found等解决方法
- Flutter Tutorial 01 Configure the environment and run the demo program (tutorial includes source code)
猜你喜欢
Visual Studio提供的 Command Prompt 到底有啥用
Message queue design based on mysql
typescript27-枚举类型呢
Difference Between Compiled and Interpreted Languages
Valentine's Day Romantic 3D Photo Wall [with source code]
UE4 模型OnClick事件不生效的两种原因
Message Queuing Message Storage Design (Architecture Camp Module 8 Jobs)
基于STM32设计的UNO卡牌游戏(双人、多人对战)
The maximum quantity leetcode6133. Grouping (medium)
win10 fixed local IP
随机推荐
数组问题之《下一个排列》、《旋转图像》以及二分查找之《搜索二维矩阵》
The kernel's handling of the device tree
typescript20-接口
How to write a high-quality digital good article recommendation
智芯传感输液泵压力传感器 为精准智能控制注入科技“强心剂”
Unity在BuildIn渲染管线下实现PlanarReflection的初级方法
RSA主要攻击方法
typescript26-字面量类型
Advice given by experts with four years of development experience in Flutter tutorial
怀念故乡的面条
lambda
请问shake数据库中为什么读取100个collection 后,直接就退出了,不继续读了呢?
Software Testing Weekly (Issue 82): In fact, all those who are entangled in making choices already have the answer in their hearts, and consultation is just to get the choice that they prefer.
7月编程排行榜来啦!这次有何新变化?
Flutter Tutorial 01 Configure the environment and run the demo program (tutorial includes source code)
软件测试面试(三)
UE4 模型OnClick事件不生效的两种原因
[Getting Started Tutorial] Rollup Module Packager Integration
Message queue design based on mysql
typescript26 - literal types