当前位置:网站首页>MRS离线数据分析:通过Flink作业处理OBS数据
MRS离线数据分析:通过Flink作业处理OBS数据
2022-07-07 15:36:00 【InfoQ】
创建MRS集群
准备测试数据
This is a test demo for MRS Flink. Flink is a unified computing framework that supports both batch processing and stream processing. It provides a stream data processing engine that supports data distribution and parallel computing.
创建并运行Flink作业
方式1:在控制台界面在线提交作业。
- 登录MRS管理控制台,单击MRS集群名称,进入集群详情页面。
- 在集群详情页的“概览”页签,单击“IAM用户同步”右侧的“单击同步”进行IAM用户同步。
- 单击“作业管理”,进入“作业管理”页签。
- 单击“添加”,添加一个Flink作业。作业类型:Flink作业名称:自定义,例如flink_obs_test。执行程序路径:本示例使用Flink客户端的WordCount程序为例。运行程序参数:使用默认值。执行程序参数:设置应用程序的输入参数,“input”为待分析的测试数据,“output”为结果输出文件。
- 服务配置参数:使用默认值即可,如需手动配置作业相关参数,可参考运行Flink作业。
方式2:通过集群客户端提交作业。
su - omm
cd /opt/client
source bigdata_env
hdfs dfs -ls obs://mrs-demo-data/flink
flink run -m yarn-cluster /opt/client/Flink/flink/examples/batch/WordCount.jar --input obs://mrs-demo-data/flink/mrs_flink_test.txt --output obs://mrs-demo/data/flink/output2
...
Cluster started: Yarn cluster with application id application_1654672374562_0011
Job has been submitted with JobID a89b561de5d0298cb2ba01fbc30338bc
Program execution finished
Job with JobID a89b561de5d0298cb2ba01fbc30338bc has finished.
Job Runtime: 1200 ms
查看作业执行结果
a 3
and 2
batch 1
both 1
computing 2
data 2
demo 1
distribution 1
engine 1
flink 2
for 1
framework 1
is 2
it 1
mrs 1
parallel 1
processing 3
provides 1
stream 2
supports 2
test 1
that 2
this 1
unified 1
Job with JobID xxx has finished.
Job Runtime: xxx ms
Accumulator Results:
- e6209f96ffa423974f8c7043821814e9 (java.util.ArrayList) [31 elements]
(a,3)
(and,2)
(batch,1)
(both,1)
(computing,2)
(data,2)
(demo,1)
(distribution,1)
(engine,1)
(flink,2)
(for,1)
(framework,1)
(is,2)
(it,1)
(mrs,1)
(parallel,1)
(processing,3)
(provides,1)
(stream,2)
(supports,2)
(test,1)
(that,2)
(this,1)
(unified,1)
边栏推荐
- AI来搞财富分配比人更公平?来自DeepMind的多人博弈游戏研究
- LeetCode 403. 青蛙过河 每日一题
- Talk about the realization of authority control and transaction record function of SAP system
- LeetCode 1043. 分隔数组以得到最大和 每日一题
- 射线与OBB相交检测
- Temperature sensor chip used in temperature detector
- LeetCode 1477. 找两个和为目标值且不重叠的子数组 每日一题
- 整理几个重要的Android知识,高级Android开发面试题
- Lowcode: four ways to help transportation companies enhance supply chain management
- skimage学习(3)——Gamma 和 log对比度调整、直方图均衡、为灰度图像着色
猜你喜欢
Advanced C language -- function pointer
值得一看,面试考点与面试技巧
[medical segmentation] attention Unet
node:504报错
Vs2019 configuration matrix library eigen
Lowcode: four ways to help transportation companies enhance supply chain management
skimage学习(3)——Gamma 和 log对比度调整、直方图均衡、为灰度图像着色
Seaborn数据可视化
skimage学习(3)——使灰度滤镜适应 RGB 图像、免疫组化染色分离颜色、过滤区域最大值
Shallow understanding Net core routing
随机推荐
Reflections on "product managers must read: five classic innovative thinking models"
[designmode] proxy pattern
服务器彻底坏了,无法修复,如何利用备份无损恢复成虚拟机?
水平垂直居中 方法 和兼容
Proxmox VE重装后,如何无损挂载原有的数据盘?
QT中自定义控件的创建到封装到工具栏过程(一):自定义控件的创建
【Seaborn】组合图表:FacetGrid、JointGrid、PairGrid
SqlServer2014+: 创建表的同时创建索引
C语言进阶——函数指针
字节跳动高工面试,轻松入门flutter
SIGGRAPH 2022最佳技术论文奖重磅出炉!北大陈宝权团队获荣誉提名
typescript ts基础知识之tsconfig.json配置选项
LeetCode 1186. 删除一次得到子数组最大和 每日一题
[Seaborn] combination chart: pairplot and jointplot
LeetCode 1031. 两个非重叠子数组的最大和 每日一题
面向接口编程
Sort out several important Android knowledge and advanced Android development interview questions
Inner monologue of accidental promotion
Pisa-Proxy SQL 解析之 Lex & Yacc
网关Gateway的介绍与使用