当前位置:网站首页>Alluxio for Presto fu can across the cloud self-service ability
Alluxio for Presto fu can across the cloud self-service ability
2022-07-30 15:32:00 【Alluxio】
Table of Contents
What kind of architecture is self-service capable?
Considerations for designing a data platform
This article highlights the synergy between two popular open source projects, Alluxio and Presto, and shows how to leverage both to enable cross-cloud self-service data architectures.
about the author
Fan BinAlluxio VP Open Source and Founding Member
Adit MadanAlluxio Senior Product Manager
Jasmine WangAlluxio Community Manager
What kind of architecture is self-service capable?
Let's discuss a question first, what conditions are met for this architecture to be called self-service.
Condition 1: As the data platform is updated, the architecture does not need to be modified
All data platforms evolve over time, including adding new data stores, computing engines, or having new teams that need access to shared data.In either case, such a platform is capable of self-service if these changes do not require modifications to the existing architecture.
Condition 2: Data isolation across teams
With a self-service platform, business units don't interfere with each other.When a new team joins, data can be shared, and the new data access will not affect the use of the original platform.
Agility is achieved if the above two conditions are met.When designing an architecture, it is more important to consider the ability to enable self-service than the cost of the physical architecture.
Considerations for designing a data platform
Below, we describe some of the considerations when designing a self-service platform, along with simplified architectural patterns and solutions.
Consideration 1: Data is shared
Share data between different computing frameworks
- Enterprises use various computing engines in the data platform, and each engine completes a specific task. For example, ETL batch processing is performed first, and then Presto is used for interactive query.This means that data is shared between different engines and between different teams
- For example, a team is responsible for collecting business data and sharing the data for use by multiple business units
Data centers across regions and data sharing across cloud vendors
- This allows the flexibility to choose the optimal storage environment and cloud service
How to solve the problem of data sharing, we propose the concept of an abstraction layer, and use the abstraction layer to realize heterogeneous computing across cross-environments.Alluxio provides such a cross-cloud abstraction layer, enabling seamless data sharing between Presto and other computing engines, no matter where the data is stored.

Consideration 2: The data has a business domain to which it belongs, the easiest way is to leave it in place
- Although copying can achieve data isolation, when the data access policy is very strict, the use of data by the data producer needs to be strictly controlled, and the entire data governance will become very complicated.
- Data copying leads to redundant storage space, is prone to errors, and takes up a lot of resources.
Copying data is obviously not an ideal solution, but how to achieve high performance for heterogeneous data access without moving the data?This requires abstraction layers to address data governance, performance, and moving data across businesses.
The architecture below shows how Presto utilizes Alluxio as an abstraction layer to access data located in different storage environments.

Generally, there are two situations:
- All data in a single cloud or single data center
- Data is shared across multiple data centers or hybrid clouds
In either case, Alluxio acts as an abstraction layer to isolate data consumers and producers.The abstraction layer is not just used as a cache, the ability to preload and write in advance ensures that the SLA is consistent even when the data is separated from the calculation.

Conclusion
Alluxio empowers Presto with self-service capabilities. Through Alluxio, a cross-cloud self-service data architecture can be realized, and the entire architecture can better adapt to the evolution of the data platform.If you want to know more information, you can check the white paper《Alluxio+Presto概述——Architecture Evolution of Interactive Queries" to learn how Facebook, TikTok, Electronic Arts, Walmart, Tencent, Comcast and more are using Alluxio to optimize the Presto platform.
边栏推荐
- 基于FPGA的DDS任意波形输出
- CS内网横向移动 模拟渗透实操 超详细
- MySQL客户端工具的使用与MySQL SQL语句
- golang modules初始化项目
- MongoDB启动报错 Process: 29784 ExecStart=/usr/bin/mongod $OPTIONS (code=exited, status=14)
- 5. DOM
- MaxWell抓取数据
- 4 senior experts share the insider architecture design and implementation principles of Flink technology with years of experience in large factories
- A Small Case About Containers
- 1700. 无法吃午餐的学生数量
猜你喜欢

【云原生】灰度发布、蓝绿发布、滚动发布、灰度发布解释

Application of time series database in the field of ship risk management

ISELED---氛围灯方案的新选择

Understand Chisel language. 28. Chisel advanced finite state machine (2) - Mealy state machine and comparison with Moore state machine

(科普文)什么是碎片化NFT(Fractional NFT)

那些破釜沉舟入局Web3.0的互联网精英都怎么样了?

阿里CTO程立:阿里巴巴的开源历程、理念和实践

Kubernetes应用管理深度剖析

71-page comprehensive overall solution for global tourism 2021 ppt

【云原生】服务行业案例-不可预测的并发场景解决方案
随机推荐
MySql error: SqlError(Unable to execute query", "Can't create/write to file OS errno 2 - No such file...
Flink实时数仓完结
定时任务 corn
惊艳!京东T8纯手码的Redis核心原理手册,基础与源码齐下
MySQL客户端工具的使用与MySQL SQL语句
Installing and Uninstalling MySQL on Mac
我们公司用了 6 年的网关服务,动态路由、鉴权、限流等都有,稳的一批!
SSE for Web Message Push
Ts是什么?
关于华为应用市场审核App无法启动的问题
MongoDB starts an error Process: 29784 ExecStart=/usr/bin/mongod $OPTIONS (code=exited, status=14)
组态 - 笔记
基于FPGA的DDS任意波形输出
The website adds a live 2d kanban girl that can dress up and interact
Allure Advanced - Dynamically Generate Report Content
The highest level of wiring in the computer room, the beauty is suffocating
Flink优化
Flink本地UI运行
SQL 优化这么做就对了!
CS内网横向移动 模拟渗透实操 超详细