当前位置:网站首页>Self summarizing
Self summarizing
2022-06-13 05:59:00 【Yushijuj】
Study Spark My experience
Since sophomore year learning big data , I know what big data is , Big data is a phenomenon , Not a technology , The volume of big data should be very large , There are many categories . Big data is massive data + Complex data type .
Big data solves problems : 1. Fast data flow ( Stream processing , real-time processing , The batch ),2. A variety of data types ( structured , Semi structured , Unstructured ),3. Massive data scale (TB,PB,EB).
Big data technology definition : It refers to the collection of big data 、 transport 、 Storage 、 The related technology of analysis and application is a series of non-traditional tools to structure a large number of 、 Semi structured and unstructured data processing , A series of data processing and analysis techniques to obtain analysis and prediction results .
The application of big data technology has the following aspects , Data collection , Data storage and management , Data processing and analysis , Data privacy and security , Big data computing mode : Batch calculation , Flow calculation , Figure calculation , Query analysis calculation .
We also learned how to build Hadoop platform ( The core competencies are as follows ) HDFS、MapReduce、hive Data warehouse, etc
Through the study of this semester , I know what is Spark,Apache Spark It is a distributed open source processing system for big data workload . It uses in memory caching and optimized query execution , It can quickly analyze and query data of any scale . It provides the use of Java、Scala、Python and R Language development API, Support code reuse across multiple workloads — The batch 、 Interactive query 、 Real time analysis 、 Machine learning and graphics processing . You will find that many organizations in various industries use it , These include FINRA、Yelp、Zillow、DataXu、Urban Institute and CrowdStrike.
Hadoop Is an open source framework , It will Hadoop distributed file system (HDFS) Used to store , take YARN As a way to manage computing resources used by different applications , And realize MapReduce Programming model to act as an execution engine . In general Hadoop In the implementation , Different execution engines will also be deployed , Such as Spark、Tez and Presto.
Spark Is a kind of special for interactive query 、 An open source framework for machine learning and real-time workloads . It doesn't have its own storage system , But on other storage systems , Such as HDFS, Or other popular storage , Such as Amazon Redshift、Amazon S3、Couchbase、Cassandra And so on .Hadoop Upper Spark Make use of YARN To share common clusters and datasets as other Hadoop engine , Ensure a consistent level of service and response .

In practice , There are still many problems , Grammatical mistakes , There are many lines of code in one line , Logic is not rigorous enough in programming , Unfamiliar with logic error code , Learn more English , The vocabulary of words is also small , I'm learning Spark There are many doubts when speaking , however , I didn't have the courage to ask the teacher , It took me a long time to understand , It's really a little stupid , There are also a lot of things you don't know , I feel like I've learned , And I don't feel like learning , in general , I feel terrible , Become a person you hate , And always looking for an excuse for their loss , Learning is on the one hand , Growth is another aspect , People always like to be praised , But I don't know that this is a sword that stabs at my weakness , In fact, it is not others who let themselves fall into the abyss , It is the weak self ; greedy 、 Indulge 、 Cowardice 、 dissolute , These bad words are always around us , We can only overcome many difficulties , To succeed , Standing in the past you , In front of you :“ Goodbye to the old me ”!
So you can learn , Know why you are moving forward . And know what to retreat or stagnate .
边栏推荐
- ffmpeg 下载后缀为.m3u8的视频文件
- Application virtual directory static resource configuration on tongweb
- Why do so many people hate a-spice
- Software testing - Summary of common interface problems
- Set the correct width and height of the custom dialog
- Leetcode- complement of numbers - simple
- Misunderstanding of tongweb due to ease of use
- Tongweb crawl performance log script
- Mobile end adaptation scheme
- Zero copy technology
猜你喜欢

Service fusing and degradation of Note Series

Config server configuration center of Nacos series

MySQL performs an inner join on query. The query result is incorrect because the associated fields have different field types.

Use of Nacos configuration center

Building a stand-alone version of Nacos series

Test logiciel - résumé des FAQ d'interface

OpenGL马赛克(八)

What happens when the MySQL union index ABC encounters a "comparison operator"?

Mongodb multi field aggregation group by

中断处理过程
随机推荐
OpenGL Mosaic (8)
Power simple of leetcode-3
Function and application scenario of field setaccessible() method
Mongodb multi field aggregation group by
Leetcode judge subsequence simple
Leetcode- longest palindrome string - simple
The 13th week of the second semester of sophomore year
Misunderstanding of tongweb due to ease of use
Why do so many people hate a-spice
Application virtual directory static resource configuration on tongweb
Sentinel series hot spot current limiting
Software testing - Summary of common interface problems
Leetcode- distribute cookies - simple
Class conflicts caused by tongweb Enterprise Edition and embedded Edition
Leetcode- string addition - simple
Vagrant virtual machine installation, disk expansion and LAN access tutorial
Validation set: ‘flowable-executable-process‘ | Problem: ‘flowable-servicetask-missing-implementatio
Lamda expression
Leetcode fizz buzz simple
The difference between the increment and decrement operators before and after variables i+, +i, I –, – I