当前位置:网站首页>How do I get the largest K massive data
How do I get the largest K massive data
2022-06-30 20:35:00 【Rabbit cloud program】
This may be an algorithm problem with great depth in metaphysics , Because the people in this meeting can design excellent algorithms , People who can't do it may have nowhere to start . Huge amounts of data top K problem , It is reflected everywhere in the products of Internet manufacturers , For example, wechat's step counting software , According to the statistics K name , And then sort it .
Of course, there are 100 million floating-point numbers for similar problems , How to find out the biggest 10000 individual . In fact, the code skills involved are memory processing and data De duplication optimization , Deal with the massive data that needs to occupy a large amount of memory space through various methods . There are several ways , Including the stupidest and wisest plan , You can blow water during the interview .
Memory allows , Sort all directly
This is probably the most direct, simple and crude way , But you know , Our premise is massive data , This method is a scheme , But it's definitely unreliable . Sort all from large to small, then , We take the head directly K individual , Methods consume a lot of memory and are not efficient , Did a lot of idle work , Not recommended . Of course, if in an oral interview , When your brain is short circuited, you can put forward this first .
Memory allows , Divide and conquer method
In fact, the idea of partition includes fast sorting and merging sorting . The first division of ideas is to continuously divide the data into N Share , Governance is to find the biggest... In each data K Number .
Minimum heap method ( Also called partial elimination method )
A partial elimination method . Before reading K Number , Build a minimum heap . Then compare all the remaining numbers with the top of the smallest heap in turn , If less than or equal to the heap top data , Then continue to compare the next ; otherwise , Remove the heap top element , And insert the new data into the heap , Readjust the minimum heap size . After traversing all the data , The data in the smallest heap is the largest K Number .
The time complexity is O(n+m^2)( among m by K, such as 10000)
边栏推荐
- maya房子建模
- 大神详解开源 BUFF 增益攻略丨直播
- 北京大学ACM Problems 1005:I Think I Need a Houseboat
- Originpro 2021 with installation tutorial
- 二叉查找树(一) - 概念与C语言实现
- How to pass the PMP Exam quickly?
- 北京大学ACM Problems 1002:487-3279
- The newly born robot dog can walk by himself after rolling for an hour. The latest achievement of Wu Enda's first disciple
- Is the project manager a leader? Can you criticize and blame members?
- SQL优化
猜你喜欢

Introduction to neural network (Part 1)

杰理之触摸按键识别流程【篇】

All the important spark summit features were released here last night (with ultra clear video attached)
![Network planning | [five transport layers and six application layers] knowledge points and examples](/img/4f/31acce51b584bed5ef56b2093c4db3.png)
Network planning | [five transport layers and six application layers] knowledge points and examples
![Jerry's touch key recognition process [chapter]](/img/3e/bb73c735d0a7c7a26989c65a432dad.png)
Jerry's touch key recognition process [chapter]

Solve the problems of Devops landing in complex environment with various tools with full stack and full function solutions

To eliminate bugs, developers must know several bug exploration and testing artifacts.

NLP paper lead reading | what about the degradation of text generation model? Simctg tells you the answer

浅谈代码语言的魅力

谈谈内联函数
随机推荐
哈夫曼樹(一)基本概念與C語言實現
Go learning notes
Web host iptables firewall security script
GeoServer installation
哈夫曼树(一)基本概念与C语言实现
Taihu Lake "China's healthy agricultural products · mobile phone live broadcast" enters Taihu Lake
第81场双周赛
浅谈代码语言的魅力
Lumiprobe核酸定量丨QuDye dsDNA BR 检测试剂盒
项目经理是领导吗?可以批评指责成员吗?
How to pass the PMP Exam quickly?
Solve the problems of Devops landing in complex environment with various tools with full stack and full function solutions
神经网络入门(上)
Openfire solves the problem of Chinese garbled code after using MySQL database
CADD course learning (1) -- basic knowledge of drug design
[1175. prime number arrangement]
Jerry's touch key recognition process [chapter]
BioVendor sRAGE Elisa试剂盒测试原理和注意事项
Evolution of screen display technology
凌云出海记 | 一零跃动&华为云:共助非洲普惠金融服务