当前位置:网站首页>How do I get the largest K massive data
How do I get the largest K massive data
2022-06-30 20:35:00 【Rabbit cloud program】
This may be an algorithm problem with great depth in metaphysics , Because the people in this meeting can design excellent algorithms , People who can't do it may have nowhere to start . Huge amounts of data top K problem , It is reflected everywhere in the products of Internet manufacturers , For example, wechat's step counting software , According to the statistics K name , And then sort it .
Of course, there are 100 million floating-point numbers for similar problems , How to find out the biggest 10000 individual . In fact, the code skills involved are memory processing and data De duplication optimization , Deal with the massive data that needs to occupy a large amount of memory space through various methods . There are several ways , Including the stupidest and wisest plan , You can blow water during the interview .
Memory allows , Sort all directly
This is probably the most direct, simple and crude way , But you know , Our premise is massive data , This method is a scheme , But it's definitely unreliable . Sort all from large to small, then , We take the head directly K individual , Methods consume a lot of memory and are not efficient , Did a lot of idle work , Not recommended . Of course, if in an oral interview , When your brain is short circuited, you can put forward this first .
Memory allows , Divide and conquer method
In fact, the idea of partition includes fast sorting and merging sorting . The first division of ideas is to continuously divide the data into N Share , Governance is to find the biggest... In each data K Number .
Minimum heap method ( Also called partial elimination method )
A partial elimination method . Before reading K Number , Build a minimum heap . Then compare all the remaining numbers with the top of the smallest heap in turn , If less than or equal to the heap top data , Then continue to compare the next ; otherwise , Remove the heap top element , And insert the new data into the heap , Readjust the minimum heap size . After traversing all the data , The data in the smallest heap is the largest K Number .
The time complexity is O(n+m^2)( among m by K, such as 10000)
边栏推荐
- 基于开源流批一体数据同步引擎ChunJun数据还原—DDL解析模块的实战分享
- [1175. prime number arrangement]
- 分析超700万个研发需求发现,这八大编程语言才是行业最需要的
- DEX file parsing - Method_ IDS resolution
- PM reports work like this, and the boss is willing to give you a raise
- 哈夫曼樹(一)基本概念與C語言實現
- Jerry's question about long press boot detection [chapter]
- Meeting, onemeeting, OK!
- Encoding type of Perl conversion file
- Based on the open source stream batch integrated data synchronization engine Chunjun data restore DDL parsing module actual combat sharing
猜你喜欢

漏洞扫描工具大全,妈妈再也不用担心我挖不到漏洞了
![Jerry's touch key recognition process [chapter]](/img/3e/bb73c735d0a7c7a26989c65a432dad.png)
Jerry's touch key recognition process [chapter]

分析超700万个研发需求发现,这八大编程语言才是行业最需要的

Taihu Lake "China's healthy agricultural products · mobile phone live broadcast" enters Taihu Lake

Halcon知识:盘点一下计量对象【1】

杰理之触摸按键识别流程【篇】

Exness: liquidity series - liquidity cleaning and reversal, decision interval

MySQL master-slave synchronization

Maya House Modeling

【Try to Hack】Windows系统账户安全
随机推荐
杰理之检测灵敏度级别确定【篇】
Huffman Tree (1) Basic Concept and C - language Implementation
Jerry's long press reset [chapter]
Huffman tree (I) basic concept and C language implementation
北京大学ACM Problems 1002:487-3279
Installation and use of securecrtportable
Lumiprobe核酸定量丨QuDye dsDNA BR 检测试剂盒
Web主机iptables防火墙安全脚本
PHP文件上传小结(乱码,移动失败,权限,显示图片)
1、生成对抗网络入门
unittest自动测试多个用例时,logging模块重复打印解决
断点续传和下载原理分析
Encoding type of Perl conversion file
Why must we move from Devops to bizdevops?
PM这样汇报工作,老板心甘情愿给你加薪
Game 81 biweekly
Summary of operating system interview questions (updated from time to time)
杰理之触摸按键识别流程【篇】
[1175. prime number arrangement]
How to pass the PMP Exam quickly?