当前位置:网站首页>Find the first k small element select_ k
Find the first k small element select_ k
2022-07-05 18:00:00 【Jingling cat】
preface
Don't offer Solutions for : Sort all elements first , Then I'll find , At this time, the time complexity is O(nlgn)
Fast selection algorithm - Quick select
Quick select and Quick sort It's all by Tony Hoare Invented
Quick select Of The average time complexity is zero O(n)
The worst time complexity is O(n^2)
The algorithm does not use extra space ,swap Operation is inplace operation , So the space complexity is zero O(1).

Quick select Adopt and Quick sort Similar steps . First choose one pivot, And then according to Each number is related to this pivot The size of the relationship Divide the entire array into Two parts .
that And Quick sort The difference is ,Quick select Only consider The molecular array where the target is sought , and Non image Quick sort And then separate the two sides . That's why ,Quick select take The average time complexity ranges from O(nlogn) Down to O(n)
public static int quickSelectK(int[] arr, int l, int r, int k) {
if (arr == null || arr.length == 0) {
return -1;
}
if (k > arr.length) return -1;
if (l == r) return arr[l];
int smallIndex = partitionForQuickSelect(arr, l, r);
int num = smallIndex - l + 1;
if (k == num) return arr[smallIndex];
else if (k < num) return quickSelectK(arr, l, smallIndex - 1, k);
else return quickSelectK(arr, smallIndex + 1, r, k - num);
}
private static int partitionForQuickSelect(int[] arr, int l, int r) {
int pivot = (int) (l + Math.random() * (r - l + 1));
swap(arr, pivot, r);
int less = l - 1;
for (int i = l; i < r; i++) {
if (arr[i] <= arr[r]) swap(arr, i, ++less);
}
swap(arr, less + 1, r);
return less + 1;
}
Linear time selection algorithm
Expected linear time - Randomized_Select Random partition linear selection
The input data are different .Randomized_Select Algorithm to Fast sorting algorithm For the model , You can ask for The... In an array k Small elements .
Unlike quick sorting , Quick sort Meeting Recursively handle both sides of the partition , and Randomized_Select Only deal with Divided side
Average time complexity : O(n+k)=O(n) ( When k More hours )
Due to the partition operation randomized_partition The limitations of , Of the selection algorithm The worst Time complexity by O(n^2)
The quick sort algorithm needs to deal with both sides of the partition recursively , And the choice algorithm is to deal with only one side .
This difference leads to differences in performance :
The expected running time of the quick sort algorithm is Θ(nlgn), The expected running time of the selection algorithm is Θ(n)
The core :
Every time Divide the array by the hub of the quick row [ begin....mid...end ]
If target<mid, It's in [ begin....mid-1 ] Recursive search The first k Elements
If target>mid, It's in [ mid+1...end ] Recursive search The first k - (mid-begin+1) Elements
public static int randomizedSelectK(int[] arr, int l, int r, int k) {
if (k > arr.length) return -1; // The first k individual Greater than The length of the array
if (l == r) return arr[l]; //l == r For example, an array with only one element ,l==r
int lessIndex = randomizedPartition(arr, l, r);
int num = lessIndex - l + 1; // Less than or equal to the last subscript of the area It is the present. [l,r] Which element of the region
if (num == k) {
// It happens to be the k individual
return arr[lessIndex];
} else if (k < num) {
// In the area less than or equal to , stay [l,lessIndex - 1] Find in the area
return randomizedSelectK(arr, l, lessIndex - 1, k);
} else {
return randomizedSelectK(arr, lessIndex + 1, r, k - num); // In greater than area , stay [lessIndex + 1,r] Find in the area , Pay attention to the last k - num
}
}
// Partition , find Less than or equal to The last subscript
public static int randomizedPartition(int[] array, int l, int r) {
int pivot = (int) (l + Math.random() * (r - l + 1)); // Find one. [l,r] The random number in
int lessIndex = l - 1; //l The previous element of
swap(array, pivot, r); // Swap with the last element
for (int i = l; i < r; i++) {
if (array[i] <= array[r]) {
// Less than or equal to
swap(array, ++lessIndex, i); // Swap with the last number smaller than the region And self increase
}
}
swap(array, lessIndex + 1, r); // Less than or equal to the next number of the zone , and r In exchange for , bring lessIndex + 1 The number of positions is less than or equal to the last number of areas
return lessIndex + 1;
}
public static void main(String[] args) {
// int[] arr = {7, 1, 3, 4, 2, 6, 2, 8, 10, 9, 5};
int[] arr = {
30, 0, 61, 34, 69, 4, 75, 67, 23, 59, 16, 41, 33, 52, 3, 51, 23, 12, 28, 24, 12, 0, 44, 45, 14, 40, 28, 8, 4, 24, 2, 98, 32, 39, 73, 26, 2, 3, 1, 59, 82, 19, 4, 39, 33, 35, 6, 89, 70, 32, 16, 24, 24, 54, 13, 27, 25, 44, 79, 1, 38, 55, 49, 9, 6, 75, 86, 7, 20, 33, 46, 77, 35, 68, 9, 14, 2, 17, 12, 52, 13, 59, 88, 71, 6, 32, 28, 49, 11, 59, 66, 4, 16, 18, 6, 63, 17, 0};
for (int j = 1; j <= arr.length; j++) {
System.out.print(randomizedSelectK(arr, 0, arr.length - 1, j) + ", ");
}
System.out.println();
Arrays.sort(arr);
System.out.println(Arrays.toString(arr));
}
Use Divided into three areas partition
Fast selection algorithm ( Find the... In the array K Big element ) - It's divided into three areas
public static int randomizedSelectK(int[] arr, int l, int r, int k) {
if (k > arr.length) return -1;
if (l == r) return arr[l];
int[] p = randomizedPartition(arr, l, r); // It's divided into three areas
/** * From the first element of the equal zone to l The number of the first element in the area , Not from Smaller than the last to l The first element of * for example * item 0 1 2 2 3 4 * index 0 1 2 3 4 5 * If p = [2,3] * * k=2 when ,res Should be 1, It's time to enter if (k < lessPart) * k=3 when ,res Should be 2, It's time to enter if (k >= lessPart && k <= morePart) | The border lessPart = 3 = p[0] - l + d ==》 d = 1 * k=4 when ,res Should be 2, It's time to enter if (k >= lessPart && k <= morePart) | The border morePart = 4 = p[1] - l + c ==》 c = 1 * k=5 when ,res Should be 3, It's time to enter if (k > morePart) */
int lessPart = p[0] - l + 1;
int morePart = p[1] - l + 1;
if (k < lessPart) {
return randomizedSelectK(arr, l, p[0] - 1, k);
} else if (k > morePart) {
return randomizedSelectK(arr, p[1] + 1, r, k - morePart); // hypothesis k = 5, That should be in morePart Area find No 1 A small number ,1 = 5 - 4 = k - morePart
} else {
return arr[p[0]]; //arr[p[0]] perhaps arr[p[1]]
}
}
public static int[] randomizedPartition(int[] arr, int l, int r) {
int rand = (int) (l + Math.random() * (r - l + 1));
swap(arr, rand, r);
int less = l - 1;
int more = r;
while (l < more) {
if (arr[l] < arr[r]) swap(arr, ++less, l++);
else if (arr[l] > arr[r]) swap(arr, --more, l);
else l++;
}
swap(arr, more, r);
return new int[]{
less + 1, more};
}
The worst case is linear time - SELECT Median linear time selection
On average, it can O(N) Complexity
and randomizedSelect equally ,select The algorithm finds the required elements by recursively dividing the input array
however , In the process of division select The algorithm can always guarantee to get a “ good ” Division
select The deterministic partition algorithm from quick sorting is also used partition , But it was modified ------ The principal component of the partition is also used as the input parameter .
adopt select Algorithm , It can be determined that one has n>1 The... In the input array of different elements p Big elements
With 5 Elements are 1 Group Divide aggregate
Find each group 5 The median of the elements
recursive select The algorithm finds the median of these median k
The time complexity of the above process is essentially 0(n) Grade
And then to k Operate for the hub Randomized Select Recursive part of the algorithm
because k The lowest lower limit of the partition can be guaranteed ( Each median ensures that half of the elements are greater than or less than it , Therefore, the median of the median is guaranteed to be as close as possible to half of the nodes that are greater than or less than ), So the algorithm is also in the worst case 0(n) Level of complexity
First step : grouping
(1) When the scale is less than a certain threshold , Directly use the sorting algorithm to return the results .
(2) When n Greater than threshold when , hold n Elements are divided into n/5 Divided three intervals , if n No 5 Multiple , Then exclude the remaining elements ( No impact , Here is just for the middle mm), Only up to 1 The group consists of the rest n%5 Elements make up .
Sort them separately
Then pick out The middle value of each group of elements
Right again All intermediate values , Recursively call This algorithm , Pick out the middle value mm,mm That is to say The middle of the middle
(3) Divide all elements into A1、A2、A3 Three groups of , Each contains Less than 、 be equal to 、 Greater than mm The elements of
The second step : Judge k stay Which group
There are three situations , Find out The first k Which of these three arrays is the small element :
a. if A1 The number of elements Greater than or equal to k , namely The first K The elements are A1 Group Inside : stay A1 in Recursive search The first k Minor elements
p<k , Then recursively call... In the low area select Find the first k Big elements
b. if A1 The number of elements Less than k , also A1 and A2 Element number The sum is greater than or equal to k, namely The first k Elements stay A2 in ,A2 in The elements are mm: return mm
p=k (p yes partition Return value ,k Is the number to choose k Big elements ), Then return to a[p] // The following code follows here
c. otherwise , The first k The elements are A3 Group : stay A3 in Recursively find the number (k-|A1、A2 Sum of elements |) Minor elements
if p>k , Recursively call in the high area select Find the first p-k Big elements


public static int select(int[] arr, int l, int r, int k) {
if (r - l < 75) {
quickSort(arr, l, r); // Sort with insert sort
return arr[l + k - 1];
}
// grouping
int group = (r - l + 5) / 5; // Each group 5 Number common group Group
for (int i = 0; i < group; i++) {
int left = l + 5 * i;
int right = (l + i * 5 + 4) > r ? r : l + i * 5 + 4; // If it exceeds the right boundary, assign a value with the right boundary
int mid = (left + right) / 2;
quickSort(arr, left, right); // Each group Sort
swap(arr, l + i, mid); // Compare the median of each group with The first i individual In exchange for
}
int pivot = select(arr, l, l + group - 1, (group + 1) / 2); // Find the median of the median
int p = partition(arr, l, r, pivot); // Use the median of the median as the reference position
int leftNum = p - l + 1; //leftNum Used to record In front of the reference position Number of elements of
if (k == leftNum)
return arr[p];
else if (k < leftNum)
return select(arr, l, p - 1, k);
else // if k Behind the reference position , Count from the back of the reference position , That is to say (k - leftNum - 1) individual
return select(arr, p + 1, r, k - leftNum - 1);
}
private static int partition(int[] array, int start, int end, int pivot) {
int smallIndex = start - 1;
swap(array, pivot, end);
for (int i = start; i <= end; i++)
if (array[i] <= array[end]) {
smallIndex++;
if (i > smallIndex)
swap(array, i, smallIndex);
}
return smallIndex;
}
边栏推荐
- Tencent music launched its new product "quyimai", which provides music commercial copyright authorization
- VBA drives SAP GUI to realize office automation (II): judge whether elements exist
- Is it safe to open an account online? What is the general interest rate of securities financing?
- 深拷贝与浅拷贝【面试题3】
- EasyCVR接入设备开启音频后,视频无法正常播放是什么原因?
- Seven Devops practices to improve application performance
- Star ring technology data security management platform defender heavy release
- Unicode processing in response of flash interface
- matlab内建函数怎么不同颜色,matlab分段函数不同颜色绘图
- 为什么阳历中平年二月是28天
猜你喜欢

ISPRS2022/雲檢測:Cloud detection with boundary nets基於邊界網的雲檢測

GFS分布式文件系统
![含重复元素取不重复子集[如何取子集?如何去重?]](/img/b2/d019c3f0b85a6c0d334a092fa6c23c.png)
含重复元素取不重复子集[如何取子集?如何去重?]

使用Jmeter虚拟化table失败

「运维有小邓」用于云应用程序的单点登录解决方案

寻找第k小元素 前k小元素 select_k

Star Ring Technology launched transwarp Navier, a data element circulation platform, to help enterprises achieve secure data circulation and collaboration under privacy protection

十个顶级自动化和编排工具
![[JMeter] advanced writing method of JMeter script: all variables, parameters (parameters can be configured by Jenkins), functions, etc. in the interface automation script realize the complete business](/img/a6/aa0b8d30913dc64f3c0cd891528c40.png)
[JMeter] advanced writing method of JMeter script: all variables, parameters (parameters can be configured by Jenkins), functions, etc. in the interface automation script realize the complete business

论文阅读_医疗NLP模型_ EMBERT
随机推荐
求解为啥all(())是True, 而any(())是FALSE?
tkinter窗口预加载
[TestLink] testlink1.9.18 solutions to common problems
7 pratiques devops pour améliorer la performance des applications
Easynmon Usage Summary
Server configuration jupyter environment
Sophon base 3.1 launched mlops function to provide wings for the operation of enterprise AI capabilities
VBA drives SAP GUI to realize office automation (II): judge whether elements exist
EPM相关
提高应用程序性能的7个DevOps实践
Sentinel flow guard
What are the requirements for PMP certification? How much is it?
Cmake tutorial Step2 (add Library)
Leetcode daily practice: rotating arrays
为什么阳历中平年二月是28天
使用QT遍历Json文档及搜索子对象
论文阅读_中文NLP_LTP
Cmake tutorial Step4 (installation and testing)
登录连接 CDB 和 PDB
[performance test] full link voltage test