当前位置:网站首页>Repeated DNA sequence [hash determination repetition + sliding window + bit operation of binary coding]
Repeated DNA sequence [hash determination repetition + sliding window + bit operation of binary coding]
2022-07-27 20:58:00 【REN_ Linsen】
Binary bit operation
Preface
When determining the repeated numbers in the array , You can use hash Record ; Turn an array into a string , When determining whether the string is repeated , can hash Record the string in the sliding window ; But string hashcode To get, you need to scan the string , and Character/Integer Is to directly obtain the original value as hashcode(hashcode After interference, act as hash value .), So when string characters are binary encoded , Let the string map to a number , Get... Quickly hashcode.
One 、“ repeat ” Of DNA Sequence
Two 、hash Matching repetition + Binary code

1、hash Look for repetition
public List<String> findRepeatedDnaSequences(String s) {
int n = s.length();
Set<String> pool = new HashSet<>();
Set<String> ans = new HashSet<>();
for(int i = 0;i < n - 9;i++){
String str = s.substring(i,i + 10);
if(pool.contains(str) && !ans.contains(str)) ans.add(str);
pool.add(str);
}
return new ArrayList<>(ans);
}
2、 Bit operation practice of binary code
package everyday.bitManipulation;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
// Repetitive DNA Sequence .
public class FindRepeatedDnaSequences {
/* Because of the hashcode You need to scan the entire string , So you can encode a few strings , Just for 10 Encoding of substring . */
public List<String> findRepeatedDnaSequences(String s) {
int n = s.length();
List<String> ans = new ArrayList<>();
if (n <= L) return ans;
// Before initialization 9 Number .
int x = 0, i = 0;
while (i < L - 1) x = x << 2 | fx.get(s.charAt(i++));
// Start using the sliding window to get the substring integer Encoding as hash Of key
Map<Integer, Integer> cnt = new HashMap<>();
while (i < n) {
x = (x << 2 | fx.get(s.charAt(i))) & (1 << L * 2) - 1;
cnt.put(x, cnt.getOrDefault(x, 0) + 1);
if (cnt.get(x) == 2) ans.add(s.substring(i - L + 1, i + 1));
++i;
}
return ans;
}
final static int L = 10;
static Map<Character, Integer> fx = new HashMap<>();
static {
fx.put('A', 0);
fx.put('C', 1);
fx.put('G', 2);
fx.put('T', 3);
}
}
summary
1) Binary encoding can be more than string encoding , It is also common in subset scenarios , Binary mapping to subset ; Binary can also put 0/1 hash Dimension reduction , For example, a value can replace 31 position 0-1hash, deal with 26 Letters are more than enough .
2) Binary system / Bit operations are important , Practice more , It is helpful for many problem optimization .
3) A string of hashcode Acquisition is not O(1), It is O(n), character / The whole number is O(1).
reference
边栏推荐
- 金仓数据库 KingbaseES 异构数据库移植指南 (4. 应用迁移流程)
- Recommend a powerful search tool listary
- [numpy] array index and slice
- R语言使用epiDisplay包的lroc函数可视化logistic回归模型的ROC曲线并输出诊断表(diagnostic table)、可视化多条ROC曲线、使用legend函数为可视化图像添加图例
- VI working mode (3 kinds) and mode switching (conversion)
- NATAPP内网穿透工具外网访问个人项目
- [Numpy] 广播机制(Broadcast)
- 征服所有程序员的3件IT装备 →
- 重复的DNA序列[hash判定重复+滑动窗口+二进制编码之位运算]
- 品牌列表案例
猜你喜欢
RK3399平台开发系列讲解(进程篇)15.36、理解进程和协程

【深度学习】Pytorch Tensor 张量

Hexagon_V65_Programmers_Reference_Manual(7)

Xdc 2022 Intel technology special session: Intel Software and hardware technology builds the cornerstone of cloud computing architecture

User login switching case
![[Numpy] 数组索引和切片](/img/ce/34db7aef3fefe8a03e638d0838492f.png)
[Numpy] 数组索引和切片

Why does Alibaba prohibit more than three forms from joining?

Brand list cases
![[deep learning] pytoch torch Autograd automatic differential engine](/img/c8/2ce1e5c02283965f8690ac5a9971a9.png)
[deep learning] pytoch torch Autograd automatic differential engine
![Laboratory management system implemented by SSM framework +jsp [source code + database + system paper]](/img/2e/64af546c58f3dc517cdae304daa671.png)
Laboratory management system implemented by SSM framework +jsp [source code + database + system paper]
随机推荐
Technology blog and tutorial
好开不贵,舒适安全!深度体验比亚迪宋Pro DM-i
Introduction to source insight 4.0
Programmer growth Chapter 18: project launch
Hexagon_V65_Programmers_Reference_Manual(9)
【深度学习】Pytorch Tensor 张量
自定义学习率
Laboratory management system implemented by SSM framework +jsp [source code + database + system paper]
你了解数据同步吗?
Hexagon_V65_Programmers_Reference_Manual(8)
Vant component library
Introduction to rk3399 platform introduction to proficient series (Introduction) 21 day learning challenge
程序中的地址如何转换?
Recommend a powerful search tool listary
Force deduction solution summary 592 fraction addition and subtraction
2022-07-19 advanced network engineering (XX) BGP route optimization, route optimization analysis one by one
js中数组与字符串常用方法属性总结
UE5使用DLSS(超级采样)提升场景的 FPS 远离卡顿的优化方案
记一次restTemplate.getForEntity携带headers失败,restTemplate. exchange
Tencent jumped out with 38K and saw the real test ceiling