当前位置:网站首页>Interpretation of iterator related "itertools" module usage
Interpretation of iterator related "itertools" module usage
2022-07-06 13:58:00 【Algorithm channel】
0 Preface
When it comes to processing cycles , We are used to using for, while etc. , For example, print the characters in each list in turn :
lis = ['I', 'love', 'python']
for i in lis:
print(i)
I
love
python
When the number of bytes of printed content is small , When all is loaded in memory , Re print , No problem . But , If there are millions of vehicle tracks now , Ask you to analyze the travel rules of each of them , Traffic jams, etc , If it's on a single computer .
You may have to face... First , It can also be ignored by you , After the final code is written , A problem that could be exposed :outofmemory
, This is often encountered in practical projects .
This question reminds us of , While processing data , How to write programs that make efficient use of memory , It's very important . today , Let's explore how to use memory efficiently , Save memory and get things done .
Actually ,Python There is a module ready to deal with this , It is itertools
modular , The functions of these functions are actually well understood .
I'm not going to give you a general idea of what they can do , I want to analyze the implementation code behind these functions , How to save memory efficiently ,Python How did the kernel contributors write beautiful code , This is very interesting. , isn't it? ?
OK,let's go. Hope you enjoy the journey!
1 Patchwork elements
itertools Medium chain Function implementation element splicing , The prototype is as follows , Parameters * Represents a variable number of parameters
chain
(iterables)
The application is as follows :
In [33]: list(chain(['I','love'],['python'],['very', 'much']))
Out[33]: ['I', 'love', 'python', 'very', 'much']
wow , It's no longer easy to use , It's a little join The smell of , But compared to join strong , It focuses on the fact that the parameters are all iteratable instances .
that ,chain How to achieve efficient memory saving ?chain The general implementation code is as follows :
def chain(*iterables):
for it in iterables:
for element in it:
yield element
The above code is not difficult to understand ,chain Essence returns a generator , So it actually reads one element at a time into memory , So save the most memory
.
2 One by one
Returns the cumulative total value of the list , Prototype :
accumulate
(iterable[, func, *, initial=None])
The application is as follows :
In [36]: list(accumulate([1,2,3,4,5,6],lambda x,y: x*y))
Out[36]: [1, 2, 6, 24, 120, 720]
accumulate The general implementation code is as follows :
def accumulate(iterable, func=operator.add, *, initial=None):
it = iter(iterable)
total = initial
if initial is None:
try:
total = next(it)
except StopIteration:
return
yield total
for element in it:
total = func(total, element)
yield total
Above code , Are you all right? ? And chain ordinary yield Different , It's a little more complicated here ,yield It's kind of like return, therefore yield total
That line directly returns an element , That is to say iterable The first element of , Because the first element returned by this function at any time is its first . Again because yield Back to a generator object , Such as name gen, therefore next(gen) when , The code will execute to for element in it:
This line , And the iterator at this time it Has pointed to iterable The second element of ,OK, I believe you understand !
3 Funnel screening
It is compress function , Function is similar to funnel function , So I call it funnel screening , Prototype :
compress
(data, selectors)
In [38]: list(compress('abcdefg',[1,1,0,1]))
Out[38]: ['a', 'b', 'd']
Easy to see ,compress The number of elements returned is equal to the shorter list length of the two parameters .
Its general implementation code :
def compress(data, selectors):
return (d for d, s in zip(data, selectors) if s)
This function is very useful
4 Segment screening
Scan list , Start to hold back if you don't meet the conditions , The prototype is as follows :
dropwhile
(predicate, iterable)
Application examples :
In [39]: list(dropwhile(lambda x: x<3,[1,0,2,4,1,1,3,5,-5]))
Out[39]: [4, 1, 1, 3, 5, -5]
The general code to implement it is as follows :
def dropwhile(predicate, iterable):
iterable = iter(iterable)
for x in iterable:
if not predicate(x):
yield x
break
for x in iterable:
yield x
5 Segment screening 2
Scan list , Return elements from iteratable objects as long as conditions are met , Until the conditions are not met , The prototype is as follows :
takewhile
(predicate, iterable)
Application examples :
In [43]: list(takewhile(lambda x: x<5, [1,4,6,4,1]))
Out[43]: [1, 4]
The general code to implement it is as follows :
def takewhile(predicate, iterable):
for x in iterable:
if predicate(x):
yield x
else:
break # Return immediately
6 Selection of defective products
Scan list , Keep... As long as you don't meet the conditions , The prototype is as follows :
dropwhile
(predicate, iterable)
Application examples :
In [40]: list(filterfalse(lambda x: x%2==0, [1,2,3,4,5,6]))
Out[40]: [1, 3, 5]
The general code to implement it is as follows :
def dropwhile(predicate, iterable):
iterable = iter(iterable)
for x in iterable:
if not predicate(x):
yield x
break
for x in iterable:
yield x
7 Slice screening
Python General slicing operation in , such as :
lis = [1,3,2,1]
lis[:1]
Their defect is still lis You have to load all the memory , So more memory saving operations islice, The prototype is as follows :
islice
(iterable, start, stop[, step])
Application examples :
In [41]: list(islice('abcdefg',1,4,2))
Out[41]: ['b', 'd']
The general code to implement it is as follows :
def islice(iterable, *args):
s = slice(*args)
start, stop, step = s.start or 0, s.stop or sys.maxsize, s.step or 1
it = iter(range(start, stop, step))
try:
nexti = next(it)
except StopIteration:
for i, element in zip(range(start), iterable):
pass
return
try:
for i, element in enumerate(iterable):
if i == nexti:
yield element
nexti = next(it)
except StopIteration:
for i, element in zip(range(i + 1, stop), iterable):
pass
Clever use of the generator will throw an exception at the end of the iteration StopIteration
, Do something about boundary handling .
8 cell division
tee The function is similar to the well-known cell division , It can replicate the original iterator n individual , The prototype is as follows :
tee
(iterable, n=2)
The application is as follows , It can be seen that the two iterators are independent
a = tee([1,4,6,4,1],2)
In [51]: next(a[0])
Out[51]: 1
In [52]: next(a[1])
Out[52]: 1
The code to implement it is as follows :
def tee(iterable, n=2):
it = iter(iterable)
deques = [collections.deque() for i in range(n)]
def gen(mydeque):
while True:
if not mydeque:
try:
newval = next(it)
except StopIteration:
return
for d in deques:
d.append(newval)
yield mydeque.popleft()
return tuple(gen(d) for d in deques)
tee The implementation uses a queue type internally deques, At first, an empty queue is generated , Add elements... To each of the copied queues newval, meanwhile yield Currently called mydeque The leftmost element in .
9 map variant
starmap You can view it as map A variation of the , It can save more memory , meanwhile iterable The element of must also be an iteratable object , The prototype is as follows :
starmap
(function, iterable)
Apply it :
In [63]: list(starmap(lambda x,y: str(x)+'-'+str(y), [('a',1),('b',2),('c',3)]))
Out[63]: ['a-1', 'b-2', 'c-3']
starmap The implementation details are as follows :
def starmap(function, iterable):
for args in iterable:
yield function(*args)
10 Copy elements
repeat Implement copy elements n Time , The prototype is as follows :
repeat
(object[, times])
The application is as follows :
In [66]: list(repeat(6,3))
Out[66]: [6, 6, 6]
In [67]: list(repeat([1,2,3],2))
Out[67]: [[1, 2, 3], [1, 2, 3]]
Its implementation details are as follows :
def repeat(object, times=None):
if times is None:# If times Not set up , Will always repeat down
while True:
yield object
else:
for i in range(times):
yield object
11 The cartesian product
The effect of Cartesian product is the same as that of :
((x,y) for x in A for y in B)
therefore , The effect of Cartesian product is as follows :
In [68]: list(product('ABCD', 'xy'))
Out[68]:
[('A', 'x'),
('A', 'y'),
('B', 'x'),
('B', 'y'),
('C', 'x'),
('C', 'y'),
('D', 'x'),
('D', 'y')]
Its implementation details :
def product(*args, repeat=1):
pools = [tuple(pool) for pool in args] * repeat
result = [[]]
for pool in pools:
result = [x+[y] for x in result for y in pool]
for prod in result:
yield tuple(prod)
12 Enhanced Edition zip
Combination value . If the length of the iteratable object is not aligned , Based on the fillvalue Fill in missing values , Be careful : Iterations continue to the longest consuming iteratable object
, The effect is as follows :
In [69]: list(zip_longest('ABCD', 'xy', fillvalue='-'))
Out[69]: [('A', 'x'), ('B', 'y'), ('C', '-'), ('D', '-')]
Its implementation details :
def zip_longest(*args, fillvalue=None):
iterators = [iter(it) for it in args]
num_active = len(iterators)
if not num_active:
return
while True:
values = []
for i, it in enumerate(iterators):
try:
value = next(it)
except StopIteration:
num_active -= 1
if not num_active:
return
iterators[i] = repeat(fillvalue)
value = fillvalue
values.append(value)
yield tuple(values)
It uses repeat, That is, when the length of the iteratable object is not aligned , according to fillvalue Fill in missing values . The key to understanding the code above is the iterator object (iter),next The particularity of the method :
In [74]: for i, it in enumerate([iter([1,2,3]),iter(['x','y'])]):
...: print(next(it))
# Output :
1
x
Combined with this prompt to understand the above code , It's not going to work .
summary
Python Of itertools Memory efficient iterator provided by the module , Basically, the implementation inside is based on the generator , So on the one hand, understand this 12 The basic functions implemented by functions , At the same time, it can also deepen the generator (generator) The understanding of the , Write for us more efficiently 、 concise 、 Beautiful code laid a solid foundation .
Read more :
The most straightforward explanation of machine learning , Just look at this article !
Python List generator 12 A little function , Which ones do you often use ?
Python The most underrated Library , Use it well and improve efficiency 10 times !
Long press QR code , Follow my public number
边栏推荐
- Zatan 0516
- 实验七 常用类的使用
- [the Nine Yang Manual] 2022 Fudan University Applied Statistics real problem + analysis
- Meituan dynamic thread pool practice ideas, open source
- Programme de jeu de cartes - confrontation homme - machine
- 7-8 7104 约瑟夫问题(PTA程序设计)
- 【VMware异常问题】问题分析&解决办法
- (original) make an electronic clock with LCD1602 display to display the current time on the LCD. The display format is "hour: minute: Second: second". There are four function keys K1 ~ K4, and the fun
- TypeScript快速入门
- Using spacedesk to realize any device in the LAN as a computer expansion screen
猜你喜欢
扑克牌游戏程序——人机对抗
使用Spacedesk实现局域网内任意设备作为电脑拓展屏
Renforcer les dossiers de base de l'apprentissage
A comprehensive summary of MySQL transactions and implementation principles, and no longer have to worry about interviews
[hand tearing code] single case mode and producer / consumer mode
仿牛客技术博客项目常见问题及解答(二)
PriorityQueue (large root heap / small root heap /topk problem)
.Xmind文件如何上传金山文档共享在线编辑?
Strengthen basic learning records
Strengthen basic learning records
随机推荐
Intensive literature reading series (I): Courier routing and assignment for food delivery service using reinforcement learning
PriorityQueue (large root heap / small root heap /topk problem)
It's never too late to start. The tramp transformation programmer has an annual salary of more than 700000 yuan
7-11 机工士姆斯塔迪奥(PTA程序设计)
Safe driving skills on ice and snow roads
【MySQL-表结构与完整性约束的修改(ALTER)】
实验四 数组
A piece of music composed by buzzer (Chengdu)
. How to upload XMIND files to Jinshan document sharing online editing?
Strengthen basic learning records
Thoroughly understand LRU algorithm - explain 146 questions in detail and eliminate LRU cache in redis
[graduation season · advanced technology Er] goodbye, my student days
[VMware abnormal problems] problem analysis & Solutions
撲克牌遊戲程序——人機對抗
强化学习基础记录
7-4 hash table search (PTA program design)
Miscellaneous talk on May 27
MySQL锁总结(全面简洁 + 图文详解)
实验五 类和对象
js判断对象是否是数组的几种方式