当前位置:网站首页>Analysis of glibc strlen implementation mode
Analysis of glibc strlen implementation mode
2022-07-05 03:38:00 【Maple tree】
Abstract
This paper deals with glibc 2.35 In the source strlen Research on function implementation , The principle of using bit operation to realize faster length calculation is analyzed .
Function position
glibc-2.35/string/strlen.c
Original implementation
Here is glibc strlen.c File contents of :
/* Copyright (C) 1991-2022 Free Software Foundation, Inc. This file is part of the GNU C Library. The GNU C Library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version. The GNU C Library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details. You should have received a copy of the GNU Lesser General Public License along with the GNU C Library; if not, see <https://www.gnu.org/licenses/>. */
#include <string.h>
#include <stdlib.h>
#undef strlen
#ifndef STRLEN
# define STRLEN strlen
#endif
/* Return the length of the null-terminated string STR. Scan for the null terminator quickly by testing four bytes at a time. */
size_t
STRLEN (const char *str)
{
const char *char_ptr;
const unsigned long int *longword_ptr;
unsigned long int longword, himagic, lomagic;
/* Handle the first few characters by reading one character at a time. Do this until CHAR_PTR is aligned on a longword boundary. */
for (char_ptr = str; ((unsigned long int) char_ptr
& (sizeof (longword) - 1)) != 0;
++char_ptr)
if (*char_ptr == '\0')
return char_ptr - str;
/* All these elucidatory comments refer to 4-byte longwords, but the theory applies equally well to 8-byte longwords. */
longword_ptr = (unsigned long int *) char_ptr;
/* Bits 31, 24, 16, and 8 of this number are zero. Call these bits the "holes." Note that there is a hole just to the left of each byte, with an extra at the end: bits: 01111110 11111110 11111110 11111111 bytes: AAAAAAAA BBBBBBBB CCCCCCCC DDDDDDDD The 1-bits make sure that carries propagate to the next 0-bit. The 0-bits provide holes for carries to fall into. */
himagic = 0x80808080L;
lomagic = 0x01010101L;
if (sizeof (longword) > 4)
{
/* 64-bit version of the magic. */
/* Do the shift in two steps to avoid a warning if long has 32 bits. */
himagic = ((himagic << 16) << 16) | himagic;
lomagic = ((lomagic << 16) << 16) | lomagic;
}
if (sizeof (longword) > 8)
abort ();
/* Instead of the traditional loop which tests each character, we will test a longword at a time. The tricky part is testing if *any of the four* bytes in the longword in question are zero. */
for (;;)
{
longword = *longword_ptr++;
if (((longword - lomagic) & ~longword & himagic) != 0)
{
/* Which of the bytes was the zero? If none of them were, it was a misfire; continue the search. */
const char *cp = (const char *) (longword_ptr - 1);
if (cp[0] == 0)
return cp - str;
if (cp[1] == 0)
return cp - str + 1;
if (cp[2] == 0)
return cp - str + 2;
if (cp[3] == 0)
return cp - str + 3;
if (sizeof (longword) > 4)
{
if (cp[4] == 0)
return cp - str + 4;
if (cp[5] == 0)
return cp - str + 5;
if (cp[6] == 0)
return cp - str + 6;
if (cp[7] == 0)
return cp - str + 7;
}
}
}
}
libc_hidden_builtin_def (strlen)
analysis
It is different from comparing whether each character is 0,glibc Of strlen Use every 8 byte (64 Bit program ) A way of comparison , Find it quickly as 0 Bytes of . after , To contain 0 Of 8 Byte by byte comparison within bytes , find 0 The location of .
alignment
To improve efficiency , First, compare the first few characters one by one , Until the pointer and long Type address alignment .
const char *char_ptr;
const unsigned long int *longword_ptr;
unsigned long int longword, himagic, lomagic;
/* Handle the first few characters by reading one character at a time. Do this until CHAR_PTR is aligned on a longword boundary. */
for (char_ptr = str; ((unsigned long int) char_ptr
& (sizeof (longword) - 1)) != 0;
++char_ptr)
if (*char_ptr == '\0')
return char_ptr - str;
/* All these elucidatory comments refer to 4-byte longwords, but the theory applies equally well to 8-byte longwords. */
longword_ptr = (unsigned long int *) char_ptr;
Comparison principle
after , Each 8 The principle of bit one comparison .
If a byte is all 0, That is to say 00000000( Binary system ). that , If we subtract it 1 The operation of , It will become 11111111.
Pay attention to the highest position . If we put the original byte 0 To reverse , What you get is 11111111, Then take and operate the two results , The resulting value will be the highest bit 1 Value . Be careful , Even if the byte immediately following it needs to borrow when subtracting , This byte will only become 11111110, The highest position is still 1.
If this byte is not all 0, There are two situations : The first is 1 And the first is 0.
If the first byte is 0, Then it is 0abcdef,a-f At least there is 1 A for 1, So subtract... From the lowest order 1 after , The first is still 0.
If the first byte is 1, And it is the only binary representation of this byte 1, Then subtract 1 In operation , The first place will become 0. If it's not the only one 1, Then the first place is still 1.
Above all , It can be seen that , If the lowest bit of a byte is subtracted 1 after , The highest bit is 1, And the highest bit after byte inversion is also 1, Then this byte must be 0. Based on this , We can compare bytes in batches .
for (;;) {
longword = *longword_ptr++;
if (((longword - lomagic) & ~longword & himagic) != 0) {
/* Which of the bytes was the zero? If none of them were, it was a misfire; continue the search. */
const char *cp = (const char *) (longword_ptr - 1);
if (cp[0] == 0)
return cp - str;
if (cp[1] == 0)
return cp - str + 1;
if (cp[2] == 0)
return cp - str + 2;
if (cp[3] == 0)
return cp - str + 3;
if (sizeof (longword) > 4) {
if (cp[4] == 0)
return cp - str + 4;
if (cp[5] == 0)
return cp - str + 5;
if (cp[6] == 0)
return cp - str + 6;
if (cp[7] == 0)
return cp - str + 7;
}
}
}
边栏推荐
- [luat-air105] 4.1 file system FS
- Leetcode42. connect rainwater
- In MySQL Association query, the foreign key is null. What if the data cannot be found?
- Qrcode: generate QR code from text
- [groovy] groovy environment setup (download groovy | install groovy | configure groovy environment variables)
- SPI and IIC communication protocol
- How to define a unified response object gracefully
- Anchor free series network yolox source code line by line explanation Part 2 (a total of 10, ensure to explain line by line, after reading, you can change the network at will, not just as a participan
- 腾讯云,实现图片上传
- 【PHP特性-变量覆盖】函数的使用不当、配置不当、代码逻辑漏洞
猜你喜欢

程序员的视力怎么样? | 每日趣闻
![[system security] ten thousand words summary system virtualization container bottom layer principle experiment](/img/c6/1bdb29a0acb0739f67b882fa6b3b47.jpg)
[system security] ten thousand words summary system virtualization container bottom layer principle experiment

New interesting test applet source code_ Test available
![[安洵杯 2019]不是文件上传](/img/f1/736eb5fe51c299e3152ca87895ee99.png)
[安洵杯 2019]不是文件上传

Hot knowledge of multithreading (I): introduction to ThreadLocal and underlying principles

Share the newly released web application development framework based on blazor Technology

Tencent cloud, realize image upload

Some enterprise interview questions of unity interview

腾讯云,实现图片上传

Azkaban overview
随机推荐
Leetcode92. reverse linked list II
Pat grade a 1119 pre- and post order traversals (30 points)
Jd.com 2: how to prevent oversold in the deduction process of commodity inventory?
001 chip test
Flex flexible layout
Basic knowledge of tuples
Tencent cloud, realize image upload
SQL performance optimization skills
Necessary fonts for designers
[105] Baidu brain map - Online mind mapping tool
SQL performance optimization skills
Talk about the SQL server version of DTM sub transaction barrier function
Anchor free series network yolox source code line by line explanation Part 2 (a total of 10, ensure to explain line by line, after reading, you can change the network at will, not just as a participan
About authentication services (front and back, login, registration and exit, permission management)
【软件逆向-基础知识】分析方法、汇编指令体系结构
De debugging (set the main thread as hidden debugging to destroy the debugging Channel & debugger detection)
LeetCode 237. Delete nodes in the linked list
[groovy] loop control (number injection function implements loop | times function | upto function | downto function | step function | closure can be written outside as the final parameter)
2. Common request methods
Devtools的簡單使用