Preface

Before , When we explore the principles of animation and rendering , We output several articles , Answer iOS How the animation is rendered , Doubts about how special effects work . We feel deeply that when system designers create these system frameworks , It's so wide open , also Deeply aware of the importance of understanding the underlying principles of a technology for working in this area .
So we decided to Explore further iOS The task of the underlying principle . In this article, we focus on Runtime an , Will explore one by one :isa Detailed explanation 、class Structure 、 Method cache cache_t、objc_msgSend、 forward 、 Dynamic method analysis 、super The essence of 、Runtime Related applications of

One 、Runtime brief introduction

1. OC A review of the essence of language

We Before that Explore OC The essence of language when , come to know Apple Official website OC Introduction to :

Objective-C It's programmers who are working for OS X and iOS The main programming language used to write software ( One of , Now there are Swift Language )
it yes C Supersets of programming languages , Provide object-oriented The function and Running dynamically when
Objective-C Inherited C Grammar of language 、 Basic types and Flow control statement , And added for Define classes and methods The grammar of .(OC Fully compliant with standards C Language )
It also adds object-oriented Management and Object literal Language level support for , At the same time provide dynamic Type and binding , Postpone many responsibilities to Runtime

2. Runtime

In the official website OC Language , Referring to the Running dynamically 、 dynamic Type and binding 、 Postpone many responsibilities to Runtime And many other runtime features , It means passing Runtime This set of ground floor API To achieve .

although ,Objective-C It's a closed source language , But officials also have appropriate open source for the language . We can usually find some of Apple's official open source code through this address :opensource.apple.com/tarballs/

Through global search objc, Can find objc4, Then download the latest open source version of the code We can also see from the official open source code Some implementations of official open source , That's included runtime Some of the implementations

Sum up , It's not hard for us to come to a conclusion :

Objective-C It is a dynamic programming language , Follow C、C++ And other languages are very different ;
Objective-C The dynamics of is caused by Runtime API To support
Runtime API The interfaces provided are basically C Language Of , The source code by C\C++\ assembly language To write

Two 、isa Detailed explanation

front , We are exploring OC Several objects and objects in isa The pointer Come to some conclusions , Let's briefly review : Objective-C Objects in the , abbreviation OC object , Mainly can be divided into 3 Kind of

instance object （ Instance object ）
class object （ Class object ）
meta-class object （ Metaclass object ）

1. `instance` object

instance object Is through class alloc Out object , Every time you call alloc produces new instance object

object1、object2 yes NSObject Of instance object （ Instance object ）
They are two different objects , Occupy Two different pieces of memory
instance Objects store information in memory, including
- isa The pointer
- Other member variables

2. `class` object

objectClass1 ~ objectClass5 All are NSObject Of class object （ Class object ）
They are the same object . Each class is in There is only one in memory class object

class The information stored in memory mainly includes :
- isa The pointer
- superclass The pointer
- Class attribute Information （@property）、 Class Object methods Information （instance method）
- Class agreement Information （protocol）、 Class Member variables Information （ivar）
- ......

3. `meta-class` object

objectMetaClass yes NSObject Of meta-class object （ Metaclass object ）
Each class is in memory There is one and only one meta-class object

meta-class Objects and class The memory structure of the object is the same , But for different purposes , The information stored in memory mainly includes
- isa The pointer
- superclass The pointer
- Class Class method Information （class method）
- ......

4. `isa` The pointer

instance Of isa Point to class
- When calling Object methods when , adopt instance Of isa find class, Finally, find the implementation of the object method to call
class Of isa Point to meta-class
- When calling Class method when , adopt class Of isa find meta-class, Finally, find the implementation of the class method to call

class Object's superclass The pointer

When Student Of instance Object to call Person Object method of , Will pass first isa find Student Of class
And then through superclass find Person Of class, Finally, find the implementation of the object method to call

meta-class Object's superclass The pointer

When Student Of class To be called Person Class method , Will pass first isa find Student Of meta-class
And then through superclass find Person Of meta-class, Finally, find the implementation of the class method to call

5. Yes `isa`、`superclass` summary

isa

instance Of isa Point to class
class Of isa Point to meta-class
meta-class Of isa Pointing to the base class meta-class

superclass

class Of superclass Pointing to the parent class class
- If there is no parent class ,superclass Pointer for nil
meta-class Of superclass Pointing to the parent class meta-class
- The base class meta-class Of superclass Pointing to the base class class

Method call

instance The path of calling object methods
- isa find class, Method does not exist , Just through superclass Find parent
class Call the trace of class method
- isa look for meta-class, Method does not exist , Just through superclass Find parent

6. Sum up

Combined with the previous conclusion , It's not hard for us to know ,OC Three kinds of objects in language , It's through isa Pointer to establish contact , and OC The runtime features of depend on RuntimeAPI It is based on isa The connection of three types of objects established by pointers , Realization Dynamic runtime .

therefore , If you want to Study Runtime, First of all, understand it Some common data structures at the bottom , such as isa The pointer

stay arm64 Before the architecture ,isa It's just one. Ordinary pointer , It stores Class、Meta-Class The memory address of the object
from arm64 Architecture starts , Yes isa optimized , Turned into A common body （union） structure , Also used Bit field to store more information . Need to pass through ISA_MASK Specific information can be further obtained only by performing certain bit operations

7. `isa` The essence of

stay arm64 After architecture OC Object's isa The pointer It doesn't point directly to Class object perhaps Metaclass object , But needs &ISA_MASK adopt An operation To get Class object perhaps Metaclass object The address of .
Today, let's explore why &ISA_MASK To get Class object perhaps Metaclass object The address of , And such benefits .( Why does Apple officially do this optimization ？ Let's explore step by step ！)

First find... In the source code isa The pointer , to glance at isa The pointer The essence of .

//  Intercept objc_object Internal sub code 
struct objc_object {
private:
    isa_t isa;
}

isa The pointer It's actually a isa_t Common body of type , Came to isa_t Check its structure internally

//  Streamlined isa_t Shared body 
union isa_t {
    isa_t() { }
    isa_t(uintptr_t value) : bits(value) { }

    Class cls;
    uintptr_t bits;

#if SUPPORT_PACKED_ISA
# if __arm64__ 
# define ISA_MASK 0x0000000ffffffff8ULL
# define ISA_MAGIC_MASK 0x000003f000000001ULL
# define ISA_MAGIC_VALUE 0x000001a000000001ULL
    struct {
        uintptr_t nonpointer        : 1;
        uintptr_t has_assoc         : 1;
        uintptr_t has_cxx_dtor      : 1;
        uintptr_t shiftcls          : 33; // MACH_VM_MAX_ADDRESS 0x1000000000
        uintptr_t magic             : 6;
        uintptr_t weakly_referenced : 1;
        uintptr_t deallocating      : 1;
        uintptr_t has_sidetable_rc  : 1;
        uintptr_t extra_rc          : 19;
    # define RC_ONE (1ULL<<45)
    # define RC_HALF (1ULL<<18)
    };

# elif __x86_64__ 
# define ISA_MASK 0x00007ffffffffff8ULL
# define ISA_MAGIC_MASK 0x001f800000000001ULL
# define ISA_MAGIC_VALUE 0x001d800000000001ULL
    struct {
        uintptr_t nonpointer        : 1;
        uintptr_t has_assoc         : 1;
        uintptr_t has_cxx_dtor      : 1;
        uintptr_t shiftcls          : 44; // MACH_VM_MAX_ADDRESS 0x7fffffe00000
        uintptr_t magic             : 6;
        uintptr_t weakly_referenced : 1;
        uintptr_t deallocating      : 1;
        uintptr_t has_sidetable_rc  : 1;
        uintptr_t extra_rc          : 8;
# define RC_ONE (1ULL<<56)
# define RC_HALF (1ULL<<7)
    };

# else
# error unknown architecture for packed isa
# endif
#endif

In the above source code isa_t yes union type ,union Represents the common body .

We can see from the source code that :

There is a structure in the common body
Some variables are defined in the structure
The value after the variable represents how many bytes the variable occupies , That is, bit field technology

Understand the common body

In some algorithms C In language programming , You need to store the values of several different types of variables in the same memory unit ;
This kind of several different variables occupy a memory structure , stay C In language , It's called “ Shared body ” Type structure , Abbreviation: common body

Next, use the way of community to have an in-depth understanding apple Why use a community , And the benefits of using a community .

7.1 Searching process

7.1.1 Imitate the underlying storage of data

Next, use code to imitate the underlying practice , Create a person Class and contains three BOOL A member variable of type .

@interface Person : NSObject
@property (nonatomic, assign, getter = isTall) BOOL tall;
@property (nonatomic, assign, getter = isRich) BOOL rich;
@property (nonatomic, assign, getter = isHansome) BOOL handsome;
@end

int main(int argc, const char * argv[]) {
    @autoreleasepool {
        NSLog(@"%zd", class_getInstanceSize([Person class]));
    }
    return 0;
}
//  Print the content 
// Runtime - union seek [52235:3160607] 16

In the above code Person contain 3 individual BOOL Properties of type , Print Person Class objects occupy memory space of 16

That is to say (isa The pointer = 8) + (BOOL tall = 1) + (BOOL rich = 1) + (BOOL handsome = 1) = 13
because Memory alignment Principle so Person Class objects occupy memory space of 16( Knowledge about memory alignment , We are This article Introduced )

Through the common body technology , You can make several different variables Stored in the same memory In the middle , It can save memory space to a great extent

Try to store three in one byte BOOL The value of a variable of type

So we know that BOOL There are only two cases of value 0 perhaps 1, But it takes up one byte of memory space
And a memory space has 8 Binary bits , And binary only 0 perhaps 1
- So can I use 1 Binary bits to represent a BOOL value
- in other words 3 individual BOOL Values are ultimately used only 3 Binary bits , That is, just a memory space ？
- How to realize this way ？

First, if you use this method Need to write by yourself setter、getter Method declaration and Implementation :

You cannot write attribute declarations , Because once the attribute is written , The system will automatically add member variables for us ( Will open up memory space 、 It's going to happen setter、getter. I want to avoid the automatic generation of the system in order to explore )

In addition, I want to add three BOOL Values are stored in one byte , We can add a char A member variable of type

char Type occupies one byte of memory space , That is to say 8 Binary bits
The last three bits can be used to store 3 individual BOOL value .

@interface Person()
{
    char _tallRichHandsome;
}

for example _tallRichHansome The value of is 0b 0000 0010 , So just use 8 The last of the binary bits 3 individual , Assign values to them respectively 0 perhaps 1 To represent the tall、rich、handsome Value . As shown in the figure below :

storage

The problem now is how to take it out 8 The value of one of the binary bits , Or assign a value to a certain bit ？

a.) Value

Suppose three BOOL The value of the variable There is In a byte , Let's first discuss how to get a byte Take out The specific values of these three variables .

have access to 1 Binary bits to represent a BOOL value , Then start from the low position , A binary bit represents a value .

If char The binary stored in the member variable of type is 0b 0000 0010
If you want to be the penultimate 2 The value of bits is rich Take out the value of , You need to do Hexadecimal An operation
We can use & Conduct Bitwise AND Operation and then get the value of the corresponding position

understand 【&： Bitwise AND 】 The same is true , Everything else is false

//  Example 
//  Take the penultimate place  tall
  0000 0010
& 0000 0100
------------
  0000 0000  //  Take the value of the penultimate digit as 0, All other bits are set to 0

//  Take out the penultimate  rich
  0000 0010
& 0000 0010
------------
  0000 0010 //  Take the value of the penultimate digit as 1, All other bits are set to 0

Conclusion : Bitwise and can be used to extract the value of a specific binary bit

If you want to take out any one, just put it in 1, All the others are set to 0
Then perform bitwise sum calculation with the original data , You can take out a specific bit

use Bitwise AND Operation to achieve get Method

#define TallMask 0b00000100 // 4
#define RichMask 0b00000010 // 2
#define HandsomeMask 0b00000001 // 1

- (BOOL)tall
{
    return !!(_tallRichHandsome & TallMask);
}
- (BOOL)rich
{
    return !!(_tallRichHandsome & RichMask);
}
- (BOOL)handsome
{
    return !!(_tallRichHandsome & HandsomeMask);
}

Two... Are used in the above code !!（ Not ） To change the value to bool type . Use the same example above

//  Take out the penultimate  rich
  0000 0010  // _tallRichHandsome
& 0000 0010 // RichMask
------------
  0000 0010 //  Take out rich The value of is 1, All other bits are set to 0

In the above code (_tallRichHandsome & TallMask) The value of is 0000 0010 That is to say 2, But what we need is a BOOL Type value 0 perhaps 1

that !!2 will 2 First convert to 0 , Then it turns into 1
On the contrary, if the bitwise and obtained value is 0 when ,!!0 take 0 First convert to 1 Then it turns into 0
Therefore use !! Two non operations convert the value to 0 perhaps 1 To represent the corresponding value .

7.1.2 Optimize mask , Make it more readable

Mask : Generally used Conduct Bitwise AND （&） The value of the operation is called Mask

Three macros are defined in the above code , It is used to perform bitwise and operations respectively and get the corresponding value
The specific values of the three macros are masks
In order to more clearly indicate which bit value the mask is to extract , The definitions of the above three macros can be used Shift left operator :<< To optimize

Shift left operator A<<n , It means that A Shift left in binary data of numerical value n Bit gets a value

<< Example of shift left operator

Then the above macro definitions can be used <<（ Move left ） Optimized into the following code

#define TallMask (1<<2) // 0b00000100 4
#define RichMask (1<<1) // 0b00000010 2
#define HandsomeMask (1<<0) // 0b00000001 1

b.) Set the value

If we Want to assign a value to a binary bit 0 perhaps 1, Still usable An operation

If you want to set the value of a binary bit of a value to 1, Then as long as the binary bit And 1 Conduct |（ Press bit or Just do the math

Press bit or operation : | : Press bit or , As long as there is one 1 That is to say 1, Otherwise 0.

In the current case , It can also be said that :
If you want to set BOOL The value is YES Words , Then match the original value with the mask （ The binary value of this position is 1） You can operate by bit or .
For example, we want to tall Set as 1

//  Will be the penultimate  tall Set as 1
  0000 0010  // _tallRichHandsome
| 0000 0100  // TallMask
------------
  0000 0110 //  take tall Set as 1, Other bit values remain unchanged

Bitwise AND operation : &: Bitwise AND , The same is true , Everything else is false

In the current case , It can also be said that :
If you want to set BOOL The value is NO Words , You need to reverse the mask bit by bit （~ : Bitwise negation ）（ The value of the binary bit in this position becomes 0）, Then perform bitwise and operation with the original value .

//  Will be the penultimate  rich Set as 0
  0000 0010  // _tallRichHandsome
& 1111 1101  // RichMask According to the not 
------------
  0000 0000 //  take rich Set as 0, Other bit values remain unchanged

here set The internal implementation of the method is as follows

- (void)setTall:(BOOL)tall
{
    if (tall) { //  If necessary, set the value to 1 //  Bitwise or masked 
        _tallRichHandsome |= TallMask;
    }else{ //  If necessary, set the value to 0 //  Bitwise AND （ Bitwise reverse mask ）
        _tallRichHandsome &= ~TallMask; 
    }
}
- (void)setRich:(BOOL)rich
{
    if (rich) {
        _tallRichHandsome |= RichMask;
    }else{
        _tallRichHandsome &= ~RichMask;
    }
}
- (void)setHandsome:(BOOL)handsome
{
    if (handsome) {
        _tallRichHandsome |= HandsomeMask;
    }else{
        _tallRichHandsome &= ~HandsomeMask;
    }
}

finish writing sth. set、get Method, and then check whether the value can be set through the code 、 It's successful .

int main(int argc, const char * argv[]) {
    @autoreleasepool {
        Person *person  = [[Person alloc] init];
        person.tall = YES;
        person.rich = NO;
        person.handsome = YES;
        NSLog(@"tall : %d, rich : %d, handsome : %d", person.tall,person.rich,person.handsome);
    }
    return 0;
}

Print the content

Runtime - union seek [58212:3857728] tall : 1, rich : 0, handsome : 1

It can be seen that the above code can be assigned and valued normally . But the code still has some limitations :

When you need to add new attributes , The above work needs to be repeated , And the readability of the code is poor
Next, use the bit field property of the structure to optimize the above code

7.1.3 use Bit field technology Realization Variable access

Optimize the above code , Use the structure location domain , Can make the code more readable . Bitfield declaration Bit domain name : Bit field length ;

Using bit fields requires attention to the following 3 spot ：

1. If there is not enough space left in one byte to store another bit field , The bit field should be stored from the next unit .
- You can also intentionally start a bit field from the next unit
1. The length of the bit field cannot be greater than the length of the data type itself
- such as int Type cannot exceed 32 Bit binary .
3. A bit domain can be a bit free domain name , At this time, it is only used for filling or adjusting the position
- Nameless bit fields cannot be used

After the above code is optimized using the structure location domain .

@interface Person()
{
    struct {
        char handsome : 1; //  Bit field , Represents occupying one space 
        char rich : 1;  //  Occupy only one space in order 
        char tall : 1; 
    }_tallRichHandsome;
}

set、get Method can be directly assigned and valued through the structure

- (void)setTall:(BOOL)tall
{
    _tallRichHandsome.tall = tall;
}
- (void)setRich:(BOOL)rich
{
    _tallRichHandsome.rich = rich;
}
- (void)setHandsome:(BOOL)handsome
{
    _tallRichHandsome.handsome = handsome;
}
- (BOOL)tall
{
    return _tallRichHandsome.tall;
}
- (BOOL)rich
{
    return _tallRichHandsome.rich;
}
- (BOOL)handsome
{
    return _tallRichHandsome.handsome;
}

Verify whether the assignment or value is correct through the code

int main(int argc, const char * argv[]) {
    @autoreleasepool {
        Person *person  = [[Person alloc] init];
        person.tall = YES;
        person.rich = NO;
        person.handsome = YES;
        NSLog(@"tall : %d, rich : %d, handsome : %d", person.tall,person.rich,person.handsome);
    }
    return 0;
}

First, in the log Place a breakpoint , see _tallRichHandsome The value of memory

_tallRichHandsome The value of memory

because _tallRichHandsome Occupy a memory space , That is to say 8 Binary bits , We will 05 Hexadecimal conversion to binary view

05 Convert to binary

As can be seen in the figure above , The penultimate place is tall The value is 1, The penultimate place is rich The value is 0, The last one is handsome The value is 1, So it looks the same as the value we set in the above code . Can be successfully assigned .

Then continue to print the content ： Runtime - union seek [59366:4053478] tall : -1, rich : 0, handsome : -1

At this point, problems can be found ,tall And handsome We set the value to YES, Logically, the output value should be 1 Why is the above output -1 Well ？

And the above is printed _tallRichHandsome Values stored in , Also confirmed tall and handsome The values are 1. We print again _tallRichHandsome The value of the variable in the structure .

person Inside _tallRichHandsome Structural variable

As can be seen in the figure above ,handsome The value of is 0x01, Convert it into binary by calculator

0x01 Binary number

You can see that the value is indeed 1 Of , Why is the printed value -1 Well ？ At this time, you can think of get There is a problem inside the method . We are here get Method to view the obtained value by printing breakpoints .

- (BOOL)handsome
{
    BOOL ret = _tallRichHandsome.handsome;
    return ret;
}

Print ret Value

po ret Value

By printing ret The value of is found to be 255, That is to say 1111 1111, At this point, it can also explain why the printed value is -1 了 , First, it is obtained through the structure at this time handsome The value of is 0b1 Only occupy one memory space 1 position , however BOOL Values occupy a memory space , That is to say 8 position . When only 1 The value of bit is expanded to 8 A word of , The remaining vacancies will be filled in according to the value of the previous one 1, So now ret The value of is mapped to 0b 11111 1111.

11111111 In one byte , The signed number is -1, Unsigned number is 255. Therefore, the value we print out when printing is -1

To verify when 1 The value of bit is expanded to 8 When a , All seats will be filled , We will tall、rich、handsome The value is set to occupy two digits .

@interface Person()
{
    struct {
        char tall : 2;
        char rich : 2;
        char handsome : 2;
    }_tallRichHandsome;
}

At this time, it is found that the value can be printed normally . Runtime - union seek [60827:4259630] tall : 1, rich : 0, handsome : 1

This is because , stay get Method obtained internally _tallRichHandsome.handsome For two, that is 0b 01, At this time, it is assigned to 8 Bit BOOL Value of type , The previous null value will be automatically filled in as 0, Therefore, the returned value is 0b 0000 0001, Therefore, the printed value is 1 了 .

Therefore, the above questions can also be used !! Double exclamation points to solve the problem .!! The principle of has been explained above , I won't repeat it here .

Use the code optimized by the structure bit domain

@interface Person()
{
    struct {
        char tall : 1;
        char rich : 1;
        char handsome : 1;
    }_tallRichHandsome;
}
@end

@implementation Person

- (void)setTall:(BOOL)tall
{
    _tallRichHandsome.tall = tall;
}
- (void)setRich:(BOOL)rich
{
    _tallRichHandsome.rich = rich;
}
- (void)setHandsome:(BOOL)handsome
{
    _tallRichHandsome.handsome = handsome;
}
- (BOOL)tall
{
    return !!_tallRichHandsome.tall;
}
- (BOOL)rich
{
    return !!_tallRichHandsome.rich;
}
- (BOOL)handsome
{
    return !!_tallRichHandsome.handsome;
}

The bit field of the structure used in the above code no longer requires a mask , Make the code more readable , But the efficiency is much lower than the direct use of bit operations , If you want to read and store data efficiently and have strong readability at the same time, you need to use the common body .

7.1.4 use Shared body and To store The value of the variable

In order to make the code store data efficiently , It has strong readability , Commonalities can be used to enhance code readability , At the same time, bit operation is used to improve the efficiency of data access .

Use community optimized code

#define TallMask (1<<2) // 0b00000100 4
#define RichMask (1<<1) // 0b00000010 2
#define HandsomeMask (1<<0) // 0b00000001 1

@interface Person()
{
    union {
        char bits;
       //  The structure is just to enhance the readability of the code , It's of no real use 
        struct {
            char tall : 1;
            char rich : 1;
            char handsome : 1;
        };
    }_tallRichHandsome;
}
@end

@implementation Person

- (void)setTall:(BOOL)tall
{
    if (tall) {
        _tallRichHandsome.bits |= TallMask;
    }else{
        _tallRichHandsome.bits &= ~TallMask;
    }
}
- (void)setRich:(BOOL)rich
{
    if (rich) {
        _tallRichHandsome.bits |= RichMask;
    }else{
        _tallRichHandsome.bits &= ~RichMask;
    }
}
- (void)setHandsome:(BOOL)handsome
{
    if (handsome) {
        _tallRichHandsome.bits |= HandsomeMask;
    }else{
        _tallRichHandsome.bits &= ~HandsomeMask;
    }
}
- (BOOL)tall
{
    return !!(_tallRichHandsome.bits & TallMask);
}
- (BOOL)rich
{
    return !!(_tallRichHandsome.bits & RichMask);
}
- (BOOL)handsome
{
    return !!(_tallRichHandsome.bits & HandsomeMask);
}

The above code uses bit operation, which is a more efficient way to access values , Use union Common body to store data . Increase reading efficiency while enhancing code readability .

among _tallRichHandsome The common body only occupies one byte , Because in the structure tall、rich、handsome It only takes up one bit of binary space , So the structure only occupies one byte , and char Type of bits It only takes one byte , They are all in common , Therefore, it is enough to share one byte of memory .

And in get、set Structure is not used in method , Structs are just for code readability , Indicate which values are stored in the community , And how many bits of space these values occupy . At the same time, the stored value value also uses bit operation to increase efficiency , Storage usage community , The storage position is still controlled by bit operation with the mask .

At this point, the code has been optimized , High efficiency and high readability , Then look back at this time isa_t The source code of the community

7.2 isa_t Source code

At this time, we are looking back isa_t Source code

//  Streamlined isa_t Shared body 
union isa_t 
{
    isa_t() { }
    isa_t(uintptr_t value) : bits(value) { }

    Class cls;
    uintptr_t bits;

# if __arm64__
# define ISA_MASK 0x0000000ffffffff8ULL
# define ISA_MAGIC_MASK 0x000003f000000001ULL
# define ISA_MAGIC_VALUE 0x000001a000000001ULL
    struct {
        uintptr_t nonpointer        : 1;
        uintptr_t has_assoc         : 1;
        uintptr_t has_cxx_dtor      : 1;
        uintptr_t shiftcls          : 33; // MACH_VM_MAX_ADDRESS 0x1000000000
        uintptr_t magic             : 6;
        uintptr_t weakly_referenced : 1;
        uintptr_t deallocating      : 1;
        uintptr_t has_sidetable_rc  : 1;
        uintptr_t extra_rc          : 19;
# define RC_ONE (1ULL<<45)
# define RC_HALF (1ULL<<18)
    };
#endif
};

Past the front Yes An operation 、 Bit field as well as Shared body Introduction to , Now let's look at the source code. We can clearly understand the content :

The source code stores 64 The value of a , These values are shown in the structure , Through to bits Perform bit operation and get the value of the corresponding position
shiftcls
- shiftcls There is a store of Class、Meta-Class Object's memory address information
- We were OC It is mentioned in the essence of objects , Object's isa The pointer needs to be the same as ISA_MASK After a &（ Bitwise AND ） Operation can get the real Class Object address

isa Pointer bitwise and get Class Object address

Now let's look again ISA_MASK Value 0x0000000ffffffff8ULL, We convert it into binary numbers

0x0000000ffffffff8ULL Binary system

As you can see in the picture above ISA_MASK Convert the value of to binary with 33 All for 1, As mentioned earlier, the function of bitwise and is to take out this 33 Value in bit
So it's obvious at this time , Same as ISA_MASK By bitwise and operation, you can get Class or Meta-Class Value .

At the same time, we can see ISA_MASK The value of the last three digits is 0, Then any number is the same as ISA_MASK After bitwise and operation , The last three people must be 0, Therefore, the last three digits of the memory address of any class object or metaclass object must be 0, The last digit converted to hexadecimal must be 8 perhaps 0.

7.3 `isa` Information and functions stored in

Take out the structure and mark the function of this information .

struct {
    // 0 Represents a common pointer , It stores Class,Meta-Class The memory address of the object .
    // 1 Represents the optimized use of bit fields to store more information .
    uintptr_t nonpointer        : 1; 

   //  Has the associated object been set , without , Release faster 
    uintptr_t has_assoc         : 1;

    //  Is there a C++ Destructor , without , Release faster 
    uintptr_t has_cxx_dtor      : 1;

    //  It stores Class、Meta-Class Object's memory address information 
    uintptr_t shiftcls          : 33; 

    //  Used to distinguish whether the object is not initialized during debugging 
    uintptr_t magic             : 6;

    //  Is there a weak reference to , without , Release faster 
    uintptr_t weakly_referenced : 1;

    //  Is the object releasing 
    uintptr_t deallocating      : 1;


    //  The value stored in it is the reference counter minus 1
    uintptr_t extra_rc          : 19;
    
    
    //  Whether the reference counter is too large to store in isa in 
    //  If 1, Then the reference count will be stored in a call SideTable In the properties of the class 
    uintptr_t has_sidetable_rc  : 1;
};

7.3.1 verification `isa` Whether the information stored in is reliable

Verify the location and function of the above information storage through the following code

//  The following code needs to be run in the real machine , Because the real machine is __arm64__  An architecture 
- (void)viewDidLoad {
    [super viewDidLoad];
    Person *person = [[Person alloc] init];
    NSLog(@"%p",[person class]);
    NSLog(@"%@",person);
}

Print first person The address of the class object , Then print it through the breakpoint person Object's isa The pointer Address .

First, let's take a look at the printed content

Print the content

Convert the class object address to binary

Class object address

take person Of isa Pointer address is converted to binary

person Object's isa Pointer address

shiftcls : shiftcls Store class object address in , Through the comparison of the above two figures, we can find the address of the storage class object 33 Bit binary content is exactly the same .

extra_rc : extra_rc Of 19 The value stored in the bit is the reference count minus one , Because at this time person The reference count is 1, So now extra_rc Of 19 Bit binary stores 0.

magic : magic Of 6 Bit is used to distinguish whether the object has not completed initialization during debugging , In the above code person Initialization completed , So at this time 6 Values stored in bit binary 011010 It is the macro defined in the community # define ISA_MAGIC_VALUE 0x000001a000000001ULL Value .

nonpointer : This must be the optimized isa, therefore nonpointer The value of must be 1

Because at this time person Object has no associated object and no weak pointer references , It can be seen that has_assoc and weakly_referenced Values are 0, Then we're going to person Object to add weak references and associated objects , Let's see has_assoc and weakly_referenced The change of .

- (void)viewDidLoad {
    [super viewDidLoad];
    Person *person = [[Person alloc] init];
    NSLog(@"%p",[person class]);
    //  by person Add weak references 
    __weak Person *weakPerson = person;
    //  by person Add associated object 
    objc_setAssociatedObject(person, @"name", @"xx_cc", OBJC_ASSOCIATION_RETAIN_NONATOMIC);
    NSLog(@"%@",person);
}

Reprint person Of isa The pointer address is converted into binary, and you can see has_assoc and weakly_referenced All of the values become 1

has_assoc and weakly_referenced The change of

Be careful ： As long as the associated object is set or the weak reference object is referenced has_assoc and weakly_referenced The value of will become 1, Whether or not the associated object is set to nil Or break weak references .

If the associated object has not been set , Objects are released faster , This is because when the object is destroyed, it will judge whether there is an associated object, and then release the associated object . Take a look at the source code of object destruction

void *objc_destructInstance(id obj) {
    if (obj) {
        Class isa = obj->getIsa();
        //  Is there a c++ Destructor 
        if (isa->hasCxxDtor()) {
            object_cxxDestruct(obj);
        }
        //  Whether there are related objects , If so, remove 
        if (isa->instancesHaveAssociatedObjects()) {
            _object_remove_assocations(obj);
        }
        objc_clear_deallocating(obj);
    }
    return obj;
}

I believe we have been right by now isa The pointer With a new understanding :

arm64 After architecture ,isa The pointer It's not just about storing Class or Meta-Class The address of , Instead, more information is stored by using a common body
among shiftcls Store Class or Meta-Class The address of , It needs to be the same as ISA_MASK Carry out the positioning & Operation can get its memory address value .

3、 ... and 、class Structure

1. Take a look back. Class The internal structure of

We were exploring before OC Three types of objects , From simple exploration Class The internal structure of , And right Class The understanding of structure is finally summarized with a picture :

We are right in the previous space isa The pointer After having a new understanding , It also needs to be based on this Yes Class Yes Further exploration , Reunderstanding Class internal structure :

First, let's review Class The internal structure of the related source code :

struct objc_class : objc_object {
    // Class ISA;
    Class superclass;
    cache_t cache;             // formerly cache pointer and vtable
    class_data_bits_t bits;    // class_rw_t * plus custom rr/alloc flags

    class_rw_t *data() { 
        return bits.data();
    }
    void setData(class_rw_t *newData) {
        bits.setData(newData);
    }
}

class_rw_t* data() {
    return (class_rw_t *)(bits & FAST_DATA_MASK);
}

1.1 class_rw_t

It is not difficult for us to know from the source code ：

bits & FAST_DATA_MASK After bit operation , You can get class_rw_t
and class_rw_t There is a store of Method list 、 attribute List and agreement List etc.

Take a look at class_rw_t Part of the code :

struct class_rw_t {
    // Be warned that Symbolication knows the layout of this structure.
    uint32_t flags;
    uint32_t version;

    const class_ro_t *ro;

    method_array_t methods; //  Method list 
    property_array_t properties; //  Property list 
    protocol_array_t protocols; //  List of agreements 

    Class firstSubclass;
    Class nextSiblingClass;

    char *demangledName;
};

from class_rw_t Members inside the structure :method_array_t、property_array_t、protocol_array_t All of them are Two dimensional array

We can go and have a look at method_array_t、property_array_t、protocol_array_t The internal structure of :

class method_array_t : 
    public list_array_tt<method_t, method_list_t> 
{
    typedef list_array_tt<method_t, method_list_t> Super;

 public:
    method_list_t **beginCategoryMethodLists() {
        return beginLists();
    }

    method_list_t **endCategoryMethodLists(Class cls);

    method_array_t duplicate() {
        return Super::duplicate<method_array_t>();
    }
};


class property_array_t : 
    public list_array_tt<property_t, property_list_t> 
{
    typedef list_array_tt<property_t, property_list_t> Super;

 public:
    property_array_t duplicate() {
        return Super::duplicate<property_array_t>();
    }
};


class protocol_array_t : 
    public list_array_tt<protocol_ref_t, protocol_list_t> 
{
    typedef list_array_tt<protocol_ref_t, protocol_list_t> Super;

 public:
    protocol_array_t duplicate() {
        return Super::duplicate<protocol_array_t>();
    }
};

We are here to method_array_t For example , Analyze the composition of its two-dimensional array :
- method_array_t Itself is an array , What is stored in the array is the array method_list_t
- method_list_t The final storage inside is method_t
- method_t Is a method object

class_rw_t Inside methods、properties、protocols Is a two-dimensional array , It's readable and writable , Which includes The initial content of the class as well as The content of the classification . ( Here we use methods For example , actually properties and protocols Are similar in composition )

1.2 class_ro_t

We mentioned it before class_ro_t There is also storage in Method 、 attribute 、 agreement list , And then there is Member variables list .

Then take a look at class_ro_t Part of the code

struct class_ro_t {
    uint32_t flags;
    uint32_t instanceStart;
    uint32_t instanceSize;
#ifdef __LP64__
    uint32_t reserved;
#endif

    const uint8_t * ivarLayout;
    
    const char * name;// Class name 
    method_list_t * baseMethodList;// Method list 
    protocol_list_t * baseProtocols;// List of agreements 
    const ivar_list_t * ivars;// Member variables 

    const uint8_t * weakIvarLayout;
    property_list_t *baseProperties;// Property list 

    method_list_t *baseMethods() const {
        return baseMethodList;
    }
};

from class_rw_t We can see in the source code of class_ro_t *ro member , But it was const Embellished , That is, read-only , Immutable . Let's take a closer look at class_ro_t We can know the internal structure of :

What is stored directly inside is method_list_t、protocol_list_t 、property_list_t Type of One dimensional array
In the array, there are Initial information of class
With method_list_t For example ,method_list_t Directly stored in method_t, But it's read-only , Addition, deletion, modification and query are not allowed .

1.3 summary

Take the method list as an example ,class_rw_t Medium methods Is the structure of a two-dimensional array , also Can read but write

So you can Dynamic adding method , And it is more convenient to add classification methods
Because we are Category The essence of It says that ,attachList Function through memmove and memcpy The two operations will List of classification methods Merge in In the method list of this class ( That is to say class_rw_t Of methods in )
At this time, the classification method and the method of this class are unified and integrated

In fact, at first, the method of class , attribute , Member variables, attribute protocols, etc. are stored in class_ro_t Medium

When the program is running , When you need to merge the list in the classification with the initial list of the class , Will be class_ro_t The list in and the list in the classification are merged and stored in class_rw_t in
in other words class_rw_t Part of the list is from class_ro_t From inside . And finally merge with the classification method

You can see the implementation of this part through the source code :

static Class realizeClass(Class cls) {
    runtimeLock.assertWriting();

    const class_ro_t *ro;
    class_rw_t *rw;
    Class supercls;
    Class metacls;
    bool isMeta;

    if (!cls) return nil;
    if (cls->isRealized()) return cls;
    assert(cls == remapClass(cls));

    //  In the beginning cls->data It's pointing ro Of 
    ro = (const class_ro_t *)cls->data();

    if (ro->flags & RO_FUTURE) { 
        // rw Has initialized and allocated memory space 
        rw = cls->data();  // cls->data Point to rw
        ro = cls->data()->ro;  // cls->data()->ro Point to ro
        cls->changeInfo(RW_REALIZED|RW_REALIZING, RW_FUTURE);
    } else { 
        //  If rw There is no such thing as , Then for rw Allocate space 
        rw = (class_rw_t *)calloc(sizeof(class_rw_t), 1); //  Allocate space 
        rw->ro = ro;  // rw->ro Point back to ro
        rw->flags = RW_REALIZED|RW_REALIZING;
        //  take rw Pass in setData function , be equal to cls->data() Point back to rw
        cls->setData(rw); 
    }
}

Source code interpretation :

From the above source code, we can find :

The initial information of the class is actually stored in class_ro_t Medium
- also ro It was supposed to point to cls->data() Of
- in other words bits.data() Get is ro
But in the process of running, I created class_rw_t, And will cls->data Point to rw
- At the same time, the initial information ro Assign a value to rw Medium ro
- Finally, it's passing setData(rw) Set up data
So at this time bits.data() What you get is rw
Then check whether there is classification , At the same time, classify Method 、 attribute 、 agreement List consolidation is stored in class_rw_t Of Method , attribute And agreement In the list

Through the above analysis of the source code , We are right. class_rw_t Memory Method 、 attribute 、 agreement The process of listing has a clearer understanding , Then next, explore class_rw_t How to store methods in .

2. class_rw_t How to store methods in ?

2.1 method_t

We know method_array_t What is ultimately stored in method_t

method_t It's the right way 、 Encapsulation of functions , Every method object is a method_t
Look through the source code method_t The structure of the body :

struct method_t {
    SEL name;  //  Function name 
    const char *types;  //  code （ return type , Parameter type ）
    IMP imp; //  Pointer to function （ Function address ）
};

method_t Three members can be seen in the structure , Let's take a look at what these three member variables represent in turn :

2.1.1 SEL

SEL Representative method \ Function name , It's called a selector , The substructure follows char * similar

SEL Can pass @selector() and sel_registerName() get

SEL sel1 = @selector(test);
SEL sel2 = sel_registerName("test");

It can also be done through sel_getName() and NSStringFromSelector() take SEL Convert to string
```
char *string = sel_getName(sel1);
NSString *string2 = NSStringFromSelector(sel2);
```
Methods with the same name in different classes , The corresponding method selector is the same
- SEL Only the name of the method , And the same method name in different classes SEL It's the only thing in the world .
```
NSLog(@"%p,%p", sel1,sel2);
Runtime-test[23738:8888825] 0x1017718a3,0x1017718a3
```

typedef struct objc_selector *SEL;, You can put SEL As a method name string .

2.1.2 types

types Contains the function return value , Parameter encoded string

The return value and parameters are spliced into a string by string splicing
This string can be used for Representative function Return value And Parameters

Let's check through the code types How to represent the return value and parameters of a function :

First of all, by writing a few and runtime Underlying implementation class The same structure , Used to simulate Class Internal implementation
We were there seek Class The essence of when , Have done this operation ： Explore internal data through forced type conversion

Person *person = [[Person alloc] init];
xx_objc_class *cls = (__bridge xx_objc_class *)[Person class];
class_rw_t *data = cls->data();

Through the breakpoint, you can data Find types Value

data in types Value

As you can see in the picture above types The value of is [email protected]:8
So what does this value mean ？
apple In order to clearly use the string representation method and its return value , Formulated a series of corresponding rules , The one-to-one correspondence can be seen from the table below

take types The value of is compared with that in the table one by one types Value [email protected]:8 What is the

- (void) test;

 v    16      @     0     :     8
void         id          SEL
// 16 Indicates the occupied space of the parameter ,id Followed by 0 From 0 Bit starts storing ,id Occupy 8 Bitspace .
// SEL hinder 8 Says from the first 8 Bit starts storing ,SEL Same share 8 Bitspace

We know that any method has two parameters by default ,id Type of self, and SEL Type of _cmd, And the above passage is right types The analysis of also verifies this statement .

In order to see more clearly , We are test Check again after adding the return value and parameters types Value .

types Value

Also find out the one-to-one corresponding values through the above table , see types The value of represents the method

- (int)testWithAge:(int)age Height:(float)height
{
    return 0;
}
  i    24    @    0    :    8    i    16    f    20
int         id        SEL       int        float
//  The total occupied space of the parameter is  8 + 8 + 4 + 4 = 24
// id  From 0 Bits begin to occupy 8 Bitspace 
// SEL  From 8 Bits begin to occupy 8 Bitspace 
// int  From 16 Bits begin to occupy 4 Bitspace 
// float  From 20 Bits begin to occupy 4 Bitspace

iOS Provides @encode Instructions , You can convert specific types into string encoding .

NSLog(@"%s",@encode(int));
NSLog(@"%s",@encode(float));
NSLog(@"%s",@encode(id));
NSLog(@"%s",@encode(SEL));

//  Print the content 
Runtime-test[25275:9144176] i
Runtime-test[25275:9144176] f
Runtime-test[25275:9144176] @
Runtime-test[25275:9144176] :

As you can see in the above code , The corresponding relationship is indeed shown in the table above .

2.1.3 IMP

IMP Represents the specific implementation of the function

The stored content is the function address
That is, when you find IMP You can find the function implementation , Then call the function

Print in the above code IMP Value

Printing description of data->methods->first.imp:
(IMP) imp = 0x000000010c66a4a0 (Runtime-test`-[Person testWithAge:Height:] at Person.m:13)

After the test Method internal print breakpoints , And come to the inside of its method, you can see IMP The address stored in is the address of the method implementation .

Four 、cache_t Method cache

Through the previous exploration, we know how the method list is stored in Class Class object Medium

But when subclasses that inherit multiple times want to call base class methods , It needs to pass superclass The pointer finds the base class layer by layer , Find the corresponding method from the base class method list to call
If Multiple calls Base class method , that You need to traverse many times List of methods of each parent class , This is undoubtedly harmful to performance

Apple adopt Method caching technology The form of solves this problem , Next, let's explore Class Class object How to cache methods

Back to the class object structure objc_class. There's a member variable in it cache

This cache Member variables are used to implement Method caching technology The support of

struct objc_class : objc_object {
    // Class ISA;
    Class superclass;
    cache_t cache; //  Method cache  // formerly cache pointer and vtable
    class_data_bits_t bits;    // class_rw_t * plus custom rr/alloc flags

    class_rw_t *data() { 
        return bits.data();
    }
    void setData(class_rw_t *newData) {
        bits.setData(newData);
    }
}

Class There is a method cache in the internal structure （cache_t）, use Hash table （ Hashtable ） To cache methods that have been called , It can improve the search speed of the method

Review the method invocation process ：

When calling a method , You need to search through the method list
- If the method is not in the list , Would pass superclass Find the class object of the parent class , Traverse and find in the list of methods to parent class objects .
If the method needs to be called many times , That's equivalent to Each call requires traversal Multiple method lists

cache_t technology

In order to find the method quickly ,apple Designed cache_t To cache methods :
Whenever a method is called , Will go first cache Find out whether there is cache in :
- If there is no cache , Look in the list of de classed object methods , And so on until we find a way , The method will be stored directly in cache in
- The next time you call this method , It will be in the class object cache Find this way inside , Call directly

1. cache_t How to cache

that cache_t How to cache methods ？ So let's look at this first cache_t The internal structure of .

struct cache_t {
    struct bucket_t *_buckets; //  Hash table   Array 
    mask_t _mask; //  The length of the hash table  -1
    mask_t _occupied; //  Number of methods that have been cached 
};

bucket_t The method list is stored in an array , to glance at bucket_t internal structure

struct bucket_t {
private:
    cache_key_t _key; // SEL As Key
    IMP _imp; //  The memory address of the function 
};

You can see it in the source code :

bucket_t There is a store of SEL and _imp
adopt key->value In the form of :
- With SEL by key
- Memory address of function implementation _imp by value To store methods

Show it through a picture cache_t Structure

Method hash table bucket_t

Above bucket_t Lists are called hash tables （ Hashtable ）
Hash table （Hash table, Also called hash table ）, According to the key code value (Key value) Data structures that are accessed directly
in other words , It accesses records by mapping key values to a location in a table , To speed up the search .
This mapping function is called Hash function , The array of records is called Hash table

that apple How to quickly and accurately find the corresponding in the hash table key And function implementation ？
This requires us to look through the source code apple How to design the hash function of :

2. Hash function and hash table principle

First, let's take a look at the source code of the method cache （ Mainly check several functions , Key code has comments , No more introduction ）

2.1 cache_fill And cache_fill_nolock function

void cache_fill(Class cls, SEL sel, IMP imp, id receiver) {
#if !DEBUG_TASK_THREADS
    mutex_locker_t lock(cacheUpdateLock);
    cache_fill_nolock(cls, sel, imp, receiver);
#else
    _collecting_in_critical();
    return;
#endif
}

static void cache_fill_nolock(Class cls, SEL sel, IMP imp, id receiver) {
    cacheUpdateLock.assertLocked();
    //  without initialize direct return
    if (!cls->isInitialized()) return;
    //  Make sure threads are safe , No other threads add cache 
    if (cache_getImp(cls, sel)) return;
    //  Get... Through class object cache 
    cache_t *cache = getCache(cls);
    //  take SEL Package as Key
    cache_key_t key = getKey(sel);
   //  Occupancy space +1
    mask_t newOccupied = cache->occupied() + 1;
   //  Get the cache capacity of the cache list , How many key value pairs can be stored 
    mask_t capacity = cache->capacity();
    if (cache->isConstantEmptyCache()) {
        //  If it is empty , Then create a space , The space created here is 4 individual .
        cache->reallocate(capacity, capacity ?: INIT_CACHE_SIZE);
    }
    else if (newOccupied <= capacity / 4 * 3) {
        //  If the occupied space accounts for 3/4 once , Then continue to use the current space 
    }
    else {
       //  If the occupied space exceeds 3/4 Then expand the space 
        cache->expand();
    }
    //  adopt key Find the right storage space .
    bucket_t *bucket = cache->find(key, receiver);
    //  If key==0 It means that this has not been stored before key, Occupancy space +1
    if (bucket->key() == 0) cache->incrementOccupied();
    //  Storage key,imp 
    bucket->set(key, imp);
}

2.2 expand () function

When the space of hash table is occupied more than 3/4 When , The hash table will call expand () Function to expand , Let's take a look expand () How to expand hash table in function .

void cache_t::expand() {
    cacheUpdateLock.assertLocked();
    //  Get the storage space of the old hash table 
    uint32_t oldCapacity = capacity();
    //  Expand the storage space of the old hash table to twice 
    uint32_t newCapacity = oldCapacity ? oldCapacity*2 : INIT_CACHE_SIZE;
    //  Assign a value to the new storage space 
    if ((uint32_t)(mask_t)newCapacity != newCapacity) {
        newCapacity = oldCapacity;
    }
    //  call reallocate function , Re create storage space 
    reallocate(oldCapacity, newCapacity);
}

2.3 reallocate function

From the above source code, we can see reallocate Function is responsible for allocating hash space , Came to reallocate Internal function .

void cache_t::reallocate(mask_t oldCapacity, mask_t newCapacity) {
    //  Can the old hash table be released 
    bool freeOld = canBeFreed();
    //  Get the old hash table 
    bucket_t *oldBuckets = buckets();
    //  Create a new hash table with new space requirements 
    bucket_t *newBuckets = allocateBuckets(newCapacity);

    assert(newCapacity > 0);
    assert((uintptr_t)(mask_t)(newCapacity-1) == newCapacity-1);
    //  Set up Buckets and Mash,Mask The value of is the hash table length -1
    setBucketsAndMask(newBuckets, newCapacity - 1);
    //  Release old hash table 
    if (freeOld) {
        cache_collect_free(oldBuckets, oldCapacity);
        cache_collect(false);
    }
}

The above source code is first introduced reallocate Functional newCapacity by INIT_CACHE_SIZE,INIT_CACHE_SIZE Is an enumeration value , That is to say 4. Therefore, the initial space created by hash table is 4 individual .

enum {
    INIT_CACHE_SIZE_LOG2 = 2,
    INIT_CACHE_SIZE      = (1 << INIT_CACHE_SIZE_LOG2)
};

It can be found in the above source code that the capacity of the hash table will be increased to the previous 2 times .

2.4 find function

Finally, let's take a look at how to quickly pass through the hash table key Find the appropriate bucket Well ？ We are here find Internal function

bucket_t * cache_t::find(cache_key_t k, id receiver) {
    assert(k != 0);
    //  Get hash table 
    bucket_t *b = buckets();
    //  obtain mask
    mask_t m = mask();
    //  adopt key find key The subscript stored in the hash table 
    mask_t begin = cache_hash(k, m);
    //  Assign subscript to i
    mask_t i = begin;
    //  If the subscript i Stored in the bucket Of key==0 It indicates that there is no corresponding key, take b[i] Go back out and store 
    //  If the subscript i Stored in the bucket Of key==k, It indicates that the corresponding... Has been stored in the current space key, take b[i] Go back out and store 
    do {
        if (b[i].key() == 0  ||  b[i].key() == k) {
            //  If the conditions are met, then directly reutrn get out 
            return &b[i];
        }
    //  If you come here, it means that you are not satisfied , Then it will move forward a space to make a new judgment , Know you can succeed return until 
    } while ((i = cache_next(i, m)) != begin);

    // hack
    Class cls = (Class)((uintptr_t)this - offsetof(objc_class, cache));
    cache_t::bad_cache(receiver, (SEL)k, cls);
}

function cache_hash (k, m) To pass through key Find the subscript stored in the hash table , Came to cache_hash (k, m) Internal function

static inline mask_t cache_hash(cache_key_t key, mask_t mask) {
    return (mask_t)(key & mask);
}

You can find cache_hash (k, m) Inside the function, only key & mask Bitwise and operation of , Get the subscript and store it in the corresponding position . Bitwise and operations have been explained in detail above , I'm not going to repeat it here .

2.5 _mask

We know from the above analysis _mask The value of is the length of the hash table minus one , Then any number passes through and _mask The value obtained after bitwise sum operation will be less than or equal to _mask, Therefore, there will be no array overflow .

for instance , Assume that the length of the hash table is 8, that mask The value of is 7

  0101 1011  //  Any value 
& 0000 0111  // mask = 7
------------
  0000 0011 // The value obtained is always equal to or less than mask Value

3. Method call summary

First method lookup and caching :
- First method lookup : When using the method for the first time , Message mechanism adopt isa The pointer find class/meta-class
- Method cache : After traversing the method list and finding the method ( If not, call superclass Go to the parent class to find ), The method will be SEL by keyIMP by value The way is cached in cache Of _buckets in
- Hash table subscript : When storing for the first time , Will be created with 4 A spatial hash table , And will _mask The value of is set to the length of the hash table minus one , After through SEL & mask Calculate the subscript value stored by the method , And store the method in the hash table
  - for instance , If the calculated subscript value is 3, Then store the method directly in the subscript 3 In the space of , The space in front will be left blank
Hash table expansion :
- When the length of the hash table occupied by the method stored in the hash table exceeds 3/4 When , The hash table will be expanded :
  - A new hash table will be created and the space will be expanded to twice the original space
  - And reset _mask Value
  - Finally, release the old hash table
- At this time, if there is another method to cache , You need to pass again SEL & mask After calculating the subscript value, it is stored according to the subscript
Hash table subscript calculation :
- If there are many methods in a class , There are likely to be multiple methods SEL & mask The obtained value is the same subscript value
- If the calculated subscript has a value , Then call cache_next Function values down -1 Bit to store
- If the subscript value -1 There are storage methods in the bit space , also key Not related to what you want to store key identical , Then go to the previous one for comparison , Until you find a bit space, there is no storage method or key And to store key The same until
- If to subscript 0 The subscript will be _mask The space of is the largest space for comparison .
Non first time method lookup :
- When you want to find a method , There is no need to traverse the hash table , Similarly passed SEL & mask Calculate the subscript value , Go directly to the space value of the subscript value
- ditto , If stored in the subscript value key And what to look for key inequality , Just go to the front one to find .
- Although this takes up a small amount of space , But it saves a lot of time , That is to say, in fact apple It is a method of using space for time to find an algorithm to optimize the measurement strategy .

Take a clearer look at the process through a diagram :

Hash table internal access logic

4. Verify the above process

Demonstrate it with a piece of code . Also use imitation objc_class Structure Customize a structure , And perform forced conversion to view its internal data , User defined structures have been used many times in previous articles, and will not be repeated here .

We created Person Class inheritance NSObject,Student Class inheritance Person,CollegeStudent Inherit Student. The three classes are personTest,studentTest,colleaeStudentTest Method

Take a look at the process of method caching by printing breakpoints

int main(int argc, const char * argv[]) {
    @autoreleasepool {
        
        CollegeStudent *collegeStudent = [[CollegeStudent alloc] init];
        xx_objc_class *collegeStudentClass = (__bridge xx_objc_class *)[CollegeStudent class];
        
        cache_t cache = collegeStudentClass->cache;
        bucket_t *buckets = cache._buckets;
        
        [collegeStudent personTest];
        [collegeStudent studentTest];
        
        NSLog(@"----------------------------");
        for (int i = 0; i <= cache._mask; i++) {
            bucket_t bucket = buckets[i];
            NSLog(@"%s %p", bucket._key, bucket._imp);
        }
        NSLog(@"----------------------------");
        
        [collegeStudent colleaeStudentTest];

        cache = collegeStudentClass->cache;
        buckets = cache._buckets;
        NSLog(@"----------------------------");
        for (int i = 0; i <= cache._mask; i++) {
            bucket_t bucket = buckets[i];
            NSLog(@"%s %p", bucket._key, bucket._imp);
        }
        NSLog(@"----------------------------");
        
        NSLog(@"%p",@selector(colleaeStudentTest));
        NSLog(@"----------------------------");
    }
    return 0;
}

We are in collegeStudent Instance object call personTest,studentTest,colleaeStudentTest Method at the break point to view cache The change of .

personTest Before method call :

personTest Before method call

As you can see from the picture above :

personTest Before method call ,cache Only init Method
As you can see in the picture above init Method It happens to be stored in the subscript 0 So we can see
_mask The value of is 3 Verify that the hash table mentioned in the above source code will be allocated when it is stored for the first time 4 Memory space
_occupied The value of is 1 Prove that at this time _buckets Only one method is stored in .

When collegeStudent Calling personTest When :

First find out collegeStudent Class object Of cache There is no personTest Method , Will go collegeStudent Class object Search for
There is also no , Then pass superclass The pointer find Student Class object
Studeng Class object in cache As with the method list , Re pass superclass The pointer find Person Class object
In the end in Person Class object Call after finding it in the method list , And slow to exist collegeStudent Class object Of cache in .

perform personTest Method and then check cache Changes in methods :

As can be seen in the figure above :

_occupied The value is 2, That at this time personTest Methods have been cached in collegeStudent Class object Of cache in

In the same way studentTest After method , Let's check this time by printing cache Information stored in memory

cache Information stored in memory

As you can see in the picture above cache It does store init 、personTest 、studentTest Three methods .

Then I have executed colleaeStudentTest Method Then at this time cache The Chinese should be right about colleaeStudentTest Method Cache .

The source code mentioned above , When the number of stored methods exceeds the length of the hash table 3/4 when , The system will re create a new hash table with twice the original capacity to replace the original hash table .
Over fall colleaeStudentTest Method , Reprint cache Memory storage method view :

As you can see from the diagram :

_bucket After the expansion of the hash table, only colleaeStudentTest Method
And print in the above figure SEL & _mask The value of the subscript obtained by bit operation is indeed _bucket In the list colleaeStudentTest Method Storage location

So far it has been right Class The structure and method of cache have new cognition :

apple Cache the method in the form of hash table , Save a lot of time to find methods with a small amount of space

当前位置：网站首页>12 - explore the underlying principles of IOS | runtime [isa details, class structure, method cache | t]

12 - explore the underlying principles of IOS | runtime [isa details, class structure, method cache | t]