当前位置:网站首页>Numpy's research imitation 1

Numpy's research imitation 1

2022-06-29 21:43:00 InfoQ

I have been in contact with a project , It is necessary to extract the voiceprint feature vector of an audio , For speech recognition . This requires some mathematical calculation . It mainly includes fast Fourier transform 、 High frequency filtering 、 Calculation of inverse Mel coefficient . I refer to some materials on the Internet , What is useful is to use  Python Realized , Among them, there are  Numpy  The highest frequency of use . But our project needs to extract voiceprints from mobile phones , Only use  C  To achieve audio voiceprint extraction . The difficulty is to  C  Write a set of  Numpy  The function of ( Part of the ), And then emulate  Python  Code writing  C  Version of voiceprint extraction .

This paper mainly records the author in order to achieve voiceprint extraction , Instead of having to use  C  Copy  Numpy  Part of the function . Recorded in the study  Numpy  There are some interesting processes in the process .

01 
 
N Analysis of dimension group

NumPy  It's using  Python  Basic software package for Scientific Computing .NumPy  One of the most important features is its  N  Dimensional array object  ndarray, It's a collection of data of the same type , With  0  The subscript is to start indexing the elements in the collection .ndarray  Object is a multidimensional array used to hold elements of the same type .ndarray  Each element in has an area of the same storage size in memory . use  C  To imitate one  Numpy  How to design ?

The key is this  N  Dimension group .N  I'm not sure , When designing objects , The inner array is not fixed . This is a challenge for a static language . But one thing is certain , That is  N  The number of elements in the dimension group is certain , The number of elements is determined , It means that the size of memory space is determined . for example  3*5*5  Three dimensional array of , If it is  float  type , That's it  3*5*5*sizeof(float) = 75 * 4 = 300 byte. So internally, we design a continuous memory block with the same size . Externally through the interface , Restore it to a 3D array .

Give this to me first  C  It's a fake  Numpy  Give a domineering name  --- ultra_array, The prototype is as follows :
struct _u_array {
 char *start[2];
 int axis_n;
};

Yes, it is so simple and grand .start  It's an array of Pointers . There are only two in all . These two pointer arrays , Point to two arrays respectively ,start[0]  Is a pointer to an array that stores dimension information . for example  3*5*5  It is  [3, 5, 5],start[1]  Is the memory block pointer to each stored data . and  axis_n  Indicates that the array has dimensions . for example  3  Dimension is  3.4  Dimension is  4.axis_n  Confirmed.  start[0]  The boundary of the . and  start[0]  Confirmed again  start[1]  The boundary of the .


02
 
Multidimensional array element access


How to access this  N  The elements in the dimension group ?N  The data in the dimension group is stored in a one-dimensional array . Then we enter the coordinates of the elements at the interface , You need a matrix to get the coordinates of the one-dimensional array , For example, there is a  3*4*5*6  Array , We are going to visit (2,2,2,3) The data of this coordinate , So we first have to calculate this, this  3*3*5*6  This transformation matrix .

Do you think that the transformation matrix is (3,4,5,6). wrong ! If you think it's this , That would be a mistake of empiricism . Its transformation matrix starts from the penultimate two dimensions , Multiply each dimension by the next dimension to the last dimension , The resulting matrix will be  N  The transformation matrix from dimension group to one-dimensional array , And the last dimension is 1 Instead of . namely (3*4*5*6) The transformation matrix of the array is  [ 4*5*6,5*6, 6,1 ] => [ 120, 30, 6, 1 ].

that (2,2,2,3) This coordinate is converted into a one-dimensional coordinate  (2,2,2,3)dot (120,30,6,1)
T 
=> 2*120 + 2*30 + 2*6 + 3*1 => 315. The code implementation is as follows :
static size_t
__xd_coord_to_1d_offset(size_t coord[], size_t axes[], int axis_n) {

 size_t offset = 0, axis_mulitply;
 for (int i=0; i<axis_n; ++i) {
 size_t co = coord[i];
 axis_mulitply = __axis_mulitply(axes, axis_n, i+1);
 offset += co * axis_mulitply;
 }
 return offset;
}

So how do one-dimensional coordinates become  N  Dimensional coordinates ? The penultimate dimension begins , We need to use one-dimensional coordinate values , Divide by the penultimate dimension and multiply by the last dimension , The resulting quotient is the coordinate of the current dimension , The remainder is the total value of the next dimension , Divide the product from the following dimension to the last dimension by the total value , Until the last dimension . For example, the one-dimensional coordinates we just calculated are  315, Then according to the above calculation, it is :
315 / (4*5*6)= 2  more than  75
75&nbsp; &nbsp;/ (5*6)&nbsp; &nbsp;= 2  more than  15&nbsp;
15&nbsp; &nbsp;/ (6)&nbsp; &nbsp; &nbsp; &nbsp;= 2  more than  3&nbsp;&nbsp;&nbsp;
&nbsp;3&nbsp; &nbsp; /&nbsp; &nbsp; 1&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; = 3&nbsp; &nbsp;&nbsp;&nbsp;

So the coordinates are  [2,2,2,3]. The code implementation is :
static void
__1d_offset_to_xd_coord( size_t offset, size_t axes[], int axis_n, size_t coord[])
{
 size_t div, mod, i, axis_mulitply, middle_value;
 middle_value = offset;
 for(i=0; i<axis_n-1; ++i) {
 axis_mulitply = __axis_mulitply(axes, axis_n, i+1);
 div = middle_value / axis_mulitply;
 mod = middle_value % axis_mulitply;
 coord[i] = div;
 middle_value = mod;
 }
 coord[i] = mod;
 return;
}

03&nbsp;
Code implementation

  • initialization
/**
 *  Enter dimension quantity , for example  3  dimension
 *  Enter each dimension , for example  [3, 3, 3]
 */
u_array_t UArray_create(int axis_n, size_t shape[]) 
{
 if (axis_n >= 0) {
 u_array_t n_array;
 n_array.axis_n = axis_n;
 start[0] = __alloc_shape(axis_n, shape);
 start[1] = __alloc_data(__axis_mulitply(shape, axis_n, 0));

 return n_array;
 }
 return ua_unable;
}

  • Load data
u_array_t* UArray_load(u_array_t* arr, vfloat_t data[])
{
 size_t size_arr = UA_size(arr);
 vfloat_t* ptr = UA_data_ptr(arr);
 memcpy(ptr, data, size_arr); 
 return arr;
}

  • Access data
float UArray_get(u_array_t* arr, ...) 
{
 va_list valist;
 va_start(valist, arr);
 size_t coord[UA_axisn(arr)];
 for (int i=0; i<UA_axisn(arr); ++i) {
 coord[i] = va_arg(valist, size_t);
 }
 va_end(valist);
 size_t offset = UA_cover_coordinate(arr, coord);
 return ((float*)(UA_data_ptr(arr)))[offset];
}

void UArray_set(u_array_t* arr, ...)
{
 va_list valist;
 va_start(valist, arr);
 size_t coord[UA_axisn(arr)];
 vfloat_t value;
 for (int i=0; i<UA_axisn(arr); ++i) {
 coord[i] = va_arg(valist, size_t);
 }
 value = va_arg(valist, double);
 va_end(valist);
 size_t offset = UA_cover_coordinate(arr, coord);
 ((float*)(UA_data_ptr(arr)))[offset] = value;
 return;
}


04&nbsp;
test

int main()
{
 //  Define a  3  Dimensional  ultra_array
 u_array_t arr3 = UArray3d(2, 3, 4);
 //  Fill from  0  To  23  The number of .
 UA_arange(&arr3, 2*3*4);
 //  obtain
 float v = UA_get(&arr3, 1, 2, 3);
 // v == 23
 UA_set(&arr3, 1, 2, 3, 5.5);
 v = UA_get(&arr3, 1, 2, 3);
 // v == 5.5
 return 0;
}

So here's an easy one  C  Version of the multi-dimensional array to achieve . The above codes are derived from :
https://github.com/zuweie/boring-code/tree/main/src/ultra_array

End !
原网站

版权声明
本文为[InfoQ]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/180/202206291532504058.html