当前位置：网站首页>np. Arange and np Linspace nuances (data overflow problem)

np. Arange and np Linspace nuances (data overflow problem)

2022-06-22 08:58:00 【Ghost crap】

A concise version that is too long to read

x = np.arange(start, end, steps)

Values are generated within the half-open interval [start, stop)
(in other words, the interval including start but excluding stop).
That is, the interval is left closed and right open , It doesn't contain end The value of

x = np.linspace(start, end, num, endpoint=True)

There are num equally spaced samples in the closed interval [start, stop] or the half-open interval [start, stop) (depending on whether endpoint is True or False).
namely endpoint=True The interval is left closed and right closed , contain end The value of
namely endpoint=False The interval is left closed and right open , It doesn't contain end The value of

The final conclusion ：

When linspace Function to specify parameters endpoint=False when , The effect of the two functions is equivalent .
When steps and num When all the specified parameters are integers ,arrange Returns the numpy.int32 data type ,
and linspace Returns the numpy.float data type , Corresponding to C Language data type , Each of these “ Integers ” It has its own range , You will encounter the problem of data overflow , In numerical analysis, the function drawing curve will not get the correct answer .

Solution ：

Use np.linspace Than np.arange good , It can prevent data overflow
Use np.arange when , Step size steps Set to decimal , such as 1.0

Draw curve code ：

x = np.arange(0,100,1.0)                           # The scope of the independent variable 
x = np.linspace(0, 100, 100,endpoint=False)       #  The two effects are equivalent 
y = alpha*(T-x) + beta*(T**4-x**4) + gamma*(u**2)  # The dependent variable 
plt.plot(x, y, y*0.0)                              # Curve drawing

On the eve of the problem

For numerical analysis course assignment python Of matplotlib I can't get the correct curve , After use Matlab Can , It's the same function , However, the curve drawn by the two shows that the zero point is inconsistent , have a long way to go . As a coin, it will not matlab Learning dregs , To get the right answer , Do you really have to bite down matlab The grammar of … I asked my classmate again , He used python Got the right answer , After two hours' investigation, I found nothing and moved to the war Matlab I left tears of envy , Hurry to his code for a comparative experiment , Finally, the key to the problem has been found , See what happens next. Please continue to read …

My code

u = 21.0
T = 293.15
alpha = 50.0
beta = 2*(10**(-7))
gamma = 800.0

x = np.arange(0,10000,1)                           # The scope of the independent variable 
y = alpha*(T-x) + beta*(T**4-x**4) + gamma*(u**2)  # The dependent variable 
plt.plot(x, y, y*0.0)

design sketch

Code rendering
It can be seen from the figure that the zero point of the function is [7000,8000] This range .
So here comes the question , Why is the graph drawn by an expression that is not a function of one degree but a straight line ,x^4 It can't be a straight line ？？？ With that in mind , I asked my classmate for his answer , It is found that there is only one zero point , stay [1000,1200] In this range . I carefully compared the functional equations , After thinking for two hours, I couldn't find the answer .

Same door code

u = 21.0
T = 293.15
alpha = 50.0
beta = 2*(10**(-7))
gamma = 800.0

start = 1000
end = 1500
step = 1
num = (end - start) // step
x = np.linspace(start, end, num)
y = alpha*(T-x) + beta*(T**4-x**4) + gamma*(u**2)

fig = plt.figure(figsize=(6, 6))
plt.plot(x, y, label='Numerical Analysis')
plt.grid(True)
plt.xlim((800, 1500))  #  According to the x The scope of the （ If it is not set, it will be set automatically by the program ）
plt.ylim((-10, 10))  #  According to the y The scope of the 
plt.legend()           #  Show marginal notes 
plt.show(fig)          #  If no value is entered, all objects will be displayed by default

design sketch

Same door code renderings
From the figure, we can see that the code of the same door is correct , And the line is still curved … I thought it was plt Caused by superposition buff bonus , Then I put the drawing code copy Into my code , Still wrong ！！！ After step-by-step analysis , I found that the reason for this phenomenon is actually just the difference of one line of code ！！！ It's shown as follows ：

Code comparison 【 The difference lies only in the independent variable x】

u = 21.0
T = 293.15
alpha = 50.0
beta = 2*(10**(-7))
gamma = 800.0

x = np.arange(1000,1500,1)           # The difference lies only in the independent variable x The function used 
#x = np.linspace(1000, 1500, 500)    # The difference lies only in the independent variable x The function used 
y = alpha*(T-x) + beta*(T**4-x**4) + gamma*(u**2)
plt.plot(x, y)

x = np.arange(1000,1500,1) Effect diagram

[<matplotlib.lines.Line2D at 0x29e40ceff10>]
Insert picture description here

x = np.linspace(1000, 1500, 500) Effect diagram

[<matplotlib.lines.Line2D at 0x29e40da97c0>]
Insert picture description here

x = np.arange(1000,1500,0.1) Effect diagram

Insert picture description here

Why , At this point, I can't help wondering , I use np.arange It's also 1000-1500, With 1 interval , The value of the independent variable is

[1000, 1001, 1002,…, 1499]

The use of np.linspace Also put 1000-1500 It's divided into 500 Share ,

[1000, 1001.00200401, 1002.00400802, …, 1498.99799599, 1500]

The effect should be similar .
But why do I use arange Function, you need to set the precision to 0.1, Can achieve linspace Accuracy of 1 The effect of ？？？

official API analysis

x = np.arange(0,10,1)
# Output ： [0 1 2 3 4 5 6 7 8 9]
x1 = np.linspace(0, 10, 10, endpoint=False)
# Output ：[0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]
x1 = np.linspace(0, 10, 10)
# Output ：[ 0.          1.11111111  2.22222222  3.33333333  4.44444444  5.55555556
  6.66666667  7.77777778  8.88888889 10.        ]

The above example shows ： When specifying endpoint by False when , The effect of the two functions is equivalent . here , I seem to have found something wrong , An output is an integer , One output is decimal ？？？ Will this affect the results ？ With a dream that you can't even think of , I tried to set integer to floating point number …

x = np.arange(1000,1500,1.0) Effect diagram

Insert picture description here
My day ah , The sampling interval was changed from 1 Set to 1.0 The effect is so different , At this time, my doubts did not lessen , On the contrary, they are more confused , The difference between integer and floating-point numbers , What is the reason for this ？ I thought about it , After all, my course is called 《 numerical analysis 》 ah , After all, I have to have some opinions in this regard ？… Maybe this expression is too complicated , The integer is input, and then it is added, subtracted, multiplied and divided by seven788 , The argument is an integer +1、-1 The difference is negligible . The main reason is that the range of my dependent variable is 10^5, Small changes may not be reflected ？？？ But my constants are all set to floating point numbers .

adopt MATLAB The zero point is known to be at [1118,1119] Between , I specially print out x=1118 and x=1119 The value corresponding to the dependent variable ：

x1 = 1118;      # Corresponding dependent variable y=572.5297745541902
x2 = 1119;      # Corresponding dependent variable y=-596.9030544458074
x = np.arange(1110,1120,1.0)  
 Output ：
[9820.44892975 8674.86472155 7526.31814255 6374.80385755 5220.31652655
 4062.85080475 2902.40134255 1738.96278555  572.52977455 -596.90305445]
x = np.arange(1110,1120,1)
 Output ：
[313045.14002735 313617.54273755 313327.98961775 313035.46879195
 313598.96837935 313300.49611675 312999.04011375 312694.59501595
 313246.14892335 312935.70955355]

from x1,x2 It can be seen that the zero point is indeed in this range , But when the argument is an integer , It is very likely that it overflowed , The calculated value is very different from the real value . Not only are positive numbers wrong , And zero is gone . But I type it alone 1119 It's also an integer , Why can we get the correct value ？python The latest version has no restrictions on integer types , The maximum integer that can be expressed is 9223372036854775807.Numpy The integer type in corresponds to C The data type of the language , Each of these “ Integers ” It has its own range , To solve the problem of data overflow , You need to specify a larger data type （dtype）！！！
It turns out that I input x Value , In this case python The integer represented by the version of , It is the overflow prevention method in the source code , So the function value can be output correctly , however x = np.arange(1110,1120,1), The type output of each corresponding argument is ：

<class ‘numpy.int32’>

Can only represent data range integers （-2147483648 to 2147483647）, When x The value is 10^3 when , The result corresponding to the fourth power is 1000000000000, It is beyond the scope of expression . I have explained my pit very clearly .

terms of settlement

If you want to draw a curve in the future, you'd better use np.linspace, It will automatically set the value of the argument to floating point , perhaps np.arange(start,end,1.0), Step size is set to floating point ！！！ In this way, there will be no overflow and other problems ！！！！
It took half a day to finally solve the problem , Give yourself