当前位置：网站首页>[academic related] why can't many domestic scholars' AI papers be reproduced?

[academic related] why can't many domestic scholars' AI papers be reproduced?

2022-07-29 08:03:00 【51CTO】

Teacher Wu Enda once said , The key to reading a paper , It's the author's algorithm .
However , Many papers can't be reproduced at all , Why is that ？

One 、 Data relation

Because the data used by the author is private , Most people don't get , In this case , Even if the author provides the source code , But readers don't get the data , There's no way to reproduce the algorithm .

This situation is very common in domestic academic circles , No one else has the data , It's like an Olympiad math teacher , I have a Mathematical Olympiad problem , Find out for yourself , Then I wrote a paper about the process of solving the problem , This kind of paper is often not convincing enough , The story is not strong enough .

Two 、 Hardware reasons

Many algorithms for deep learning , It's done by doing miracles with great efforts . For example, Google. 、facebook Some of the algorithms , Rely on powerful hardware training out .

Ordinary researchers don't have that powerful hardware resources , I don't think it can reach their 1% Calculation power , There's no way to reproduce the algorithm .

3、 ... and 、 Data division and training methods

Some papers have made the code public , And it's open data , But the paper does not mention the problem of data division , If the data is small , Different divisions lead to different results .

Four 、 Well known reasons

We all know the reason , I don't understand , This situation appears in the papers of many domestic authors . This is rare in public data .

Many papers published by domestic scholars , The usual routine is ：

1. Define a very new but meaningless problem ;

2. oriented github Programming ;

3. Add some to the network attention,module,normalization,loss, Until it doesn't collapse ;

4. Make up a story , produce a novel , It seems that the logic is quite clear , But don't give people a chance to reproduce .

What is the ideal paper like ？

1. The effect can be reproduced , The logic of every experiment in the paper is very clear , The logical chain formed by all the experiments is complete , Using public data sets , The results are basically the same as the paper .

It's the big guys in the field who can achieve this , Like Chen Tianqi 、 He Kaiming .

2. Using public data , Open code , The details of the paper are clear , It can reproduce the effect of the paper . Although the authors of many papers can't explain why the network designed in this way works well , This should be the reason why deep learning can't be explained . Because they made the code public , It works well on public data sets , Can reproduce the effect , So it's also a good paper .