当前位置：网站首页>Little p weekly Vol.11

Little p weekly Vol.11

2022-07-01 22:06:00 【SenseParrots】

Small P Send you information worth seeing this week ~

Comments 、 Suggestion and roast , Welcome to Xiao P Leave a message ~

Click the link to open the corresponding web page .

Academic frontier

Yann LeCun Put forward to AI Bold new ideas

Yann LeCun This famous scientist recently put forward a proposal for the next generation AI Bold new ideas .

In this document , He aimed at “ How can machines learn as efficiently as humans and animals ”、“ How can machines learn to reason and plan ”、“ How can machines learn the representation of perception and action plans at multiple levels of abstraction ” And other issues put forward a framework and training paradigm , It can be used to build autonomous agents .

This architecture combines some concepts , Such as configurable predictive world models , Behavior driven by intrinsic motivation , And the layered joint embedding architecture through self supervised learning and training .

You can OpenReview I read that LeCun The original text published .

Yandex Release the largest in open source GPT-Like neural network YaLM 100B

Yandex Developed YaLM The language model series is used for its Alice Voice assistant and Yandex Search Etc . recently , The company will develop the largest scale model YaLM 100B Free release on GitHub On . The project adopted Apache 2.0 agreement , Allow research and commercial .

stay Yandex released This article in , They talked about from training YaLM 100B Lessons learned , Include ： How to speed up model training 、 How to deal with divergence (divergence) etc. .

OpenAI about DALL·E 2 Pre training mitigation measures

To reduce the risks associated with image generation models ,OpenAI Some measures have been taken to prevent the generated images from violating their content terms .

In this article ,OpenAI It focuses on one of these measures —— Pre training mitigation measures . This measure can be modified directly DALL·E 2 Learning data . in consideration of DALL·E 2 Use hundreds of millions of labeled pictures on the Internet for training ,OpenAI You need to delete and adjust the weight of some pictures , To change the learning content of the model .

This article focuses on three different pre training mitigation measures ：

How to go from DALL·E 2 The training data set filters out images of violence and sex . This measure can prevent the model from generating explicit images based on the input text , You can also avoid entering text that has nothing to do with violence and sex , The model returns an image containing these elements .
OpenAI It is found that filtering training data will magnify bias (biases), How do they mitigate this effect . for example , Without such mitigation measures , Compared with the model trained on the original data set , Models trained on filtered data sometimes produce more images of men , There are fewer images depicting women .
OpenAI Find out DALL·E Such a model can sometimes reproduce images used for training , Instead of creating new images . In practice , They found that this kind of image rumination was caused by images copied many times in the data set .OpenAI Alleviate this problem by deleting images that are visually similar to other images in the dataset .

You can OpenAI The blog of Read the original text of this article .

Basic technology

100 A common NumPy Test questions

This website starts with interview questions 、 Mailing lists and documents , Selected 100 A FAQ , With the answer , For everyone to practice .

These problems are divided into 1~3 Star three difficulty , Let's test your right NumPy Degree of familiarity ？

for example ：

a tiny bit ： Find the result of the following expression .

np.array(0) / np.array(0)np.array(0) // np.array(0)np.array([np.nan]).astype(int).astype(float)

a tiny bit ： Given two sequences , Intersection of evaluation .（ Tips ： Use np.intersect1d）
Two stars ： Given two arrays ,shape Namely (1, 3) and (3, 1). How to calculate their sum with iterators ？（ Tips ： Use np.nditer）
samsung ： Calculate the rank of the matrix .（ Tips ： Use np.linalg.svd, np.linalg.matrix_rank）

Design philosophy

The art of code annotation , Does good code really need no comments ？

Lei Jun once said a famous saying ：“ I haven't written a poem , But some people say that my code is as elegant as a poem .” I believe this is also the pursuit of countless engineers . And in the process of pursuing this elegance , Whether to write notes 、 How to write notes is also a problem that cannot be bypassed .

The author uses the code he actually touches , He talked about his views on annotation ： By precisely naming variables 、 Code level cutting and other methods make the code easy to understand , Leave the comments to those complex business logic 、magic number And the outside world API Definition .

Tool recommendation

Learn language from hands-on projects

This website collects various practical tutorials for learning （ For example, write your own database 、 Compiler and so on ）, Classify according to computer language , Help readers quickly find simple projects that they can do by themselves .

Generate online by dragging tkinter Interface code

We use it Python When writing some gadgets , It's often used tkinter As a graphical interface . This tool supports dragging and dropping components on Web pages , What you see is what you get , Automatic generation Python Code .

The tool has been released on GitHub On . You can Demo page Online experience .

Thank you for reading , Welcome to leave a message in the comment area ~

P.S. If you like this article , Please do more give the thumbs-up , Let more people see us :D