当前位置:网站首页>yandex robots txt
yandex robots txt
2022-07-28 05:23:00 【oHuangBing】
robots.txt It is a text file containing website index parameters , For search engine robots .
Yandex Support for... With advanced functions Robots Exclusion agreement .
When crawling a website ,Yandex The robot will load robots.txt file . If the latest request for this file shows a website page or part is prohibited , Robots will not index them .
Yandex robots.txt Document requirements
Yandex Robots can handle robots.txt, However, the following requirements should be met :
The file size does not exceed 500KB.
It is named "robots " Of TXT file , robots.txt.
The file is located in the root directory of the website .
This file can be used by robots : The server hosting the website is HTTP Code response , Status as 200 OK. Check the response of the server
If the document does not meet the requirements , The website is considered to be open indexed , That is to say Yandex Search engines can access web content at will .
Yandex Support from robots.txt Redirect files to files located on another website . under these circumstances , The instructions in the object file are taken into account . This kind of redirection may be very useful when moving websites .
Yandex visit robots.txt Some rules of
stay robots.txt In file , The robot will check to User-agent: The first record , And look for characters Yandex( Case is not important ) or *. If User-agent: Yandex String detected ,User-agent: * String will be ignored . If User-agent: Yandex and User-agent: * String not found , Robots will be considered to have unlimited access .
You can have the Yandex The robot inputs separate instructions .
For example, the following examples :
User-agent: YandexBot # Writing method for index crawler
Disallow: /*id=
User-agent: Yandex # Will be for all YandexBot work
Disallow: /*sid= # In addition to the main indexing robots
User-agent: * # Yes YandexBot It won't work
Disallow: /cgi-bin
According to the standard , You should be in every User-agent Insert a blank line before the instruction .# The character specifies a comment . Everything after this character , Until the first line break , Will be ignored .
robots.txt Disallow And Allow Instructions
Disallow Instructions , Use this instruction to prohibit indexing site sections or individual pages . Example :
Pages containing confidential data .
Pages with site search results .
Website traffic statistics .
Repeat the page .
All kinds of logs .
Database service page .
Here is Disallow Examples of instructions :
User-agent: Yandex
Disallow: / # It is forbidden to crawl the entire website
User-agent: Yandex
Disallow: /catalogue # Do not grab to /catalogue Opening page .
User-agent: Yandex
Disallow: /page? # It is forbidden to grab URL The page of
robots.txt Allow Instructions
This directive allows indexing of site sections or individual pages . Here is an example :
User-agent: Yandex
Allow: /cgi-bin
Disallow: /
# Prohibit indexing any page , Except for '/cgi-bin' The opening page
User-agent: Yandex
Allow: /file.xml
# Allow Indexing file.xml file
robots.txt Combined instructions
In the corresponding user agent block Allow and Disallow The order will be based on URL Prefix length ( From shortest to longest ) Sort , And apply it in order . If there are several instructions that match a specific website page , The robot will choose the last instruction in the sorting list . such ,robots.txt The order of instructions in the file will not affect the way the robot uses them .
# robots.txt File example :
User-agent: Yandex
Allow: /
Allow: /catalog/auto
Disallow: /catalog
User-agent: Yandex
Allow: /
Disallow: /catalog
Allow: /catalog/auto
# Prohibit indexing with '/catalog' Opening page
# But you can index it with '/catalog/auto' The beginning page address
summary
So that's about Yandex Reptiles for robots.txt Some rules of writing , You can specify the configuration , Allow or prohibit Yandex Reptiles Crawl or disable crawl pages .
Reference material
边栏推荐
- C language: addition and deletion of linked list in structure
- [high CPU consumption] software_ reporter_ tool.exe
- Driving the powerful functions of EVM and xcm, how subwallet enables Boca and moonbeam
- How does Alibaba use DDD to split microservices?
- HDU 3585 maximum shortest distance
- Summary and review of puppeter
- POJ 1330 Nearest Common Ancestors (lca)
- 【ARXIV2204】Simple Baselines for Image Restoration
- [internal mental skill] - creation and destruction of function stack frame (C implementation)
- Service object creation and use
猜你喜欢

Activation functions sigmoid, tanh, relu in convolutional neural networks

First acquaintance with C language (2)

Driving the powerful functions of EVM and xcm, how subwallet enables Boca and moonbeam

How practical is the struct module? Learn a knowledge point immediately

阿里怎么用DDD来拆分微服务?

11.< tag-动态规划和子序列, 子数组>lt.115. 不同的子序列 + lt. 583. 两个字符串的删除操作 dbc

Table image extraction based on traditional intersection method and Tesseract OCR
![[internal mental skill] - creation and destruction of function stack frame (C implementation)](/img/a9/81644ee9ffb74a5dc8ff1bc3977f49.png)
[internal mental skill] - creation and destruction of function stack frame (C implementation)

Why is MD5 irreversible, but it may also be decrypted by MD5 free decryption website

Reading notes of SMT practical guide 1
随机推荐
Reading sdwebimage source code Notes
mysql的日期与时间函数,varchar与date相互转换
Google browser cannot open localhost:3000. If you open localhost, you will jump to the test address
Flask Development & get/post request
The default isolation level of MySQL is RR. Why does Alibaba and other large manufacturers change to RC?
ES6 new variable modifiers let and const, new basic data type symbol
Melt cloud x chat, create a "stress free social" habitat with sound
C language: realize the simple function of address book through structure
SMD component size metric English system corresponding description
Activation functions sigmoid, tanh, relu in convolutional neural networks
New methods and features of ES6 built-in objects
Gan: generative advantageous nets -- paper analysis and the mathematical concepts behind it
What are the methods of array objects in Es5 and what are the new methods in ES6
Confused, I'm going to start running in the direction of [test]
FreeRTOS startup process, coding style and debugging method
Struct模块到底有多实用?一个知识点立马学习
【ARXIV2203】Efficient Long-Range Attention Network for Image Super-resolution
HDU 3585 maximum shortest distance
Using RAC to realize the sending logic of verification code
How to successfully test php7.1 connecting to sqlserver2008r2