Categories
SEO

Basic terms of web robot.txt

What is a Web robot:

In basic, a web robot is simply a program. The robot is used to automatically and recursively traverses a Web site to retrieve document content and information. Search engine spiders are the most common types of Web robots. These robots visit Web sites and follow the links to add more information to the search engine database.

Web robots often go by different names. You may hear them called:

  • spiders
  • bots
  • crawlers

Commonly Web robots is used to index a site for a search engine. But robots can be used for other purposes as well. Some of more common uses are:

  • Link validation – Robots can follow all the links on a site or a page, testing them to make sure they return a valid page code. The advantage to doing this programmatically is inherently obvious, the robot can visit all the links on a page in a minute or two and provide a report of the results much quicker than a human could do manually.
  • HTML validation – Similar to link validation, robots can be sent to various pages on your site to evaluate the HTML coding.
  • Change monitoring – There are services available on the Web that will tell you when a Web page has changed. These services are done by sending a robot to the page periodically to evaluate if the content has changed. When it is different, the robot would file a report.
  • Web site mirroring – Similar to the change monitoring robots, these robots evaluate a site, and when there is a change, the robot will transfer the changed information to the mirror site location.

Share with: