JM Internet Group - SEO - Search Engine Optimization - Explained SEO Glossary of Terms
Free Help: Call 888-993-1122
|
Email a Question | Search
SEO <Search Engine Optimization> Defined
Home
Classes: SEO, SMM, PPC
Training Schedule
Login
Free Stuff
Blog
News
Partners | Certification
Reviews
Consulting
Books
Tutorial
SEO Tips
Videos
Reputation
Home > Twitter
Free Webinar - Top Ten Free Tools for Google and SEO
Top Ten
FREE SEO Tools
for
Search Engine Logo Register Here

Robots.txt File

Robots.txt stands for "The Robots Exclusion Protocol", and is used by website owners to give instructions about their site to web robots. The robots.txt protocol is a convention to prevent cooperating web spiders and other web robots from accessing all or part of a website which is otherwise publicly viewable. Robots are often used by search engines to categorize and archive web sites, or by webmasters to proofread source code. The standard is unrelated to, but can be used in conjunction with, Sitemaps, a robot inclusion standard for websites.

In addition, your robots.txt file also identifies the location of your XML site map. It is one easy way to inform search engine robots of your site map file, and thereby increase the likelihood of full indexing of your site by Google, Yahoo, and/or Bing.

The robots.txt standard was developed in 1994, when large-scale web indexing became popular; indexers such as Lycos[1] and AltaVista used it. (wikipedia)

If a site owner wishes to give instructions to web robots he must place a text file called robots.txt to the root of the web site hierarchy (e.g. www.example.com/robots.txt). This text file should contain the instructions in a specific format (see examples below). Robots that are programmed to follow the instructions try to fetch this file and read the instructions before fetching any other file from the web site. If this file doesn't exist web robots assume that the web owner wishes to provide no specific instructions.

A robots.txt file on a website will function as a request that specified robots ignore specified files or directories in their search. This might be, for example, out of a preference for privacy from search engine results, or the belief that the content of the selected directories might be misleading or irrelevant to the categorization of the site as a whole, or out of a desire that an application only operate on certain data.

Notably, for websites with multiple subdomains, each subdomain must have its own robots.txt file. If example.com had a robots.txt file but a.example.com did not, the rules that would apply for example.com would not apply to a.example.com. (wikipedia)

By Noelle Decambra

--------------------------------------------------------------------------------
About JM Internet Group | Site Map
|
Privacy policy
|
© 2012, All Rights Reserved


Search Engine Optimization | SEO Training Class | Social Media Marketing Training Class