robots tx, search engine spiders
Home- robots.tx, search engine spidersOrder0 robots tx, search engine spiderssearch engine submission and website optimization informationwebsite optimization and website promotion toolsSubmissionsource.com  contactreciprocal links
increase we site traffic

 

Robots tx

The Importance of Robots tx

Although the robots tx file is a very important file if you want to have a good ranking on search engines, many fail to include this file in their root directory. This is a mistake in most cases. If you have a robots tx file, it may not be compiled correctly. A recent study showed out of 15,000 websites, 56% had errors in there robot.txt file. If you would like to validate a robots.tx file try the robots.tx syntax checker for free. If you are not comfortable writing a robots.tx file yourself try RoboGen , it is a hndy program that generates robots.tx files automatically. I would suggest your read this article anyway to gain a basic understanding of how this file works, and evaluate if you need one.

What is Robots tx?

Quite simply, it is a file that directs search engine spiders to pages and files you want indexed and don't want indexed..It is a simple text file that belongs in your root directory on your server.

How do I create a Robots tx file?

As mentioned above, the robots tx file is a simple text file. Open a simple text editor to create it.A record contains the information for a special search engine. Each record consists of two fields: the user agent line and one or more Disallow lines. Here's an example:

    User-agent: slugbot
    Disallow: /cgi-bin/

This robots tx file would allow the "slugbot", which is a fictitious spider, to retrieve every page from your site except for files from the "cgi-bin" directory. All files in the "cgi-bin" directory will be
ignored by slugbot.

The Disallow command works like a wildcard. If you enter

    User-agent: slugbot
    Disallow: /support

both "/support.html" and "/support/index.html" as well as all other files in the "support" directory would not be indexed by search engines. Everything behind the directory you disallow will be disallowed. This is very important to remember.If you leave the Disallow line blank, you're telling the search engine that all files may be indexed. In any case, you must enter a Disallow line for every User-agent record.

If you want to give all search engine spiders the same rights, use the following robots tx content:

    User-agent: *
    Disallow: /cgi-bin/

However this is not advisable. There are many malicious spiders that harvest personal information for spam. Here is a good list of bot's to disallow.


Things you should avoid

If you don't format your robots tx file properly, some or all files of your Web site might not get indexed by search engines. To avoid this, do the following:

  1. Don't use comments in the robots tx file

    Although comments are allowed in a robots tx file, they might confuse some search engine spiders.

    "Disallow: support # Don't index the support directory" might be misinterpreted as "Disallow: support#Don't index the support directory".


  2. Don't use white space at the beginning of a line. For example, don't write

    placeholder User-agent: *
    place Disallow: /support

    but

    User-agent: *
    Disallow: /support


  3. Don't change the order of the commands. If your robots tx file should work, don't mix it up. Don't write

    Disallow: /support
    User-agent: *

    but

    User-agent: *
    Disallow: /support


  4. Don't use more than one directory in a Disallow line. Do not use the following

    User-agent: *
    Disallow: /support /cgi-bin/ /images/

    Search engine spiders cannot understand that format. The correct syntax for this is

    User-agent: *
    Disallow: /support
    Disallow: /cgi-bin/
    Disallow: /images/


  5. Be sure to use the right case. The file names on your server are case sensitive. If the name of your directory is "Support", don't write "support" in the robots tx file.


  6. Don't list all files. If you want a search engine spider to ignore all files in a special directory, you don't have to list all files. For example:

    User-agent: *
    Disallow: /support/orders.html
    Disallow: /support/technical.html
    Disallow: /support/helpdesk.html
    Disallow: /support/index.html

    You can replace this with

    User-agent: *
    Disallow: /support


  7. There is no "Allow" command

    Don't use an "Allow" command in your robots tx file. Only mention files and directories that you don't want to be indexed. All other files will be indexed automatically if they are linked on your site.

Tips and tricks:

1. How to allow all search engine spiders to index all files

    Use the following content for your robots tx file if you want to allow all search engine spiders to index all files of your Web site:

    User-agent: *
    Disallow:

2. How to disallow all spiders to index any file

    If you don't want search engines to index any file of your Web site, use the following:

    User-agent: *
    Disallow: /

3. Where to find more complex examples.

    If you want to see more complex examples, of robots tx files, view the robots tx files of big Web sites:

Information

27 Tips to Top Search Engine Rankings

Using Links For Improved Search Engine Ranking

10 Common Webmaster Mistakes

How Do Search Engines Work?

Is Your Website Optimized To Sell?

Doorway Pages- Good Advertising?

The Importance Of A Sitemap

Keyword Optimization 101

robots tx To Get Indexed

More Web Site Traffic With DMOZ

More Web Site Traffic With Yahoo

Wore Web Site Traffic With Inktomi

More Web Site Traffic With Google

 

Tools

Free Link Popularity Check

Free Meta Tags Generator

Free Search Engine Ranking Tool

searchsearch engine ranking

home | tips | tools | order | contact | sitemap                                 

 CopyrightGABCO 2003