10 common mistakes using robots.txt on your website

Posted by | Posted in Search Engine Optimization, Website Security | Posted on 09-03-2009

Robots.txt is a special file which is located in the root of each server which is a plan text file which allows the administrator of a website to define which web content need to be allowed and disallowed for the bot which visitors their website.

All major search engine like Google, Yahaoo and MSN agrees to the Robots Exclusion Protocol. There are several elements that every website owner need to understand for a easing crawling of their website. Following are the top 10 common mistakes to be avoided while create a robots.txt file.

1. Adding robots.txt not under the root directory - This is one of the common mistake webmaster do. They upload the robots.txt file at the wrong place it must reside in the root of the domain and must be named “robots.txt”. A robots.txt file uploaded in subdirectory is not a valid one since blots check for robots.txt file only in the root of the domain name.

User-agent: *
Disallow:

2. Wrong syntax in robots.txt – Another explanation is that the Webmaster used the wrong syntax when creating the robots.txt. Therefore, always double check the robots.txt file using tools like Robots.txt Checker
Here is an example

User-agent: *
Disallow: private.html

We advise you to start a file/directory name with a leading slash char (Example: /private.html).

3. Adding comment at the end of the sentence instead of at the beginning – If you wish to include comments in your robots.txt file, you should precede them with a # sign like this:

# Here are my comments about this entry.
User-agent: *
Disallow:

4. Empty robots.txt file almost like not having one – If you have created a robots.txt file under your root directory and there is nothing in it, then it is similar like not having one. Because nothing is disallowed or no User-agent is given, everything is allowed for every bots.

5. Blocking the pages which you need to get indexed - If  you are blocking spider bots and pages using robots.txt you should have thorough understanding of the syntax to be used any mistake can cause you huge problem with the spiderbots.

6. URL’s Paths are case sensitive – URL paths are often case sensitive, so be consistent with the site capitalization WARNING! Many robots and webservers are case-sensitive. So this path will not match any root-level folders named private or PRIVATE.

7. Misspelled robots/user agent names – SpiderBots will ignore mispelled User-Agent names. Check out your raw server log to find User-Agent name which you need to be blocked. Check out UserAgentString.com for a list of User Agent name.

8. Don’t add all the files in one single line – Some of the common mistake is adding all the files under on disallow.
For example

User-agent: *
Disallow: /private/ /images/ /javascript/

This is a wrong syntax and robots will not understand this format. The correct syntax is given below.

User-agent: *
Disallow: /private/
Disallow: /images/
Disallow: /javascript/

9. No allow command in robots.txt - There is only one command that is Disallow: and there is no command called Allow: So if you want to allow the bots to visit the page just don’t add the files.

10. Missing the colon – Missing the colon in Disallow and User-agent entry. Here is one of the example of a missing colon entry.

#This is a wrong entry
User-agent: googlebot
Disallow /

#The correct entry
User-agent: googlebot
Disallow: /

Please leave your comment if you find any other common mistakes which need to be avoided while generating a robots.txt file. Also below are few robots.txt useful resources and tools.

http://www.mcanerin.com/en/search-engine/robots-txt.asp
http://webtools.live2support.com/se_robots.php
http://googlewebmastercentral.blogspot.com/2008/03/speaking-language-of-robots.html

16 Web Based Handy Web Designer Tools

Posted by | Posted in General, Web Graphic Design | Posted on 04-10-2008

while designing and developing a website a web designer need to several tools to maximizing the effectiveness of the website. There are a wide range of different tools which is been presented here. Each of these tools is so important for a web designer.

  1. Web Page Analyzer Tool
    Web Page Analyzer Tool – This tool can be used to test the loading time of your web page. It gives a detailed report on how many objects and size of the objects on the page. These tools have several rating for each object like Green ratings is for good, Yellow for caution and Red for warnings.
  2. W3C CSS Validation Services
    W3C CSS Validation Services –
    This W3C CSS validation Service is a web based tool helps web designers to check Cascading Style Sheets(CSS) are validated. The CSS can be validated here by three ways one is typing the URL, second is by uploading the file and finally by inputting the direct CSS code in the input box.
  3. Browser compatibility testing tool
    Browser compatibility testing tool – By using Browsershots you can test your web designed pages in different browsers. As we all know the importance of checking the web pages with different multiple browsers. Browser compatibility testing is a critical element in the web development practice.
  4. Website Wireframe Tool
    Website Wireframe Tool –
    Wireframes are a basic visual guide for any web page layout design. Wireframe also allows maintaining design consistency throughout the site. Wireframes are very important part in the initial stage of the design process. Gliffy is a free web-based diagram editor. Gliffy free wireframe software makes it easy to create website wireframes and to share web mockups with anyone.
  5. Web color picker
    Web color picker - This color picker will help you fine tune the hex color codes for your web page. These color picker tools are great for exploring colors and color schemes for your web design project.
  6. Screen Resolutions Checker
    Screen Resolutions Checker – Quickly and easily test any Website in various Screen Resolutions! This tool allows you to check any web page in different screen resolutions. As web visitors uses different screen resolutions web designers need to check their web page design in different screen resolutions and make sure it looks in all of them.
  7. Dummy Text Generator
    Dummy Text Generator –
    A dummy text generator will come handy to any web designers because when they wait for the content. Web designers can use dummy text generators to fill up that space.
  8. CSS Menu Generator
    CSS Menu Generator – will generate both the CSS and the HTML code required to produce a text-based yet appealing set of navigation buttons. Menu generator makes it easy to create custom CSS menus without having to know all the complicated HTML and CSS
  9. Broken Link checker
    Broken Link checker –
    Use this tool to check for Broken Links on a website. Broken links are links that lead nowhere; Clicking on the link will show an error page.
  10. Browser size checker
    Browser size checker -
    A nifty online tool for setting your browser size while doingWeb design.
  11. Robot Control Code Generation Tool
    Robot Control Code Generation Tool –
    The only reason you might need to use a robots.txt file is if you want to prevent someone else from using this search engine to index your site
  12. Http header checker
    Http header checker - Check your server to make sure the proper HTTP Status Codes (200, 301, 302, 304, 307, 404, 410) are being returned in the server headers.
  13. URL rewriting tool
    URL rewriting tool - This tool helps you convert dynamic URLs into static looking html URLs.
  14. htaccess tool
    htaccess tool – These tools will create .htaccess and .htpasswd files for you, without having to build these manually.
  15. Rounded Corner and Gradient Generator
    Rounded Corner and Gradient Generator - This generates a basic box with rounded corners. It will create four image files and the necessary HTML and CSS code for you to put rounded corners around your content.
  16. Strip Generator
    Strip Generator lets you make your own comic strips. Users can even create their own Strip Blog, which combines the art of creating comic strips with blogging.


By TwitterIcon.com