Wednesday 17 July 2013

Importance of Robots.txt

Robots.txt file is a very important file for seo if you want to have a good ranking in major search engines, many websites don't offer this file. A Robots.txt file is helpful to keep out unwanted search engine spiders like email retrievers, image strippers, etc. It defines which paths are off limits for spiders to visit. This is useful if you want to hide some personal information or some secret files from search engines.

Importance of Robots.txt
What is Robots.txt

Robots.txt file is a special text file that is always located in server's root. Robots.txt file contains restrictions for Spiders, telling them where they have permission to read. A Robots.txt is like defining rules for search engine spiders (robots) what to follow and what not to. It should be noted that Web Robots are not required to respect Robots.txt files, but most well written Web Spiders follow the rules you define.

How to Create Robots.txt


The format for the robots.txt file is special. It consists of records. Each record consists of two fields: a User-agent line and one or more Disallow: lines. The format is: 
<Field> ":" <value>
The robots.txt file should be created in Unix line ender mode! Most good text editors will have a Unix mode or your FTP client *should* do the conversion for you. Do not attempt to use an HTML editor that does not specifically have a text mode to create a robots.txt file.

User-agent


The User-agent line specifies the robot. For example: 
User-agent: googlebot

You may also use the wildcard character "*" to specify all robots:
User-agent: *
You can find user agent names in your own logs by checking for requests to robots.txt. Most major search engines have short names for their spiders.

Disallow

The second part of a record consists of Disallow: directive lines. These lines specify files and/or directories. For example, the following line instructs spiders that it can not download contactinfo.htm:
Disallow: contactinfo.htm
You may also specify directories:
Disallow: /cgi-bin/
Which would block spiders from your cgi-bin directory?

There is a wildcard nature to the Disallow directive. The standard dictates that /bob would disallow /bob.html and /bob/indes.html (both the file bob and files in the bob directory will not be indexed).

If you leave the Disallow line blank, it indicates that ALL files may be retrieved. At least one disallow line must be present for each User-agent directive to be correct. A completely empty Robots.txt file is the same as if it were not present.

White Space & Comments


Any line in the robots.txt that begins with # is considered to be a comment only. The standard allows for comments at the end of directive lines, but this is really bad style: 
Disallow: bob #comment

Some spider will not interpret the above line correctly and instead will attempt to disallow "bob#comment". The moral is to place comments on lines by themselves.
White space at the beginning of a line is allowed, but not recommended.
Disallow: bob #comment

Examples


The following allows all robots to visit all files because the wildcard "*" specifies all robots.
User-agent: * 
Disallow:

This one keeps all robots out.
User-agent: * 
Disallow: /

The next one bars all robots from the cgi-bin and images directories:
User-agent: * 
Disallow: /cgi-bin/
Disallow: /images/

This one bans Rover dog from all files on the server: 
User-agent: Rover dog 
Disallow: /

This one bans keeps googlebot from getting at the personal.htm file:
User-agent: googlebot
Disallow: personal.htm


Thursday 23 May 2013

Penguin 2.0 - Update on 22 May 2013


Google has rolled out the next generation of the Penguin 2.0 webspam algorithm on (May 22, 2013). About 2.3% of English-US queries are affected to the degree that a regular user might notice. The change has also finished rolling out for other languages world-wide. The scope of Penguin varies by language, e.g. languages with more webspam will see more impact.
Penguin 2.0 - Updates

This prospect creates fear for many small businesses who depend on search engine optimization (SEO) for their livelihoods. But there is also a sense of confusion as the line often shifts and the message from Google contradictory.

Matt Cutts:

This is the fourth Penguin-related launch Google has done, but because this is an updated algorithm (not just a data refresh), we’ve been referring to this change as Penguin 2.0 internally. For more information on what SEOs should expect in the coming months, see the video that we recently released.

Added: If there are spam sites that you’d like to report after Penguin, we made a special spam report form at http://bit.ly/penguinspamreport . Tell us about spam sites you see and we’ll check it out.

Thursday 14 March 2013

Importance Of Sitemap.xml


If you have a website, you should be use  an XML Sitemap to help improve your Search Engine visibility. What is an XML Sitemap? It is a simple and effective way for you to give the Robots  a list of all the pages you want them to crawl and index.

Importance Of Sitemap.xml

XML sitemap saves the time of search engine spider to track all the pages of a website. It does not affect the site ranking but it allows crawler to track all the pages frequently which is helpful to rank well. Previously, XML sitemap was important for Google only but now Yahoo! and MSN are also giving importance to it.

 Use Of XML Sitemap
Using sitemaps has many benefits, not only easier navigation and better visibility by Search Engines. Sitemaps provide the opportunity to inform Search Engines spider instantly about any changes on your website. Of course, you cannot expect that Search Engines spider will rush right away to index your changed pages but certainly the changes will be indexed faster, compared to when you don't have a sitemap.

How do I submit a Sitemap to Google?
XML site maps can easily be submitted to Google through Webmaster Tools, under the Site configuration panel. Once the XML file has been uploaded you are done, fingers crossed you will see some benefit.
From an SEO point of view we know getting links from external sites is key but for on page optimisation lists of internal links to pages is also a helpful tool. Any ranking benefits seen when using sitemaps are easily a side effect.
Links to Sitemap generating websites :

XML Sitemaps  http://www.xml-sitemaps.com/
                            
                             http://www.sitemapdoc.com/

Wednesday 9 January 2013

Seo Friends


Seo Friends is all about Seo (Search Engine Optimization)factors & techniques which we use for increase traffic and visibility onyour website in major Search Engine.

Introduction of Seo
Seo (Search Engine Optimization) is technique which helps to rank your website higher than other websites in major search engines.
Seo (Search Engine Optimization) is technique of internet marketing which promote your business  online.