All bloggers need their websites to be crawled frequently, so that their posts, pages, stories, and images get indexed quickly. Search engines’ crawling is an important factor in the field of blogging. As soon as a web page index, it also started ranking on the search engines on different keywords.
But sometimes we need to block search engines bots to index some contents of our website or any subdomain/subfolder. It is because of security reasons and also SEO reasons, as the site is for trial purposes. When we are using a subdomain for testing purposes, it is obvious to block it from indexing by search engines.
You can generate robots.txt file for your website easily with different customization by our free robots.txt generator.
How to Use Robots.txt to Disallow Search Engines Bots?
Place a robots.txt file in the root directory. For example, www.example.com/robots.txt.
Now you can start adding commands to the file. The two main ones you should know are:
- User-agent refers to the type of bot that will be restricted, such as Googlebot or Bingbot.
- Disallow is command helps you to restrict the bots to a specific folder or whole domain.
If you want to prevent Google’s bot from crawling a specific folder of your site, you can put this command in the robots.txt file:
User-agent: Googlebot
Disallow: /subfolder/
If you want to prevent Google’s bot from crawling a specific web page of your site, you can put this command in the robots.txt file:
User-agent: Googlebot
Disallow: /subfolder/page.html
Do you want the robots.txt file to disallow all search engine bots? You can do it simply by putting an asterisk (*) next to User-agent. And if you want to prevent them from accessing the entire site, just put a slash (/) next to Disallow. Here are the codes looks:
User-agent: *
Disallow: /
How to Allow the Search Engines Bots to Crawl your Website through Robots.txt
If you want to allow all search engine bots to crawl a specific folder of your site, you can put this command in the robots.txt file:
User-agent: *
Allow: /subfolder/
If you want to allow all search engine bots to crawl a specific file of your site, you can put this command in the robots.txt file:
User-agent: *
Allow: /subfolder/page.html
Caution in modifying Robots.txt
It is very much important to understand the code before modifying the robots.txt file. Because wrong codes in this file can totally deindex your website from search engines. So code it carefully.