Skip links

How I solved URL blocked by robots.txt in error ?

How I solved error  “Sitemap-URL blocked by robots.txt” in Google search Console ?

I was in webmaster tool to add my sitemap ( ex : https://www.google.com/webmasters/tools/robots-testing-tool?hl=en&siteUrl=http://35.200.180.90/), to get my website indexed in google, saw an error popping up saying

“Url blocked by robots.txt.”

Tried editing, and submitted to the webmaster. After one refresh, its back to same old code.

# Added by Google

# Modify or delete lines below to allow robots to crawl your site.

# Block all robots

User-agent: *

Disallow: /

First thing came into my mind after few searches in google were – Install Yoast Seo plugin.

Log in to my  WordPress website. When you’re logged in, you will be on your ‘Dashboard’. On the left-hand side, you will see a menu. In that menu, click on SEO.

Then Click on ‘Tools’.  Click on ‘File Editor’. 

Yoast SEO Tools

 

yoast SEO File Editor

After clicking the File Editor, for me it showed

“If your robots.txt were writable, you could edit it from here.”

Means I Don’t have permission to edit it. 🙁  There was no button to create a new one.

If your robots.txt were writable, you could edit it from here.

Second attempt with Virtual Robots,

Installed a plugin called “Virtual Robots.txt”  In settings window I saw my new Robots.txt code….yay

Again went back to the search console and checked the new sitemap, Google was again fetching the same old “Disallow” code.

Then I noticed a small information at Virtual Robots.txt plugin settings saying “If your robots.txt file doesn’t match what is shown below, you may have a physical file that is being displayed instead. ” 

User-agent: *

Disallow: /wp-admin/

Allow: /wp-admin/admin-ajax.php

Disallow: /wp-includes/

Allow: /wp-includes/js/

Allow: /wp-includes/images/

Disallow: /trackback/

Disallow: /wp-login.php

Disallow: /wp-register.php

I was using the Google cloud server, By default, the Google Cloud hosting server defines the robots.txt to DISALLOW search engines. If you have a WordPress website hosted on google cloud you need to manually edit the file to ALLOW indexing.

It is very important to manually edit the robots.txt file to allow permission for search engines to index your site.

These instructions will guide you through the process of editing your robots.txt file on Google Cloud,

SSH for Google Cloud Platform

2) Login to https://cloud.google.com/ and navigate to https://console.cloud.google.com/ and Open Resources TAB.

Resources TAB

Write down the INSTANCE NAME and TIME ZONE of the hosting server you are running your word-press blog, we might need to enter later.

VM instances

Activate “CLOUD SHELL” from the top right corner on your Cloud Console dashboard.

After the cloud shell opens up type in: gcloud compute ssh [INSTANCE_NAME]

Use the instance name you have noted in the previous Step,  Without the brackets.

If you are using the SSH command for the first time, you need to set up your SSH keys. Just type in the above command.

If asked for any username or passphrase, enter something simple like your website name. There is a second prompt which asks you to define a password, This is optional so leave it blank and press ENTER.

On the Shell command of your Cloud shell type this and Press Enter:  sudo nano /var/www/html/robots.txt

The robots.txt file is now open on the shell for editing. You need to select each character to delete or replace. The arrow keys on the shell command do not work like a normal text editor.

Change “Disallow” to “Allow” on the LAST LINE while retaining the same formatting.

This is the new format of my robots.txt file that ALLOWS indexing :

#

# Added by Stocksonfire

# Allow all robots

User-agent: *

Allow: /

After you have modified the file, PRESS CTRL+O (O as in alphabet O). Now Press ENTER key to confirm the changes, This will save the changes you made to the robots.txt file.

And my problem got solved. Check out www.legenesis.com

You can check your website’s robots.txt file in the following location once again : http://www.yoursite.com/robots.txt

Leave a Reply

%d bloggers like this: