If you search in Google for [offers] Google will return the shuttered Google Offers page as the first search listing. But if you look carefully, you will see that Google has blocked the page from spiders within their robots.txt file and the page itself says “This site is no longer being supported.”
Here is a screen shot of the search result for [offers] currently in Google:
As you can see, it ranks number one but the snippet reads, “A description for this result is not available because of this site’s robots.txt – learn more.” When you click through, you get to a page that technically returns a 404 status code, which should communicate to Google to remove the page from their search results.
Here is a copy of the Google Offers page:
Google seems to be ranking the page for the term offers based on the links still pointing to that URL. This is a good example of how old links and a site’s reputation can still keep content showing up even if it has changed, been blocked by robots.txt and also 404ed. Google’s crawlers will continue to try to access the URL to see if the disavow and 404 status has changed and if the URL should be included sometime in the future based on those changes.
If you want to completely remove a page from Google’s index, 404ing the page should be enough. I wonder if since it is disallowed in the robots.txt, that Google cannot crawl the page to see the 404 status code, thus leaving the page to rank in the search results, solely based on links. Google Webmaster Tools has a tool to remove the page completely and you can learn about that over here.
The post What’s Tops For “Offers” On Google? Google Offers – Which No Longer Exists appeared first on Search Engine Land.