« How URLs Can Affect Top Search Engine Rankings | How can I optimize my website, it's dynamic? The Rewrite Engine! What? »
April 24, 2006
Should I Submit Again?
By Scott Goodyear
Have you launched a new web site for yourself or a client, submitted to the search engines, waited a few months, and were left wondering if the site had gotten into the search engines? Maybe you've found the URL you've submitted but don't know if they've only picked up the URL, or if they had picked up the content from your pages? You may also be wondering from time to time whether you should resubmit or not? In this article, we will explore some of the methods that you might use to answer these questions.
Extended The Invitation To Visit.
When working on the content of a site, gaining inbound links, searching for link partners, etc., you may have submitted a few pages to the major search engines and received a vague notice that your site has been added to a list of sites to potentially crawl and index. Here are a few examples of messages you may have seen:


A few months will likely pass and at that point, it is usual that the average site's submitted pages will not be listed in the top100 let alone top 1000 rankings. Should you resubmit? The answer to this question may lay in answering this: was the site's home page, or at least a few pages within the site even indexed?
Try searching with a few URLs as 'keywords'. For example: www.marketposition.com/blog/archives/2004/12/index.html on Netscape.com This can be an important step since you may not find the URL under a specific keyword or phrase, but you will want to at least know if the engine is aware of the site. This can be a bit time consuming since you might need to check for URLs with and without the "www." in them. You may wish to consider using some of the advanced search operators like:
site:www.site.com
inurl:www.site.com
domain:www.site.com
The actual search operators can vary based on the engine you are using, but most sites will offer an advanced search option or domain search option if you do not wish to bother with search operators. There are also engines like Yahoo! that have an exploration tool to simplify this process.
Did An Engine Just Knock At The Door?
If you have a web analytics package or know how to review your site's log files, you can see whether a search engine robot has been visiting your site. Many search engines robots have obvious names like Google Bot, MSNbot, etc. or unique names like Slurp, rather than the simple IP address/browser name combinations that most of your human visitors will have. If you are taking part in a search engine's paid advertising program, this can mean that your site will be indexed not only by a normal search spider but could also be analyzed by an advertising program spider. While advertising and submission services will not typically cause your site to gain an extra rankings boost, it can mean that your site will be indexed faster than letting the spider naturally crawl your site and incidentally, may mean that your site information is updated more often than a non-advertiser.
I Can't Believe They Still Have THAT In The Index!
Some time has passed and you've likely had a few pages that have been indexed. From time to time, engines will return to your site to check for updates automatically. When others link to your site, the engines can also follow those links back to your site and re index your site as well. This is why larger, more popular sites like CNN.com, BBC.co.uk, etc. are often quite up to date in the engines. Yet when looking at listings from your own site, you may find information that is out of date or different from the content that is currently on the page.
If your indexed content is out of date, first check your analytics package/log files as mentioned previously. Has the engine been by in the past few months? If so, it may simply take a bit longer for the engine to update its database and spread the updates throughout its various data centers. If you do not have this type of option available to you, some engines will allow you to check their 'cache' directly from their search. Here is a search in MSN for the URL:
http://www.marketposition.com/blog/archives/2005/09/controlling_sea.html

Search, then click on the 'Cached page' option.

On the next page, at the top there is a note about when MSN had last examined this page. Since they have visited recently, I would not resubmit this even if the contents of the page had changed on say 4/7/06. The engine's robot should come back to visit the page in the future whether I resubmit or not. If the date is quite old, such as 3-6+ months out of date, you may wish to resubmit if the page has updated information. If the content on the page is the same as the cached copy, there is no reason to resubmit and resubmission at that point will not help to improve your rankings.
How 'engine friendly' is your site? Some sites that rely heavily on flash, javascript, or other technologies are harder to interpret and index than those based on plain HTML/text. Search engines can choose to display listings from a directory like The Open Directory Project (DMOZ.org), if a search engine has a problem reading your page. You can try to get your DMOZ listing updated however it might be better to fix your page so that it is more search friendly. The text based Lynx web browser is ideal for displaying a very basic text version of your page, similar in many ways to what an engine might see. If you do not have a DMOZ listing you may simply end up with an entry for your URL, but no description. This is sometimes referred to as a 'partial listing'.
Many engines have multiple data centers. While the engines often update their data centers on a continuous basis, from time to time you could notice that a page seems to be partially indexed, such as a URL with no text description. If you perform a unique text string search, you may find that the page's content has indeed been indexed, but perhaps the information has not yet been linked to your normal listing due to a canonical issue (i.e. your site was indexed twice, once under www.site.com and once under site.com where one or the other shows up with the text listing). If you do find that one or the other is listed with text and the other is not, you may need to work on creating a 301 redirect rule for your site.
So how do you perform a URL verification search? Again, I will use this page: http://www.marketposition.com/blog/archives/2005/09/controlling_sea.html
I can look for unique text that other sites would not typically have. I could probably use the title of the article, a combination of the author name and a keyword from the page, or other text information that is fairly unique like:
'Controlling Search Engine Robots With robots.txt and Other' Google Yahoo MSN
'Scott Goodyear Robots.txt' Google Yahoo MSN
'conserve bandwidth (data transfer) since some engines will completely skip files and folders indicated' Google Yahoo MSN
(Some words like and, or, the, .com, and others are known as noise words or stop words which most engines will ignore because they are so common.)
As an aside, if you have a list of several URLs that you use to periodically verify unique text, you can check these URLs from time to time (once you have first verified that the URL is pulled up with your unique text) and submit to the engines if your URL no longer shows up with the unique text. At that point you will know that it is a good time to resubmit. If you are using Web Position 3 or the newest Web Position 4 software, a URL verification feature is built into the Reporter and Submitter modules. Simply create your Reporter mission as normal, follow the URL verification steps from just above, and if the URL is found, enter this information into the Reporter mission. Save and close this mission. Create a Submitter mission and check the URL verification option. Save and close this mission. Start a new Scheduler event, set it to run say once a month with these two new missions and WebPosition will run once a month to make sure that your pages are still indexed and if not, resubmit them for you.
← What is this?
