The field of search engine optimization is a constantly evolving
industry with important changes being made almost daily.
In the beginning, search engine optimizers focused on one thing
only: rankings. Tracking of actual sales or conversion rates was almost unheard
Slowly, search engine marketers began to realize that all of the
1’s in the world won’t help if they don’t convert to traffic,
and all the traffic in the world won’t help if it doesn’t convert to
So, search engine experts began analyzing their log files and
tracking sales. With today’s technology, Web site owners can now tell
which of their marketing campaigns are truly effective and which need to be
replaced by a different campaign. In other words, through log analysis, you can
analyze your human traffic and their visits to your Web site.
But what about your “spider” traffic?
One part of log analysis that has remained surprisingly lacking,
both in terms of content and accuracy, is spider analysis.
Though traffic analysis programs may look at spider activity,
the information often isn’t detailed enough or presented in a format to do
you much good. Also, spider and robot analysis is
as being a main culprit for inaccurate log analysis measurements.
Therefore, the need for detailed spider analysis has begun
entering the minds of search engine marketers.
So, what is “spider analysis”?
You know that when you submit a Web page to an engine for
indexing, the engine sends a spider to your site to index the contents of the
page. “Spider analysis” is simply analyzing the search engine spider
visits to your site.
Through effective spider analysis, you can learn the following
information about your site in a concise, easy-to-read format.
- Has your site been spidered?
- If so, by which engines?
- When did the spiders visit?
- Which directories and pages did they visit?
- Are certain pages getting respidered more often, signaling
their importance to the search engines?
- Are certain pages not getting spidered at all?
- Are the spiders indexing inappropriate content?
- Are the spiders getting everything they want and need, or are
they receiving error messages?
- Was your site spidered within the specified time agreed upon
in the pay inclusion programs you’re participating in?
- Is your site getting respidered on a regular basis, as agreed
upon in your participating pay inclusion programs?
Another important issue in spider analysis is robots.txt files.
Though the object of search engine marketing is to help spiders
find all the pages on your Web site, there will be times when you want to keep
spiders out of certain pages. You can do this with a robots.txt file.
What is a robots.txt file?
A robots.txt file is a text file that is placed on your server
that instructs the search engine spiders not to crawl or index certain sections
or pages of your site.
But why would you want to keep the spiders out?
Let’s say that you’re creating a new page for your
site, and you’ve placed the page online while you tweak and edit it. The
page isn’t ready for visitors, so you wouldn’t want it indexed yet.
Or, let’s say you’ve placed some employee guidelines on your site.
The guidelines are of interest only to your employees, and there’s no
reason for the public to view them.
Using a robots.txt file, you can keep the spiders out of those
Therefore, when working with spiders or robots, you want to
be able to:
- Create a robots.txt file quickly and easily;
- Use a robots.txt file to present optimized pages to specific
engines. For example, using a robots.txt file, you can focus English language
robots onto the relevant pages and direct robots from international search
engines to the localized content areas of your site;
- Send e-mail harvesting programs away from your site to keep
your e-mail spam down;
- View highlighted pages requested in error by the spiders;
- Direct search engine spiders to relevant areas of your Web
How do you create a robots.txt file?
Creating a robots.txt file manually is tricky at best. One
little mistake will make the entire file invalid, and you’re opening your
Web site up to spiders operating on their own free will. Plus, if you’ve
created engine-specific pages with similar content, the spiders may discover
those nearly duplicate pages, and you could find yourself in trouble for
While there have been software programs on the market to create
robots.txt files, that’s been their sole function. Spider analysis
hasn’t been a part of their features, until now.
Robot-Manager Professional Edition is a software program that
concentrates totally on spider visits to your site.
Recently, I had the good fortune of reviewing a new software
program that performs two major functions: spider analysis and robots.txt
In a nutshell, I’m amazed, and I have to predict that the
software, Robot-Manager Professional Edition, will rapidly become a “must
have” tool for search engine marketers.
To begin with, the program is easy to use. I don’t know
about you, but I don’t have time to learn a complicated software program.
Though the program is intuitive and doesn’t require the reading of a
complicated user’s guide, you’ll still find detailed help topics that
can answer any questions.
The program begins with Step 1, where you choose which spiders
you want to work with. In Step 2, you download your file directory tree, which
is where you begin to tell the program which files you want to keep spiders out
of. The robots.txt file is automatically created in Step 3, and you can
instantly upload the file to your directory. The program allows you to edit the
file manually as well.
In Step 4, the program gets down to the real “meat” of
the analysis: analyzing your log files for spider visits. When downloading and
analyzing the log files from the domain I worked with, it automatically added
30 new spiders to the spider list.
On the Spider Visits page, I found it particularly helpful to
view the visits by Web page. In this manner, I could quickly see which spiders
have visited which pages of the site and when. Just think of the value of this
information when working with clients. You can prove to them that spiders have
visited their sites, even if the pages aren’t yet indexed. You can also
view spider visits by date or by spider.
Also, international spiders are included in the program, which
is ideal for those sites that are aiming for a corner of the international
Where can you go to see this program for yourself?
Management Tools and look for Robot-Manager" 3.0 Professional Edition.
Though the site also offers a Standard Edition of the software, I strongly
advise you to look at the Professional Edition instead. Why? Because the
Standard Edition doesn’t include the spider analysis portion of the
software, which is a “must have,” in my opinion.
You can download a trial version of the software at no cost,
then test drive it before purchasing. With the trial version, only 5 spiders
can be selected and 20 spider visits picked up from the log file.
In Conclusion . . .
As the search engine marketing industry continues to move
forward, three major categories of tools or services need to be considered:
- Web page checking, submission, and ranking tools;
- Web traffic analysis tools for analyzing human visitors to
your site; and,
- Spider analysis tools for analyzing spider activity.
If your current search engine marketing plan doesn’t cover
each of those crucial areas, you need to look into expanding your tool arsenal.
The information you can gain by analyzing your human and spider traffic will
prove invaluable to you as you work toward strengthening your online business.
Robin Nobles teaches 2-, 3-, and 5-day hands-on search engine marketing workshops in locations across the globe (SearchEngineWorkshops.com) as well as online SEO training courses (OnlineWebTraining.com). They have recently launched localized SEO training centers through SearchEngineAcademy.com, and they have expanded their workshops to Europe with Search Engine Workshops UK. They have also opened the first networking community for SEOs, the Workshop Resource Center (WRC).
Click here to go back to the
index of search engine marketing articles
This work is licensed
under a Creative Commons License.