- Blue Hat SEO-Advanced SEO Tactics - http://www.BlueHatSEO.com -

Blue Hat Technique #10-Teaching The Crawlers To Run

Posted By Eli On April 3, 2006 @ 1:54 am In Blue Hat Techniques | 28 Comments

One thing that can be learned only by running quite a few websites at once is the differences in how the bots treat sites different. One of the biggest differences is how often they pull your pages, and how often they update your site in the index. One day while browsing through my different stats, I noticed how certain sites get updated in the indexes daily and some get updated monthly. Some sites that only have about 1,000 links get hit by Googlebot 700times/day while some others that have over 20,000 links only get hit about 30 times/day. This inspired me to begin an experiment.

The Experiment
Being one of the few that paid attention in Junior High science class I did this test the right way and put on a white lab coat(just kidding, but wouldn’t that be cool. Where do you buy those things?). My constants were simple. Each site was a brand new domain with similair keywords with similair competition and searches/day. Each site had extremely similair content and had the same template. I also pointed exactly 10 links from the same sites to each site. My variables were also simple. Each site was automatically updated with new pages and with new content at random times, the only difference was how many times in one day they would be updated.

Site 1-Updated 1 times/day

Site 2-Updated 3 times/day

Site 3-Updated 5 times/day

Hypothesis
The crawlers behave differently depending on how often the site is updated. The indexes will update more or less frequently depending on how often the site is updated.
Time Frames
I let the sites sit for one month. I closely monitored each site and it’s progress each day.

Spider Hits After First Month
Site 1                    Site 2                    Site 3
MSN:214               MSN:478                MSN:1170
Google:184           Google:523            Google:957
Inktomi:226           Inktomi: 391          Inktomi: 514

Time Frames
Then I monitored the sites for 6 months.

Cache Update Averages After 6 Months
Site 1- MSN: 1.52 times/month Google: 1.4 times/month
Site 2- MSN: 18.24 times/month Google: 4.1 times/month
Site 3- MSN: 21.70 times/month Google: 13.4 times/month
*Yahoo excluded because it’s tougher to tell cache times and date stamps vs. cached pages/title changes.

I also tracked the percentage of pages to actual that were indexed across Google, MSN, and Yahoo
Site 1-57%
Site 2-81%
Site 3-83%
Conclusion
It is understood that spiders will hit your site for three primary reasons. First, validating a link from another site. Second, checking for changes to your site. Third, reindexing your site. Fourth, pulling robots.txt. With the first and fourth factor neutralized we can assume the update and spider stats are because of the second and third reasons.

Practical Use
I understand from this experiment that if you keep your updates consistant and at random times it will force the bots to revist your site more often. They will all start visiting your site at a consistant intervals depending on your number of links. Once they start to build a rythmn of how often your content changes, they will adapt and start visiting more. Once they build that rythmn into timing they will update your site in the indexes accordingly.

Therefore a theory can be built. Crawlers are designed to accomidate your site and the practices of the webmaster. Thus, you can train the crawlers to how your site operates and this will conclude in differences in performance in the indexes.

Flaws In The Experiment
Upon factoring the final results I wish I had over done it with a fourth site. Had it update 100 or 1,000 times a day. To see if it performed better or worse than Site 3. The second flaw falls into the category of seasonal changes. I did this experiment between June 2005 - January 2006. The engines could have been acting differently during those times. I know for a fact that MSN was, because it was so new.


28 Comments To "Blue Hat Technique #10-Teaching The Crawlers To Run"

#1 Comment By George On April 5, 2006 @ 9:48 pm

Very interesting information! Question: were these hand-built sites or auto-generated content?

Do robots treat blogs differently from “traditional” more static sites? Or are the sites treated the same, only crawled more frequently because they’re updated regularly?

Great site, BTW.

#2 Comment By Eli On April 6, 2006 @ 10:25 am

Great question George. They were auto generated content but put into static pages. They weren’t blog sites however. I do think robots do treat blogs differently than traditional static sites, but that is only because blogs are updated at more random intervals than larger sites. Blogging and pinging does have it’s effects as well.

#3 Comment By George On April 20, 2006 @ 1:52 pm

Thanks for the response.

I’ve been reading about the blog/ping cycle leately (just getting started with technical aspects of SEO — no hat yet) and I’m simply not clear on it. Could you do a post about blog/ping?

There are only two benefits that I see:

1. IF it works, you can get new pages indexed fast by blogging a link and then pinging.

2. You can POSSIBLY give your sites worthwhile links by blogging links then pinging.

I’ve read that this is “dead” (definition: anything I’ve heard about — Capri pants, the Decembrists, blogging/pinging) as a technique. What’s your take?

#4 Comment By deeb basheer On April 28, 2006 @ 8:25 pm

can you tell me how to build a self updated website ???

thank you for the great info

deeb basheer

#5 Comment By Eli On April 30, 2006 @ 5:11 am

Sure Deep,
You will need some experience in coding either cgi or php. Basically you just write all your content and put it into a database. Then write the script to pull one of the sections of content and feed it into the main page. The other way of doing it is to create the pages and then cycle links to them on hte main page on a schedule. Creating a cronjob(scheduled server event) will be needed.

#6 Comment By josh On February 4, 2007 @ 7:20 pm

Got your cool ass lab coat for you. Just hit me with the size. The wife works for Clinique and they go with the “laboratory” look.

They sell to their employees at $200+/coat but for you, my friend, $0.

Worth every penny for all the sweet advise from a evil genius. Only been here about an hour and you have already taught me a trick or 2. Any methods discussed on this site your favorite?

#7 Comment By Eli On February 4, 2007 @ 8:27 pm

hehe, I have no idea what labcoat size i am :) I wear a mens large shirt if that helps :) Labcoats are badass, I’d totally wear one all the time. I’d be one of those creepy scientists. So if anyone has an evil looking labcoat to hook me up with you can mail it to my office on BlueHatSEO.com whois info.

thanks for the compliments by the way. Feel free to visit anytime.

#8 Comment By neil strauss On January 21, 2008 @ 8:55 am

These days, blogs that release a new post gets that post index in literally less than an hour!

#9 Comment By Prosperity Writer On March 24, 2008 @ 2:08 am

from your experiment is it safe to say that putting a blog, mydomain.com/blog, for example, in my non-blog website improve indexing?

#10 Comment By Forumistan On April 7, 2008 @ 3:59 pm

Great stats man, keep it on…

#11 Comment By beverly farrar On June 22, 2008 @ 11:02 am

According to my website reporting of crawler hits below, it has slowed considerably. What do you think is the cause and how can I remedy this? Thanks so much!

Crawler Hits
June 2008 104
May 2008 151
April 2008 0
March 2008 149
February 2008 136
January 2008 128
December 2007 185
November 2007 160
October 2007 153
September 2007 212
August 2007 277
July 2007 580
June 2007 685
May 2007 11
April 2007 791
March 2007 1201
February 2007 948
January 2007 911
December 2006 746
November 2006 460
October 2006 472
September 2006 796
August 2006 1118
July 2006 673
June 2006 820

#12 Comment By Supermarket Accidents On September 24, 2008 @ 3:20 pm

Cool experiment. I have noticed it myself too but not one for running experiments. Too lazy to start so end up waiting for others and then read about their results :)

#13 Comment By forex faculty On March 7, 2009 @ 9:32 am

Thanks again Eli. have read 4 articles so far and still craving for more

#14 Comment By Jesper Wallin On August 13, 2009 @ 6:02 pm

A really interesting and very useful article.. How does these figured add up today, seeing the experiment was posted more than 3 years ago?

Also, like someone mentioned, how does search engines treat blogs vs “normal” pages? Sure, pinging and such have it effects, but is that positive or negative? As for trackback and pingback protocols, are these links treated as “real” links in the eyes of a search engine?

Keep up the good work Eli!

#15 Comment By Made Easy Forex On September 7, 2009 @ 11:48 am

from your experiment is it safe to say that putting a blog, mydomain.com/blog, for example, in my non-blog website improve indexing?

#16 Comment By Sameday payday On October 10, 2009 @ 1:47 am

Through lots of comments on your site, i have known that the site is extremely good for offering latest information.

#17 Comment By Luis On September 20, 2010 @ 1:19 pm

This is a good experiment to try . lately google hasn’t update my blog for a while

#18 Comment By India Tour Packeges On October 10, 2010 @ 6:51 am

hi,

Eli, Very Nice Post Wow!

#19 Comment By abercrombie milano On May 16, 2011 @ 11:42 pm

sI think am just having some problems with subscribing to RSS feed here.

#20 Comment By abercrombie deutschland On May 17, 2011 @ 3:19 am

9Thanks i like your blog very much , i come back most days to find new posts like this.

#21 Comment By Computer Tips and Tech Talk On July 11, 2011 @ 1:48 am

Yes, I agree too. Anyway, thanks for sharing!

#22 Comment By kadın On July 29, 2011 @ 4:52 am

I do agree with all of the ideas you have presented in your post. They’re really convincing and will definitely work. Still, the posts are too short for newbies. Could you please extend them a bit from next time? Thanks for the post.

#23 Comment By rumah dijual On October 21, 2011 @ 4:29 am

great post, thanks blue hat. Does the content unique or just scrap from other site?

#24 Comment By Louboutin On December 21, 2011 @ 8:49 pm

asdfsa

#25 Comment By security guard resume On August 21, 2012 @ 11:38 pm

Does anyone have any example of this in action?

#26 Comment By chong tham On September 8, 2012 @ 5:46 am

Yes, I agree too. Anyway, thanks for sharing!

#27 Comment By thong cong On September 8, 2012 @ 8:18 am

can you tell me how to build a self updated website ???

#28 Comment By Jasmine @ Callme.lk On September 28, 2012 @ 4:56 am

I submitted my site with both programs (demos) and got about 10 succesful submissions with Promosoft and about 70 with Robosoft. (Btw the demo from Robosoft is great, same as full version with a 30 day limit).
Obviously theres many other factors, but perhaps the SE’s see these links as low quality (or spam)?


Article printed from Blue Hat SEO-Advanced SEO Tactics: http://www.BlueHatSEO.com

URL to article: http://www.BlueHatSEO.com/blue-hat-technique-10-teaching-the-crawlers-to-run/

Click here to print.