April 2006


General Articles07 Apr 2006 03:46 pm

One thing I’ve always been curious about is the origin of doorway pages. Who originally invented them? I’m not niave enough to believe I was the one who first thought of it but I do have an interesting background story for whoever is interested.

It was my freshman year in highschool around spring/summer of 1996. I had some weird entertainment/wav file website (kegger.com - and no I dont’ still own that domain). Back then Webcrawler and Altavista ruled the search engines. I was reading a lot on the limited world on how search engines worked when I thought “wait a minute why should I ruin my page by putting this crap into my pages why not just create separate pages that are perfect for these keywords that I want and have them forward to my main site.” So I manually made about 20 of them and they worked so damn well that I started making more and more and more. Before this time I’ve never heard of this being done and never seen it(althought it probably did exist it just was so private i never got to see it). Eitherway they worked extremely well and was producing a ton of traffic for me on every keyword I would use. It wasn’t long before I ranked for one word terms like Music, MP3(the hottest new craze at the time), download, and software to name a few. Which of course by today’s standards; my site would be worth millions if I ranked for any one of those terms. They were so great I started calling them my “hitwhores” (haha..hey i was only 14 at the time. gimmie a break). After a couple weeks I recieved an email from Altavista asking me whats the deal with these pages that were ranking well for extremely popular words like music and download, yet they didn’t do anything but forward to my main site. I told them what they were for. They responded and told me I need to take them off my site and that my site would be banned. I told them in words less professional to get lost and that if they did I would just copy the files over to a bunch of other servers like Geocities. They emailed me back and said they would sue. I told them go ahead, and by the way I’m only 14 and all my stuff is located on servers in other countries. They didn’t respond.

Then about two days later I noticed on their url submit page they had a link that said “Attention webmasters don’t do this or you will be banned”(or something along those lines). I then clicked on the link and it described perfectly what I had done in detail, they even changed my wonderful term for it into something more politically correct like doorway pages. Infact those idiots basically posted instructions to the world. I emailed them and told them they were idiots for doing that and now everybody was going to do it. They never responded. Needless to say it wasn’t too long after that, that it became a big popular thing and ruined SERPS for a long time after.

Anyways thats my story, I rarely tell it since no one would probably believe it. Eitherway thats my story and I’m sticking to it. I know the saying that says no idea is an original one. So I’m sure someone else thought of it and did it before I did, but it’s nice to think that I at least helped make it popular.

If anyone else has any cool stories about the origins of blackhat techniques or things that became Internet standards I would love to hear them.  Please post them in the comments or in the forum.

Random Thoughts06 Apr 2006 10:36 am

Wow, when did BlueHatSEO.com get a page rank 4? I could of swore it wasn’t there yesterday. HaHa…all well whatcha gonna do? It’s not like any sites link to me anyways. I wasn’t even aware I was in the search engines.

If anyone is in the market for a PR4 link I’ll sell you one for a measily 300k dollars. Or you can just leave a comment :)

Random Thoughts03 Apr 2006 10:41 pm

With only about 1,200 visitors/day to this site I’m no mover and shaker but since my post about an Alexa Cheat my Alexa ranking has been rising. HAHA. Com’n people start sending your friends so I can be in the English Top 100. Kidding Kidding, don’t crash my site :)

Can you tell which day I put up that rel=prefetch link?

FYI! Incase anyone has been keeping track; I have reached my 50th post! Not bad for the site only being about 2 months old. Although I am still really wishing I could get some more authors in here to give their insights. It would be nice to hear some other people’s thoughts.

Random Thoughts03 Apr 2006 02:37 am

I found an interesting post on The Pyramids of Google. It gives a link to the Alexa traffic log for GoogleSyndication.com. Which is the domain that Google adsense posts the links. This is actually pretty shocking. It shows the top sites that send Googlesyndication.com traffic. In other words what sites send the most traffic through adsense. They are mostly chinese sites! I don’t understand how that is even possible. Crazy stuff though.

Blue Hat Techniques03 Apr 2006 01:54 am

One thing that can be learned only by running quite a few websites at once is the differences in how the bots treat sites different. One of the biggest differences is how often they pull your pages, and how often they update your site in the index. One day while browsing through my different stats, I noticed how certain sites get updated in the indexes daily and some get updated monthly. Some sites that only have about 1,000 links get hit by Googlebot 700times/day while some others that have over 20,000 links only get hit about 30 times/day. This inspired me to begin an experiment.

The Experiment
Being one of the few that paid attention in Junior High science class I did this test the right way and put on a white lab coat(just kidding, but wouldn’t that be cool. Where do you buy those things?). My constants were simple. Each site was a brand new domain with similair keywords with similair competition and searches/day. Each site had extremely similair content and had the same template. I also pointed exactly 10 links from the same sites to each site. My variables were also simple. Each site was automatically updated with new pages and with new content at random times, the only difference was how many times in one day they would be updated.

Site 1-Updated 1 times/day

Site 2-Updated 3 times/day

Site 3-Updated 5 times/day

Hypothesis
The crawlers behave differently depending on how often the site is updated. The indexes will update more or less frequently depending on how often the site is updated.
Time Frames
I let the sites sit for one month. I closely monitored each site and it’s progress each day.

Spider Hits After First Month
Site 1                    Site 2                    Site 3
MSN:214               MSN:478                MSN:1170
Google:184           Google:523            Google:957
Inktomi:226           Inktomi: 391          Inktomi: 514

Time Frames
Then I monitored the sites for 6 months.

Cache Update Averages After 6 Months
Site 1- MSN: 1.52 times/month Google: 1.4 times/month
Site 2- MSN: 18.24 times/month Google: 4.1 times/month
Site 3- MSN: 21.70 times/month Google: 13.4 times/month
*Yahoo excluded because it’s tougher to tell cache times and date stamps vs. cached pages/title changes.

I also tracked the percentage of pages to actual that were indexed across Google, MSN, and Yahoo
Site 1-57%
Site 2-81%
Site 3-83%
Conclusion
It is understood that spiders will hit your site for three primary reasons. First, validating a link from another site. Second, checking for changes to your site. Third, reindexing your site. Fourth, pulling robots.txt. With the first and fourth factor neutralized we can assume the update and spider stats are because of the second and third reasons.

Practical Use
I understand from this experiment that if you keep your updates consistant and at random times it will force the bots to revist your site more often. They will all start visiting your site at a consistant intervals depending on your number of links. Once they start to build a rythmn of how often your content changes, they will adapt and start visiting more. Once they build that rythmn into timing they will update your site in the indexes accordingly.

Therefore a theory can be built. Crawlers are designed to accomidate your site and the practices of the webmaster. Thus, you can train the crawlers to how your site operates and this will conclude in differences in performance in the indexes.

Flaws In The Experiment
Upon factoring the final results I wish I had over done it with a fourth site. Had it update 100 or 1,000 times a day. To see if it performed better or worse than Site 3. The second flaw falls into the category of seasonal changes. I did this experiment between June 2005 - January 2006. The engines could have been acting differently during those times. I know for a fact that MSN was, because it was so new.

« Previous Page