New To Blue Hat SEO?
  Here are a few posts the other readers recommend you check out.
Check Mates22 Mar 2008 10:54 pm

Here ya go. This is the del.icio.us captcha busted in Python.
376623524.png


#!/usr/bin/python
import Image,time,random,glob,re,os,sys

##$$$$
train = raw_input("train? (y/n)")
if(train == "y") : train= True
else: train = False
##
fileName = ''.join(sys.argv[1:])
def getNeighbourhood(i,width,height,pixels):
	results = []
	try:
		if(pixels[i+1] != 0): results.append(i+1)
		if(pixels[i-1] != 0): results.append(i-1)
		if(pixels[i-width] != 0): results.append(i-width)
		if(pixels[i+width] != 0): results.append(i+width)
		if(pixels[i-width+1] != 0): results.append(i-width+1)
		if(pixels[i+width+1] != 0): results.append(i+width+1)
		if(pixels[i-width-1] != 0): results.append(i-width-1)
		if(pixels[i+width-1] != 0): results.append(i+width-1)
	except:pass
	return results
now = time.time()
captcha = Image.open(fileName)
(width,height) = captcha.size
pixels = list(captcha.getdata())
i=0
for pixel in pixels:
	if (pixel == 2): pixels[i] = 0
	i+=1
toclean = []
for i in xrange(len(pixels)):
	neighbourhood = getNeighbourhood(i,width,height,pixels)
	if (len(neighbourhood) < 4) : 	pixels[i] = 0

captcha.putdata(pixels)
started=False
lowestY,highestY,count = 0,10000,0
captchas = []
slant = 15
for x in xrange(width):
	hasBlack = False
	for y in xrange(height):
		thisPixel = captcha.getpixel((x,y))
		if(thisPixel != 0):
			if(started == False):
				started=True
				firstX = x
				firstY = y
			else:
				lastX = x
			if(y > lowestY): lowestY = y
			if(y< highestY): highestY = y
			hasBlack = True
	if((hasBlack == False) and (started==True)):
		if((lowestY - highestY) > 4):
			croppingBox = (firstX,highestY,lastX,lowestY)
			newCaptcha = captcha.crop(croppingBox)
			if(train):
				text = raw_input(”char:n”)
				try: os.mkdir(”/home/dbyte/deliciousImages/” + text)
				except:pass
				text__ = “/home/dbyte/deliciousImages/” + text + “/” + str(random.randint(1,100000)) + “-.png”
				newCaptcha.resize((20,30)).save(text__)
				text_ = “/home/dbyte/deliciousImages/” + text + “/” + str(random.randint(1,100000)) + “-.png”
				newCaptcha.resize((20,30)).rotate(slant).save(text_)
				text_ = “/home/dbyte/deliciousImages/” + text + “/” + str(random.randint(1,100000)) + “-.png”
				newCaptcha.resize((20,30)).rotate(360 - slant).save(text_)
				captchas.append(Image.open(text__))
			else:
				#text = str(count)
				#text = “tmp-delicious-” + text + “.png”
				#newCaptcha.save(text)
				captchas.append(newCaptcha.resize((20,30)))

			started=False
			lowestY,highestY = 0,10000
			count +=1
if(train == False):

	imageFolders = os.listdir(”/home/dbyte/deliciousImages/”)
	images =[]
	for imageFolder in imageFolders:
		imageFiles = glob.glob(”/home/dbyte/deliciousImages/” + imageFolder + “/*.png”)
		for imageFile in imageFiles:
			pixels = list(Image.open(imageFile).getdata())
			for i in xrange(len(pixels)):
				if pixels[i] != 0: pixels[i] = 1
			images.append((pixels,imageFolder))

	crackedString = “”
	for captcha in captchas:
		bestSum,bestChar = 0,”"
		captchaPixels = list(captcha.getdata())
		for i in xrange(len(captchaPixels)):
			if captchaPixels[i] != 0: captchaPixels[i] = 1
		for imageAll in images:
			thisSum = 0
			pixels = imageAll[0]
			for i in xrange(len(captchaPixels)):
				try:
					if(captchaPixels[i] == pixels[i]): thisSum+=1
				except: pass
			if(thisSum > bestSum):
				bestSum = thisSum
				bestChar = imageAll[1]
		crackedString += bestChar
	print crackedString
	#print “time taken: ” + str(time.time() - now)
Random Thoughts and Check Mates17 Mar 2008 11:03 pm

Guess what I’m in the mood to talk about? You guessed it. Captchas! In fact I feel like dedicating a whole week, maybe more depending on if any downtime occurs. :) to talking about nothing but captcha breaking. We’ll break every captcha in the book and even by the end of this post the captchas that haven’t been created yet. Furthermore, for this week only I am accepting any and all captcha related guest posts. So if you got a captcha solved or want to discuss techniques to breaking them feel free to write up a guest post and email it to ELI at BLUEHATSEO.COM in html form. You can stay anonymous and not only will I put it up but I’m also willing to put up any ad you’d like. Pick any text or banner ad you’d like to put up with your post and I’ll include it. With as many readers as this place has I’m sure it’ll get clicked. Also be sure to include your paypal address. If I really like your guest post I may even send you a $100 as a thank you. Also, all you bloggers are welcome to repost any of the captcha related posts on this blog. I now declare any captcha related posts on this blog public domain and republishable under full rights. For some odd reason I feel like blowing the captcha breaking industry the fuck up. Like my favorite saying goes, if you’re going to wreck a room you might as well WRECK it. Lets begin by visiting one of my first captcha related posts; the Army Of Captcha Typers.

The Army of Captcha Typers is a great technique because it doesn’t require loads of programming and is 100% adaptable to any captcha. I suggest you go back and reread it, but in interest of keeping this short here’s a quick summary.

You use a service, I used a proxy site as an example, to get the users to type in the captchas for you. It records what the user typed in as the solution to the captcha and you use that to solve it. The more pageviews the service provides per user the more effective it is to breaking captchas. Why pay Indians or tediously code it yourself?

Normally I like to leave most of the code and creative portion out of the written technique in interest of not ruining the technique and to help the methods be more effective through use of spins and unique code. I don’t write this blog to ruin techniques, and those people who claim I do are just insecure and like to claim they already know everything. As common sense as most of the stuff I post is, I haven’t met a person yet who hasn’t in some way learned something from this blog. That truth brags a lot louder than most SEO blogs I’ve seen. But! If we’re going to wreck something lets wreck it. In that spirit I see no reason why every newbie on the planet shouldn’t be able to easily throw up their own web proxy site that solves captchas for them so here’s the script to do it.

Captcha Solving Web Proxy

This a modified version of CGIPROXY that I mentioned in the post. Basically you install it following the included instructions (README file). Then you setup your web proxy site. Target a niche such as kids behind a school proxy or something similar. There is an extra file included called captcha.cgi. Upload it to the cgi-bin in the same folder as the nph-proxy.cgi and give it 755 chmod permissions. Make a folder one directory below your cgi-bin called captchas. Give it read/write permissions (777 should work all else fails). Then anytime you got a captcha to solve upload it to that directory with a unique filename. This can be done automatically with whatever script you’re using to spam a captcha protected site. On the very next pageview the webproxy will require the person to type in the captcha and disguise it as a human check to prevent abuse. Any captcha works. Once it gets their response it’ll delete the captcha from the folder and write out the solution along with the filename to a new file called solved.txt. Format: characters|image.jpg\n . Remember to make some kind of reminder or code for the filename so you know which image is which when you go to use the solutions. Get enough users to your webproxy (which is very easy) and you can solve any captcha in moments.

Enjoy!

User Contributed11 Mar 2008 04:25 am

This is a fantastic guest post by Harry over at DarkSEO Programming. His blog has some AWESOME code examples and tutorials along with an even deeper explanation of this post so definitely check it out and subscribe so he’ll continue blogging.

This post is a practical explanation of how to crack phpBB2 easily. You need to know some basic programming but 90% of the code is written for you in free software.

Programs you Need

C++/Visual C++ express edition - On Linux everything should compile simply. On windows everything should compile simply, but it doesn’t always (normally?). Anyway the best tool I found to compile on windows is Visual C++ express edition. Download

GOCR - this program takes care of the character recognition. Also splits the characters up for us ;) . It’s pretty easy to do that manually but hey. Download

ImageMagick - this comes with Linux. ImageMagick lets us edit images very easily from C++, php etc. Install this with the development headers and libraries. Download from here

A (modified) phpbb2 install - phpBB2 will lock you out after a number of registration attempts so we need to change a line in it for testing purposes. After you have it all working you should have a good success rate and it will be unlikely to lock you out. Find this section of code: (it’s in includes/usercp_register.php)

if ($row = $db->sql_fetchrow($result))
{
if ($row['attempts'] > 3)
{
message_die(GENERAL_MESSAGE, $lang['Too_many_registers']);
}
}
$db->sql_freeresult($result);

Make it this:

if ($row = $db->sql_fetchrow($result))
{
//if ($row[’attempts’] > 3)
//{
// message_die(GENERAL_MESSAGE, $lang[’Too_many_registers’]);
//}
}
$db->sql_freeresult($result);

Possibly a version of php and maybe apache web server on your desktop PC. I used php to automate the downloading of the captcha because it’s very good at interpreting strings and downloading static web pages.

Getting C++ Working First

The problem on windows is there is a vast number of C++ compilers, and they all need setting up differently. However I wrote the programs in C++ because it seemed the easiest language to quickly edit images with ImageMagick. I wanted to use ImageMagick because it allows us to apply a lot of effects to the image if we need to remove different types of backgrounds from the captcha.

Once you’ve installed Visual C++ 2008 express (not C#, I honestly don’t know if C# will work) you need to create a Win32 Application. In the project properties set the include path to something like (depending on your imagemagick installation) C:\Program Files\ImageMagick-6.3.7-Q16\include and the library path to C:\Program Files\ImageMagick-6.3.7-Q16\lib. Then add these to your additional library dependencies CORE_RL_magick_.lib CORE_RL_Magick++_.lib CORE_RL_wand_.lib. You can now begin typing the programs below.

If that all sounds complicated don’t worry about it. This post covers the theory of cracking phpBB2 as well. I just try to include as much code as possible so that you can see it in action. As long as you understand the theory you can code this in php, perl, C or any other language. I’ve compiled a working program at the bottom of this post so you don’t need to get it all working straight away to play with things.

Getting started

Ok this is a phpBB2 captcha:

It won’t immediately be interpreted by GOCR because GOCR can’t work out where the letters start and end. Here’s the weakness though. The background is lighter than the text so we can exclude it by getting rid of the lighter colors. With ImageMagick we can do this in a few lines of C++. Type the program below and compile/run it and it will remove the background. I’ll explain it below.


using namespace Magick;

int main( int /*argc*/, char ** argv)
{

// Initialize ImageMagick install location for Windows
InitializeMagick(*argv);

// load in the unedited image
Image phpBB("test.png");

// remove noise
phpBB.threshold(34000);

// save image
phpBB.write("convert.pnm");

return(1);
}

All this does is loads in the image, and then calls the function threshold attached to the image. Threshold filters out any pixels below a certain darkness. On linux you have to save the image as a .png however on windows GOCR will only read .pnm files so on linux we have to put the line instead:


// save image
phpBB.write("convert.png");


The background removed.

Ok that’s one part sorted. Problem 2. We now have another image that GOCR won’t be able to tell where letters start and end. It’s too grainy. What we notice though is that each unjoined dot in a letter that is surrounded by dots 3 pixels away should probably be connected together. So I add a piece of code onto the above program that looks 3 pixels to the right and 3 pixels below. If it finds any black dots it fills in the gaps. We now have chunky letters. GOCR can now identify where each letter starts and ends :D . We’re pretty much nearly done.


using namespace Magick;

void fill_holes(PixelPacket * pixels, int cur_pixel, int size_x, int size_y)
{
int max_pixel, found;

///////////// pixels to right /////////////////////
found = 0;
max_pixel = cur_pixel+3; // the furthest we want to search
// set a limit so that we can't go over the end of the picture and crash
if(max_pixel>=size_x*size_y)
max_pixel = size_x*size_y-1;

// first of all are we a black pixel, no point if we are not
if(*(pixels+cur_pixel)==Color("black"))
{
// start searching from the right backwards
for(int index=max_pixel; index>cur_pixel; index--)
{
// should we be coloring?
if(found)
*(pixels+index)=Color("black");

if(*(pixels+index)==Color("black"))
found=1;
}
}

///////////// pixels to bottom /////////////////////
found = 0;
max_pixel = cur_pixel+(size_x*3);
if(max_pixel>=size_x*size_y)
max_pixel = size_x*size_y-1;

if(*(pixels+cur_pixel)==Color("black"))
{
for(int index=max_pixel; index>cur_pixel; index-=size_x)
{
// should we be coloring?
if(found)
*(pixels+index)=Color("black");

if(*(pixels+index)==Color("black"))
found=1;
}
}

}

int main( int /*argc*/, char ** argv)
{

// Initialize ImageMagick install location for Windows
InitializeMagick(*argv);

// load in the unedited image
Image phpBB("test.png");

// remove noise
phpBB.threshold(34000);

/////////////////////////////////////////////////////////////////////////////////////////////////////
// Beef up "holey" parts
/////////////////////////////////////////////////////////////////////////////////////////////////////
phpBB.modifyImage(); // Ensure that there is only one reference to
// underlying image; if this is not done, then the
// image pixels *may* remain unmodified. [???]
Pixels my_pixel_cache(phpBB); // allocate an image pixel cache associated with my_image
PixelPacket* pixels; // 'pixels' is a pointer to a PixelPacket array

// define the view area that will be accessed via the image pixel cache
// literally below we are selecting the entire picture
int start_x = 0;
int start_y = 0;
int size_x = phpBB.columns();
int size_y = phpBB.rows();

// return a pointer to the pixels of the defined pixel cache
pixels = my_pixel_cache.get(start_x, start_y, size_x, size_y);

// go through each pixel and if it is black and has black neighbors fill in the gaps
// this calls the function fill_holes from above
for(int index=0; index fill_holes(pixels, index, size_x, size_y);

// now that the operations on my_pixel_cache have been finalized
// ensure that the pixel cache is transferred back to my_image
my_pixel_cache.sync();

// save image
phpBB.write("convert.pnm");

return(1);
}

I admit this looks complicated on first view. However you definitely don’t have to do this in C++ though if you can find an easier way to perform the same task. All it does is remove the background and join close dots together.

I’ve given the C++ source code because that’s what was easier for me, however the syntax can be quite confusing if you’re new to C++. Especially the code that accesses blocks of memory to edit the pixels. This is more a study of how to crack the captcha, but in case you want to code it in another language here’s the general idea of the algorithm that fills in the holes in the letters:

1. Go through each pixel in the picture. Remember where we are in a variable called cur_pixel
2. Start three pixels to the right of cur_pixel. If it’s black color the pixels between this position and cur_pixel black.
3. Work backwards one by one until we reach cur_pixel again. If any pixels we land on are black then color the space in between them and cur_pixel black.
4. Go back to step 1 until we’ve been through every pixel in the picture

NOTE: Just make sure you don’t let any variables go over the edge of the image otherwise you might crash your program.

I used the same algorithm but modified it slightly so that it also looked 3 pixels below, however the steps were exactly the same.

Training GOCR

The font we’re left with is not recognized natively by GOCR so we have to train it. It’s not recognized partly because it’s a bit jagged.

Assuming our cleaned up picture is called convert.pnm and our training data is going to be stored in a directory call data/ we’d type this.

gocr -p ./data/ -m 256 -m 130 convert.pnm

Just make sure the directory data/ exists (and is empty). I should point out that you need to open up a command prompt to do this from. It doesn’t have nice windows. Which is good because it makes it easier to integrate into php at a later date.

Any letters it doesn’t recognize it will ask you what they are. Just make sure you type the right answer. -m 256 means use a user defined database for character recognition. -m 130 means learn new letters.

You can find my data/ directory in the zip at the end of this post. It just saves you the time of going through checking each letter and makes it all work instantly.

Speeding it up

Downloading, converting, and training for each phpbb2 captcha takes a little while. It can be sped up with a simple bit of php code but I don’t want to make this post much longer. You’ll find my script at the end in my code package. The php code runs from the command prompt though by typing “php filename.php”. It’s sort of conceptual in the sense that it works, but it’s not perfect.

Done

Ok once GOCR starts getting 90% of the letters right we can reduce the required accuracy so that it guesses the letters it doesn’t know.

Below I’ve reduced the accuracy requirement to 25% using -a 25. Otherwise GOCR prints the default underscore character even for slightly different looking characters that have already been entered. -m 2 means don’t use the default letter database. I probably could have used this earlier but didn’t. Ah well, it doesn’t do a whole lot.

gocr -p ./data/ -m 256 -m 2 -a 25 convert.pnm

We can get the output of gocr in php using:

echo exec(”/full/path/gocr -p ./data/ -m 256 -m 2 -a 25 convert.pnm”);

Alternatives

In some instances you may not have access to GOCR or you don’t want to use it. Although it should be usable if you have access to a dedicated server. In this case I would separate the letters out manually and resize them all to the same size. I would then put them through a php neural network which can be downloaded from here FANN download

It would take a bit of work but it should hopefully be as good as using GOCR. I don’t know how well each one reacts to letters which are rotated though. Neural networks simply memorize patterns. I haven’t checked the inner workings of GOCR. It looks complicated.

My code

All the code can be found here to crack phpBB2 captcha.

Zip Download

In conclusion to this tutorial it’s a nightmare trying to port over all my code from linux to windows unless it’s written in Java :D . If only Java was small and quick as well.

It’s worth stating that phpbb2 was easy to crack because the letters didn’t touch or overlap. If they had touched or overlapped it would probably have been very hard to crack.

I plan to look at that line and square captcha that comes with phpBB3 over on my site and document how secure it is.

Thanks for the awesome guest post Harry.

Random Thoughts19 Feb 2008 12:38 am

I haven’t forgot about you guys :)
Been deeply involved in my recent projects. Got a lot of really kick ass shit going on, its pretty cool and exciting for me. I still continue my work on SEO Empire part 2 every Saturday morning as well as a few other posts that are in the works. I’ve been very scatter brained lately. I miss having more time to write here, but I’m definitely looking forward to getting back into it. If you guys want to chat it up or discuss some ideas we got a Blue Hat chatroom going on in IRC and I’m always in there.
irc.freenode.net /join #bluehatseo (zelanzy.freenode.net is a great mirror if you can’t get on)

I’m also headed to ASW in Vegas this year. First conference ever, I know! I’ll get to meet whoever of you who are there, which is exciting. :)

Either way I think I’ll set a deadline for SEO Empire Part 2 just because I work better under pressure. We’ll say March 9th. From there, once we have some basics out of the way the posts will flow like wine (ha!). I finally checked my feedburner stats for the first time since I set it up, this place has actually grown in reader size instead of dropped since I kinda started taking a break on my regular postings. Why and how?!? Well at least that tells me theres still people interested in this shit. So I’m more than excited to get back into it. Most importantly though, to those of you who have sent in guest posts (especially the guy who sent in the long very detailed one on captcha breaking) DUDES I AM SO FUCKING SORRY!

Random Thoughts26 Dec 2007 12:11 am

Roady also wishes everyone a happy holidays.

Random Thoughts10 Dec 2007 01:47 am

This question just in from Till

Hi Eli,

is an old domain just worth if it has quite a lot backlinks or is an old domain also worth if it’s just in the index of search engines, but has almost no backlinks (0-20).

Regards

Till

Great question. It captured my attention because theres always a lot of talk in the SQUIRT forum about expired domains. Several members of the community are talking about how they’re building their SEO Empires with snagged expired domains. I kind of cringe when I hear that not because expired domains are bad, but because I personally have no idea about the history of the domain. Frankly it could sway either way. The practice of using expired domains could be good or bad. The problem I have with it is the unpredictability, which I’ll get to in a moment. For now I assume the people know what they’re doing when they buy the domain and are making wise decisions. Much like buying a used car always do your research and find out the background of what you’re buying. The inherent problem is, the odds are stacked against you. If it was a good domain with value someone would of kept it. Yet, mistakes are made and there are some definite gems out there and if you aren’t on the field you can’t score. So while I think buying up expired domains for SEO reasons is a good thing if you know what your doing I am hypocritical in the fact that I don’t do it myself. The main reason is due to a question I have myself.

This question just in from Eli

Hi handsome!
About 8 months ago I had several domains expire on me and never managed to pull them out. They were good domains with links, never banned or penalized and were part of several different projects. I reregistered them quickly and managed to get them back. I had no real purpose for them so I added them to a common platform site network I was working on with several other new domains. All the sites had the same structure and went through the same promotion, but for some reason the expired domains took nearly 3 weeks longer to get indexed than the brand new domains. 8 months later they still seem to perform about the same as the other sites, but I’m curious with all their previous backlinks and such why did those exact domains take longer than the others to get reindexed. Any ideas of why that was?

I still don’t know. I don’t have the attention span long enough to buy some control domains and wait a year to expire them out and hope I manage to get them back in order to do any tests and figure it out. Anyone else experienced this by chance?

Either way I see buying expired domains for SEO reasons as having the following benefits.
1. Established inbound links
2. Aged inbound links

Other than that your still starting from scratch. So my philosophy is, unless the domain is a gem, such as either a good name or it having phenomenal unique backlinks (ie lots of links or saturation like you mentioned) than its easier and more predictable to just work with new domains. Not to mention it saves a bit of headaches and time, and even sometimes money. Which brings me back to the predictability thing. I sometimes get questions from people about a particular basement or foundation site that was an expired domain like it suddenly dropped in ranking, or it got banned, or it lost a bunch of pages in the index. Anything out of the ordinary.

BTW I’d like to take this moment to remind everyone that in case you never noticed, every year right before Christmas sites tend to drop in saturation levels in Google. Its probably due to the upcoming updates that usually happen in January, I don’t know. Either way it seems to happen every year near the beginning of December.

So in cases like this you can look at stuff and maybe find a problem, or you can just write it off as the search engines being weird, but when your dealing with a new site on an previous registered domain you get that extra variable. Is the problem caused by a problem with the site, search engines being weird, or the history of the domain causing problems? It makes the job of diagnosing problems and learning from mistakes that much harder. For me personally, I’m still going to be doing this in 5 years so theres no point in forcing unneeded shortcuts on myself. All my domains will eventually become old, all my domains will eventually get link age. I just let time do its thing and in the mean time work on new exciting projects. :) <- its a good life

Which nearly answers the question about old domains. Old domains aren't something I think people should stress about. Every single site I build, while I'm building it, I'm wishing the domain was old. Hell when I'm buying the domains I wish they were old. Yet in a year none of it matters and nothing has changed. I'll still be wishing the domains I am buying now were older like the domains I bought last year and the year before that. It's like playing Sim City, it doesn't matter if you have it on fast mode or slow the strategy is still the same. Because the beautiful thing about age factors are, they are done for you :)

PS. Please read my Follow up post to SEO Empire if you haven’t already. It talks a lot about shortcuts and how to speed up the process of rankings, which I think is where time is best spent. The more experience you have with that the less you have to worry about domain age.

General Articles09 Dec 2007 06:15 pm

Now that I put the dreaded C-word in the title mine won’t be the only office in the nation calling it Class-Cunt ips. Watch, you’ll catch yourself doing it and frankly you deserve it. :) To make the transition into a technopotty mouth easier with a handy mnemonic: A Big Cunt Drowns Easier (E is incase we ever make that switch the government keeps rambling on about).

I probably get more questions about my distribution of IPs than any other type. Frankly I can answer it in one word, evenly. But once again hitting up our Open Questions post here’s a question that I think best illustrates the topic.

This one is from Quinton Figueroa

1. For each domain do you split your subdomains up in multiple C Class IPs or do they all stay on 1? Does it depend?

2. For each domain do you link from your subdomains to other subdomains or do you keep each one as its own stand alone “site”?

3. Do you set up in the 100’s of subdomains or in the 1,000’s of subdomains (or maybe more) per domain?

Appreciate the help man, you kick ass!

Google doesn’t penalize a site because of the other sites on the same IP or class. I say this with confidence because even though Matt Cutts publicly said it in one of his video dialogs I still researched it myself to make damn sure (you can thank me later ionhosting). I also haven’t seen any evidence that the other search engines are any different. So I speak the same answer whether I’m talking about one site having a different IP than another or a subdomain having a different IP than the main domain. It’s all under the same point of reference, but to address the question directly what’s the one primary reason why a subdomain has a different IP than a main domain? Thats right, it’s on a different server.

Side Track
BTW when people say a statement like, “I haven’t seen any evidence” it usually means they haven’t LOOKED at any evidence. For future reference, give statements like that about as much authority as a one legged security officer. Do your own research.

Back On Track
If there is no penalty for sites being on the IP and there is no explicit reward for being on separate IPs than all thats left is two small benefits of 1. If your sites are black hat it makes it harder to track all them down. 2. The links appear to be more natural between two sites if they are on separate IPs (whether or not this is an actual benefit or not remains to be seen). So whole IP diversification business boils down to costs vs financial reward. So while in the past I’ve been very cautious of my own IP dispersement, which was only in part because during that period I was able to acquire IPs very cost efficiently, since I have lessened my efforts. The rewards vs the costs just aren’t there enough to invest any worry into the matter. So my answer is simply “evenly.” Use what you got. If you get a server and it gives you 10 free ips. Use them all and just distribute your sites amongst them. You won’t regret it and at the same time you wouldn’t see any explicit benefits from dumping a bunch of extra money every month into more ips. The money is obviously better spent on things thats make more revenue such as domains and servers. Even if you had unlimited IPs how would you end up distributing them? Evenly…

To be perfectly clear, even though I take IP distribution with a grain of salt it doesn’t mean I take nameserver distribution lightly and the same applies to domain registration info. In fact I’d say the one exception to the IP carefree rule is if you happen to write a blog teaching people how to bend over Google like a Japanese whore. :) I mention it, because I know some of you do. In which case be very careful about what sites you allow others to see. Throwing a few decoys out also doesn’t hurt because “do no evil” policies don’t apply to profit risks. Paranoia? For a year and a half yes, after Oct 21st of this year. No. You may not get it, but someone somewhere just shit their pants. So feel free to giggle anyways.

As for questions 2 and 3 if you would of asked me a year ago I would of had a completely different response. Yet the basic principle still remains. I talked about this topic to great depth in my SEO Empire Part 1 post. Reread the section where I talk about the One Way Street Theory. The decision on how many subdomains as well as whether or not they should be orphan subdomains or innerlinked is a decision I make by asking whether or not those subdomains would be of benefit to the main domain. If they are of a benefit to it than i establish a relationship between the two (ie a link either one way or exchanged). If they aren’t than I keep the subdomains orphan. BTW the term Orphan subdomain or Orphan Subpage was a term coined by an obnoxious troll here. I kinda liked it so I kept it. It means the subdomain has no relationship with the main domain or any other pages or subdomains of the site. Watch out for innerlinking between subdomains though. Think in terms of sites who do it effectively and sites that don’t. If your innerlinking in a way that mimics About.com or similar than great. If your innerlinking in a way that say Blog Solution or something would, for the sake of link building to each subdomain, I’d advise against it for footprint reasons and for god sakes if you’re hosting a blackhat generated site on a white hat domain don’t even consider it!

Do’s and Don’ts of Subdomains.
Do create subdomains for the purpose of exploiting an established domains domain authority. - I’ve talked a lot about software related sites. I think they’re a great and easy way to build domain authority. Anything related can be thrown into a subdomain. I got a couple general sites that have great domain authority and anything i throw up on it does well in the SERPS almost instantly. I make sure to not over do it and it works out very well for me.

Don’t create subdomains to save on domain costs. - It’s less than ten dollars a year for fuck sake. Don’t risk trashing a $20/day site and its authority that it took you a year or two to establish to save $10/year.

Neat Tricks and Hacks08 Dec 2007 11:53 pm

I’ll be dedicating a few posts to grabbing questions in the Open Questions post.

There were several questions like it, but I think this one represents it the best.

From Matthew

Hi Eli,

I’ve read through your blog and its amazing info. Thanks. I have a question. I’ve read through and I find the link building stuff (black hole SEO) a bit too complex, could you suggest any other effective link building techniques? I’ve heard great stuff about TNX.net.

Thanks!

I haven’t heard of TNX, but here’s a really simple one most haven’t thought of. Much like ugly girls tend to have better personalities pretty and clean sites tend to be harder to link build, or at least take more effort initially. So a little technique I’ve been using a lot lately is to build sites that gather links really easy. A quick easy way to do this is to build a site that distributes something people want to either put on their websites or social site profiles (ie. myspace, facebook, youtube and such). I’ll give you the most basic of examples. In a really old post on this blog I put up this picture:

Since then everyone and their dog has been hotlinking to it, especially since it’ll often times show up in Google images for the term Middle Finger. Not that I care but it illustrates the stupid shit people spread and its effectiveness can be used for links as minuscule as it is. So lets say all I had in my arsenal was this stupid picture of a flaming middle finger. I post it up on some site get it in google images and such just like i did. Then under it I put a textbox that says something like Put this on your ____ copy past etc etc. The code has a link to a random domain of mine, the domain doesn’t even have to be active, or it can be a money site. Who really cares? After awhile the flaming finger image gathers enough links to that domain that i just 301 redirect the domain over (doesn’t even have to be an entire domain as I’ll mention below). All the links goes to my new pretty site.

This is a really weak example I realize, but it can be done with just about any media (video, image, flash, etc.) you’d like. It can be done with any type of site you’d like. As long as its the type that tends to be able to gather links faster than your other type of site. If I got big pretty sites coming out on a future date i can build several of these site and by the time the main site goes live I can have plenty of link volume to rank properly. The only reason why this extra work is necessary is because viral link building tends to be exponential. In other words you got to have links to get rankings and got to have rankings to have links. More links means more links. So it creates a nice little shove on a boat too big to leave the dock on its own. Best of all its really easy and rarely costs anything at all to do.

If you’re wondering why this all kind of sounds familiar or like you should already know it, its because its a creative spin on two of the more popular techniques on this blog, Link Laundering Sites, and Cycle Sites. On a side note I’ve found that this also works on subdirectories. So you can create a site that distributes media or allows people to upload their own and it has em all on a separate subdirectory or page, the links can even go to the same page the item is on. Once each subdirectory or subpage gets enough links to it, cycle it out and let the links go to a site that needs em. You can also put the media up on another subdirectory to be used again in gathering more links. :) The best advice I can give you on this is to look for types of sites that gain links quickly right out of the gate. They may not make a lot of money, they may be high bandwidth, it doesn’t matter if they’re all temporary. You can still steal the idea to do some easy link building on the harder sites.

More answers coming soon

Random Thoughts07 Dec 2007 04:22 am

Well I’ve been working extremely hard this coming Christmas season. Got several personal big launches coming out as well as loads of other fun business. I’ve been writing nearly every day on SEO Empire Part 2. I also got several large followup posts in draft. It occured me though, as if I couldn’t with the flood of emails, that I haven’t updated in a month. So I thought it would be fun to throw out a few short and to the point posts before I dig down into my presents and finish up this bitch of a season. :)

So lets take on a few open questions. A public dialog between yourself and I and the other readers. I know theres lots of great SEO and marketing questions out there and frankly I can talk about it all day. Fire off any questions you got in the comments and for the next week I’ll be answering as many as I have time for.

Have fun and happy holidays!

User Contributed17 Oct 2007 09:19 am

Here’s a nice little guest post contributed by SEOcracy. I love guest posts that involve some form of creative money making idea. :)

——————————
I hope you have all been paying attention over the past week, because today I am going to build on last weeks database revelations and tell you all how to use database content to make serious money (and build serious traffic) through Google Custom Search.

Now, I have been making decent bank with Google Custom Search for a while now, and having recently amp’d up my efforts in a big way, I feel confident enough to make the claim that you should all be able to make at LEAST Fifty Dollars a day using the techniques I am about to outline.

Google Custom Search came on the scene back in late 2006 and it really didn’t make the splash that I expected it would in the SEO scene. Google Custom Search, in good Web2.0-Mash-Up style, gave me a brand new way to inter-link and cross promote my diverse network of niche sites, and I thought that was pretty cool. In fact, most people thought that was pretty cool, and that was about all we really heard about the launch of GCS.

But let’s stop and examine three things that make GCS especially cool for us SEO’s and Affiliate Marketers, shall we?

1) GCS engines can be highly targeted, returning extremely relevant results. This means that we can create a GCS that will satisfy the search needs of our site’s users based on their specific interests; and a satisfied user is a loyal user.

2) GCS engines allow us to return results only from the websites we choose. This means that we can set up a GCS to promote only those sites within our network. So if we have a mini-network on home development, our GCS can be set to only return those relevant results from sub-sites within our network. For our mini-network on home development our GCS might return results from sub-sites that provide mortgage offers, information on concrete polishing, or how to select granite countertops. This allows us to cross promote the sites within our network instead of having visitors turn to our competitor for more in-depth information.

3) GCS is made to be monetized. You can display your Adsense ads in your search results and thus can profit from the increased impressions when people use your GCS.

From an SEO point of view, GCS is solid gold because it lets you bypass the usual google.com search completely and in so doing, you bypass your competition! Of course, as I’m sure many of you who have played with Google Custom Search have already realized: your search engine is only as good as the number of users you can funnel into it. Meaning, if you have a GCS that only does 10 searches a day then you are really not going to see any tangible benefit to having it setup on your website.

The hardest part about making a profitable custom search is getting traffic to it. Often people add GCS to their website as an afterthought. Maybe they just feel like it is a cheap & easy way to provide search functionality and increased accessibility to their users. That is all well and good, but we aren’t just trying to make our site more accessible, we are trying to make some extra money, and this implementation of a Google search is rarely profitable because only a small percentage of your websites visitors will ever use it.

As of this post, I am running a HUGE amount of different GCS engines. I have a GCS engine for every single niche I am involved in. Every time I start a new niche, one of the first things I do after going online is to create a GCS for it. I constantly maintain and update my GCS’ settings and configuration, adding and removing websites from each niche network. This is a lot of work, but depending on the subject matter for each niche, the profit returned from the GCS alone can rival the profit I make on the niches individual websites themselves.

But still, we face the same problem. My many GCS engines are useless and won’t make a cent unless I am providing them with a sufficient volume of visitors performing searches. I can’t depend on people to use my search engines just by visiting the website and then typing a query into a box. There are too many distractions on a website, too many places to click and things to see. Because of that, you will find that only a very small percentage of your websites visitors ever actually use your GCS.

Rather, what we need is a way to take every one of our users interested in a given niche and funnel their attention directly towards using nothing else other than our niche’s GCS to find the answers they need. This is where things start to get interesting.

The secret here is thinking outside of the box. Sure lots of websites offer inline search functionality, but how about offering a search-based service outside of a proper website? How about getting people to search for answers in your niche without ever even visiting your website, and without ever even knowing your website exists in the first place?

What I am referring to here are desktop widget platforms like Google Desktop or (my favorite) Windows Vista Sidebar. Even web-top widget platforms like Netvibes, Facebook and more. For the sake of this post, I am going to focus solely on Windows Vista Sidebar.

Out of curiosity, I installed Vista on a partition just to take a peek at it, and one of the first things I started messing around with was the sidebar widgets. They have made it incredibly simple for people to create and publish their own widgets to the Windows Live Gallery website for other people to download and use. Also, Windows Vista Sidebar widgets can be very lucrative since not many people are creating them yet! Now is a great time for you all to get a slice of this action before the whole world eventually ports to the Vista OS.

A Vista sidebar widget is comprised of simple HTML and an XML file that acts in many ways like a PAD file does in as much as it identifies your widget and its provenance. If you know how to copy-and-paste and some basic HTML, then you can easily create a desktop widget.

So here is what I do:

Finding the proper niche is key to doing large search volume. Be SPECIFIC. For example, one of my most high traffic search engines was one that enabled people to interpret their dreams.

Realizing that people love to know what their dreams mean, I did some brief research and compiled a database of dreams and their meanings. Then I created a website around that database and did all the usual SEO stuff on it, got it indexed and laid in some Adsense ads just for good measure.

Next, I went to my Google account and set up a custom search engine. The setup interface allows you to target your search engine to a pre-defined list of websites or to have to search the entire web. For my purpose, I made the custom search engine return results from ONLY my website that I had made around the dreams database I compiled. That way, not only do I stand to profit from the ads displayed in the SERPs but since all the SERP results are for pages on my site, I am funneling traffic into my site which in turn shows more of my ads.

After going through all the setup steps, Google spat out the code for me to copy-and-paste into the HTML for my Sidebar Search Widget.
I am not going to go into detail on how-to create your widgets as it is extremely easy to do and you will be able to figure it out with just a little research on your part.

So I designed my little search widget with nice clean interface and a snappy title and then I published it to the Windows Live Gallery website. Within one week, over 1000 people had downloaded my widget to their desktop sidebar. Getting that kind of desktop real estate on peoples computer screens is something that most internet marketers would KILL for. My search box is on their screen every day and it continually sends their requests to my website.

The important part is repeating this process on many different subjects. The more subjects you cover, the more search volume you will pull. The beautiful thing here is the sky is really the limit. If you want to pull in that $50/day I’ve been talking about, then you better be prepared to create a GCS for every niche you are in. And you better be prepared to present each of your GCS’ in a different widget for each platform. Ie: Create one widget for Netvibes, one for Vista, one for Google Desktop, etc, etc. Once you get the hang of it, you will find that you can create a simple template for each platform and then just plug in the different GCS code into each one. After a while, you’ll be creating new search widgets at an amazing pace.

Now, remember how I said that this post was going to build on the previous posts about Google Hacking? I wasn’t kidding.

Take a sec to peek through the Free Downloads section of the website and think about how you can build websites around the databases provided there. How about creating a database of Bible verses, and then creating a Bible Study search widget that funnels all searches into that website? How about creating a website of food and drink recipes and making it searchable via a desktop widget?

The databases provided give you an excellent leg-up in creating websites with large amounts of information that are perfectly appropriate for Google Custom Search widgets.

Here are some other ideas to get you started:

* Video Game Cheats Search Widget
* Myspace Layouts Search Widget
* Baby Names Search Widget
* Restuarants Search Widget
* Celebrity Gossip Search Widget
* Product Recalls Search Widget
* Lyrics Search Widget

In parting, here some more food for thought for all your Affiliate Marketers:

Instead of just relying on Adwords for profit, how about a desktop search widget that returns Amazon Books with your Affiliate Link? This is where the real money is.

Before you comment below, I want to make a few things clear: I know that “Make XXX dollars a day” posts tend to be incendiary and/or contentious topics for people. The success of this technique, like every money making technique, depends on how much creativity you employ, and how much volume you push. Further, the number “50″ is completely arbitrary; it makes for a nice round number and a good title for a post. Truth is, if you are smart and play it right, there’s no reason you can’t pull in a lot more than $50.00/day (a lot more). Conversely, if you are half assed about it, like with anything, you probably won’t do nearly as well. So if you are about to bitch and whine about not being able to pull $50/day (which is pretty easy to do) then take a step back and ask yourself if you really are doing everything right and pushing enough volume.

–Rob

———————————
Thanks for the great post Rob. He has some great articles and free stuff at his blog SEOcracy.com so be sure to check it out.

Next Page »