100’s Of Links/Hour Automated - Introduction To Black Hole SEO
I really am holding a glass of Guinness right now so in all the authority it holds…Cheers! I’m kind of excited about this post because frankly it’s been a long time coming. For the last 7-9 months or so I’ve been hinting and hinting that there is more to Black Hat than people are willing to talk about. As “swell” as IP delivery and blog spam are there’s an awesome subculture of Black Hats that takes the rabbit hole quite a bit deeper than you can probably imagine. This is called Black Hole SEO. By no means am I an expert on it, but over the last few years I’ve been getting in quite a bit of practice and starting to really kick some ass with it. In the gist, Black Hole SEO is the deeper darker version of black hat. It’s the kind of stuff that makes those pioneering Black Hat Bloggers who dispel secrets like parasite hosting and link injection techniques look like pussies. Without getting into straight up hacking its the stuff black hatters dream about pulling off, and I am strangely comfortable with kicking in some doors on the subject. However lets start small and simple for now. Than if it takes well we’ll work our way up to some shit that’ll just make you laugh its so off the wall. Admit it, at one point you didn’t even think Advanced SEO existed.
In my White & Black Hat Parable post I subtly introduced this technique as well as the whole Black Hole SEO concept. It doesn’t really have a name but basically it follows all the rules of Black Hole SEO. It targets sites on a mass scale, particularly scraper sites. It tricks them into giving you legitimate and targeted links and it grabs its content on an authoritative scale (will be explained in a later related post). So lets begin our Black Hole SEO lesson by learning how to grab hundreds of links an hour in a completely automated and consenting method.
Objective
We will attempt to get black hat or scraper sites to mass grab our generated content and link to us. It’ll target just about every RSS scraper site out there, including Blog Solution and RSSGM installs including many private scrapers and Splogs.
Methodology
1) First we’ll look at niche and target sources. Everyone knows the top technique for an RSS scraper is the classic Blog N’ Ping method. It’s basically where you create a scraped blog post from a search made on a popular Blog Aggregator like Google Blog Search or Yahoo Blog Search. Then they ping popular blog update services to get the post indexed by the engines. For a solid list of these checkout PingOMatic.com. Something to chew on, how many of you actually go to Weblogs.com to look for new interesting blog posts? Haha yeah thats what I thought. 90% of the posts there are pinged from spam RSS scraper blogs. On top of that there’s hundreds going in an hour. Kinda funny, but a great place to find targets for our link injections none the less.
2) We’ll take Weblogs.com as an example. We know that at least 90% of those updates will be from RSS scrapers that will eventually update and grab more RSS content based upon their specified keywords. We know that the posts they make already contain the keywords they are looking for, otherwise they wouldn’t of scraped them in the first place. We also have a good idea of where they are getting their RSS content. So all we got to do is find what they want, where they are getting it from, change it up to benefit us, and give it back.
3) Write a simple script to to scrape all the post titles within the td class=”blogname” located between the !– START - WEBLOGS PING ROLLER — comments within the html. Once you got a list of all the titles store it in a database and keep doing it infinitely. Check for duplicates and continuously remove them.
4) Once you got all the titles steadily coming in write a small script on your site that outputs the titles into a rolling XML feed. I know I’m going to get questions about what a “rolling XML feed” is so I’ll just go ahead and answer them. It’s nothing more than an xml feed that basically updates in real time. You just keep adding posts to it as they come in and removing the previous ones. If the delay is too heavy you can always either make the feed larger (up to about 100 posts is usually fine) or you can create multiple XML feeds to accommodate the inevitably tremendous volume. I personally like the multiple feed idea.
5) Give each post within the feed the same title as you scraped from Weblogs. Then change the URL output field to your website address. Not the original! Haha that would do no good obviously. Then create a nice little sales post for your site. Don’t forget to include some html links inside your post content just in case their software forgets to remove it.
6) Ping a bunch of popular RSS blog search sites. The top 3 you should go for are:
Google Blog Search
Yahoo News Search
Daypop RSS Search
This will republish your changed up content so the RSS scrapers and all the sites you scraped the titles from will grab and republish your content once again. However, this time with your link. This won’t have any affect on legitimate sites or services so there really are no worries. Fair warning: be sure to make the link you want to inject into all these Splogs and scraped sites as a quickly changed and updated variable because this will gain you links VERY quickly. Lets just say I wasn’t exaggerating in the title
A good idea would be to put the link in the database, and every time the XML publishing script loops through have it query it from the database. That way you can change it on the fly as it continuously runs.
As you’ve probably started to realize this technique doesn’t just stop at gaining links quickly, it’s also a VERY powerful affiliate marketing tool. I started playing around with this technique before last June and it still works amazingly. The switch to direct affiliate marketing is easy. Instead of putting in your URL, grab related affiliate offers and once you got a big enough list start matching for related keywords before you republish the XML feed. If a match is made, put in the affiliate link instead of your link and instead of the bullshit post content put in a quick prewritten sales post for that particular offer. The Black Hat sites will work hard to drive the traffic to the post and rank for the terms and you’ll be the one to benefit.
Each individual site may not give you much but when you scale it to several thousands of sites a day it starts really adding up quickly. By quickly I mean watch out. By no means is that a joke. It is quick. There are more RSS scraped pages and sites that go up everyday than any of us could possibly monetize no matter how fast you think your servers are.
This made my frickin’ head explode.
I can see I am going to be writing some new code in the wee hours of the morning tomorrow…
can you give us the code?
This sounds great
hi eli,
how do you do your nr3 script?
Is it enough to change the positions of the words, or should i kill some letter out of the title
like “this title is best” kill the “t”
and results in “his ile is bes” ?
nah, keep the exact same titles. Your not stealing anyones content. You are more than welcome to have the same post title. The only ppl this technique affects is RSS scraper sites so you can’t go too incredibly wrong with it.
Uff, for a noob this is too complicated. Can someone write a script which does that all those steps automatically, along with a visual interface guiding you through it?
I didn’t understand half of what you posted *lol*
I’d like to someone to write a script for me that does this, then install it on my website for me.
Then Id like that person to come over and wipe my ass for me: two wipes in a clockwise direction.
Gimme a break buys! Start learning how to do this simple stuff yourself and stop depending on the charity of others!
“Then Id like that person to come over and wipe my ass for me: two wipes in a clockwise direction.”
LOL, Lol, lol….
btw eli: is this a way how i should promote my subdomains you wrote me via email?
Nice post.
This is a really stupid question but what language does the script need to be in?
you can do this with php,perl,ruby,python and more. If you are a forever noob like me, I would vote for PHP, because php.net is an incredible resource for gluing this sort of thing together. I think the way PHP handles things like dealing with mysql is easy to understand, and i think it will give you more possibilities for extension of these ideas by incorporating them into some of the wide variety of php website scripts.
php5 would be excellent choice!
Guys,
No one is going to hand over this script to you. This is valuable stuff. If you don’t know how to code it, pay someone to do it for you. But you can’t expect everything to be handed to you on a silver platter.
I actually have been doing this. The results are just as you described it. I didn’t know it had anything black hat in it, tho
Haha yeah, I wouldn’t call it very blackhat. After all, you are basically targeint gpsmmaers, so who cares?
Eli…when you did this with affiliate links, what were the profits like?
Georgi do you do any seo consultancy work or coding.
Anyone want to sell me a working script?
Hey Eli,
Couple of clarifications. What are you feeding the scrapers for content? I know you are giving them the weblogs titles, but where is the actual content they are scraping off us coming from ? Are you saying one sales copy for all posts, just with different titles?
Also, how many of these scrapers actually link back to the original article.. I don’t see that too often.
Does someone know the regexpression to scrape from weblogs.com? I don’t get it, the multiple lines are killing my tries…
Just str_replace the newlines out befor doing the regexp. Thats what I always do
http://www.ilovejackdaniels.com/regular_expressions_cheat_sheet.png
Look at pattern modifiers. /i and /m are your friend. I like to use /ims at the end of each one.
Very nice link man.
He’s back!
If i understand correctly, the idea is to create a RSS feed in xml for a non-existing blog, but that will contain instead keyword-targeted text with links to your affiliates links?
Then ping everything to get your feed scraped and used on blackhat sites, thus having your content published by blackhat people?
Writing a dynamic RSS feed is a piece of cake, if you need to get started you can look at the tutorial:
http://www.icemelon.com/tutorials.php?id=3&/PHP/Generate%20RSS%20Feeds/
Then we go over a technique called link baiting. Hey guys! The script for this is at my blog!!
Very nice tutorial Eli.. Maybe you should just hint at stuff from now on because I am sure this is going to get quite a rush in the next few days. Actually though, I don’t know how many people will go through with the program.
I think that’s the case with a lot of what Eli posts… very cool ideas that LOTS of people get all hot and bothered about… I suspect that very few readers actually get to the stage of coding and launching many of these ideas though for one simple fact… some effort is required.
That’s great news though, because the few of us that are actually trying and expanding on the ideas Eli is giving us will have less competition
Thanks again Eli for another awesome post!
Holy Cow Eli - my head just about exploded when I read this post. I love the idea - had wondered how this worked when you mentioned it in your last post. I’m definitely going to have to get someone to design something like this for me (I’m a lover not a coder
) Anyone on here who intends to make a copy of this holler at me with a quote for a price, I’ll make it worth your while
One meeeeeelion dollars!
SOLD!
Nah - not really - not got pockets THAT deep
I’ll let you guys know when I have my scripts written and working.
Now write me a script!
Tyler, where is the script on your site?
I couldn’t find it.
Thanks.
It was never there.. It was a joke.
By the way everyone.. From my calculation there are about 30 blog posts/second.. This is going to destroy my server lol..
I’m going to have to agree with most everyone on here.
If you didn’t entirely understand the post then don’t attempt it. There will be more posts in the future with lots of fun stuff you can try out. Just let this one go until you’re ready for it.
People with the regex question, it depends on what language your using but you will have to do a multiline match as well as accept multiple matches(usually m/ and /g) and put those matches into either an array or a scalar.
Guiness on Cinco De Mayo… “BRILLIANT!”… muchas gracias for another beauty, E
Regex isn’t the only way. If you can’t get your head around regex then you can use PHP (VERY n00b friendly) with some str_replace and substr calls, along with some while loops to grab any content you want. Regex is a faster solution, but it’s not the only solution… and not always better for performance.
Don’t use regex to parse html/xml, its way to hard and breaks all the time.
Use python, BeautifulSoup, is amazingly esay and works, now with that said, wow this is such a cute fun project it can be done in a simple shell script using the usual suspects, gred/sed/curl/lynx/etc
To get your targets, why not try
grep ‘.xml”‘ shortChanges.xml | \
grep -i “\(xbox\|game\|wii\|psp\)” | \
sed ’s; url=”.*” ; url=”http://my1337.com/rss.xml” ;’ | \
sort | \
uniq
As far as the weblogs example goes you can use their changes log. It’s about 2mb for every 5 minutes.
http://rpc.weblogs.com/shortChanges.xml
I shouldn’t have to say this, but Right click, Save Target As (Save Link As in FF).
awsome eli, thanks for this
Thanks for sharing this method, I’m excited to try it out. One question, for # 6) “Ping a bunch of popular RSS blog search sites.” How often do you ping, once an hour?
ping once per group of posts. So if you got an xml feed with 100 posts make one ping to each of the services. Then repopulate the xml feed with another 100 posts and reping again.
Do you use any specific software to do this? (the rss/ping stuff) or is it custom code?
Come on Eli, you always tend to put these complex techy stuff which I mess up. I get excited at the idea of getting more traffic and more money and finally end up getting nothing as I don’t do it right. Can we have a non techy post for all us non techy people who want to get traffic?
all your base are belong to us
Sorry buddy I don’t know what to tell you.
I post the simple ones every once in awhile but they quickly get eaten up just because everyone is capable of doing thing so quickly. Then everyone complains about not getting the info it in time and the idea becoming quickly saturated. It’s a bummer stigma, I realize. What would you like to hear about? Any specific topics you would particularly like to read about? You know I’m always open.
Before every post I usually take a moment to think about what I would like to read about. That usually becomes my post topic. Then I get to read further about it in the comments and learn even more. I know that sounds unfair but it’s sorta my balancing factor.
On that, this idea is actually very simple it’s just that I use a lot of jargon and don’t bother explaining any of it. It’s not that I’m unsympathetic to the people who aren’t familiar with the jargon, its just that this is after all an “advanced SEO” venue. So I can either explain every single thing on every single post(my posts are already quite lengthy in case you didn’t notice) or I can just cut to the meat of it.
Thanks Eli, that was fun! Much easier than it looked. Wait and see what happens next, going to have to figure out how to automate the pings. Right now have it set up to randomly (my personal favorite) pull titles, and tie it to a domain. May have to cache it, if it becomes a burden on the server.
that was the most confusing thing i’ve ever read.
Any particular step thats causing this confusion? Perhaps I can help explain it further.
Hi everybody. Hi Eli.
I’ve just grabbed and analyzed a bunch of titles of recently updated blogs. I got it from
http://blogsearch.google.com/changes.xml?last=60
Below is what I got:
BLOG_TITLES_START
weston
Database for Research Grants and Contracts
justin
http://denshi.hitchart.com/u.r/denshi/RQ2
My Wheels
weston
Real estate exam maryland
Jason Bartholme's SEO Blog
х_+ф║
That is only the first level scraping. You will need to further scrape those RSS feeds for the actual post titles and contents. We scrape RSS Aggregator only for the list of RSS feeds. Then, scrape those feeds for the actual post titles and contents that we want. Re-package (retain post titles, random post contents, replace the link portion) as your own feed . Lastly, ping your feed back to RSS Aggregator.
Hi Eli,
What format are your feeds in? RSS, RSS2, Atom?
Its a great idea, but methodology went above my head. Can you provide us the script and more detailed examples??
Please please please
Eli:
I found a much easier way of doing this which I won’t post here. E-mail heading your way in a few minutes
Jason
Does the weblog url we ping with have to be where the “post” is stored? Or can we just make the post url be http://www.sitewewanttopromote.com/?
Hi Eli,
I would like to read about promoting your site through social networking sites. How exactly to go about it. You have covered this is various posts as tit-bits but I would like a single big easy to understand one. In a non-technical way ofcourse
I have begun working on a script for this, I will sell copies when I have finished.
John,
I am emailing you from your post at Blue Hat SEO.
You mention that when you are finished building your code that you would be willing to sell a copy.
I’m interested in being put on a List — when you email me if you use “Blue Hat SEO” somewhere in the title — I will be able to find it and respond quickly.
Warmest Regards,
Jeff — 21world@bellsouth.net
Hi John
Im interested in the script too
Good if you would put me on your list and email your paypal details
Kind regards
Mark
Eli:
I just sent you that e-mail
Jason
Great post! I’ve yet to come across another blogger who’s giving away tricks and tips like these
Anyway, I was wondering: at step 3 you’re talking about grabbing and storing all the post Titles from weblogs.com. Maybe I’m missing something here, but doesn’t weblogs.com only list the general -blog- titles?
I was wondering the same thing. I don’t see how just re-feeding the blog titles does any good.
Those are not the title you wanted. You need to further scrape the rss feeds to get the actual post titles and contents.
Thanks for such a valuable resource Eli. This blog just gets better and better.
One question: Maybe I’m missing something here, but could a “poor man’s ” way to do this be just to download http://rpc.weblogs.com/shortChanges.xml, find and replace everything between the url=”" tag and ping all the above mentioned blog resources?
Perhaps I’m missing where the actual blog post comes from.
thx.
There will be a follow up post to this whole thing by the end of the day.
Well, Weblogs.com is now worthless. Everyone and their brother will be using those titles. Time to figure out how to get the info from the other ping services.
My bg this is I’m trying to figure out how to filter out crazy titles that are not in english.
If you want to get really fancy and even targeted, write a way to categorize the post titles and then target links / title that are relevant.
By the way. I’ll just come right out and say it. In that post I made a blatant hint to the next Black Hole SEO technique when I mentioned authoritative content. So if you want a head start on everyone you might want to start thinking about how someone like me would go about getting unique content that has been proven to be authoritative in the search engines.
I think I know what you mean, and I borrowed some code from a “website generator” to help with this process. Let’s just say the posts read a little “differently” now.
Of course, I used less authoritative content…
Alright, here is the moment that you have all been waiting for
I have released my automated version of this to the public on my blog. Here is the link:
#EDIT BY ELI: LINK REMOVED UNTIL I SEE A COPY.
Jason
Hi Eli,
is it a good idea to combine this method with your network idea?
like one script for collection all titles and for each network site a ping script with their own rss feed
Not especially. That wouldn’t do a whole ton of good. However, it would be an awesome idea to combine this method with my Link Laundering Sites technique. Also, a little birdy told me this works great as an ultimate solution for “Power Indexing.”
Wow, Wonder if i can max out my two clustered dual cpu iis boxes….. Gotta love dedicated hardware.
Great Post - keep it coming! Your post really help me to think outside of the box, and to focus on ways to up my game.
Black hole SEO
Following up on yet another silly phrase made up by Mr Bluehat, I’ll tell you how to do black hole SEO in another way. You know what else people scrape a lot? Search engine results.
So how would you go about making people scrape SERPS that inclu…
That reminds me, there’s a search engine API that I was scraping for extra content on an experimental site and the funny thing was after 1.5 months my scraped pages started showing up at the top of the results in that engine and my own pages were showing up in the results I was scraping.
Hey Eli,
Nice post! Some questions thinking about Google and contents being related.
As always if I am wrong or not getting it right, let me know please.
1 - Wouldn´t you filter those titles in order to make them related to the content you will use in your XML feed? (Relevancy)
Would be great to parse having in mind the niche you want to target (this would imply we will be using multiple resources to scrape titles and get some matches related to our content). This way, you get links from scraper-made websites targeting specific niches.
2 - “The switch to direct affiliate marketing is easy. Instead of putting in your URL, grab related affiliate offers and once you got a big enough list start matching for related keywords before
you republish the XML feed.” (Quality)
We talked a bit about this on your last post where you introduced the idea.It seems we will get tons of incoming links fast but these are splogs as you said (poor quality). Seems ideal for heavy link spamming processes and short term affiliate revenue.
Not that much for websites with long term aspirations.
In your last post, you mentioned that websites, blogs, etc could scrape contents from white hats defending their position against BH (including links to their websites) or unwillingly insert links to banned domains thus decreasing
their value as a source of incoming links for the BH webmaster (as a matter of fact this was the only part of your post I had some doubts about cus since white hatters are getting links from there also…aren´t they harming theirselves at the same time? Why not just insert those banned domains and let BH get horrible links that harm their rankings solely?)
As you can see, my doubt is always revolving around the benefits of these fast link building schemes when it comes to SEO projects. Link velocity will be great but what about results in the long run. What will prevail? What do you think based on your experience?
Keep it up!:)
Nick
Heya Nick,
Great comment/questions as always. I’m glad you post them as comments so everyone can read them.
First your first question, I build links for volume. I build links for relevancy. Just by personal policy I never mix the two. Simply because obviously whenever you try to do both one, the other, or both will naturally suffer. So I always try to do each to it’s maximum capacity in a separate manner. It hasn’t failed me yet.
However in this case, if you were to do the affiliate offer rather than going for the straight link building than definitely. A reader already commented on this on the follow up post. He solved the targeted traffic problem by gathering tons of affiliate offers, making a solid list of keywords for each one, than attempting to match each possible title with an affiliate offer. I would stretch that one step further. I would put a priority on the affiliate offers. So if a possible match could be made, than i would insert the affiliate link instead. If no match could be made than I would use the title get an inbound link. Kind of like a Link Laundering technique on steroids.
I think i just accidentally answered your second question. Indeed we did talk about it. With the same intent as before, I would use these links for traffic, or use them for SEO purposes. Mixing them is fine, but do it tactfully and like you said targeted. These are not the highest quality links in the world but many will have solid link authority because the owner may drive some massive amounts of deep links to his pages through link bombing practices. More often than not, these link bombing techniques will involve gathering relevant links. So even though his page may not be relevant to your site, it can still pass good authority, thus boosting your rankings.
So don’t focus on being worried about building links too quickly. It seems logical that search engines would think about this and consider it a bad thing, but its simply an impossibility to make applicable without drastic consequences. Take for instance the presidential race going on right now. Each candidate has a brand new website. Instantly over night they all got absolutely massive amounts of links, most completely irrelevant and from random blogs and sites that have nothing to do with their subject. Can you see a single site that doesn’t rank for its terms. Even the celebrity candidates toppled down everyone whos been in long standing already. Gaining links too quickly is never as much of a problem as gaining links too slowly. Although if you are the type that spends 7 hours a day hitting refresh on the results page for your terms, than sanity may be a factor.
I think its a damn shame that people here are yet to truly understand and realize the power of my old Synergy Links post. If you completely ignore the entire technique itself and strictly comprehend that it is entirely possible to change the relevancy value of a group of inbound links, into a high quality and relevant link the SEO world is your oyster.
Here’s how you can create hyperlinked keywords from the short change weblogs file on the command line.
wget http://rpc.weblogs.com/shortChanges.xml
cat shortChanges.xml | grep “weblog name” | perl -ne ‘/”(.*?)”/;print “$1\n”;’
and the result …
Dear God Part Two
My blog 710
trip, etc.
Demetrius
Dave
Thanks dave. thats perfect.
What exactly is that cat going to do for you ?
/me spent waaaay too many hours on #sed,#awk and #grep
;)
mmh how would you filter out titles that look like
跳呀跳的,我è¦�è·³é€²ä½ å¿ƒè£¡ or something?
i think titles like that would not be of any use..
use utf8_decode and then filter titles that contain 3 or more ? characters.
This eliminates 90% of the titles like that.
thanks, works nicely!
I use this to filter out those characters:
$val = iconv(”UTF-8″,”UTF-8//IGNORE”,$val);
it does the trick.
Reformed Adult Webmaster Reveals Cutting-Edge Marketing Secrets
Cutting edge internet marketing secrets revealed for your home business.