<?xml version="1.0" encoding="UTF-8"?><!-- generator="wordpress/2.0.7" -->
<rss version="2.0" 
	xmlns:content="http://purl.org/rss/1.0/modules/content/">
<channel>
	<title>Comments on: Complete Guide To Scraping Pt. 2 - Crawling</title>
	<link>http://www.BlueHatSEO.com/complete-guide-to-scraping-pt-2-crawling/</link>
	<description>Advanced SEO Tactics and Techniques</description>
	<pubDate>Mon, 15 Mar 2010 15:03:02 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.0.7</generator>

	<item>
		<title>by: Cubefield</title>
		<link>http://www.BlueHatSEO.com/complete-guide-to-scraping-pt-2-crawling/#comment-415408</link>
		<pubDate>Sat, 23 Jan 2010 04:31:36 +0000</pubDate>
		<guid>http://www.BlueHatSEO.com/complete-guide-to-scraping-pt-2-crawling/#comment-415408</guid>
					<description>I'm not big on CGI, could you possibly translate it into a PHP combatible format? Thanks!</description>
		<content:encoded><![CDATA[<p>I&#8217;m not big on CGI, could you possibly translate it into a PHP combatible format? Thanks!
</p>
]]></content:encoded>
				</item>
	<item>
		<title>by: free online pakistani chat room</title>
		<link>http://www.BlueHatSEO.com/complete-guide-to-scraping-pt-2-crawling/#comment-384782</link>
		<pubDate>Mon, 14 Sep 2009 11:00:59 +0000</pubDate>
		<guid>http://www.BlueHatSEO.com/complete-guide-to-scraping-pt-2-crawling/#comment-384782</guid>
					<description>Is there any further article coming on the same topic?</description>
		<content:encoded><![CDATA[<p>Is there any further article coming on the same topic?
</p>
]]></content:encoded>
				</item>
	<item>
		<title>by: Tower</title>
		<link>http://www.BlueHatSEO.com/complete-guide-to-scraping-pt-2-crawling/#comment-362199</link>
		<pubDate>Wed, 10 Jun 2009 11:56:08 +0000</pubDate>
		<guid>http://www.BlueHatSEO.com/complete-guide-to-scraping-pt-2-crawling/#comment-362199</guid>
					<description>I have just read part 1 which was brilliant but part 2 was super, is there a part 3 to this?

I do hope so.</description>
		<content:encoded><![CDATA[<p>I have just read part 1 which was brilliant but part 2 was super, is there a part 3 to this?</p>
<p>I do hope so.
</p>
]]></content:encoded>
				</item>
	<item>
		<title>by: Rocket Spanish Reviews</title>
		<link>http://www.BlueHatSEO.com/complete-guide-to-scraping-pt-2-crawling/#comment-335585</link>
		<pubDate>Wed, 18 Feb 2009 18:53:29 +0000</pubDate>
		<guid>http://www.BlueHatSEO.com/complete-guide-to-scraping-pt-2-crawling/#comment-335585</guid>
					<description>Nice idea but I see too many ways for you to get in legal trouble with this method so I'm not sure I will try it.</description>
		<content:encoded><![CDATA[<p>Nice idea but I see too many ways for you to get in legal trouble with this method so I&#8217;m not sure I will try it.
</p>
]]></content:encoded>
				</item>
	<item>
		<title>by: id_oNe</title>
		<link>http://www.BlueHatSEO.com/complete-guide-to-scraping-pt-2-crawling/#comment-299633</link>
		<pubDate>Sun, 26 Oct 2008 23:39:54 +0000</pubDate>
		<guid>http://www.BlueHatSEO.com/complete-guide-to-scraping-pt-2-crawling/#comment-299633</guid>
					<description>The html:LinkExtor link is invalid, new link is 

http://search.cpan.org/~gaas/HTML-Parser-3.56/lib/HTML/LinkExtor.pm 

Thanks, keep up the good work!</description>
		<content:encoded><![CDATA[<p>The html:LinkExtor link is invalid, new link is </p>
<p><a href="http://search.cpan.org/~gaas/HTML-Parser-3.56/lib/HTML/LinkExtor.pm" rel="nofollow">http://search.cpan.org/~gaas/HTML-Parser-3.56/lib/HTML/LinkExtor.pm</a> </p>
<p>Thanks, keep up the good work!
</p>
]]></content:encoded>
				</item>
	<item>
		<title>by: download wii games</title>
		<link>http://www.BlueHatSEO.com/complete-guide-to-scraping-pt-2-crawling/#comment-232108</link>
		<pubDate>Fri, 30 May 2008 15:15:53 +0000</pubDate>
		<guid>http://www.BlueHatSEO.com/complete-guide-to-scraping-pt-2-crawling/#comment-232108</guid>
					<description>so joe cracker looks to have become the topic of the thread....was there even any answer to the use of content re-writers however??? something i would also like to use to 'flip' content pieces.</description>
		<content:encoded><![CDATA[<p>so joe cracker looks to have become the topic of the thread&#8230;.was there even any answer to the use of content re-writers however??? something i would also like to use to &#8216;flip&#8217; content pieces.
</p>
]]></content:encoded>
				</item>
	<item>
		<title>by: BORAT</title>
		<link>http://www.BlueHatSEO.com/complete-guide-to-scraping-pt-2-crawling/#comment-209467</link>
		<pubDate>Mon, 18 Feb 2008 21:05:35 +0000</pubDate>
		<guid>http://www.BlueHatSEO.com/complete-guide-to-scraping-pt-2-crawling/#comment-209467</guid>
					<description>OOOOOOOOOOOOOOOOJ Joe Cracker
Do you remember me?
You were my driving instructor.
You said that woman must give me permision to have sexy time with me.
hahahahahaha what a nonsense:)</description>
		<content:encoded><![CDATA[<p>OOOOOOOOOOOOOOOOJ Joe Cracker<br />
Do you remember me?<br />
You were my driving instructor.<br />
You said that woman must give me permision to have sexy time with me.<br />
hahahahahaha what a nonsense:)
</p>
]]></content:encoded>
				</item>
	<item>
		<title>by: neil strauss</title>
		<link>http://www.BlueHatSEO.com/complete-guide-to-scraping-pt-2-crawling/#comment-200950</link>
		<pubDate>Mon, 28 Jan 2008 14:18:37 +0000</pubDate>
		<guid>http://www.BlueHatSEO.com/complete-guide-to-scraping-pt-2-crawling/#comment-200950</guid>
					<description>So Joe Cracker, would you please let us all know the technique and system you use in myspace to make so much money?  Please do so as we would all appreciate that.</description>
		<content:encoded><![CDATA[<p>So Joe Cracker, would you please let us all know the technique and system you use in myspace to make so much money?  Please do so as we would all appreciate that.
</p>
]]></content:encoded>
				</item>
	<item>
		<title>by: I would</title>
		<link>http://www.BlueHatSEO.com/complete-guide-to-scraping-pt-2-crawling/#comment-9542</link>
		<pubDate>Thu, 04 Jan 2007 20:01:15 +0000</pubDate>
		<guid>http://www.BlueHatSEO.com/complete-guide-to-scraping-pt-2-crawling/#comment-9542</guid>
					<description>Joe Cracker is a noob and a half.</description>
		<content:encoded><![CDATA[<p>Joe Cracker is a noob and a half.
</p>
]]></content:encoded>
				</item>
	<item>
		<title>by: Eli</title>
		<link>http://www.BlueHatSEO.com/complete-guide-to-scraping-pt-2-crawling/#comment-8753</link>
		<pubDate>Sun, 24 Dec 2006 04:11:39 +0000</pubDate>
		<guid>http://www.BlueHatSEO.com/complete-guide-to-scraping-pt-2-crawling/#comment-8753</guid>
					<description>Welcome to &lt;b&gt;advanced SEO&lt;/b&gt; we take no prisoners :)</description>
		<content:encoded><![CDATA[<p>Welcome to <b>advanced SEO</b> we take no prisoners <img src='http://www.BlueHatSEO.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />
</p>
]]></content:encoded>
				</item>
	<item>
		<title>by: Matt Larson</title>
		<link>http://www.BlueHatSEO.com/complete-guide-to-scraping-pt-2-crawling/#comment-7376</link>
		<pubDate>Mon, 04 Dec 2006 08:37:07 +0000</pubDate>
		<guid>http://www.BlueHatSEO.com/complete-guide-to-scraping-pt-2-crawling/#comment-7376</guid>
					<description>Can't believe I'm backing up Joe Cracker, but black-hat doesn't mean illegal... it means "unethical". Now what is "unethical?" I suppose that depends on your own values and that of the industry to which you belong.

Conversely, laws are not issues of ethics. You break them and you pay one way or the other.

The government can make anyone's life miserable, so follow copyrights &#38; attribute sources. If it's wikipedia you're scraping or some other GNU or CC work, it's easy. For article sites, put a little more work into it and reference the source (author). Then scrape away!</description>
		<content:encoded><![CDATA[<p>Can&#8217;t believe I&#8217;m backing up Joe Cracker, but black-hat doesn&#8217;t mean illegal&#8230; it means &#8220;unethical&#8221;. Now what is &#8220;unethical?&#8221; I suppose that depends on your own values and that of the industry to which you belong.</p>
<p>Conversely, laws are not issues of ethics. You break them and you pay one way or the other.</p>
<p>The government can make anyone&#8217;s life miserable, so follow copyrights &amp; attribute sources. If it&#8217;s wikipedia you&#8217;re scraping or some other GNU or CC work, it&#8217;s easy. For article sites, put a little more work into it and reference the source (author). Then scrape away!
</p>
]]></content:encoded>
				</item>
	<item>
		<title>by: bad-ass-bob</title>
		<link>http://www.BlueHatSEO.com/complete-guide-to-scraping-pt-2-crawling/#comment-7180</link>
		<pubDate>Tue, 28 Nov 2006 00:10:37 +0000</pubDate>
		<guid>http://www.BlueHatSEO.com/complete-guide-to-scraping-pt-2-crawling/#comment-7180</guid>
					<description>Hay Joe Cracker-
Welcome to the dark side of the web. Black hat stuff it is. Does it involve unethical tactics? you bet your ass it does. If this bothers you then go play somewhere else cause we know what we are doing and we don't need you to stumble in and start blabbing to us about -hay do you realize this is copyright infringement? fuck yes we realize it but the money is too good to pass up. So as I said, find someone else to bother and get the fuck outa here. If you stick around we will turn you to the dark side. You been warned!!</description>
		<content:encoded><![CDATA[<p>Hay Joe Cracker-<br />
Welcome to the dark side of the web. Black hat stuff it is. Does it involve unethical tactics? you bet your ass it does. If this bothers you then go play somewhere else cause we know what we are doing and we don&#8217;t need you to stumble in and start blabbing to us about -hay do you realize this is copyright infringement? fuck yes we realize it but the money is too good to pass up. So as I said, find someone else to bother and get the fuck outa here. If you stick around we will turn you to the dark side. You been warned!!
</p>
]]></content:encoded>
				</item>
	<item>
		<title>by: Joe Cracker</title>
		<link>http://www.BlueHatSEO.com/complete-guide-to-scraping-pt-2-crawling/#comment-7177</link>
		<pubDate>Mon, 27 Nov 2006 23:35:43 +0000</pubDate>
		<guid>http://www.BlueHatSEO.com/complete-guide-to-scraping-pt-2-crawling/#comment-7177</guid>
					<description>You ignored the issue regarding copyrights, Eli. I don't think publishers want you profiteering off their original work without permision. Scraping for content is stealing.

Also, google has duplicant content filters so your scrape content probably won't rank too well.</description>
		<content:encoded><![CDATA[<p>You ignored the issue regarding copyrights, Eli. I don&#8217;t think publishers want you profiteering off their original work without permision. Scraping for content is stealing.</p>
<p>Also, google has duplicant content filters so your scrape content probably won&#8217;t rank too well.
</p>
]]></content:encoded>
				</item>
	<item>
		<title>by: Seostomp</title>
		<link>http://www.BlueHatSEO.com/complete-guide-to-scraping-pt-2-crawling/#comment-7163</link>
		<pubDate>Mon, 27 Nov 2006 13:54:57 +0000</pubDate>
		<guid>http://www.BlueHatSEO.com/complete-guide-to-scraping-pt-2-crawling/#comment-7163</guid>
					<description>I use myspace for indexing purposes... used to monitize it but was asked by my affiliate managers to stop...  it works well for indexing though.</description>
		<content:encoded><![CDATA[<p>I use myspace for indexing purposes&#8230; used to monitize it but was asked by my affiliate managers to stop&#8230;  it works well for indexing though.
</p>
]]></content:encoded>
				</item>
	<item>
		<title>by: Aur</title>
		<link>http://www.BlueHatSEO.com/complete-guide-to-scraping-pt-2-crawling/#comment-7151</link>
		<pubDate>Mon, 27 Nov 2006 10:48:24 +0000</pubDate>
		<guid>http://www.BlueHatSEO.com/complete-guide-to-scraping-pt-2-crawling/#comment-7151</guid>
					<description>Hey today I discovered that in my company there is a big printer that also does scanner-to-email... I can put a big pile of papers in the machine and it will scan everything in less than one minute. With some nice OCR, it would mean a lot of fresh content !

About scraping, indeed the idea is to generate thousands of pages that you can re-use into a website. Every page should have some advertisers in it (affiliates and/or adsense). The idea is to get A LOT of content (like 10000 pages), build a website, get it known by the Search engines (you can use Eli's QUIT tool :)), and wait until search engines discover that you just stole the content, and then they'll ban your site. It usually takes something between 1-3 months, and meanwhile you'll have earned money from your advertisers.
Then you repeat the whole procedure.

I'm quite new to the game of scraping but I did one site that make me earn something between $5-$10 a day with 10000 pages, so if you manage to automate things well enough, you may be able to generate enough sites to multiply your income!

And by the way, I'm working on a tool that may let you have a scraped site not being banned so quickly (maybe not at all!) I'm currently testing and refining it, more news about it later ;)!

About Joe cracker, he's the kind of people who would like to be admired for what he does, so he will just boast and will not understand why people don't get impressed. On the other hand, someone like Eli just give you real keys to progress, and he deserves to get some admiration ! ;)
The myspace technique Joe's talking about is just the following: Use a myspace bot to add hundreds of friends. Then when you have friends, you can post a "bulletin" which is an announce that will be seen by all your friends. This bulletin will make them go to some affiliate (like a dating service). Again the game numbers is what is important, over thousands of friends, most will ignore your bulletin, but some will not, and you will get money from affiliate commissions.
"clicking on one button" is what Joe meant : let your bot add friends, and then when you have enough friends, post a bulletin going to somewhere that will make you some money.
Then repeat as much as you can.</description>
		<content:encoded><![CDATA[<p>Hey today I discovered that in my company there is a big printer that also does scanner-to-email&#8230; I can put a big pile of papers in the machine and it will scan everything in less than one minute. With some nice OCR, it would mean a lot of fresh content !</p>
<p>About scraping, indeed the idea is to generate thousands of pages that you can re-use into a website. Every page should have some advertisers in it (affiliates and/or adsense). The idea is to get A LOT of content (like 10000 pages), build a website, get it known by the Search engines (you can use Eli&#8217;s QUIT tool <img src='http://www.BlueHatSEO.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> ), and wait until search engines discover that you just stole the content, and then they&#8217;ll ban your site. It usually takes something between 1-3 months, and meanwhile you&#8217;ll have earned money from your advertisers.<br />
Then you repeat the whole procedure.</p>
<p>I&#8217;m quite new to the game of scraping but I did one site that make me earn something between $5-$10 a day with 10000 pages, so if you manage to automate things well enough, you may be able to generate enough sites to multiply your income!</p>
<p>And by the way, I&#8217;m working on a tool that may let you have a scraped site not being banned so quickly (maybe not at all!) I&#8217;m currently testing and refining it, more news about it later <img src='http://www.BlueHatSEO.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> !</p>
<p>About Joe cracker, he&#8217;s the kind of people who would like to be admired for what he does, so he will just boast and will not understand why people don&#8217;t get impressed. On the other hand, someone like Eli just give you real keys to progress, and he deserves to get some admiration ! <img src='http://www.BlueHatSEO.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /><br />
The myspace technique Joe&#8217;s talking about is just the following: Use a myspace bot to add hundreds of friends. Then when you have friends, you can post a &#8220;bulletin&#8221; which is an announce that will be seen by all your friends. This bulletin will make them go to some affiliate (like a dating service). Again the game numbers is what is important, over thousands of friends, most will ignore your bulletin, but some will not, and you will get money from affiliate commissions.<br />
&#8220;clicking on one button&#8221; is what Joe meant : let your bot add friends, and then when you have enough friends, post a bulletin going to somewhere that will make you some money.<br />
Then repeat as much as you can.
</p>
]]></content:encoded>
				</item>
</channel>
</rss>
