Where’s the harm in any of that? After all – imitation is the sincerest form of flattery, right?
Actually, we have another name for this sort of behavior. It’s not called flattery – it’s called plagiarism. And it is harmful – at the very least, it causes harm to the creator of the original work.
A quick definition of plagiarism – the “use, or close imitation, of the language and thoughts of another author and the representation of them as one’s own original work. (1995 Random House Compact Unabridged Dictionary)
Note to scrapers: when you quote directly from a publication, you are required to use attribution.
For the most part, people believe that in order to be engaged in plagiarism it is necessary to copy a writer’s words, word for word, when in fact, the presentation of an author’s idea no matter the words used to describe the idea, constitutes plagiarism.
In the online world plagiarism, more commonly called content scraping, is rampant – essentially out of control. Over the course of the last six months my personal site, the one you’re reading, has been scrapped relentlessly by content aggregators, individual site owners, and other less savory characters.
My experience with scrapers is not unusual, and it has been duplicated by virtually every site owner I know.
Currently, there is a small group of Bloggers, myself included, who are being scrapped by a someone who calls himself various names including, Adrian LaSalle, Chak Fung Pang, Sean Myers, and I’m convinced, a multitude of other names.
Some of the web sites he has set up in the last few days, or weeks, include Tech and App Info, DigiShark, Scam or Not (I had to laugh at that one), and others I’m sure that have yet to be discovered. Adrian et al, when approached with evidence of his scraping, first apologizes profusely and then continues to scrape.
The objective behind this type of attack is the hacker’s need (in my view, these people are no better than common hackers), to increase his site’s reputation on Google, and other search engines, by fraudulently increasing the site’s hits. This can lead to an increase in revenue generated by that site.
These types of exploits don’t impact me financially, since I don’t run ads on my sites. Nevertheless, it’s always disappointing to know that cyber-criminals are potentially benefiting economically from the results of my efforts. Yes Adrian, or whoever you really are, you are a thief.
Google, does a relatively good job of weeding out duplicate, or close to duplicate content. In fact, Google does a great job of culling the cheaters, by shipping their copied content to the far depths of search results.
In the short run then, criminals who scrape content may benefit marginally; but in the long run, scrapping is simply a loser’s game, perpetrated by losers.
In my experience, there is no absolute method available to stop scraping, or any other form of plagiarism, on the Internet. Unless one has very deep pockets and is prepared to institute legal action to recover damages (which must be proven – no easy task), the only alternative, in my view, is to name/shame these jerks who engage in this activity. Note to Adrian: I’m already closing in on who you really are.
There are a number of free services that offer some form of protection; the one that I currently use is My Free Copyright.
From the My Free Copyright site:
How does the MyFreeCopyright.com process work?
MyFreeCopyright.com provides a third-party, non-repudiation, registered dating of your original digital creation. By using this service, you publicly associate your digital copyright and defined rights to you.
So, how does MyFreeCopyright.com date register my copyright?
Every digital file has a unique makeup of bits and bytes which is its fingerprint. MyFreeCopyright.com captures your original creation’s fingerprint, stores the fingerprint in a database and sends a copy of the fingerprint to you in an email. The email contains the verified date; the fingerprint verifies the digital creation, and your email address verifies it belongs to you.
The following graphic illustrates how this works. Click on the graphic to view larger.
If you have been, or are being scraped, I’d be very glad to hear from you by way of comment.