<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>var/log &#187; picture</title>
	<atom:link href="http://www.varslashlog.com/tag/picture/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.varslashlog.com</link>
	<description>Yet another weblog</description>
	<lastBuildDate>Sat, 12 Sep 2009 13:34:27 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>How to stop spam without captcha or javascript</title>
		<link>http://www.varslashlog.com/2009/06/03/how-to-stop-spam-without-pictures-or-javascript/</link>
		<comments>http://www.varslashlog.com/2009/06/03/how-to-stop-spam-without-pictures-or-javascript/#comments</comments>
		<pubDate>Wed, 03 Jun 2009 00:15:46 +0000</pubDate>
		<dc:creator>AHSauge</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[Antispam]]></category>
		<category><![CDATA[captcha]]></category>
		<category><![CDATA[Header]]></category>
		<category><![CDATA[HTTP]]></category>
		<category><![CDATA[Javascript]]></category>
		<category><![CDATA[picture]]></category>
		<category><![CDATA[spam]]></category>
		<category><![CDATA[spambot]]></category>

		<guid isPermaLink="false">http://www.varslashlog.com/?p=169</guid>
		<description><![CDATA[This time I&#8217;m going to put up a really bold statement. I believe it&#8217;s currently possible to fight spam on the web without pictures, javascript or anything else that might compromise usability. Sounds quite impossible and too good to be true, right? Well, it might not be that far fetched as it seems. Please read [...]]]></description>
			<content:encoded><![CDATA[<p>This time I&#8217;m going to put up a really bold statement. I believe it&#8217;s currently possible to fight spam on the web without pictures, javascript or anything else that might compromise usability. Sounds quite impossible and too good to be true, right? Well, it might not be that far fetched as it seems. Please read on and I&#8217;ll explain why.<span id="more-169"></span></p>
<p>We&#8217;ve all seen it, <a title="Article on wikipedia about captcha" href="http://en.wikipedia.org/wiki/Captcha">captcha</a> or pictures that protects alot of web-forms against those nasty spam-bots while at the same time blissfully destroying the usability of the web (<a href="http://www.johnmwillis.com/other/top-10-worst-captchas/">10 &#8220;good&#8221; examples</a>). While a lot of attempts have been made to make it more usable for more people (pictures aren&#8217;t exactly user-friendly for blind people &#8230;), it still boils down to some requirements either for the browser (e.g. javascript) or the user (typing into an input field). While requiring the browser to support something isn&#8217;t that bad, requiring the user to do something definitely isn&#8217;t usability at it&#8217;s best. Wouldn&#8217;t it be nice to just instant see whether or not the current request is made from a bot or a browser? Well, you might be able &#8230;</p>
<p>For the last two weeks I&#8217;ve created my personal pet project logging spam attempts in some scripts running at <a href="http://www.ascdevel.com">Ascended Development</a>. After more than 300 attempts and above 100 unique spam-bots (or IPs anyway), the results shock me &#8230; a lot. The spam-bots that has visited the site is incredible stupid. In simplest terms they don&#8217;t even seem to be able to handle cookies (e.g. thereby also session in PHP), little less the pictures or javascript. The most ridiculous thing is that they actually do send data in the antispam input field. The problem is that it&#8217;s 2 to 4 times longer than the number of characters in the picture, and yes, it didn&#8217;t supply the cookie for the session meaning it wouldn&#8217;t succeed even if the wild guess was right.</p>
<p>So what can we do without requiring user input or javascript? Certainly not add a hidden input field in the form and hope that the bot add something to it. All bots (yeah, actually every single one of them) either provide the default value or didn&#8217;t provide the field at all. Actually out of the 300 attempts only 50 didn&#8217;t provide the field, and a massive 250 attempts had the default value. None tried any other values, and it seems that those that didn&#8217;t provide the field tried again with the default value. Simply put, a hidden field don&#8217;t work, at least not on those bots visiting our site at the logged locations.</p>
<p>So here&#8217;s the trick: Simply read the HTTP-headers. Has to be too good to be true? Well, the log is equally clear on this matter too. None of the bots provided the fields Accept-Language and Accept-Encoding, both of which quite frankly any decent browser sends out these days (Opera, Firefox, Konqueror, Chrome, Safari and even IE). Even lynx, a text browser, does send these headers, and I tested it with a two year old release. It does make sense if you think about it. The browsers will add the Accept-Language so pages can be correctly localized and Accept-Encoding so that compression can be used. Both things is benefitial to the user, and therefor present despite both being optional. The spambots on the other hand seem to be using libaries like libcurl to build their HTTP-client, and by default these libraries don&#8217;t seem add Accept-Language or Accept-Encoding. Add the fact that few, if any, sort spam-bots from browsers this way, and we can see that it&#8217;s not really such a surprise after all.</p>
<p>This isn&#8217;t without it&#8217;s flaws though. I&#8217;ve only encountered simple, general spam-bots and not the ones attacking widespread software like phpBB or vBulletin. Also, the site isn&#8217;t subject to targeted attacks. That been said, I wouldn&#8217;t be surprised if they too fail to provide these HTTP-headers, and for the time being I&#8217;m quite confident that this method is about as efficient, or better, than the current widespread method of using pictures. It should also continue to be that way until this type of checking is more widely used. So until then, I believe this is a good way to avoid spam in you web applications:</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">&lt;?php</span>
<span style="color: #b1b100;">if</span> <span style="color: #009900;">&#40;</span><span style="color: #990000;">isset</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$_SERVER</span><span style="color: #009900;">&#91;</span><span style="color: #0000ff;">'HTTP_ACCEPT_ENCODING'</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">===</span> <span style="color: #009900; font-weight: bold;">false</span> <span style="color: #339933;">||</span>
    <span style="color: #990000;">isset</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$_SERVER</span><span style="color: #009900;">&#91;</span><span style="color: #0000ff;">'HTTP_ACCEPT_LANGUAGE'</span><span style="color: #009900;">&#93;</span><span style="color: #009900;">&#41;</span> <span style="color: #339933;">===</span> <span style="color: #009900; font-weight: bold;">false</span><span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#123;</span>
    <span style="color: #b1b100;">echo</span> <span style="color: #0000ff;">'Spambot'</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
<span style="color: #b1b100;">else</span>
<span style="color: #009900;">&#123;</span>
    <span style="color: #b1b100;">echo</span> <span style="color: #0000ff;">'Browser'</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
<span style="color: #000000; font-weight: bold;">?&gt;</span></pre></div></div>

<p>If only this was working against email spam too &#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.varslashlog.com/2009/06/03/how-to-stop-spam-without-pictures-or-javascript/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
	</channel>
</rss>
