Spam, spam, spam, spam. Spammers have taken over the Internet and made it a horrible place to maintain a web site. Spammers have gone beyond simply spamming your email account -- now they are spamming blogs and guestbooks, spamming trackbacks, and spamming signup forms. Even a child's home page with a guestbook for friends is not safe from links for cialis, porn, or web hosting. Obviously these spammers are getting some return from their criminal activity, because they keep doing it. Unfortunately, you can't reach through the computer screen and grab them by the throat to strangle the life out of them. All you can do is put in place some safeguards and try to minimize the attack.

Asking a question

One of the simpler ways to cut down on spam is to use a question and ask the user to choose the response. There are various ways to do this -- maintain a list of common questions and answers, use a calculation (1+3 = ?), use some server-generated random number that a user has to type in. These methods can be used fairly effectively. I'll show some code for PHP and ColdFusion using the random number. I'm not going to get into the insert/update code, as there are numerous articles dealing with that (such as these for PHP and these for CF), but I'll show the random number generation and the conditional logic to prevent the insert. I'll start off with a simple form:

<form id="form1" name="form1" method="post" action="">
<table width="200">
<tr>
<td><div align="right">Name</div></td>
<td><input type="text" name="yourname" id="yourname" /></td>
</tr>
<tr>
<td><div align="right">Email</div></td>
<td><input type="text" name="youremail" id="youremail" /></td>
</tr>
<tr>
<td><div align="right">Comment</div></td>
<td><textarea name="yourcomment" id="yourcomment" cols="45" rows="5"></textarea></td>
</tr>
<tr>
<td><div align="right">Type the following number:</div></td>
<td><input type="text" name="myNumber" id="myNumber" /></td>
</tr>
<tr>
<td>&nbsp;</td>
<td><input type="submit" name="button" id="button" value="Submit" /></td>
</tr>
</table>
</form>

Now, I'll add to the top of the page some logic to determine a random number.

ColdFusion

Using CF, I'll just use the RandRange function, which returns an integer that is for all practical purposes, random for the user, between two integers:

<cfif not IsDefined("session.myNumber")>
  <cfset Session.myNumber = RandRange(100000,999999)>
</cfif>

Now, in the form, put the value in the display:

Type the following number: <cfoutput>#session.myNumber</cfoutput>

And finally, add some conditional logic on your form insertion code to prevent the insertion if the number doesn't match, and display a message if it does not match so the user can try again:

<cfif isdefined("form.myNumber")>
  <cfif form.myNumber EQ session.myNumber>
    do the insert
  <cfelse>
    <p>Your number did not match. Please try again.</p>
  </cfif>
  <!--- Set a new random number --->
  <cfset session.myNumber = RandRange(100000,999999)>
</cfif>

PHP

Using PHP, I'll use the rand() function, which gives a nice random number between a range:

<?php session_start();
if(!isset($_SESSION["myNumber"])) {
  $_SESSION["myNumber"] = rand(100000,999999);
}?>

Now, in the form, put the value in the display:

Type the following number:<?php echo($_SESSION["myNumber"]);?>

And finally, add some conditional logic on your form insertion code to prevent the insertion if the number doesn't match:

<?php
if(isset($_POST["myNumber"])) {
  if($_POST["myNumber"] == $_SESSION["myNumber"]) {
    // do the insert
  }else{
    echo('<p>Your number did not match. Please try again.</p>');
  }
  $_SESSION["myNumber"] = rand(100000,999999); // Set a new random number to prevent another insertion
}
?>

This method is not perfect, but for practical use it works well. The screenshot below shows the form in action:


Figure 1: Using the random number spam form

If the user does not enter the number correctly, he will be shown the form again with a message telling him to try typing the number again -- a new number.

A step up: Captcha

The most obvious anti-spam mechanism these days is the use of a Captcha -- an image that contains random characters or a random word that the user is forced to type in before proceeding. This is slightly more effective than the method outlined previously, but a bit harder to implement. There are numerous articles on the web and at Community MX that deal with implementing a Captcha system (including Thomas Pletcher's articles on Community MX called Captcha the Bastards), so I won't go into it here. This is probably the most reliable way to eliminate comment and other form spam on your web site.

This still does not guarantee that a spammer will not get through because spammers have taken a novel approach -- actually visiting sites and typing into the forms by hand. Who would have thought the lazy, idiot minds behind spams would actually go a step beyond an automated attack and try the old fashioned manual approach? They have. Apparently they are smarter than we have ever given them credit for.

Recording Spammers

If the spam gets through, it's time to keep track of the spammer and reduce the chance that he will return. This is also not 100% effective, because spammers are crafty and treacherous, but it will help control spam. Part 2 of this series will show a way to record the IP address and prevent the spammer from accessing your site at all.