RSS
 
09. Aug. 2010

How to implement Captchas properly

09 Aug

What is a captcha?

They protect forms on websites from spammers and bots (@see Wikipedia for details). The main idea: Display some kind of code a human can easily read and submit but a computer can not.

How NOT to implement captchas

This part is even more important, because there is not only one correct way, but even more wrong ways to go here.

Don’t use sessions unless you really have to.
Pages that use something like $_['Captcha']['field'] and override this one on every form, really freak me out! It makes working with two or more tabs impossible, because they override each others captcha values all the time, resulting in a ****** mess.
You could use an array like structure, but your captcha session array can get pretty big in a short amount of time.

More helpful are captchas which use the current form and some hash values based on the fields + current timestamp. It should not be possible to “guess” or “calculate” the hash value. So there is no way to use future “hashs”.
You could use older hashs (like from last week), though. But most bots are programmed to just post right away. They would have to save possible valid “scenarios” for later usage.
Mabye we can come up with something fail-prove later on. For now we want to effectively prevent spam bots to submit their crap without annoying normal users.
The second part is always the hardest. Me – for example – I really hate those image captchas which you can barely read. I often times have to repeat or reload it twice in order to succeed. Annoying comes not even close.

Why do we need captchas

First of all, they make sure that there is no bot (automatic program) posting “spam” or whatever.
But sometimes you just want to add captchas to prevent users from doing some action too often (like friendship requests in a community site etc).

A good example what happens if you dont use captchas, is bakery.cakephp.org.
Some articles have like 36 comments, of which all 36 are SPAM. This is a desaster.
And in this case you even have to be logged in to submit a comment.
The argument that forms for which you need to be logged in don’t need captchas is not contemporary anymore.

Passive Captchas

I already talked about using some hash values based on the fields + current timestamp.
This can be used to generate passive captchas. They are similar to the cake core component “security” which adds some hidden fields to make sure that the fields have not been tempered with.

Both are invisible to the user – they dont even notice the passive captcha. But bots will soon discover that they are facing a wall.
The difference is, that passive captchas should only valid in a specific timeframe. Too fast (less than 2 seconds) is usually a sign for the work of a bot – humans cannot type that fast.
Too late (> 12 hours?) means you need to revalidate anyway, so we would render the form invalid as well. Well designed forms will keep the posted content, so nothing gets lost.

Another aspect to improve security is to use other user specific fields for the hash value like browser agent (cannot change during posts, but is less secure because it can be modified), IP address (can only be modified by very skilled hackers and therefore is pretty secure), …

Active Captchas

Those are the most commonly used ones. Users either have to read an image, calculate numbers or
interpret a sentence. The first one is not suitable for handicapped people.

Usually they are build as extension on top of passive captchas. We first validate the passive one. If the form is OK, we then validate the user input. If validation passes we render the captcha valid.

I decided to use math captchas. They keep you mentally fit and do what they are supposed to. The only important issue is to make it easy enough. I saw pages using / [division] or numbers above 20 in multiplication or even above 100 for summation – which is total overhead.
But other ones could be used as well – simply by changing configuration settings.

Captcha Behavior

Ok, to sum it up, we want captchas that
- don’t annoy users
- protect as good as absolutely necessary
- can be used with tabs
- can be easily implemented and configured

The idea is, that we want to add a single form field as well as attach a single behavior to our model.
Thats all there is to it.
That’s why a behaviour in combination with a helper does the trick perfectly.

The code is in my github tools plugin:
captcha behavior
captcha helper
and functionality that both classes use is in captcha lib.

Note: The links are for 1.3 – if you want the 2.0 stuff, you need to switch to this branch.

Helper usage (in the view):

echo $this->Captcha->input(); // or input('Modelname') if model is different from the form model

Behavior usage (in the controller):

$this->User->Behaviors->attach('Captcha');
	if ($this->User->save($this->data)) { ... }

Pretty straight forward, isn’t it?

Current weaknesses (apart from its strenghts):
- possible “hash extraction” with unlimited use of those valid hashs (session or db to prevent?)

Final notes

Right now it is mainly used for math captchas (active captchas) and just passive captchas.
Feel free to update the missing parts like providing more captcha types (image, sentence, …) or processing types (session, cookie, …).

UPDATE 2011-10-13
The i18n translations are now commited, as well: /locale

 
10 Comments

Posted by Mark in CakePHP

 

Tags: , ,

Leave a Reply

Tip:
If you need to post a piece of code use {code type=php}...{/code}.
Allowed types are "php", "mysql", "html", "js", "css".

Please do not escape your post (leave all ", <, > and & as they are!). If you have encoded characters and need to reverse ("decode") it, you can do that here!
 

 
  1. Michael Clark

    August 30, 2010 at 17:28

    Sounds like you're thinking in the direction of a SAPTCHA: http://dmytry.pandromeda.com/texts/captcha_and_saptcha.html

     
  2. Mark

    August 30, 2010 at 17:52

    yeah thanks
    didnt know they are called saptchas :)

    quite similar to my ideas anyway

     
  3. heohni

    October 12, 2011 at 12:41

    What is the CaptchaLib at
    App::import('Lib', 'Tools.CaptchaLib'); ?

     
  4. Mark

    October 12, 2011 at 12:47

    ops – forgot the link to it.
    is now available!

     
  5. heohni

    October 12, 2011 at 12:56

    OK great!
    I've got it running, but how can I validate it?
    I need to setup a rule in my model to return a error message on failure, how can I do this?

     
  6. Mark

    October 12, 2011 at 12:59

    its already done for you by the behavior :)

     
  7. m16u

    January 18, 2012 at 22:59

    hi im working un cake 2,but i have this erros

    Warning (4096): Argument 1 passed to Helper::__construct() must be an instance of View, none given, called in D:\xampp\htdocs\blogcake2\app\View\Helper\CaptchaHelper.php on line 28 and defined [CORE\Cake\View\Helper.php, line 144]
    Notice (8): Undefined variable: View [CORE\Cake\View\Helper.php, line 145]
    Notice (8): Undefined variable: View [CORE\Cake\View\Helper.php, line 146]
    Notice (8): Trying to get property of non-object [CORE\Cake\View\Helper.php, line 146]

    any help
    ?????

     
  8. Mark

    January 19, 2012 at 02:14

    Did you use the correct files? 2.0 branch for a 2.0 project.
    Then everything should work fine.
    PS: how did you include your helper?

     
  9. m16u

    January 19, 2012 at 23:54

    Hi thanks for reply, the captcha sometimes works..i have this errors,

    Notice (8): Use of undefined constant BR – assumed 'BR' [APP\Plugin\Tools\View\Helper\CaptchaHelper.php, line 124]

    CaptchaBRcaptchaExplained
    nine calcMinus zero =
    '
    I think it should be "nine -zero " ….????

     
  10. Mark

    January 20, 2012 at 00:31

    Oh, thx for the heads up. the translations are still only in the 1.3 branch.
    I will merge them into the 2.0 branch immediately.

    the BR is a part missing (see the bootstrap goodies talked about in the article).
    But I will also provide a fallback.

    All right – done:
    https://github.com/dereuromark/tools/tree/2.0/Locale