What is a captcha?
They protect forms on websites from spammers and bots (@see Wikipedia for details). The main idea: Display some kind of code a human can easily read and submit but a computer can not.
How NOT to implement captchas
This part is even more important, because there is not only one correct way, but even more wrong ways to go here.
Don't use sessions unless you really have to.
Pages that use something like $_['Captcha']['field'] and override this one on every form, really freak me out! It makes working with two or more tabs impossible, because they override each others captcha values all the time, resulting in a ****** mess.
You could use an array like structure, but your captcha session array can get pretty big in a short amount of time.
More helpful are captchas which use the current form and some hash values based on the fields + current timestamp. It should not be possible to "guess" or "calculate" the hash value. So there is no way to use future "hashs".
You could use older hashs (like from last week), though. But most bots are programmed to just post right away. They would have to save possible valid "scenarios" for later usage.
Mabye we can come up with something fail-prove later on. For now we want to effectively prevent spam bots to submit their crap without annoying normal users.
The second part is always the hardest. Me - for example - I really hate those image captchas which you can barely read. I often times have to repeat or reload it twice in order to succeed. Annoying comes not even close.
Why do we need captchas
First of all, they make sure that there is no bot (automatic program) posting "spam" or whatever.
But sometimes you just want to add captchas to prevent users from doing some action too often (like friendship requests in a community site etc).
A good example what happens if you dont use captchas, is bakery.cakephp.org.
Some articles have like 36 comments, of which all 36 are SPAM. This is a desaster.
And in this case you even have to be logged in to submit a comment.
The argument that forms for which you need to be logged in don't need captchas is not contemporary anymore.
Passive Captchas
I already talked about using some hash values based on the fields + current timestamp.
This can be used to generate passive captchas. They are similar to the cake core component "security" which adds some hidden fields to make sure that the fields have not been tempered with.
Both are invisible to the user - they dont even notice the passive captcha. But bots will soon discover that they are facing a wall.
The difference is, that passive captchas should only valid in a specific timeframe. Too fast (less than 2 seconds) is usually a sign for the work of a bot - humans cannot type that fast.
Too late (> 12 hours?) means you need to revalidate anyway, so we would render the form invalid as well. Well designed forms will keep the posted content, so nothing gets lost.
Another aspect to improve security is to use other user specific fields for the hash value like browser agent (cannot change during posts, but is less secure because it can be modified), IP address (can only be modified by very skilled hackers and therefore is pretty secure), ...
Active Captchas
Those are the most commonly used ones. Users either have to read an image, calculate numbers or
interpret a sentence. The first one is not suitable for handicapped people.
Usually they are build as extension on top of passive captchas. We first validate the passive one. If the form is OK, we then validate the user input. If validation passes we render the captcha valid.
I decided to use math captchas. They keep you mentally fit and do what they are supposed to. The only important issue is to make it easy enough. I saw pages using / [division] or numbers above 20 in multiplication or even above 100 for summation - which is total overhead.
But other ones could be used as well - simply by changing configuration settings.
Captcha Behavior
Ok, to sum it up, we want captchas that
- don't annoy users
- protect as good as absolutely necessary
- can be used with tabs
- can be easily implemented and configured
The idea is, that we want to add a single form field as well as attach a single behavior to our model.
Thats all there is to it.
That's why a behaviour in combination with a helper does the trick perfectly.
The code is in my github tools plugin:
captcha behavior
captcha helper
and functionality that both classes use is in captcha lib.
Note: The links are for 1.3 - if you want the 2.0 stuff, you need to switch to this branch.
Helper usage (in the view):
{code type=php}
echo $this->Captcha->input(); // or input('Modelname') if model is different from the form model
{/code}
Behavior usage (in the controller):
{code type=php}
$this->User->Behaviors->attach('Captcha');
if ($this->User->save($this->data)) { ... }
{/code}
Pretty straight forward, isn't it?
Current weaknesses (apart from its strenghts):
- possible "hash extraction" with unlimited use of those valid hashs (session or db to prevent?)
Final notes
Right now it is mainly used for math captchas (active captchas) and just passive captchas.
Feel free to update the missing parts like providing more captcha types (image, sentence, ...) or processing types (session, cookie, ...).
UPDATE 2011-10-13
The i18n translations are now commited, as well: /locale
What is a captcha?
They protect forms on websites from spammers and bots (@see Wikipedia for details). The main idea: Display some kind of code a human can easily read and submit but a computer can not.
How NOT to implement captchas
This part is even more important, because there is not only one correct way, but even more wrong ways to go here.
Don’t use sessions unless you really have to.
Pages that use something like $_['Captcha']['field'] and override this one on every form, really freak me out! It makes working with two or more tabs impossible, because they override each others captcha values all the time, resulting in a ****** mess.
You could use an array like structure, but your captcha session array can get pretty big in a short amount of time.
More helpful are captchas which use the current form and some hash values based on the fields + current timestamp. It should not be possible to “guess” or “calculate” the hash value. So there is no way to use future “hashs”.
You could use older hashs (like from last week), though. But most bots are programmed to just post right away. They would have to save possible valid “scenarios” for later usage.
Mabye we can come up with something fail-prove later on. For now we want to effectively prevent spam bots to submit their crap without annoying normal users.
The second part is always the hardest. Me – for example – I really hate those image captchas which you can barely read. I often times have to repeat or reload it twice in order to succeed. Annoying comes not even close.
Why do we need captchas
First of all, they make sure that there is no bot (automatic program) posting “spam” or whatever.
But sometimes you just want to add captchas to prevent users from doing some action too often (like friendship requests in a community site etc).
A good example what happens if you dont use captchas, is bakery.cakephp.org.
Some articles have like 36 comments, of which all 36 are SPAM. This is a desaster.
And in this case you even have to be logged in to submit a comment.
The argument that forms for which you need to be logged in don’t need captchas is not contemporary anymore.
Passive Captchas
I already talked about using some hash values based on the fields + current timestamp.
This can be used to generate passive captchas. They are similar to the cake core component “security” which adds some hidden fields to make sure that the fields have not been tempered with.
Both are invisible to the user – they dont even notice the passive captcha. But bots will soon discover that they are facing a wall.
The difference is, that passive captchas should only valid in a specific timeframe. Too fast (less than 2 seconds) is usually a sign for the work of a bot – humans cannot type that fast.
Too late (> 12 hours?) means you need to revalidate anyway, so we would render the form invalid as well. Well designed forms will keep the posted content, so nothing gets lost.
Another aspect to improve security is to use other user specific fields for the hash value like browser agent (cannot change during posts, but is less secure because it can be modified), IP address (can only be modified by very skilled hackers and therefore is pretty secure), …
Active Captchas
Those are the most commonly used ones. Users either have to read an image, calculate numbers or
interpret a sentence. The first one is not suitable for handicapped people.
Usually they are build as extension on top of passive captchas. We first validate the passive one. If the form is OK, we then validate the user input. If validation passes we render the captcha valid.
I decided to use math captchas. They keep you mentally fit and do what they are supposed to. The only important issue is to make it easy enough. I saw pages using / [division] or numbers above 20 in multiplication or even above 100 for summation – which is total overhead.
But other ones could be used as well – simply by changing configuration settings.
Captcha Behavior
Ok, to sum it up, we want captchas that
- don’t annoy users
- protect as good as absolutely necessary
- can be used with tabs
- can be easily implemented and configured
The idea is, that we want to add a single form field as well as attach a single behavior to our model.
Thats all there is to it.
That’s why a behaviour in combination with a helper does the trick perfectly.
The code is in my github tools plugin:
captcha behavior
captcha helper
and functionality that both classes use is in captcha lib.
Note: The links are for 1.3 – if you want the 2.0 stuff, you need to switch to this branch.
Helper usage (in the view):
echo $this->Captcha->input(); // or input('Modelname') if model is different from the form model
Behavior usage (in the controller):
$this->User->Behaviors->attach('Captcha');
if ($this->User->save($this->data)) { ... }
Pretty straight forward, isn’t it?
Current weaknesses (apart from its strenghts):
- possible “hash extraction” with unlimited use of those valid hashs (session or db to prevent?)
Final notes
Right now it is mainly used for math captchas (active captchas) and just passive captchas.
Feel free to update the missing parts like providing more captcha types (image, sentence, …) or processing types (session, cookie, …).
UPDATE 2011-10-13
The i18n translations are now commited, as well: /locale
Michael Clark
August 30, 2010 at 17:28
Sounds like you're thinking in the direction of a SAPTCHA: http://dmytry.pandromeda.com/texts/captcha_and_saptcha.html
Mark
August 30, 2010 at 17:52
yeah thanks
didnt know they are called saptchas
quite similar to my ideas anyway
heohni
October 12, 2011 at 12:41
What is the CaptchaLib at
App::import('Lib', 'Tools.CaptchaLib'); ?
Mark
October 12, 2011 at 12:47
ops – forgot the link to it.
is now available!
heohni
October 12, 2011 at 12:56
OK great!
I've got it running, but how can I validate it?
I need to setup a rule in my model to return a error message on failure, how can I do this?
Mark
October 12, 2011 at 12:59
its already done for you by the behavior
m16u
January 18, 2012 at 22:59
hi im working un cake 2,but i have this erros
Warning (4096): Argument 1 passed to Helper::__construct() must be an instance of View, none given, called in D:\xampp\htdocs\blogcake2\app\View\Helper\CaptchaHelper.php on line 28 and defined [CORE\Cake\View\Helper.php, line 144]
Notice (8): Undefined variable: View [CORE\Cake\View\Helper.php, line 145]
Notice (8): Undefined variable: View [CORE\Cake\View\Helper.php, line 146]
Notice (8): Trying to get property of non-object [CORE\Cake\View\Helper.php, line 146]
any help
?????
Mark
January 19, 2012 at 02:14
Did you use the correct files? 2.0 branch for a 2.0 project.
Then everything should work fine.
PS: how did you include your helper?
m16u
January 19, 2012 at 23:54
Hi thanks for reply, the captcha sometimes works..i have this errors,
Notice (8): Use of undefined constant BR – assumed 'BR' [APP\Plugin\Tools\View\Helper\CaptchaHelper.php, line 124]
CaptchaBRcaptchaExplained
nine calcMinus zero =
'
I think it should be "nine -zero " ….????
Mark
January 20, 2012 at 00:31
Oh, thx for the heads up. the translations are still only in the 1.3 branch.
I will merge them into the 2.0 branch immediately.
the BR is a part missing (see the bootstrap goodies talked about in the article).
But I will also provide a fallback.
All right – done:
https://github.com/dereuromark/tools/tree/2.0/Locale