<

KAFKA: A Simple CAPTCHA Implementation

Published on
18,419 Points
6,319 Views
11 Endorsements
Last Modified:
Awarded
After investigating several CAPTCHA solutions, I found none of them to my liking. Some were web services, susceptible to downtime; others were too costly, or required image manipulation or PHP on the server (I'm a staunch JavaScript fan - I use it on the client and the server. Don't even get me started).

So I did what I always do - I created my own version of CAPTCHA called KAFKA that requires:

No images
No reliance on external servers (except your own ISP)
No PHP

DISCLAIMER: I honestly do not know if this actually thwarts screen readers. While the displayed characters may [b]look like[/b] letters drawn by a font, they are a [b]representation[/b] of characters drawn by a font. They are human readable, but I do not know whether or not they are machine readable. Caveat emptor.

Some clarification
The meaning of the acronym CAPTCHA is "Completely Automated Public Turing test to tell Computers and Humans Apart". Does KAFKA fall into this catgory? Perhaps.

Traditional CAPTCHA implementations rely on images of words that the user must decipher and enter, often within a short period of time. KAFKA uses a grid of zeroes and ones to display letters and numbers, not images. One could call KAFKA a "meta-CAPTCHA generator" because letters and numbers are shown without using actual letters and/or numbers.

KAFKA generates pseudo-random strings of capital letters and/or numbers. There is no dictionary or hard-coded list of words from which to choose. I reasoned that this would suffice for this project, since CAPTCHA words and phrases are often gibberish.

This article exposes you to my way of thinking, and displays my coding abilities. It does not preach at you, nor does it ask much in the way of understanding. There are aspects of the code that might be of interest: the concept of a class; the act of binding methods to objects; closures; the heady music of logical thought...sorry, I do get carried away.

I am no genius, and the depth of my ignorance is boundless. But I do know a thing or two about programming, or at least in thinking like a programmer. I have been at it for over thirty years (yes, I am a relic - anyone remember Thoroughbred BASIC?). That is a rhetorical question, but if you know the answer, by all means drop me a line.

Since you made your way through all of that, let's get to the Good Stuff.

1. Components

KAFKA incorporates all of the requisite CAPTCHA functions:

Creation of from six to sixteen random characters
Character display is a 5 x 7 grid of zeroes and ones for each letter (or "bimp" for "bitmap")
An input for entry
An "OK" button to verify entry
A "New" button to generate a new KAFKA
A "Cancel" button to abort entry

I separated the JavaScript into four external files, the first three of which are germane to KAFKA, but are not the focus of this article:
helper.js	Generic methods
json.js	JSON methods
server.js	Server methods brought up to the client
kafka.js	KAFKA-specific methods

Open in new window

All of the JavaScript is held in place by "kafka.htm".

2. Walkthrough

"kafka.htm" is a short HTML page that demonstrates the features of KAFKA. The important part of the page is invoking KAFKA. Let's see how this is done.
<script type="text/javascript">

function draw_kafka() {
    var how = {
        'text_id'    :'txt_length', 
        'wrap_id'    :'div_wrap' 
    };
    KAFKA.draw(how);
}

function page_load() {
    // Bind the button "onclick" method
    document.getElementById('btn_draw').onclick = draw_kafka.bind(this);
    
    // Default KAFKA length
    document.getElementById('txt_length').value = '6';
}

window.onload = page_load;

</script>

Open in new window

Function "page_load" is invoked when the page is rendered. This sets up the "onclick" event for the draw KAFKA button and defaults the number of characters to six.

Clicking the button creates a JSON object for the arguments to KAFKA. We need to know how many characters to display, and where to insert the KAFKA elements. The JSON object indicates the IDs for these elements.

Next, KAFKA is drawn and the user is in charge.
KAFKA in action

3. User Interaction

There are three things the user can do:

Enter the displayed CAPTCHA and click the "OK" button
Click the "New" button
Click the "Cancel" button

When the "OK" button is clicked, KAFKA checks the user's entry against the displayed characters. Three things can happen:

If they match, a message is sent to the callback function for further processing
If they do not match, KAFKA alerts the user and a new series of characters is generated
If the length is incorrect, KAFKA alerts the user and a new series of characters is generated

Clicking the "New button generates a new series of characters

Clicking the "Cancel" button sends a message to the callback function for further processing

That is how KAFKA works. No secrets, a little magic, and no images.

4. The Code Behind KAFKA

KAFKA is a class with private properties and methods, and a single exposed method: draw. The shape of KAFKA is:
var HESSE = (function() {
    var private_property_1 = 'You cannot see this from outside HESSE';
    var private_property_2 = 'HESSE shares this with you';
    
    function private_method_1(parm) {
        alert(private_property_1 + '\t\n' + arg); // Show the world one of my secrets and the passed parameter
        
        return private_property_2; // Make private property visible (NOT public)
    }
    // Exposed (public) methods
    return { // Bracket must be on same line to avoid JavaScript misunderstanding our intent
        'public_method_1' : function(parm) { return private_method_1(parm); }
    };
})();

Open in new window

The only method HESSE exposes is "public_method_1". It alerts a secret and returns another. This is the only interaction allowed. You cannot access any of its internal (private) properties or methods. For example:
var result = HESSE.public_method_1('arlo'); // VALID

var secret_1 = HESSE.private_property_1; // INVALID

HESSE.private_method_1('fred'); // INVALID

Open in new window

Let's examine KAFKA's "draw" method.
    function draw_kafka(arg) {
    // Expects
    // arg
    //  .text_id    ID of input for number of chars
    //  .wrap_id    ID of KAFKA container
        
        // Validate and save the number of chars, set the wrapper ID and set the callback function
        var wlen = Math.min(Math.max(parseInt($(arg.text_id).value, 10), 6), 16), // min = 6, max = 16
            chow = {
                'num_letters':wlen, 
                'wrapper_id':arg.wrap_id,
                'callback':kafka_watchdog
            },
        //
        exknob; // Final var (prevents missing "," and ";"
        
        document.getElementById(arg.text_id).value = wlen; // OPTIONAL: Visual feedback only
        
        // The way we were...
        invoked_with = arg;
        
        get_kafka(chow);
    }

Open in new window

We create a JSON var with the necessary arguments; we display the actual number of bimps; we save the original arguments for later; finally, we invoke the private method "get_kafka" with our JSON parameter.

We have examined how KAFKA works, and how it is invoked. Next we will take a look at the magic of the bimp.

5. Generating Bimps

The human-readability of KAFKA is due to the bimps. They appear to be some sort of oddball font, but they are not. They are binary bitmaps comprised of either a zero ("0") or a one ("1"). Together, these form the characters KAFKA shows to the world.

The letter "A" is represented by this string of bits (binary digits):
    "01100100101001011110100101001010010"

Open in new window

Subsequent letters - and the numbers zero through nine - look similar.

"How does that become the letter 'A'?", I hear you ask. I will tell you, then show you.

Each string of 35 bits is displayed as a 5 x 7 grid. So the letter "A" looks like this:
    "01100"
    "10010"
    "10010"
    "11110"
    "10010"
    "10010"
    "10010"

Open in new window

The letter is more visible, but we can make it clearer by substituting two spaces for the zeroes and two "at" signs for the ones:
    "  @@@@    "
    "@@    @@  "
    "@@    @@  "
    "@@@@@@@@  "
    "@@    @@  "
    "@@    @@  "
    "@@    @@  "

Open in new window

When we display this in a 4-point type, it looks like a 10-point font character.

KAFKA generates a string of characters using Math.random(). It then selects the appropriate bimp for each letter and arranges all of the letters into the grid. This is returned to KAFKA from the server, or, in this case, the client.

NOTE: This aspect of KAFKA is server-based because the random character string - the actual characters, not the bimps - should be kept hidden from the client. All the client sees is the string of bimps; her entry is sent to the server for comparison against the KAFKA-generated word.

6. The DOM and Internal KAFKA Methods

The function "draw_kafka" invokes "get_kafka", which does all of the heavy lifting. It invokes other internal methods to:
Request bimps (normally from the server via Ajax)
Create and tear down the DOM elements that support KAFKA
Bind and unbind methods to objects
Handle verification of the user's input (OK, New and Cancel buttons)
Communicate with a callback routine when KAFKA is finished

The HTML elements are created using DOM methods. If this is not to your liking, you could instead code the HTML into a web page, read it on the server and Ajax it up to the client along with the KAFKA word.

Or you could use "window.open", and open the same web page. Or use "showModalDialog" for IE and FF support. Or use the web page as the src of an <iframe>.

I have tried all of these, and simply stuffing pre-CSS'd DOM objects into a wrapper <div> suits my needs. YMMV.

The code for KAFKA is straightforward and available for download. We will not delve into it further.

7. Binding Methods to Objects

Credit for this approach to handling events is due to Daniel Brockman; I simply use his idea. It is complex, but worth the time taken to grok its potential. However, it is simple to use:
    document.getElementById('OBJECT_ID').EVENT_NAME = FUNCTION_NAME.bind(this, 'PARAM_1', PARAM_2, ... PARAM_n);
    //
    // In practice, for the KAFKA "OK" button, it is bound like this:
    //
    document.getElementById('btn_ok').onclick = kafka_validate.bind(this, 'txt_kafka_letters', false);

Open in new window

I no longer have to worry about event handling. It just works.

8. Feedback

I don't often release complete code into the wild. Along with the implied arrogance of it, I tend to be careful of how I present myself to the world (through my code). But then I suppose the real reason is fear. The fear of not being good enough, etc. But, I tell myself, this is little different than creative writing, something else I like to do. What I'm trying to say is:

This software is provided "as-is". No effort will be made to diagnose and/or fix problems arising from the use of this code. Experts-Exchange is hereby granted permission to make this code available to others, but I retain the rights of sole ownership.

I welcome your comments and criticisms, both good and bad.

This turned out to be an interesting project, and it may actually have some value (as opposed to some of my other ideas ;-)

If anyone can verify the validity of KAFKA, i.e., that it is not machine readable, I would appreciate hearing the how's and why's of that tale.

9. Installation

Create a new virtual directory on your server - kafka, for example - and unzip the archive into it. Then enter the URL "http://YOUR_SERVER_NAME/kafka/kafka.htm" into your browser and see what you think.

10. Addendum

It appears that KAFKA provides little or no security against attackers. It is trivial to scrape the screen and OCR the KAFKA text.

This is good news and bad. It is good because there can be no mistaking KAFKA for true CAPTCHA. But it is bad because it seems that I failed in making a CAPTCHA-like widget.

I wondered about the ability of OCR and bots in general, to read altered text. I rotated the KAFKA text 90 degrees, making the characters more difficlult to understand. It took some effort to read them, at first, but after viewing a few different groups it was no worse than viewing non-rotated letters. At least for me; I do not know how OCR would interpret this:
KAFKA in action - sideways!
But I think this is an ex-horse, to paraphrase John Cleese. I will bow to those who argued with me - politely, but firmly - that KAFKA is not secure CAPTCHA. For my simple needs, however, it will suffice.

kafka.zip
11
Comment
Author:Badotz
6 Comments
 
LVL 58

Expert Comment

by:harfang
Badotz,

I liked your article for several reasons (and accordingly voted 'yes'). The question whether your implementation is secure, in effect trying to elaborate on your disclaimer, has created some debate during the editorial process, and quite a lot of interest among experts.

For the benefit of future readers: CAPTCHA and similar techniques attempt to determine if you are a human or a machine. This isn't as easy as it sounds. One of the things that a human brain does much better is image (and sound) processing, in particular pattern recognition. We can all read the image below (stolen from the gmail account creation page); even a good character recognition program is quite likely to fail, mainly because the characters are distorted and touch each other.

In other words, there is no simple way to translate the image back to text. It requires human intervention.

But why would a machine want a gmail account? Let's say I'm a spammer. I can write a robot that will repeatedly create a new account, send a few thousand legitimate-looking emails, and move to the next account. The accounts will not be blocked in time (but gmail.com would soon be on every black list as a frequent source of spam).

KAFKA on the other hand is "machine readable". If gmail was using it, a small army of spammers and hackers would try to reverse the displayed information back into text. Given that the same letter or digit in the code results with '@' characters in the same relative positions, a simple regular expression search can give the answer.

Although KAFKA has the look and feel of CAPTCHA, it offers only cursory protection against serious robot attacks, at least in the simple form shown here (without distortion and with clearly separated characters).

This being said, the implementation is really interesting to study, and the demonstrated techniques can be used for other similar purposes.

Markus -- (°v°)
Captcha.jpg
0
 
LVL 56

Expert Comment

by:Ryan Chong
Hi Badotz,

Looks cool... and thanks for your sharing. It tested successfully in FF3.5.x

However, when I tested in IE6.0, the captcha was not displayed correctly. Any idea? Thanks.
snap-captcha.jpg
0
 
LVL 49

Expert Comment

by:DanRollins
The resulting output can pretty easily be decoded back into text with a short program, so it fails to do what Captcha is designed to do -- foil a bot.  For instance, in an afternoon of coding, I could write a bot that could repeatedly spam your inbox 1000 times per minute.

That said... I found this to be an interesting piece of code, and I agree that it would foil anyone who did not have the will or the expertice to write a decoding function.  Your use of JSON and passing temporary structures as parameters is outside of my experience... I've learned something here.  Thanks!

 I voted Yes, above :-)
0
Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 
LVL 75

Expert Comment

by:Michel Plungjan
A few comments

* Very interesting article for the coding practices. I had a cursory read of Daniel Brockman's text, will read it in detail after a really good night's sleep ;)
* Some/many companies are still on IE6 for corporate upgrade policy reasons

I saw a new captcha the other day - the letters came flying into a box per letter and disappeared, one had to type them as they appeared. Looked like fun and also looked hard to ocr.

Some sites do need captchas or other bot defending stuff. The site I work on is pumping out hundred of GB a day and need to be able to block some types of requests from being automated.
0
 
LVL 9

Expert Comment

by:Bob Stone
> borrow away. Out of curiousity, though, I'd like to know where you plan to use it? <

I am going to put it in a shopping cart admin page as a confirm big change thing, to make certain they really want that change.
0
 
LVL 111

Expert Comment

by:Ray Paseur
A newer article about CAPTCHA is available here, and importantly, please see the Addendum:
https://www.experts-exchange.com/articles/9849/Making-CAPTCHA-Friendlier-with-Simple-Number-Tests-or-PHP-Image-Manipulation.html

Given the advances in computer vision and machine learning, the old ways of CAPTCHA are rapidly becoming obsolete.  Anyone with some basic CS skills and an NVidia GEForce card can do image recognition.  Facebook can identify and tag your friends in your pictures.  So "reading" a CAPTCHA image is well within the scope of modern computability.  More and different methods are needed.
0

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

Join & Write a Comment

The viewer will learn the basics of jQuery, including how to invoke it on a web page. Reference your jQuery libraries: (CODE) Include your new external js/jQuery file: (CODE) Write your first lines of code to setup your site for jQuery.: (CODE)
The viewer will learn the basics of jQuery including how to code hide show and toggles. Reference your jQuery libraries: (CODE) Include your new external js/jQuery file: (CODE) Write your first lines of code to setup your site for jQuery…

Keep in touch with Experts Exchange

Tech news and trends delivered to your inbox every month