Tuesday, January 31, 2012

CAPTCHA - A Revolution


CAPTCHA = Completely Automated Public Turing test to tell Computers and Humans Apart

What is a CAPTCHA?
A System built by Luis von Ahn, Manuel Blum, Nicholas J. Hopper, and John Langford of CMU to make sure that user who is active at the other end is a Human and not a bot. This was initially done to prevent bots entering yahoo chat rooms and redirecting the users to someother sites.

CAPTCHA - Reverse Turing Test:
Yups, CAPTCHA is a reverse turing test because it reverses the role of computers and human. Computer is a device designed to perform what human want it to. But in the case of CAPTCHA it is reversed. It is completely automated, so computer challenges you to perform some action to identify that you are a human.

Initially [even now] CAPTCHA was an distorted image with some characters in it which would make lives of bots harder to detect them but which wouldn't affect human though

Next generation of CAPTCHA carried a audio link with the distorted image beside to help visually challenged people

Although CAPTCHA are automatically generated they are easily breakable using some techniques like OCR(Optical Character Recognition) or by understanding the underlying logic of automation.

And Now, people started their own implementation including

Mathemetical Captcha => What is 1 + 1?
Image/Visual Captcha => Who is alice in the photo tagged with friends? [FB uses it to detect legitimate user of an account]
and so on

But the real master piece is reCaptcha [Powered by Google]








What is great in that?
It is great because it knows the value of human time. A test that unites human power :)
How?
If you had noticed any recaptcha there will be two space separated words
Consider the image shown for example [said allectst]
Where does this words come from?
These words come from the process of digitizing old text with OCR
Means?
Inorder to generate digital version [ex: pdf] of a book which was written way back digitized books or word processing tools came in to existence, people use a technology which scans the book and takes a photocopy[image] of it. Then it tries to recognize the characters using image processing technique called OCR and digitizes the old text.
What it has to do with reCaptcha?
OCR is an automated tool to recognize characters from an image. It is not guaranteed that it will be able to recognize all characters with out any discrepancies. For ex. T can be interpreted as I based on some fonts or clarity of the image.

So, what recaptcha people do is
Pick two words; one was successfully recognized by ocr, said and the other it wasn't able to, allecstst. 
Challenge the user for CAPTCHA test.
If the user answers the one successfully recognized by ocr [said] correctly, it will confirm that the user is a human. And the other word is kind of a poll. The same unrecognized word will be shown to a group of people [say 10].
If out of 10, 7 [i.e., majority] were able to recognize allecstst as allecstst and the rest understood it as alleestst, then the unrecognized word is considered as allecstst as majority falls for it. Hence a word is digitized in a book :)

So, without your knowledge you are helping digitize a book whenever you fill a recaptcha :) Be happy whenever you answer a recaptcha and proud to be united :)
A book is being digitized whenever a user signs in to Facebook, gmail, linked in, etc.,

From the site
About 200 million CAPTCHAs are solved by humans around the world every day. In each case, roughly ten seconds of human time are being spent. Individually, that's not a lot of time, but in aggregate these little puzzles consume more than 150,000 hours of work each day. What if we could make positive use of this human effort? reCAPTCHA does exactly that by channeling the effort spent solving CAPTCHAs online into "reading" books.

visit this site to learn more and feel great :)

Monday, January 9, 2012

Nodejs Modules and Export Explained


Enjoyed a week playing around with nodejs :) Lets share

Simple Node Server [http://localhost:6666]:

var http = require('http');

var server = http.createServer(function (req, res) {

           // Do Whatever you want

              res.writeHead(200, {'Content-Type':'text/plain'});

              res.end('Running');

         }).listen(6666, function () {

                   console.log('Node Runs on port 6666');

});

How to install a node_module?
NPM is a powerful package manager for node which you can use to install node modules
Ex:
npm install redis

The above command installs redis module in ./node_modules/redis directory

How to use a module?
Use require() method

Ex:
require('redis')

How will node resolve the modules?
Consider

/home/xxx/sample

 |___ index.js

 |___ node_modules

          |____ redis

                     |____lib/

                        ...

          |____ my_module

                        |____node_modules

                              |____first.js

                        |____test.js



//test.js

var redis = require('redis');


So node will look for redis module in

1.my_module/node_modules/
2.sample/node_modules/
3.xxx/node_modules/
4.home/node_modules
5./node_modules

Also some global folders

You can exploit this behaviour nicely if planned :)

Will node load the module everytime I request it to via require?
No, It won't load the module everytime you request untill the module requested resolves to a location different from previously loaded location.
Ya, Node caches the module

Cache means?? What will get cached?
uh uHHH... Lets get deeper into modules before this question

How to write your own module?
Simple... Let us write a module which performs lowercase to uppercase conversion

//simple.js

var stringtoUpper = function (text) {

        return text.toUpperCase();

};

exports.toUpper = stringtoUpper;

Done :)

//test.js

var utils = require('./simple');

console.log(utils.toUpper('this is text'));



//Execution

node test.js

>THIS IS TEXT


What is that exports?
That is whatever you wish to expose to the src module that requires the destination module.
This is shared between all instances of the current module.
exports.toUpper same and equal to module.exports.toUpper, [exports === module.exports] nice way to use. module is the referrence to the current module :)
Also you can name your export different from it's actual name. As you can see in the above example. Actual function's name is stringtoUpper, but exports' name is toUpper

Consider the same simple.js and instead exports.toUpper = stringtoUpper replace with

1.module.exports.toUpper = stringtoUpper;
2.module.exports = stringtoUpper;
3.module.exports = new stringtoUpper('this is a text');

All the three are different

module.exports is an object
1 makes it {toUpper: [Function]]}
2 makes it [Function] // We can use this a constructor function
3 makes it an object of stringtoUpper class

After require('test.js'); to use the exports, follow
1 : simple.toUpper('This is a text')
2 : simple('this is a text') or require('simple')('this is a text') or new simple('this is a text');
3 : You can access all public members. Here there is none

Methods 2 & 3 overrides the entire object, means whatever might be there initialised before via exports will be overridden
Ex:
module.exports.a = 10;
module.exports.b = 20;
module.exports = 20; // will override a & b
Hope you understand the reason. module.exports itself is an object and members of it can be initialised either as module.exports.a, module.exports.b but when module.exports = 20 happens the whole object is initialised with new value

Now lets comeback to caching

node caches all these exports of a module when they are loaded for the first time

1 & 2 has no effect on caching since they are just functions they can be called n number of times with change in parameter

But 3 is different. The object returned is cached. So how many ever time you load the module after first time, no instantiation take place [Kind of singleton]. Because you are exporting only one object of the class.

So, If you are looking for stateful implementation across modules you can go for 3, else go for 1 or 2

Happy Node :) Hope this helps :)