"A CAPTCHA is a program that protects websites against bots by generating and grading tests that humans can pass but current computer programs cannot." is the definition offered up on the website for Captcha. It stands for Completely Automate Turing test to tell Computers and Humans Apart and offers websites a free protection service from automated Spam that plague certain blogs and websites. It started off as a project between Mark D. Lillibridge, Martin Abadi, Krishna Bharat, and Andrei Z. Broder in 1997 at Carnegie Melon University(CMU,2010) as a means counteract automated optical character recognition and has since spread throughout the internet to websites such as Amazon and has become a standard tool in website security.
Since the massive spread of Captcha the original creators have been working on a new project called ReCaptcha. What the creators realized is that "more than 100 million CAPTCHAsevery day, in each case spending a few seconds typing the distorted characters. In aggregate, this amounts to hundreds of thousands of human hours per day."(Luis von Ahn, 2008) and that this process had no beneficial outcome to it. What their team did is come up with a innovative way to use this human based processing power to do a process that computers struggle to imitate. Recognizing printed text and converting it into computer encoded text is a task that most standard optical recognition processors struggle to perfect. Certain fonts, watermarks and stains are all that is necessary to inhibit a computer from recognizing a word. Researcher Patrick van der Smagt goes on to state average "recognition rates are 80% for the uppercase letters and 90% for the digits."(Smagt, 1990) which proves troublesome for a computer based digitization of books. ReCaptcha uses the same process of showing a distorted image of text and asking the user to validate the word and themselves as a human.
What ReCaptcha does to aid the digitization of books is shows the user two sets of words, the first being completely automated word to verify the user, and the second a word from a book currently being digitized, and the user essentially converts the text image into a format that is readable to the computer. After this process has been repeated 10 times they can then compare results, along with the computer based interpretations, to come up with an accurate interpretation of the printed text (Von Ahn, 2014). The recaptcha project went on to prove a 99.1% accuracy rate which is over the industry "acceptable over 99% industry standard guarantee for “key and verify" transcription techniques"(Luis von Ahn, 2008). The company went on to be bought by Google and is now a free to use alternative to Captcha.
word count: 451
Carnegie Melon University, 2010. CAPTCHA: Telling Humans and Computers Apart Automatically[Online]. Available at: http://www.captcha.net/[Accessed 24th November 2014]
Von Ahn, L. 2008. reCAPTCHA: Human-Based Character Recognition via Web Security Measures. Pittsburgh. Sciencemag.org. Available at:http://www.cs.cmu.edu/~biglou/reCAPTCHA_Science.pdf [accessed on 24th November 2014]
Tedx Talks, 2014. Louis Von Ahn. Massive-scale online collaboration .[video online]. available at : https://www.youtube.com/watch?v=-Ht4qiDRZE8 [Accessed on 24th November 2014]
Van der Smagt, P. 1998. A comparative study of neural network algorithms applied to optical character recognition. Amsterdam. University of Amsterdam.
Since the massive spread of Captcha the original creators have been working on a new project called ReCaptcha. What the creators realized is that "more than 100 million CAPTCHAsevery day, in each case spending a few seconds typing the distorted characters. In aggregate, this amounts to hundreds of thousands of human hours per day."(Luis von Ahn, 2008) and that this process had no beneficial outcome to it. What their team did is come up with a innovative way to use this human based processing power to do a process that computers struggle to imitate. Recognizing printed text and converting it into computer encoded text is a task that most standard optical recognition processors struggle to perfect. Certain fonts, watermarks and stains are all that is necessary to inhibit a computer from recognizing a word. Researcher Patrick van der Smagt goes on to state average "recognition rates are 80% for the uppercase letters and 90% for the digits."(Smagt, 1990) which proves troublesome for a computer based digitization of books. ReCaptcha uses the same process of showing a distorted image of text and asking the user to validate the word and themselves as a human.
What ReCaptcha does to aid the digitization of books is shows the user two sets of words, the first being completely automated word to verify the user, and the second a word from a book currently being digitized, and the user essentially converts the text image into a format that is readable to the computer. After this process has been repeated 10 times they can then compare results, along with the computer based interpretations, to come up with an accurate interpretation of the printed text (Von Ahn, 2014). The recaptcha project went on to prove a 99.1% accuracy rate which is over the industry "acceptable over 99% industry standard guarantee for “key and verify" transcription techniques"(Luis von Ahn, 2008). The company went on to be bought by Google and is now a free to use alternative to Captcha.
word count: 451
Carnegie Melon University, 2010. CAPTCHA: Telling Humans and Computers Apart Automatically[Online]. Available at: http://www.captcha.net/[Accessed 24th November 2014]
Von Ahn, L. 2008. reCAPTCHA: Human-Based Character Recognition via Web Security Measures. Pittsburgh. Sciencemag.org. Available at:http://www.cs.cmu.edu/~biglou/reCAPTCHA_Science.pdf [accessed on 24th November 2014]
Tedx Talks, 2014. Louis Von Ahn. Massive-scale online collaboration .[video online]. available at : https://www.youtube.com/watch?v=-Ht4qiDRZE8 [Accessed on 24th November 2014]
Van der Smagt, P. 1998. A comparative study of neural network algorithms applied to optical character recognition. Amsterdam. University of Amsterdam.