Beyond the Turing Test

By Charles Simon, Nationally recognized entrepreneur and software developer.

Image by Juan Alberto Sánchez Margallo, CC BY 2.


With artificial intelligence (AI) seemingly touching every aspect of our lives, most experts agree that it’s only a matter of time before today’s AI evolves into Artificial General Intelligence (AGI), a point at which computers meet or even exceed human intelligence.

The question that remains, though, is how will we know when that happens?In 1950, Alan Turing introduced his famous test as a method for determining whether or not a machine was actually thinking.

While his test has gone through some evolution since his original paper, a common explanation goes like this:A person, the interrogator (C), can communicate via a computer terminal (these days, we might say by instant-messaging, emailing, or texting).

At the other end of the computer link is either a human (B) or a computer (A).

After 20 minutes of keyboard communication, the interrogator states whether a person or a computer was at the other end.

If the interrogator believes he was conversing with a human, but its actually a computer, the conclusion is that the computer must be thinking like a human.

This experiment is then carried out multiple times, with more than half of interrogators in agreement, for a computer to “pass” the test.

A more recent adaptation to the Turing Test reduces the conversation to five minutes and considers the test passed if the computer fools the subject better than 30% of the time.

In 2014, a program called Cleverbot (http://www.


com/) was claimed to have passed the Turing Test by fooling 33% of interrogators.

While Cleverbot has some sophisticated responses, my interaction with it quickly exposed its limitations.

Rather than quibble with Cleverbot’s claims, though, I would rather quibble with Turing’s test.

While it represented a great leap at the time of its publication,  I have two primary concerns:I also have concerns about the accuracy of the test:Suppose we had true AGI systems, and the positions are reversed.

Suppose it’s an AGI deciding whether you are a computer or a human.

How good a job would you do?  At the recent AGI-20 conference, one attendee commented that a test for true intelligence would be the ability to design a test for true intelligence.

Since we don’t have such a test,  are none of us truly intelligent?To get around these issues, I propose adjusting the Turing Test.

Instead of individual interrogators making up more-or-less random questions, we could create sets of standard types of questions designed to probe various facets of intelligence.

Instead of comparing the computer’s responses to an individual human responder, compare the computer to a spectrum of human respondents of different ages, sexes, backgrounds, and abilities.

Now, recast the interrogators as judges who individually score the test results indicating whether or not each answer is a “reasonable” response to the question.

The questions and answers should be mixed randomly to prevent spotting and scoring trends.

For example, if a respondent gives one low-scoring answer, it should not color the perceived quality of other responses from that respondent.

Sample questions which target specific component areas of intelligence potentially could include the following:“What’s wrong with this picture?” requires not only object recognition within the image but real-world understanding of the use and relationship of objects.

From: Koch, Christof and Giulio Tononi.

“A Test for Consciousness How will we know when we’ve built a sentient computer? By making it solve a simple puzzle.

” (2011).

While these questions could be posed equally to a thinking machine and a human, we would presume that we could get significantly different answers from the two, and it would be easy to distinguish the computer from the person.

Instead, the response to each question is graded by several judges as meaningful or not meaningful.

Now we can determine that the computer is thinking if it gives a similar number of meaningful answers.

The key issues are that questions need to be open-ended in order to let the respondent demonstrate that they are really understood.

  The types of questions given can be varied in order to create a limitless collection.

This prevents the computer from being primed with specific answers.

The questions would require actual thought.

Likewise, any single judge may not be great at determining reasonableness in an individual answer, but with multiple judges rating multiple respondents, we should get a good assessment.

How about allowing the AGI to be one of the judges?Bottom line: It’s time to replace the Turing Test with something better.

We have already reached a level of AI development where we can see that continued efforts targeted solely at fooling humans on a Turing Test are not the correct direction for AGI creation.

 Bio: Charles Simon, BSEE, MSCs is a nationally-recognized entrepreneur and software developer who has many years of computer experience in industry, including pioneering work in AI.


Simons technical experience includes the creation of two unique Artificial Intelligence systems along with software for successful neurological test equipment.

Combining AI development with biomedical nerve signal testing gives him the singular insight.

He is also the author of Will Computers Revolt?: Preparing for the Future of Artificial Intelligence, and the developer of Brain Simulator II, an AGI research software platform that combines a neural network model with the ability to write code for any neuron cluster to easily mix neural and symbolic AI code.

Related: var disqus_shortname = kdnuggets; (function() { var dsq = document.

createElement(script); dsq.

type = text/javascript; dsq.

async = true; dsq.

src = https://kdnuggets.



js; (document.

getElementsByTagName(head)[0] || document.


appendChild(dsq); })();.

Leave a Reply