Preprint version: not for scholarly citation.
For serious scholarly attribution, please refer to the published version in

Minds and Machines, Vol. 11 (2001), No. 1, pp. 41-51.  Reprinted in James H. Moor (ed.), The Turing Test: The Elusive Standard of Artificial Intelligence (Kluwer, 2003), pp. 185-196. <!doctype html public "-//w3c//dtd html 4.0 transitional//en">

Look Who's Moving the Goal Posts Now

Larry Hauser

Alma College

Abstract

The abject failure of Turing's first prediction (of computer success in playing the Imitation Game) confirms the aptness of the Imitation Game test as a test of human level intelligence.  It especially belies fears that the test is too easy.  At the same time, this failure disconfirms expectations that human level artificial intelligence will be forthcoming any time soon.  On the other hand, the success of Turing's second prediction (that acknowledgment of computer thought processes would be become commonplace) in practice amply confirms the thought that computers think in some manner and are possessed of some level of intelligence already.  This lends ever-growing support to the hypothesis that computers will think at a human level eventually, despite the abject failure of Turing's first prediction.

Turing's Failed Prediction

In 1950 Alan Turing made two predictions.  First and most famously, he predicted

that in about fifty years' time it will be possible to programme computers, with a storage capacity of about 109, to make them play the imitation game so well that an average interrogator will not have more than 70 per cent. chance of making the right identification after five minutes of questioning. (Turing 1950, p. 442)

This prediction has failed abjectly. Current contestants play the Imitation Game so ill that an average interrogator has a 100 percent chance of making the right identification.  By Turing's measure, in other words, current contestants play with no measurable success.  What this says about the adequacy of the Turing test as a test of high-grade or human-level artificial intelligence, I argue, is that it's about right.  What it means for AI itself is more equivocal and best assessed in the light of the success of Turing's second prediction.

The Test

It's Not Too Easy

Some fear the Turing test is too easy -- that Turing test passing would not suffice to warrant attribution of thought -- because

people are easily fooled and are especially easily fooled into reading structure in chaos, reading meaning into nonsense (Shieber 1994, p. 72);

and

it has been known since ELIZA that a test based on fooling people is confoundingly simple to pass. (Shieber 1994a, p. 72)

ELIZAphobia, I call it.1  For those prone to ELIZAphobia the moral of the abject failure of computers to fulfill Turing's prediction, I believe, is that your fears are unfounded.  Eager though we are to read structure into chaos and meaning into nonsense, the experience of the Loebner Prize competition -- not to mention the last fifty years -- attests that the unrestricted Turing test is confoundingly hard.

I agree that if a program very much like ELIZA could pass, that would be very good  reason to doubt the sufficiency of the Turing test.2  But the evidence of the Loebner prize competition suggests that nothing like ELIZA has the remotest chance of passing the unrestricted test.  This supports the judgment that "if the Turing test was passed, then one would certainly have very adequate grounds for inductively inferring that the computer could think on the level of a normal, living, adult human being"  (Moor 1976, p. 251).  The abject failure of Turing's prediction together with our intuitive low estimates of the intellectual capacities of the current generation of computer contestants argues strongly for the empirical sufficiency of Turing's test.

It's Not Too Hard

As a test of intelligence or thought per se the test is obviously too hard.  Neither my cat nor my computer can play the Imitation Game successfully.  Nevertheless, I don't doubt that my cat exhibits intelligence to some extent, and thinks in his own peculiar manner.  Likewise, my computer exhibits intelligence to some extent, and thinks in it's own peculiar manner.  That's what I think.

Even as a test of human-level intelligence, the Turing test, followed to the letter, seems too hard.  Presumably something "could think on the on the level of a normal, living, adult human being" (1976, p. 251) without thinking in the manner of a normal adult human being, i.e., without sharing our cognitive style which all of it's peculiarities.  Thought or intelligence up to our level needn't share our style.  Turing himself observes,

The game may perhaps be criticized on the ground that the odds are weighted too heavily against the machine. If the man were to try and pretend to be the machine he would clearly make a very poor showing. He would be given away at once by slowness and inaccuracy in arithmetic. (Turing 1950, p. 435)

Turing asks, "May not machines carry out something which ought to be described as thinking but which is very different from what a man does?" and acknowledges "this objection is a very strong one" (Turing 1950, p. 435).  Still, so long as the computer contestants' failings obviously bespeak cluelessness, not just inhuman style, we need not be troubled by this objection.

It's About Right

"Machines take me by surprise with great frequency,"  Turing writes, "because, although I do a calculation, I do it in a hurried, slipshod fashion" (Turing 1950, p. 450).   To the attentive reader Turing's presentation of the Imitation Game test may seem likewise slipshod.3  First he says that the machine contestant is supposed "to take the part of A," the man, pretending to be a woman. Yet, a while later Turing asks,

Is it true that . . . [a computer] can be made to play satisfactorily the part of A in the imitation game, the part of B [the confederate] being taken by a man? (Turing, p.442: my emphasis)

Yet a little further on, he observes,

 The game (with the player B omitted) is frequently used in practice under the name of viva voce to discover whether some one really understands something or has "learnt it parrot fashion." (Turing 1950, p. 446),

here, allowing the confederate, it seems, to be inessential.  Finally, Turing speaks of "playing . . . the imitation game . . . against a blind man" (Turing 1950, p. 455).   Finally, the question of what importance to attach to the averageness of the interrogator, the five minute time limit, and the 70% rate of success mentioned in the prediction leaves room for further doubts about the details of the test intended.

Rather than lecture Turing on the subject of his slipshod ways in this connection, however, I commend them: when "speaking of such  subjects and with such premises" it is best to "be content . . . to indicate the truth roughly and in outline"  (Aristotle, Nich. Eth., Bk. I, Ch. 3); and in this spirit also should Turing's proposal be received, as a rough approximation.  So taken, the Turing test is apt.  It tests the contestant's ability "to use words, or put together other signs, as we do in order to declare our thoughts to others" or "produce different arrangements of words so as to give an appropriately meaningful answer to whatever is said in its presence" (Descartes 1637, p. 140).  Such verbal responsiveness is what we normally do use as a basis for assessing mental competence, e.g., in determining to whom legal and moral entitlements apply, and by whom moral and legal responsibilities are owed.  This lends powerful support to the thought that Turing-like tests are just right as tests for human-level intelligence.

More generally, no test is sacrosanct; testing needs to be flexible.  If the time comes when computer contestants seem to be failing mainly due to the inhuman style of their thought, rather than its low level, it would hardly be contrary to the empirical spirit of Turing's proposal to alter the test to try to control for this.4  Similarly, it is hardly contrary to (the spirit of) Turing's proposal if we have to tweak the test to accommodate the needs of particular subjects.  This is why it's no objection to the Turing test to point out, e.g.,  that the test as proposed would disqualify nontypists and the illiterate. Where subjects' disabilities are irrelevant to their intelligence (as these are), accommodating such disabilities accords, generally, with the spirit of empirical testing and with the "fair play" spirit of Turing's proposal, in particular.

This, incidentally, is why Hugh Loebner's revision of the Turing test to require the contestant to "respond intelligently to audio-visual input" (Loebner 1994, p. 82: original italics) is ill advised.  It proposes to discriminate against computers on the basis of their visual and auditory disabilities, contrary to Turing's proposal,  and contrary, as far as I can see, to any sound spirit or principle whatever.  If the test did need strengthening -- as the abject failure of Turing's prediction argues it doesn't -- it shouldn't be strengthened in this way. This audio-visual requirement should be dropped.  Alternately, if the time comes when Loebner would like to ask his questions "about images and patterns" (Loebner 1994, p. 82), the rules should be amended to require inclusion of a blind person among the confederates.

AI

Modest AI Covered

If AI is understood to be the thesis that computers can think or be genuinely possessed of some intelligence, then the failure of computers to pass the Turing test is inconsequential due to the test being obviously too strong a disqualifying test for thought or intelligence per se.  Just as my cat's inability to pass the Turing test has no tendency to undermine his claim to some manner of thought and some degree of intelligence, the abject failure of computers to pass the Turing test provides no good reason to deny them some manner of thought and degree of intelligence; even if Turing test passing computers are very far "over the horizon"; and even, perhaps, if the horizon "seems to be receding at an accelerating rate," as Hubert Dreyfus (1979: 92) complains.

Since ensuing years have only to sharpened Dreyfus' "receding horizon" complaint, if something like Turing test passing capacity were required before attribution of any sort of thought or intelligence were warranted, Dreyfusian arguments such as the following would have considerable bite:

1.  Turing test passing ability is necessary condition for (warranted attribution of) thought or intelligence.
2.  Probably, no computer will ever pass the Turing test.
 :. C.  Probably, no computer will ever (be warrantedly affirmed to) think.

Turing test passing ability not being required for warranted attribution of thought or intelligence per se, however, the Dreyfusian argument does not go through against AI modestly understood; understood as asserting "merely" that computers really do think, and are intelligent in their own peculiar ways.

Oh the Humanity: Immodest AI Exposed

On the other hand, AI is more famously understood to be advancing a more ambitious claim that a "computer could think on the level of a normal, living, adult human being" (Moor 1976, p. 251).  Such immodest AI (as I call it) is vulnerable to the Dreyfusian argument.  It is vulnerable because the Turing test is a plausible disqualifier for such human-level intelligence.  In speaking of thought "on the level of a normal, living, adult human being" we are talking of something like moral agency or personhood; such that we should have to ask ourselves in all moral seriousness the sorts of questions Robert Epstein proposes we shall have to ask of a Loebner Prize winner.  Questions like,

“Who should get the prize money?  . . . Should we give it the right to vote?  Should it pay taxes?” (Epstein 1992 as cited by Shieber 1994, p. 70)

We should have to ask such questions in all moral seriousness because the ability "to use words or put together other signs" so as to "give an appropriately meaningful answer to whatever is said" (Descartes 1642, p. 140) is what we ordinarily do take to qualify normal adults and disqualify others for moral and legal responsibilities and entitlements.  The Turing test's credibility as a disqualifying test for human-level intelligence -- in light of the abject failure of Turing's first prediction -- leaves AI immodestly understood exposed to the Dreyfusian argument's bite.  At the very least the "probably none ever will" claim of the Dreyfusian argument is confirmed by the abject  failure of Turing's prediction.  It remains, of course, to quarrel as to what degree and subject to what further considerations.

I believe that  hope for immodest AI remains viable, given further considerations.  These derive from the success -- less equivocal than it may seem -- of Turing's neglected second prediction.

Turing's Second Prediction

[A]t the end of the century the use of words and general educated opinion will have altered so much that one will be able to speak of machines thinking without expecting to be contradicted. (Turing 1950, p. 442)

Naive AI

I submit that the success of this second prediction is less equivocal than may seem.  Consider the following exchange:

MARGARET WARNER: All right. Let me bring Mr. Friedel back in here. Mr. Friedel, did Gary Kasparov think the computer was thinking?

FREDERIC FRIEDEL: Not thinking but that it was showing intelligent behavior. When Gary Kasparov plays against the computer, he has the feeling that it is forming plans; it understands strategy; it's trying to trick him; it's blocking his ideas, and then to tell him, now, this has nothing to do with intelligence, it's just number crunching, seems very semantic to him. [Friedel is Kasparov's technical advisor.] (MacNeil & Lehrer 1997)

It certainly does seem very semantic.  But it seems equally "semantic" to deny Deep Blue's exercise of its intelligence the name of thought.  Naive judgments serving practically and predictively (as in trying to psych out Deep Blue's plans and strategy) trump theoretical misgivings (reluctance to call such planning and strategizing "thinking") in the absence of empirical and scientific theoretic support for such misgivings.  In lieu of such support, modest AI prevails directly in virtue of arguments like this:

1.  Forming plans and understanding strategy are intelligent activities.
2.  Deep Blue forms plans and understands strategy.
:. C. Deep Blue is intelligent (to some extent): it thinks in some manner.

Every intelligent-seeming computer act underwrites an argument of this naive argument type.  And the more such behavior computers display, and the more interconnectedly they display it, the more hopeful things look for immodest AI, despite the abject failure of Turing's first prediction. This is the main empirical reason immodest AI remains a viable hope, despite that failure.5

Since the behaviors in question are intelligent seeming, the opponent of these naive arguments needs to maintain that such behavior does not evince true thought or genuine intelligence because some essential characteristic is lacking.  Thus the naive argument calls for a theoretical rejoinder.  To be sustained such a rejoinder must

1. scientifically support its essence claim on behalf of the characteristic in question, and

2. empirically support its claim that computers lack this characteristic.

 I believe the prime candidates for the office of disqualifying essential characteristic -- unity, intentionality, and phenomenal consciousness -- all fail on these counts.

Unity First

The unity objection holds that it's necessary, for a thing to really be thinking, for it to have enough interconnected intelligent-seeming characteristics or abilities.  The would-be disqualifying thought is that Deep Blue and other candidate modest AIs lack enough interconnected mental abilities for their intelligent-seeming performances to be truly considered thought.  To be sustained, the unity objection requires some account of how many and which other mental abilities a thing must have in order to think; and why.  And the rub -- as with the Turing test as a test of thought per se -- is how to disqualify my computer without disqualifying my cat.  In light of computers' many actual intelligent-seeming capabilities and their (theoretically almost unlimited) potential for acquiring more; and in light of the somewhat limited capacities of my cat; I do not believe the unity objection to modest AI is sustainable.6

Nevertheless, for expository purposes, I am going to imagine the Unity objection to be sustained.  Exit my cat.  Reenter the Turing test; and rejoin the issue of immodest AI that has occasioned so much interest.  I contend that intentionality and phenomenal consciousness objections cannot be sustained even against immodest AI claims.  They are thereby shown to be unsustainable against modest AI in spades.7

The Imitation Game Revisited

Recall the original man-woman version of the game.  In this version, no matter how female-seeming the man's manner of conversation, revelation of what's hidden overrides the conversational evidence; because that's the essential thing; not the style and content of their conversation, but the content of their jenes [sic.].  In the same manner, the two objections now to be considered -- the phenomenal consciousness and intentionality objections -- urge that no matter how intelligent or thoughtful seeming the computer's conversation, revelation of what's hidden in it overrides the conversational evidence; because that's the essential thing: phenomenal consciousness or intentionality.

Phenomenal Consciousness

Phenomenal consciousness is "inward" experience, or subjectivity, or possession of private conscious experiences or qualia. It's that certain je ne sais qua but you don't know what it is; or, at least, you can't say.  It's the stuff that souls are made of; a spiritual concept, I think; and whereof one cannot scientifically speak, thereof one must pass over, scientifically, in silence.  Since disqualification of a Turing test passer requires scientific-theoretic and empirical support to be sustained, I submit the phenomenal consciousness objection fails.  Let me assemble a few reminders why.

With regard to the scientific theoretic standing of consciousness . . . for a very long time consciousness was regarded as the foundational concept of psychology.  Introspectionism was the last gasp attempt to base psychology as an empirical science on such a phenomenological foundation.  Introspectionism was a flop.  The trouble was -- allowing verbal reports of consciousness to go proxy for direct introspective observation, as one must in the case of others' conscious experiences -- everyone disagreed in their verbal reports.  At present, I believe, no credible scientific psychological theory features consciousness as a fundamental concept.8

The chief trouble with consciousness as a scientific concept -- in a word -- is the subjectivity of it.  Turing notes the impossibility of observing and consequent difficulty of confirming it's presence in others in suggesting "those who support the argument from consciousness" might "be willing to accept our test" rather than "be forced into the solipsist position" (Turing 1950, p. 447).  Though some entertain hopes of discovering the neurophysiological basis or computational basis of phenomenal consciousness -- so as to be able to scientifically infer the presence of the phenomenal consciousness we can't observe in others from such neurophysiological or computational causes as we can -- such would-be solutions to the other minds problem, even if successful, would only solve half the problem; and not the half that most concerns us.  They would provide "sufficient but not necessary conditions for the correct ascription of [phenomenal consciousness] to other beings" (Searle 1992, p. 76); but it's the detection of the absence of qualia, or discovery of what's causally necessary for phenomenal consciousness, that concerns us for purposes of disqualifying Turing test passing computers.  And there seems no scientific or anywise empirical hope of establishing the absence of phenomenal consciousness or qualia in anything . . . unless by telepathic scan.  Even then . . . perhaps the telepath is sensitive only to humanoid qualia.  Perhaps a Turing test passing computer would have qualia quite unlike ours, and the telepath's failure to detect any phenomenal consciousness in the computer might be due to our human telepath's insensitivity to such alien qualia, and not to the machine's lacking qualia altogether.

I conclude there is no scientific reason to think phenomenal consciousness is the true essence of thought; and if there were -- when faced with a truly Turing test passing computer -- we would have no empirical grounds to deny such a computer to be possessed of such phenomenal consciousness as its conversation would suggest.

Intentionality: Why Robot?

I will not dispute that intentionality -- meaning or aboutness -- is essential to thought; but scientific grounds for thinking a Turing test passing computer would lack the requisite intentionality are slim to nonexistent.

Views according to which intentionality -- meaning or "aboutness" -- is supposed to boil down to consciousness, in addition to facing the aforesaid troubles with consciousness, undertake the burden to say how it boils down; which none can say.   We can dismiss such views immediately.

A more credible approach maintains that computers' "symbol" processing is not sufficiently grounded in causal-perceptual and causal-robotic interaction with things for its symbol processing to really be about these things.  On this view, the missing ingredient -- what we have that computers lack -- are causal connections between the signs and the things they signify. Put crudely, the difference between my meaningful belief that cats are intelligent and a computer's meaningless "representation" of this same information -- say by storing a Prolog clause that says intelligent(X):-cat(X) -- is that my representation came to be, or could be, elicited in me by the actual presence (the actual sights and sounds) of cats. It is these perceptually mediated connections between my use of the English word "cat" and actual cats that makes that word signify those animals for me; and it is for want of such connections that computer representations lack such signification for the computer (c.f. Hauser 1993).

I will not dispute the general causal story about reference underlying this objection.  I do, however, dispute the use of this story to justify discrimination against would-be Turing test passing computers on the basis of their sensory or motoric disabilities.  On any plausible telling of the causal story no extensive ability to apply words to the world directly on the basis of sensory acquaintance or physical manipulation is crucial.  As Turing notes, "We need not be too concerned about the legs, eyes, etc.  The example of Miss Helen Keller shows that education can take place provided communication in both directions between teacher and pupil can take place" (Turing 1950, p. 456).9   My computer communicates with me -- and I with it -- through its touch pad and LCD screen.  That's why the figures Excel manipulates when it calculates my students' grades are about my students' grades.  The conversation of  a Turing test passing computer would likewise be about what it was discussing, for similar reasons.

Conclusion

In the light of the success of Turing's second prediction in practice, I take modest AI to be a present reality: generally speaking, computers really are possessed of the mental states and capacities our naive assessments say.  Deep Blue really considers chess positions and evaluates possible continuations; my humble laptop really searches for -- and sometimes finds -- files in storage and words in documents.  If it asks like it thinks, and answers like it thinks, and extemporizes like it thinks, it ain't necessarily thinking.  Still, prima facie it thinks; and you're warranted in saying and believing as much on the basis of such evidence.  In the case of ducks we know what sort of observable facts would scientifically override the quacking, and waddling, and ducklike appearance, if it turns out to be a mechanical duck.  We know no such things in the case of a Deep Blue's forming plans and understanding strategy; and we know no such things which -- if we were faced with a Turing test passing computer -- would warrant withholding attribution of human-level thought.  Of course, by the time we are faced with that eventuality -- if ever -- we may know more about thought.  Then again, what more we know may not be disqualifying.  Though the test itself -- like any empirical test -- is negotiable, there is presently no empirical or theoretical reason to renegotiate.

Immodest AIs -- computers with human-level thought -- are still very far over the horizon, I suspect.  If that horizon "seems to be receding at an accelerating rate" (Dreyfus 1979: 92), that is because it initially appeared, to Turing among others, to be closer than it really was.  Ironically, what I have been calling immodest AI for the immodesty if it's aspirations suffers, more than anything else, perhaps, from too modest an assessment of our own human mental capacities.  Nature has taken some four billion years to evolve human intelligence from inanimate matter. Through us, she has taken some fifty years to arrive at the current level of artificial intelligence. Wait, I say, till next millennium!

Acknowledgements

I am indebted to James Moor for his encouragement and inspiration.  Thanks are also due to Ethan Disbrow, Tanisha Fuller, Lark Haunert, Anne Henningfeld, Holly Townsend, and Jeanette Watripont for their helpful comments and criticisms.

Notes

  1. Shieber was addressing these remarks to the restricted version of the test.  Ironically despite his general reservations about tests based on fooling people being confoundingly easy, Shieber dismisses the unrestricted Turing test as impossibly hard and the Loebner Prize competition as incredibly premature.
  2. Pace Block 1981.
  3. See Saygin, et al. (forthcoming) for further discussion.
  4. This might be helped somewhat by requiring cultural diversity among querents and confederates.  Inclusion of "differently abled" humans among the confederates and querents might also help. 
  5. A hope buttressed, of course, by the theoretical universality of these machines.
  6. Descartes' (1637) advocacy of Turing test passing (roughly) as a necessary condition for any intelligence whatsoever underwrites his infamous denial of any mental capacity to any infrahuman animal at all.  Herein lies a cautionary tale.
  7. If (evidence of) phenomenal consciousness or robustly-robotically-grounded intentionality are not justifiably held to be prerequisite for (warranted attribution of) human-level intelligence, they certainly cannot be warrantedly asserted to be prerequisite for (attribution of) of lesser varieties -- say sparrow-level or starfish-level intelligence.
  8. Though many a crackpot theory does.
  9. See Putnam's (1975) discussion of the elm/beech and aluminum/molybdenum examples and the division of linguistic labor.  See also Landau and Gleitman's (1975) findings concerning blind children's acquisition of seemingly vision-dependent concepts.

Bibliography

·   Aristotle.  Nichomachean Ethics, trans. W. D. Ross (1941) in R. McKeon, ed., The Basic works of Aristotle, New York: Random House, pp. 935-1126.

·   Block, N. (1981), Psychologism and Behaviorism, Philosophical Review XC, pp. 5-43.

·   Descartes, R. (1637), Discourse on Method trans. R. Stoothoff (1985) in J. Cottingham, D. Murdoch, and R. Stoothoff, eds., The Philosophical Writings of Descartes, Vol.1, Cambridge: Cambridge University Press, pp. 111-151.

·   Dreyfus, H. (1979), What Computers Can't Do, New York: Harper Colophon.

·   Epstein, R. (1992), The Quest for the Thinking Computer, AI Magazine 13, pp. 81-91.

·   Keller, H.  (1912), The Hand of the World, Out of the Dark: Essays, Letters and Addresses on Physical and Social Vision, Garden City, NY: Doubleday, Page & Co., pp. 3-17. 

·   Keller, H. (1913), A New Chime for the Christmas bells, Out of the Dark: Essays, Letters and Addresses on Physical and Social Vision, Garden City, NY: Doubleday, Page & Co., pp. 274-282. 

·   Hauser, L. (1993), Why Isn't my Pocket Calculator a Thinking Thing?, Minds and Machines 3, pp. 2-10.  Online: http://members.aol.com/lshauser/wimpcatt.html.

·   Landau, B. and Gleitman, L. (1985), Language and Experience: Evidence from the Blind Child, Cambridge, MA: Harvard University Press.

·   Loebner, H. (1994), In Response, Communications of the ACM 37, pp. 79-82.  Online: http://www.loebner.net/Prizef/In-response.html.

·   MacNeil, R. and Lehrer, J. (1997), Big Blue Wins, NewsHour May 12.  Accessible online: http://www.pbs.org/newshour/home.html.

·   Moor, J. (1976), An Analysis of the Turing Test, Philosophical Studies 30, pp. 249-257.

·   Putnam, H. (1975), The Meaning of `Meaning', Mind, Language, and Reality: Philosophical Essays, vol. 2, Cambridge: Cambridge University Press, pp. Minneapolis: University of Minnesota Press, pp. 215-271.

·   Saygin, A. P., Cicekli, I., & Akman, V. (forthcoming), Turing Test: 50 Years Later.  Online: http://cogsci.ucsd.edu/~asaygin/papers/tt50abs.html.

·   Searle, J. R. (1992),  The Rediscovery of the Mind, Cambridge, MA: MIT Press.

·   Shieber, S. M. (1994), Lessons from a Restricted Turing Test, Communications of the ACM 37, pp. 70-78.  Online: http://www.eecs.harvard.edu/shieber/papers/loebner-rev-html/loebner-rev-html.html.

·   Turing, A. M. (1950), Computing Machinery and Intelligence, Mind LIX, pp. 433-460.