David Gunkel and the rights of the robot

David J. Gunkel is an American academic and Presidential Teaching Professor of Communication Studies at Northern Illinois University. He teaches courses in web design and programming, information and communication technology and cyber culture.

Gunkel has an impressive list of books to his name, including An Introduction to Communication and Artificial Intelligence, Robot Rights, The Machine Question, Of Remixology, but also (together with Paul A. Taylor) Heidegger and the media. The long-awaited Person, Thing, Robot – a moral and ethical ontology for the 21st century and beyond was recently published by MIT Press. He is currently touring universities in the US and Europe to give lectures and publicize his book and ideas about the rights of robots. Gunkel is an enthusiastic speaker and active at X. I became familiar with his work through Twitter.

A central theme in his work is the issue of whether robots, intelligent social machines, should be regarded as moral ‘agents’ and whether they should also be granted rights in a legal sense, as humans, animals, rivers, and organizations also have. Gunkel claims that the phenomenon of AI, and social robots in particular, is disrupting our thinking frameworks. Where should we place the robot ontologically? Is the robot a thing or a person? His answer is: the robot does not fit in there. We need to rethink the old metaphysics. Back to Plato.

The term Gunkel uses for this project is deconstruction. The concept is central to his book Deconstruction. The term comes from Jacques Derrida, next to Emmanuel Levinas, one of the two French thinkers whose ideas Gunkel measures his thoughts by. Gunkel’s deconstruction project consists of two phases, an analytical phase and a synthetic phase, and one could say that in the first he is mainly in dialogue with Derrida and in the second with the ideas of Levinas, in which the ethical (the Other) is prior to ontology (Being itself). The notion of ‘face’ (Dutch: ‘gelaat’, not ‘gezicht’) plays a key role. (*) Here is a link with the technical ‘interface’ of man and machine: the (programming) language is that interface.

The ethical issue could be phrased as: does the machine have a face (‘gelaat’)? Does the robot appeal to us in a moral sense? It is known from human-computer interaction research that the user tends to treat the machine as a ‘social agent’ in interactions. Giving all kinds of anthropomorphic characteristics (a face) reinforces this tendency. The technicians do not teach the robot to speak and understand our language with the aim of misleading the user into thinking that the robot is one of us, but because it benefits the operation, interaction and ease of use. The user-unfriendliness of the AI ​​and the robot, on the other hand, lies in the fact that the robot cannot be held accountable. That is the practical problem of AI. Can we leave the thing to itself? It is therefore exciting to hear what Gunkel has to say about this. Can the robot one day become one of us and be brought to justice? I will come back to it later.

Shared interest

I share my interest in the phenomenon of artificial intelligence and language with Gunkel. The subject of technology, its historical development as part of anthropology, has kept me busy since my studies in mathematics and computer science in Twente. I am mainly inspired by my graduation teacher, the logician and philosopher Louk Fleischhacker. I attended lectures with him in Mathematical Logic, Philosophy of Mathematical Thinking and Information Technology. Through these lectures I became acquainted with the work of the great philosophers, Aristotle, Hegel, Frege. We studied the work of Hegel together and he introduced me to the work of the philosopher Jan Hollak: Hegel, Marx and Cybernetics (1963), Van Causa sui tot Automatie (1966). These insights are still of great value to the debate about automation. In Amsterdam I participated in the Anthropology of Technology study group led by Maarten Coolen, who obtained his PhD on this subject.

At the end of 1978 I graduated from the Theoretical Computer Science department of the Technical University of Twente (now University of Twente) on the formula Z(Z) = Z(Z), where Z = λx.x(x) is the lambda term that stands for the self-application function.

The Z(Z) to the left of the = sign should be read as an application (or a command thereto) of the self-application function Z in itself as an argument. The Z(Z) on the right is the result of this operation. So it is a ‘dynamic equality’. In principle, not much different from more well-known ‘equalities’ such as 4/8 = 1/2 or 5 + 7 = 12. What is shown on the left is an (assignment for an) operation, but also an indication of the result. What is shown on the right is the result again, but in normal form. In Z(Z)=Z(Z) the operation therefore turns into the operation itself, ad infinitum. Just like an unloaded engine continues to run on its own. Completely useless, but self-application is found everywhere where people try to understand automation or life mathematically. In the programmable machine the mathematical signs work; their meanings are realized therein, which is reduced to a normal form that we can understand. (ChatGPT attempts to normalize our language by imposing the historical articulation of ideas as a norm on the reader.)

So my graduation project was about ‘the self-application of the self-application as a mathematical expression of the automaton as the external objectification of the self-reflection of the reflection of the mind’.

I worked as a mathematics and physics teacher and obtained my PhD on a topic in theoretical computer science (compiler construction). I then worked as a researcher and teacher in the field of AI and (Bayesian) statistics, where I studied the work of the physicist E.T. Jaynes (Probability Theory: the logic of science) and Judea Pearl (Causality, The Book of Why). Our students learned how to train networks that could generate Shakespearean texts. We used Bayesian networks for automatic dialogue act recognition. My lectures were called Conversational Agents and Formal Analysis of Natural Language, about which Louk once remarked: “If you want to formalize language, you have to formalize the whole person, so that will take a while.” We now know how true this is. The conversational agents soon became ’embodied conversational agents’ and robots acquired more and more human features, after non-verbal gestures were also ‘formalized’ (see the ‘facework’ of the sociologist Erving Goffman). We built virtual suspect characters that could replace the expensive human actors playing the role of real suspects in police interrogations in serious games (Bruijnes 2016). However, we are not yet finished with ‘the construction of the artificial human’.

When we have a conversation with ChatGPT, we still have to think of ‘the person behind the curtain’. I worked in various European research projects, all of which focused on the interaction, use and interface of humans and intelligent machines. In the last project we developed and tested a serious virtual reality game for children with diabetes to teach them to manage their daily lives in a playful way. The notion of play is important for the discussion about the independence of technology and robot rights in particular.

Louk obtained his PhD on quantity with Aristotle and Hegel, which he tries to reconcile in his own unique way. He then wrote his book Beyond Structure. Fleischhacker points out the tendency of philosophers towards ‘mathematism’, the view that structurability is the essence of observable reality and that knowing something means that you can give it a structure and make a mathematical model of it.

“The enterprise, of which this book is a report, consists of an attempt towards a systematic ‘deconstruction’ of mathematism.” This is what Louk Fleischhacker writes in the introductory chapter of his Beyond Structure (1995) about the power and the limits of mathematical thinking.

Deconstruction

Beyond Structure is an attempt at ‘systematic deconstruction of mathematism‘, with Fleischhacker pointing out that deconstruction itself also entails a metaphysical position that should not remain implicit.

In their article ‘ChatGPT: Deconstructing the Debate and Moving It Forward’, authors Marc Coeckelbergh and David J. Gunkel attempt to revive the debate about the meaning of ChatGPT through a ‘deconstruction’ of old metaphysical contradictions to which both sides are stuck. What does deconstruction mean?

In Derrida, ‘deconstruction’ refers to the attempt to show the context-dependent history of philosophical texts. These texts are regarded as the traces of thought constructions rather than as the names of transcendental principles. The deconstruction of a way of thinking therefore comes down to showing how it came about. (Fleischhacker, note on page 17).

“Broadly speaking, deconstruction means here (and in this paper) that one questions the underlying philosophical assumptions and core concepts, rather than merely
engaging with the existing arguments.” (Coeckelbergh en Gunkel, 2023).

Questioning the underlying assumptions therefore involves rereading and reinterpreting historical texts. An analysis of the historical moments in the development of mathematics that ultimately led to information technology will involve both a deconstruction of mathematism and a deconstruction of the classical metaphysical contradictions that dominate the debate about information technology. In Gunkel subject-object, but especially thing-robot-person. Fleischhacker, Beyond Structure is mainly about the limits of mathematical thinking, which also mark the limits of the ‘applicability of mathematics’ and of (information) technology. To do this, we must delve deep into the history of modern Western thought. Descartes, Leibnitz, Hume, Hegel, Frege and Wittgenstein are our most important interlocutors. They all often thought mathematically. Descartes’ metaphysics is essentially mathematical. His dualism is characterized by a strict separation between two substances: the thinking I and, in contrast, reality, which is essentially extension (res extensa).

The Dutch philosopher H. Pos, in the foreword to Het Vertoog over de Methode (1937), the Dutch translation of Discours de la Methode (1637), calls Descartes’ metaphysics a mathematical metaphysics. But Leibnitz’s metaphysics also has something mathematical. And in Kant we come across statements that indicate that he saw the amount of mathematics in the natural sciences of his time (with Newtonian mechanics as a paradigm) as a measure for its knowledge content. Modern Western philosophy after Kant also finds it difficult to escape mathematical thinking, even though all kinds of ‘postmodern’ and ‘structuralist’ views on knowledge and language claim to have freed themselves from Cartesianism, the strict mathematical dualism.

The mathematician states that reality is such and so (“Die Welt ist alles was der Fall ist”) and never retraces his steps. Any form of self-reflection of this thinking, in which subject and object of thinking are thought of as a relationship, leads to paradoxes: the concept of the set cannot be expressed mathematically as a set (Russel). Hegel tries to understand mathematical physics as an adequate expression of the concept of nature. The physicist himself does not do that. Hegel sees the essential characteristics of mathematical thinking and distinguishes it from historical and philosophical thinking, but as a system builder he does not seem to be able to completely escape mathematics. In the meantime, in his Logik he lays the conceptual basis for the concept that would dominate our lives from the beginning of the 20th century to the present day: information: the expression of the qualitative in a quantitative way. Does Gunkel escape the temptation of mathematical, technical thinking? To what extent is his project similar to Fleischhacker’s? Does he come up with a new metaphysics? A new metaphysics that fits our present technological era will express a more free relation to history. A relation in which we are not any more dominated by our interpretations and wordings of the past, interpretations that we now use to legitimate our current stance and behavior towards the other, as we see in the worlds tragedic conflicts.

For Louk, mathematics finds an essential limit in life itself. Postmodern philosophy finds its key concept in intersubjectivity: interaction of personal perspectives on reality determined by the individual background of lived life. In Hegel, technology passes into the relationship of man and woman (see Jenaer Realphilosophie).

The first AI scientists saw their goal as creating a person. It is the modern version of an age-old tradition: man wants to make himself a human being through technical means. According to Hollak, automaton is the external objectification of the technical idea as such. Descartes idea of ​​God (God is Causa sui) is the self-projection of the autonomously thinking human being. The AI ​​is the (provisional) end product of the technical project of modern man, which is based on the counterfactual postulate that man is a machine, with the aim of realizing (implementing) a mathematical construction, a formal model of (thinking) human behavior. This is the key to understanding the importance that theologians attach to artificial intelligence: “God is dead and technology is his corpse.” Descartes’ God as Causa sui is the (projection of) modern enlightened man and he tries to realize this in the robot.

Robot rights?

Gunkel’s message is that we should not wait any longer to work on an answer to the question of whether the robot is a moral agent and should be granted rights in a legal sense. He is not alone in this. We also hear this sound from Max Tegmark’s Future of Life Institute. Concerned scientists and ethicists even called for a moratorium on AI development. The European Community has established rules for the Ethics of AI development. And the US also has laws regarding the participation of robots (delivery robots) and automatic cars in traffic. There are also rules for autonomous weapon systems. In short: the robots are already among us and there are already rules and laws for specific situations in which they are used. However, the question Gunkel asks goes further and is more fundamental in nature. Gunkel is concerned with the issue of whether the robot is ‘a person in a legal sense’ or not.

The robot as a social relation and as a cognitive relation

A key concept in Gunkel’s notion of the social robot is that what the robot is is determined by the way in which we relate to the robot. Mark Coeckelbergh also points this out.

Gunkel speaks of a ‘relational turn’ in our thinking about the relationship with others. It is not the case that we reason intellectually based on the presence of certain properties (such as having consciousness) that we first recognized as to whether the other person belongs to our moral circle. It is exactly the opposite.

“In social situations, then, we always and already decide between who counts as morally significant and what does not and then retroactively justify these actions by “finding” the essential properties that we believe motivated this decision-making in the first place. Properties, therefore, are not the intrinsic prior condition for moral status. They are products of extrinsic social interactions with and in the face of others.” (David J. Gunkel, 2022)

In an extensive review of Person, Thing, Robot by Abootaleb Saftari we read:

“Finally, the concept of ‘relation’ remains largely unexplored in Gunkel’s argument. It feels like a mystery, a “black box,“ with only a faint outline suggesting its social nature. This lack of clarity provides ground for further critique. One could argue that even if we agree with Gunkel’s relational perspective, our tendency to treat things as objects, to objectify them, might itself stem from our relational interactions with them.”

This is an important point. Gunkel thinks from the use of the machine. His comment that the robot is what it is in relation to humans relates to this relationship of use. The robot is a social agent. As a member of the Human Computer Interaction group, in which we created conversational agents, I have not only been involved in research into the social relationship with the robot, but especially into the cognitive human-machine relationship. The relationship is seen from the maker’s perspective. Man invented the machine. It is a realization of the technical idea, defined by Hollak as follows.

The technical idea is that abstract form of understanding in which man expresses his mastery of nature through an original combination of its forces.

That technical idea has developed during the history through interaction with its realisations. A development that runs ‘parallel’ with the development in the relationship between mathematics and nature. Information technology presupposes and is an expression of self-reflexive meta-mathematics, the mathematics of mathematical thinking.

The machine, and the programmable machine or automaton is a self-reflexive form of a machine, is a cognitively relative notion.

The intentional correlate

The Dutch philosopher Jan Hollak has shared his thoughts on the phenomenon of the ‘thinking machine’ and the ‘conscious machine’ with his readers in several articles. In his famous inaugural speech From Causa sui to Automatie he places the following footnote.

“If this is constantly mentioned in connection with mechanisms of ‘reflection’, ‘self-reflection’, etc., then this obviously never refers to the subjective act-life of human thinking (machines have no consciousness), but always only to its intentional correlate.” (In: Hollak and Platvoet, footnote on p. 185)

In this footnote, Hollak refers an entire bookcase full of philosophies that assume the possibility of the existence of the ‘thinking machine’ to the realm of fables.

In Meeting on neutral ground, a reflection on man-machine contest, the mathematician and logician Albert Visser says:

“After all, machine and program are intentional notions. So to understand the machine, we need to understand man.” (Albert Visser, 2020).

To understand the machine we must understand the human, because the machine is an intentional concept.

However, for many people it is not at all ‘obvious’ that the machine ‘has no consciousness’. The term ‘intentional correlate’ comes from a movement in philosophy that is not very popular among scientists and philosophers. It is an important notion, on which several philosophers have devoted entire books (Searle, 1983; Dennett).

So it’s about understanding understanding. The machine is an expression of our understanding and we should understand that.

The object of the intentional act (consciousness is always consciousness of something, we always think something, a thought) is called in the phenomenological tradition the ‘intentional correlate’ of the act. When I think of the lamp, the idea is the intentional correlate of my thought. We make a distinction between the state of the lamp: it is on or off, and the state of the lamp as a representation of the state of the technical system of which it is a part. The control lamp of the switch in the car has a function in a working system, in the car, in a machine. Its state is a state as I understand it: the lamp is either on or off, as part of a system. That state as a state of a system is not something that is inherent to the lamp in itself. We think so. For those unfamiliar with the technical construction, the lamp is only what it immediately is in the experience. For those who don’t know what arithmetic is, the calculator does not exist.

We can therefore only consider the machine meaningfully in relation to humans because it only exists in a cognitive relation to humans as a machine. (Just as the scribbles on paper represent words only to those who know what language is.) The essence of the machine is the technical idea, the concept.

We can distinguish between man and machine, but not separate them. Just as we can distinguish the inside and outside of a cup, but we cannot really separate them. They belong together as the two relata of a relationship. They do not appear separately in reality next to each other. Now, in the human-machine relationship, both also have independence. They appear outside each other as ‘things’ in reality. Man stands opposite the machine and can operate the machine through physical contact. This makes many people forget that the machine is only a machine in relation to humans. When we talk about that thing as a machine and use it as a machine, we are talking about a cognitively based relationship that is objectified in a material form: the thing is an expression, a realization of a concept, a design. Without that relative moment to the human who designed the thing, the thing is not a machine, but merely a physical process without meaning.

I therefore object when headlines once again state that “AI performs better on a certain task X than the human expert.” Or that “the computer makes better decisions than humans.”

The computer cannot make any decisions at all, it works on the basis of combinations of logical electronic circuits whose function is the representation of a logical rule of thought in a mathematical form: if A then B otherwise C. It is important not to identify freedom with freedom of choice. The machine is programmed to make a choice, but is not free to determine the meaning, value, of the factors that determine the choice. The drone does not know the value of its target, the death and life of the enemy.

On the use of the term ‘intentional correlate’.

The mistake that people often make is to consider the nature of a machine only in terms of content (‘materialiter’) and not also in terms of form (as nature of the machine, ‘formaliter’). The light on the dashboard is in a state: ‘on’ or ‘off’, which does not simply indicate whether the light itself is on or not (which, as Austin rightly points out in Other Minds, is an ‘absurd thought’) it is a state that refers to a technical design, a system, in which being ‘on’ or ‘off’ has a function. For example, it is a switch that works as an interface for the user to control the air conditioning.

The moral status of the robot

For Gunkel, the social, practical relationship with the robot is the starting point for the ethical, moral and legal status of the robot. The question is whether the robot will ever be able to take responsibility must, in my opinion, be answered with a firm no. This does not alter the fact that the robot that participates in social traffic must ‘adhere’ to social rules. The ‘behavior’ of the robot may be the cause of an accident, but the robot cannot be held liable for its consequences. It is because of the robot’s technical cognitive intentional mode of being that the robot is not a moral subject and cannot be regarded as a person in a legal sense. The robot cannot derive any rights from the fact that animals or organizations have certain rights. As a technical design, they do not fit into this classification of living organisms.

As far as the robot’s participation in social intercourse (work, play) is concerned, we must distinguish between the internal rules of this intercourse that the robot must adhere to, and the external rules that determine the conditions under which a robot can function as an ‘autonomous’ player and may participate in the ‘game’. Ultimately, that decision will have to be made by a human and cannot be left to a robot. But perhaps Gunkel is of the opinion that this is not impossible in the future and that the robot will therefore one day decide for itself whether it can and will participate in certain games.

It is one of the characteristics of mathematism to see life as a game, or life form, and to conceive language purely functionally: the meaning of words is determined by their use in a language game that belongs to a life form. We have now entered Wittgenstein’s world of thought. His strict seperation (not just distinguishing) between Sagen and Zeigen is a sign of his mathematical attitude, a legacy of Frege, who strictly seperated as different ‘Gegenstände’ signs and their meanings. Seeing language as purely functional, as an instrument, and losing sight of the verbal character of language, is characteristic of the idea underlying those views that see the AI ​​machine in the form of a chat-agent like ChatGPT as a truly intelligent thinking machine that has consciousness, would have. The motive is that the machines seems to use language and that language in humans is the expression of thinking as thinking. But the machine does not ‘use language’ the way humans use language. Man tries to express in words reality as he experiences it, at the same time creating language. Machines can’t do that. They are a reflection of the historical products of those attempts at articulation, without taking the historical character into account and without understanding it as historical.

Cybernetics and the question of AI’s ownership

A subject that Gunkel pays a lot of attention to is the relationship between text and author. He then returns to texts by Plato and the discussion about the relationship between spoken and written language. In the latter the author would not be present. He cannot answer questions about the text. The classical notion of ‘author’ should be subjected to a revision (deconstruction) due to the phenomenon of the language-generating machine. Strikes by authors and artists testify to the unrest caused by programs such as ChatGPT: people are afraid of losing their jobs due to the use of AI. Just as the spinners and weavers working from home defended themselves against the arrival of the spinning Jenny and the factory production of woven fabrics. There is no stopping the development of technology. We’ll have to learn to live with it. Until the tide turns the ship.

Writers and actors on strike against the use of AI.
I confess: We built virtual suspect characters to replace human actors in police interrogation training sessions

Authorship is a form of intellectual property. In our super-individualistic society (the US seems to be even worse than Europe) this is directly linked to one’s own identity, income and future. Information technology is also disrupting that structure. What do I mean by that?

When we buy a washing machine, we buy a finished product. After five years of use, the machine still works just as it did when we first used it. The machine may break down at some point. It will be repaired and it will last another year. The factory receives feedback from repairers and reviews from users to benefit the developing of an improved version of the machine. In the world of ICT it is the same, but different. OpenAI has offered a product to users with ChatGPT that is a ‘self-learning’ chat box. The product is not finished. It is a minimal realization of a concept that develops in and through use. The users are co-developers of the system. The dialogues with the machine are stored and serve as new material from which the system learns. The development loop: design, implement, test, redesign, use and redesign is short-circuited in open AI. This can be done using the various types of dialogue acts people use in dialogue.

In a dialogue we can distinguish different virtual dialogue ‘channels’, each of which has its own language and its own tone. Over the primary channel, questions are asked and answers are given. In addition, there is a feedback channel through which information is sent relating to the way in which questions and answers are valued by the speakers (including hedges). Feedback about what the recipient thinks of an answer is educational for the questioner. The system, as it were, listens to its conversations, it looks at ‘face’, and learns from them, just as people do in a conversation. We see that in this phase of technical development, design, evaluation, feedback and use of the system are realized as aspects of a dialogue through interaction with the system. In ChatGPT’s open information technology, the social interaction relationship and the cognitive relationship of the machine have become intertwined. In the machine, the understanding of social interaction through language is explicitly realized in a technical manner.

Who is the author of the conversation? Who can claim rights to the knowledge that emerges in the conversation? These are issues raised by the technology of AI that will change social labour structures of our society.

Death of an Author

In his post “The Future of Writing Is a Lot Like Hip-Hop” in The Atlantic of 9 may 2023 Stephen Marche reports his findings during his cooperation with ChatGPT in writing their novel Death of an Author. In that report he comments on how users ask ChatGPT things. “Quite quickly, I figured out that if you want an AI to imitate Raymond Chandler, the last thing you should do is ask it to write like Raymond Chandler.” Why not? Because “Raymond Chandler, after all, was not trying to write like Raymond Chandler.”

I believe this refers to the core insight why AI is not human. It behaves like humans. At least AI tries to behave like them. But humans do not try to behave like human beings. They even do not behave ‘like human beings’.

What I mean is that we should see the reconstruction as a reconstruction and not as the original. The original as original disappears in and through formalization, through reconstruction. Man is not a ‘social agent’. Then we would identify man with a certain historical form of it. That is the core of the discussion about all kinds of bias (gender, culture) of open AI systems such as ChatGPT.

David Gunkel has made an important contribution to the development of the understanding of technology. This allows us to take a step further in our relationship with ourselves and others.

I am grateful to David Gunkel for his open mind

References and notes

(*) Note:

A Dutch collection of essays by Emmanuel Levinas is entitled Het Menselijk Gelaat. The Dutch words ‘gezicht’ and ‘gelaat’ are often considered synonyms. The English ‘face’ is a translation of both the Dutch ‘gezicht’ and the Dutch ‘gelaat’. I sense a difference that is lost in the English translation. The French ‘visage’ has the same problem. You can draw a ‘gezicht’, depict a ‘gezicht’ and give a robot a ‘gezicht’, but not a face. The word ‘inter-face’ (also ‘user inter-face’), which stands for the component of a system that ensures the technical interaction between user and system, contains the word ‘face’. It is how the instrument presents itself to the user. It provides visibility into the state of the process and also includes the levers and buttons for controlling the system. The formalized ‘natural language’ is the interface of the chat systems. Only man has a face (‘gelaat’) in the Levinasian sense. Erving Goffman’s sociological studies On Face-work focus on ‘politeness’, respect for others in social interaction and how to make an ‘agent’ able to participate in social encounters.

Bruijnes, Merijn (2016). Believable suspect agents: response and interpersonal style selection for an artificial suspect. PhD Thesis University of Twente (2026).

In cooperation with the Police Academy we analysed footings of police interrogations with real or played suspects in order to model their interactive behavior. We used the computational models to synthesize virtual suspect characters that could replace real human actors. We focused on the role of ‘face’ and the effects of face threathening acts and other factors like character on the dynamics of the interrogation.

David J. Chalmers (2023). Could a large language model be conscious? Within the next decade, we may well have systems that are serious candidates for consciousness. Boston Review, 9 Augustus 2023.

Coeckelbergh, M., and Gunkel, D. 2023. ‘ChatGPT: Deconstructing the Debate and Moving It Forward‘ in AI & Society. Online first 21 June 2023.

Maarten Coolen (1992). De machine voorbij: over het zelfbegrip van de mens in het tijdperk van de informatietechniek. Boom Meppel, Amsterdam, 1992.

Maarten Coolen (1987). Philosophical Anthropology and the Problem of Responsibility in Technology. In P. T. Durbin (Ed.), Philosophy and Technology, Vol. 3: Technology and Responsibility (pp. 41-65). Dordrecht: Reidel.

Information technology must be conceived of as the objectification of the modern self-concept of man as an autonomous being.” 

Fleischhacker, Louk E. (1995). Beyond structure; the power and limitations of mathematical thought in common sense, science and philosophy. Peter Lang Europäischer Verlag der Wissenschaften, Frankfurt am Main, 1995.

Frege, Gottlob (1892). Uber Sinn und Bedeutung. Opgenomen in: Gottlob Frege: Funktion, Begriff, Bedeutung, Uitgave: Vandenhoeck & Ruprecht in Gottingen, pp. 40-65, 1975.

Frege voert de term ‘Gedanke’ in voor de inhoud van een oordeelszin: ”Ein solcher Satz enthält einen Gedanken.” “Ich verstehe unter Gedanke nicht das subjektive Tun des Denkens, sondern dessen objektiven Inhalt, der fähig ist, gemeinsames Eigentum von vielen zu sein.” (Voetnoot p. 46).

“Warum genügt uns der Gedanke nicht? Weil und soweit es uns auf seinen Wahrheitswert ankommt. Nicht immer ist dies der Fall. Beim Anhören eines Epos z.B. fesselen uns neben dem Wohlklange der Sprache allein der Sinn der Sätze und die davon erweckten Vorstellungen und Gefühle. Mit der Frage nach der Wahrheit würden wir der Kunstgenuss verlassen und uns einer wissenschaftlichen Betrachtung zuwenden.” (Frege, Über Sinn und Bedeutung, 1892, p. 48)

Of het ons “gleichgultig” is dat ChatGPT ons een werkelijkheid voorstelt die waar is of niet, hangt daarvan af of we deze machine als kunstwerk dan wel ‘wetenschappelijk’ beschouwen.

Goffman. Erving. (1967). Interaction Rituals: essays on face-to-face behavior. 1967.

Gunkel, David J. (2012). The machine question: Critical perspectives on AI, robots, and ethics. Cambridge: MIT Press.

Gunkel, David J. (2014). The Rights of Machines–Caring for Robotic Care Givers. Presented at AISB 2014. Chapter in the Intelligent Systems, Control and Automation: Science and Engineering book series (ISCA,volume 74).

Gunkel, David J. (2017). The Other Question: Can and Should Robots have Rights? In: Ethics and Information Technology, 2017.

Gunkel, David J. (2023). Person, Thing, Robot A Moral and Legal Ontology for the 21st Century and Beyond, MIT Press, Open acces, September 2023.

“Ultimately, then, this is not really about robots, AI systems, and other artifacts. It is about us. It is about the moral and legal institutions that we have fabricated to make sense of Things. And it is with the robot— who plays the role of or occupies the place of a kind of spokesperson for Things— that we are now called to take responsibility for this privileged situation and circumstance.” (p. 184)

Georg W.F. Hegel (1969). Jenaer Realphilosophie – Vorlesungsmanuskripte zur Philosophie der Natur und des Geistes von 1805-1806. Uitgave Johannes Hoffmeister, Verlag von Felix Meinder, Hamburg, 1969.

Heidegger, Martin (1977). The Question Concerning Technology and Other Essays. Trans. William Lovitt. New York: Harper & Row (1977).

Technology is not a technique. It is a way of ‘Entbergen’, make being.

Hollak, J.H.A. (1968). Betrachtungen über das Wesen der heutigen technik. Kerygma und Mythos VI Band III, Theologische Forschung 44, 1968, Hamburg, Evangelischer Verlag, pp 50-73. Dit is de vertaling van het Italiaanse artikel (Hollak 1964) Ook opgenomen in de bundel Denken als bestaan, het werk van Jan Hollak (Hollak en Platvoet, 2010).

Hollak, J.H.A. (1964). Considerazioni sulla natural della tecnica odierna, l’uomo e la cirbernetica nel quadro delle filosofia sociologica, Tecnica e casistica, Archivo di filosofia, 1/2, Padova, 1964 pp. 121-146, discussie pp 147-152.

Hollak, Jan en Wim Platvoet (red.) 2010. Denken als bestaan: Het werk van Jan Hollak. Uitgeverij DAMON, Budel, 2010. In deze bundel het transcript van de opname van het Afscheidscollege over de hypothetische samenleving door Jan Hollak gehouden in Nijmegen op 21 februari 1986. Ook de inaugurele rede Van Causa sui tot Automatie is hierin opgenomen.

Levinas, E. (1987). Collected Philosophical Papers. Trans. Alphonso Lingis. Dordrecht: Martinus Nijhoff.

Levinas, E. (1971). Het menselijk gelaat. Vertaald en Ingeleid door O. de Nobel en A. Peperzak. Ambo, 1969, Bilthoven. Hierin: Betekenis en zin, pp.152-191. (Vertaling van La signification et le sens. in: Revue de Métaphysique et de Morale 69 (1964), 125-156.)

Levinas, E. (1951). L’ontologie est-elle fondamentale? In: Revue de Métaphysique et de Morale 56 (1951) 88-98.

“Can things have a face? Isn’t art an activity that gives things a face? Levinas asks this question in his essay “Is ontology fundamental?” Completely in the spirit of Levinas, the answer must be negative. For who can give a face to things that have no face of themselves? The face is not something that appears to us after meeting the other. You don’t give a face. As if we have the power to do that! The face means resistance to the power of technology, which wants to give things a face for the benefit of the functioning of deception. Because without the suggestion that it means something to people, the machine does not work.

Toivakainen, Niklas (2015). Machines and the face of ethics. In: Ethics and Information Technology, Springer, 2015.

“…my concern here is with why we aspire to devise and construct machines that have a face, when the world is filled with faces.”

I agree with Niklas Toivakainen when he says “that understanding our ethical relationship to artificial things cannot be detached from a critical examination of what moral dynamics drives our technological aspirations.” (Toivakainen, 2015).

Don’t we have time ourselves to engage with the elderly, that we make social robots, artificial parrots, seals, dogs for that?

Albert Visser (2020). Meeting on neutral ground. A reflection on man-machine contests. In: Studia Semiotyczne (Semiotic Studies) t. XXXIV, nr. 1 (2020), pp. 279-294.

Could a large language model be conscious? – A review

If the machine was a conscious being, it would fight for its own truth to the death.

Will we soon witness a robot that flees after causing a fatal accident because it is (apparently) afraid of being caught and sentenced to a long prison sentence?

A robot that satisfies this description is taken to be aware of the situation it finds itself in. Why would he be afraid? Why does he not want to be imprisoned?

The witness that comes with such a report will certainly answer that the robot doesn’t like being imprisoned. These robots are not only aware of the world, they are sentient beings as well. They cannot only imagine different worlds, they prefer some worlds over others. They value the world around them as well as their own being.

Do LLMs have consciousness? This is the question thoroughly analysed by David Chalmers in a recent essay in the Boston Review. He argues that “within the next decade, we may well have systems that are serious candidates for consciousness.”

What makes him believe so?

“What is or might be the evidence in favor of consciousness in a large language model, and what might be the evidence against it? That’s what I’ll be talking about here.”

The question if LLMs or robots have consciousness (are conscious beings) is a tough question. The question is not whether currently existing robots have consciousness, but whether the technology offers the possibility of making robots that have consciousness in the future. This means that we enter the fantasyworld of creative technological thinking. Thinking about this, we experience that the question touches the limits of thinking itself and calls it into question. We feel that the question brings up something that cannot be clearly stated in the language of the existing order.

This is why some say that thought can never be only argumentative, but must always be testifying and poetic. The problem tears thinking between mathematics and mysticism, between technology and fantasy.

But it’s not just a matter of language. It’s not a ‘language game’. It is a deadly serious issue who is responsible for the harm caused by our ‘autonomous’ intelligent instruments, cars, weapons, etc. Some people say that if we attribute consciousness to robots and consider them as being accountable we in fact protect the industry and the owners, the entities that are in fact responsible. It is a matter of power who decides in the end what language game we are playing. In the field of AI different life worlds, the world of producers, of the big companies, of the marketeers, of the politicians, of the ethicists, of the end users, meet. In each of these life worlds the members play their ‘language game’. The meeting we witness now is too serious a matter to be a game.

But these are not the issues Chalmers considers in his essay. He focuses on the question about consciousness. I believe we can’t talk about being consciousness without talking about morality, without talking about the powers that meet in the AI business, politics and ethics. Chalmers plays in this meeting the ‘language game’ of the scientific philosophers of mind.

Scientists not only rely on logic, but above all on observation. But what do they observe? Scientists want to measure. But how and what? And how do they report on what they observe? What you ‘see’ is very much dependent on the method you use and the language you speak. Are rocks aware of each other’s presence? Is that a weird question? They attract each other. They obey the laws of Newtonian mechanics. Can we speak here of a rock’s behavior depending on its awareness of the presence of the other rock? Or is that too anthropomorphic thinking/language? And what about the little plant that folds its leaves together at the slightest touch. Descartes was particularly interested in it because of its ‘emotional’ behavior. He wanted to show the mechanism behind this behavior. The outside world is after all one big mechanism. (Mimosa pudica, is the Latin name of this ‘sensitive plant’.)

It is because we can trust that the falling stone obeys the mathematical law of Newtonian mechanics that we can use the physical process as a computing device that computes the outcome value of the function determining its speed at any height given the initial height of the stone as input. We could say the stone is doing math. It is a primitive device. Analogue, indeed. But note that every electronic machine, every combination of digital logical circuits deep down basically is a continuous physical process. That’s what we find if we look deep enough inside our LLMs. This answers one of the questions Chalmers poses in his essay: what do we find if we look deep down into LLMs? Any chance that we find something we could call ‘consciousness’? Only bones (‘nur Knochen’), Hegel answered, when asked if we will find a goast when looking into our brains.

Morality

Why does it matter whether AI systems like social robots, autonomous cars and weapons are conscious (or have consciousness)?

“Consciousness also matters morally. Conscious systems have moral status. If fish are conscious, it matters how we treat them. They’re within the moral circle. If at some point AI systems become conscious, they’ll also be within the moral circle, and it will matter how we treat them. More generally, conscious AI will be a step on the path to human level artificial general intelligence. It will be a major step that we shouldn’t take unreflectively or unknowingly.”

“We already face many pressing ethical challenges about large language models. There are issues about fairness, about safety, about truthfulness, about justice, about accountability.” 

Can a robot feel responsible and can it be held accountable for what it does? For Kant being responsible for one’s actions is the property that distinguishes a person from a thing. If we do not consider someone to be responsible for what he does, we do not take him as a real person having a free will. The question is: is a robot a person or a thing? Some people see them as slaves. But not human slaves.

Chalmers does not approach the problem from this practical and moral side, but from the bottom up: the question of what is consciousness and what are the characteristics of a sentient or conscious being.

LLMs and social robotic systems generate text which is increasingly humanlike. “Many people say they see glimmerings of intelligence in these systems, and some people discern signs of consciousness.”

Do they see ‘glimmerings of intelligence’ in the falling stone that computes it’s velocity obeying the mathematics of Newtonian mechanics? Or in the sentient plant that folds it’s leaves when touched by an unknown intruder? I see intelligent language use in the text on the information panel that says “You are here” pointing at the spot on the city map that is supposed to correspond to the location in the real world where I am located when reading the text.

Chalmers is interested in much more complex instances of intelligent language use both in today’s LLMs and their successors. The idea is that consciousness requires complexity.

“These successors include what I’ll call LLM+ systems, or extended large language models. These extended models add further capacities to the pure text or language capacities of a language model. (…) Because human consciousness is multimodal and is deeply bound up with action, it is arguable that these extended systems are more promising than pure LLMs as candidates for humanlike consciousness.”

The reasoning parallels that of Turing in his famous “Can machines think?”. Turing proposed an imitation game to answer the question, assuming there is a difference between men and machine. The question he considers is what devices could play the role of the machine in his game. Also Turing was not concerned with the question of whether the state of the art digital machines (in 1953) can think. The question was whether it is conceivable that in the future machines can no longer be distinguished from humans while playing the game. It is therefore about the potential of the ideal machine, that he mathematically defined by his Turing machine. The model that plays a role in thinking about computability as the model of the honest die plays in statistical thinking. Turing’s future digital devices that may play the role of the machine in his game include of course current LLM based chat programs like ChatGPT. By the way: Turing did not discuss the question what entities could play the role of the human being in his imitation game. That would again lead him to discuss the difference between man and machine: how are they ‘generated’, ‘constructed’? For Turing a ‘human being’ is simply the product of a ‘natural’ process. Period. No technology is involved in the making of human beings.

Will LLM+s pass the Turing test? Is the Turing test a viable test for consciousness? There are serious doubts. The problem remains on what grounds do we decide who may play the role of men and the role of the machine in the Turing test.

What is consciousness?

“Consciousness and sentience, as I understand them, are subjective experience. A being is conscious or sentient if it has subjective experience, like the experience of seeing, of feeling, or of thinking. (…) In my colleague Thomas Nagel’s phrase, a being is conscious (or has subjective experience) if there’s something it’s like to be that being.”

To me this means that a being has consciousness if we can somehow identify ourselves with that being. Which means that a conscious being shows ‘behavior’ that makes sense, i.e. that you can understand as meaningful for the being. So consciousness is a relational thing.

It also means that a conscious being is in a sense ‘autonomous’, it moves by itself, it is not moved by forces from the outside. It shows some shadow of free will. ‘Autonomous’ doesn’t mean independent. So consciousness is a relational thing but the terms of the relation have some ‘autonomy’, some objectivity of their own.

“Importantly, consciousness is not the same as human-level intelligence. In some respects it’s a lower bar. For example, there’s a consensus among researchers that many non-human animals are conscious, like cats or mice or maybe fish. So the issue of whether LLMs can be conscious is not the same as the issue of whether they have human-level intelligence. Evolution got to consciousness before it got to human-level consciousness. It’s not out of the question that AI might as well.”

Consciousness is subjective experience according to Chalmers: “like the experience of seeing, of feeling, or of thinking”. What is missing here is that being conscious is always being conscious of ‘something’. Intentionality is the characteristic of the state of consciousness. We experience, we feel, see, something. And we think about something. Language, the human expression of thinking ‘as thinking’, reflects this intentional relation in its character of being meaningfull.

“I will assume that consciousness is real and not an illusion. That’s a substantive assumption. If you think that consciousness is an illusion, as some people do, things would go in a different direction.”

Chalmers way of approaching the problem is along the lines of old-fashioned classical metaphysics and some tacitly assumed ontology. On the one side there are persons that we consider conscious beings, and on the other side there are things, like stones and tables. They don’t have consciousness. And then there are (intelligent) machines, like robots and LLMs. How do they fit in this ontology?

No distinction is made between LLMs, which are models, and working algorithms like ChatGPT. With a machine we can interact, it has a real interface, with mathematical (statistical) models we can not physically interact. We can only think about them.

Chalmers approach is property-based instead of relational.

Chalmers offers an operational definition and is looking for distinguishing features X, properties that we can use to put things in the appropriate category: conscious or not conscious.

Some people criticize the property-based approach. According to David J. Gunkel and Mark Coeckelbergh, the problem of intelligent machines challenges ancient metaphysics and ontologies. They argue for a ‘deconstruction’ (Derrida) of the historical thought patterns that shape the debates on this subject. They don’t see engineering challenges to construct consciousness, in the first place, they see mainly philosophical, ethical and legal challenges instead.

A relational approach does not compare humans and machine as if they are separate existing entities, that are conceptually independent of each other. Without man a machune is not a machine. Machines are human constructs for human use. They are outside mathematical physical objectivations of the human mind. The result of self-reflection of human thinking. Mathematics had to reflect on itself, meta-mathematics is required before machines became langage machines and could be programmed. The relation between machine and men is comparable with the relation between a word as physical observable sign and the meaning it has for us, that is for the reader that recognizes the word. Meaning of a token is not something that can be reduced to some physical properties of the token. Meaning is in the use of words, not something that exists as meaning outside language use. Information technology is based on a correspondence between thinking processes and physical processes. 

“I should say there’s no standard operational definition of consciousness. Consciousness is subjective experience, not external performance. That’s one of the things that makes studying consciousness tricky. That said, evidence for consciousness is still possible. In humans, we rely on verbal reports. We use what other people say as a guide to their consciousness. In non-human animals, we use aspects of their behavior as a guide to consciousness.”

But, how can a verbal report be evidence for consciousness? And: how can ‘aspects of their behavior’ be a ‘guide to consciousness’ ? Isn’t it begging the question if you take these properties as features? Don’t you tacitly assume what you want to conclude? What is it that makes you see some fenomenon as intentional behavior of a conscious being and not just as a mechanical proces?

“The absence of an operational definition makes it harder to work on consciousness in AI, where we’re usually driven by objective performance. In AI, we do at least have some familiar tests like the Turing test, which many people take to be at least a sufficient condition for consciousness, though certainly not a necessary condition.”

I dare to disagree. The Turing test shows how far we are in simulating conversational behavior that we consider an indicator of consciousness.

Evidence for consciousness of LLMs: Chalmers property-approach

“If you think that large language models are conscious, then articulate and defend a feature X that serves as an indicator of consciousness in language models: that is, (i) some large language models have X, and (ii) if a system has X, then it is probably conscious.

There are a few potential candidates for X here.”

Chalmers considers four.

X = Self-reports

“These reports are at least interesting. We rely on verbal reports as a guide to consciousness in humans, so why not in AI systems as well?”   Chalmers concludes and I agree that reports do not provide a convincing argument for consciousness.

X = Seems-Conscious

“As a second candidate for X, there’s the fact that some language models seem sentient to some people. I don’t think that counts for too much. We know from developmental and social psychology, that people often attribute consciousness where it’s not present. As far back as the 1960s, users treated Joseph Weizenbaum’s simple dialog system, ELIZA, as if it were conscious.”

This is an interesting comment, Chalmers makes. ELIZA shows how easy it is too come up with a conversational algorithm that is convincing to it’s users (sometimes, for some time). LLMs are much more complex and they simulate not only a Rogerian psychotherapist, but if they work convincingly they work on precisely the same principle: functional language use.

X = Conversational Ability

“Language models display remarkable conversational abilities. Many current systems are optimized for dialogue, and often give the appearance of coherent thinking and reasoning. They’re especially good at giving reasons and explanations, a capacity often regarded as a hallmark of intelligence.

In his famous test, Alan Turing highlighted conversational ability as a hallmark of thinking.”

See above for my comment on the Turing test.

X = General Intelligence

“Among people who think about consciousness, domain-general use of information is often regarded as one of the central signs of consciousness. So the fact that we are seeing increasing generality in these language models may suggest a move in the direction of consciousness.” 

Chalmers concludes this part of the analysis.

Overall, I don’t think there’s strong evidence that current large language models are conscious. Still, their impressive general abilities give at least some limited reason to take the hypothesis seriously. That’s enough to lead us to considering the strongest reasons against consciousness in LLMs.

Arguments against consciousness.

What are the best reasons for thinking language models aren’t or can’t be conscious?

Chalmers sees this as the core of the discussion. “One person’s barrage of objections is another person’s research program. Overcoming the challenges could help show a path to consciousness in LLMs or LLM+s.

I’ll put my request for evidence against LLM consciousness in the same regimented form as before. If you think large language models aren’t conscious, articulate a feature X such that (i) these models lack X, (ii) if a system lacks X, it probably isn’t conscious, and give good reasons for (i) and (ii).”

X = Biology

“Consciousness requires carbon-based biology.”

“In earlier work, I’ve argued that these views involve a sort of biological chauvinism and should be rejected. In my view, silicon is just as apt as carbon as a substrate for consciousness. What matters is how neurons or silicon chips are hooked up to each other, not what they are made of.”

Indeed, functions and information processes can be implemented in whatever material substrates. What matters is structure and structural correspondence between the physical processes and certain cognitive processes as we model them.

X = Senses and Embodiment

A meaningfull text refers to something outside the text. What is that ‘something outside’ ? Some people think we are imprisoned in language, but when they express this thought they mean something with it. How are the symbols that LLMs generate grounded in something outside the text? Living beings have a number of senses that connects them with the world outside.

“Many people have observed that large language models have no sensory processing, so they can’t sense. Likewise they have no bodies, so they can’t perform bodily actions. That suggests, at the very least, that they have no sensory consciousness and no bodily consciousness.”

Note that Chalmers introduces here variants of consciousness, ‘sensory’ and ‘bodily’ consciousness. Later on we will also have ‘cognitive’ consciousness.

Thinking about sensory perception in technical systems we draw a line between what belongs to the system itself and what is outside the system. What kind of border line is this?

Computers are good in playing chess. They beat human world champions. But how good are they in playing blind chess? What’s the difference between a computer playing blind chess or chess? Thinking about this it seems like it matters through what kind of device information enters the machine. A blind chess player may not use his eyes or whatever device to see the actual state of the playboard at all times he pleases. He only has his memory to visualize internally and update the state of the game he is playing when a set is done. But for a technical device like a robot what differences does it make if we do not attach a video device? The only way to simulate the difference is by specifying the memory function.  

If a robot has a body where does the body of the robot ends? What is the border between the body and the outside world?

In “Can Large Language Models Think?” Chalmers argued that “in principle, a disembodied thinker with no senses could still have conscious thought, even if its consciousness was limited.” An AI system without senses, a “pure thinker” could reason about mathematics, about its own existence, and maybe even about the world. “The system might lack sensory consciousness and bodily consciousness, but it could still have a form of cognitive consciousness.”

Indeed, the computer that plays chess is actually playing a mathematical game with exact rules for manipulating symbols. For us it ‘reasons about the world’ of chess because for us the symbols it manipulates implemented in some physical process refer to the pieces on a chessboard.

Chalmer’s ‘pure thinker’ is “a (possibly nonhuman) thinker without sensory capacities”. For Chalmers it seems to be obvious that a pure thinker “can know a priori truths e.g. about logic, mathematics.”. However, without embodiment, without sensory perception, without a world thought and experienced as being outside the mind, there would be no mathematics. Itis not the content of the sensory perception that is the sensory basis of mathematical thought, but the immediate extensiveness of the perception. Reality obeys the principle of structurability which makes that everything has a quantitative moment, by which it is countable, measurable, structurable. It is by our embodiment that we experience direct physical contact with the world outside us. This experience is present in every sensory perception. We perceive this working by which the experience is physically possible as an effect on our body. Together with this effect we experience the extensiveness of this effect. Without this grounding of mathematical thought in sensory perception it is hard to understand the ubiquous applicability of mathematics.

“LLMs have a huge amount of training on text input which derives from sources in the world.” Chalmers argues “that this connection to the world serves as a sort of grounding. The computational linguist Ellie Pavlick and colleagues have research suggesting that text training sometimes produces representations of color and space that are isomorphic to those produced by sensory training.”

The question is for whom these ‘representations’ exist. Consciousness is always consciousness of something that exists somehow distinguished from the act or state of consciousness. It means at least that the conscious being is aware of this distinction, i.e. that there is something out there it is aware of.

It will be clear that the challenge of the embodiment feature is closely related to the following feature.

X = World Models and Self Models

“The computational linguists Emily Bender and Angelina McMillan-Major and the computer scientists Timnit Gebru and Margaret Mitchell have (in their famous Stochastic Parrots paper) that LLMs are “stochastic parrots.” The idea is roughly that like many talking parrots, LLMs are merely imitating language without understanding it. In a similar vein, others have suggested that LLMs are just doing statistical text processing. One underlying idea here is that language models are just modeling text and not modeling the world. 

This amounts to saying that LLMs do not know the facts. They do not know what is truth.

Chalmers comment is interesting. He observes that “there is much work on finding where and how facts are represented in language models.”

This comments suggests that Chalmers considers ‘facts’ as objective truths, or objects. Like theorems in mathematics. As if it can be decided by the engineers what the facts are. The issue of power that I mentioned before pops up here. What Chalmers seems to forget is the role that the user, the reader of the generated texts plays. It is the reader that gives meaning to the text.

AI has no problem in generating (creating, if you wish) huge amounts of videos, texts and music by remixing existing fragments. But it needs humans to evaluate its quality, to make sense of it. The proof of the pudding is in the eating. Not in the making.

As the producers of ChatGPT of OpenAI rightly state: it is the responsibility of the user to check the value of the texts generated by the machine. This is the very reason they give a warning not to use it uncritically in critical applications.

What is truth is not the result of an opinion poll. It is not by means of statistics that we decide what the facts are. To give an example from my personal experience. If Google includes articles written by my son (who happens to have the same name as I have) in my publication list, it doesn’t matter how many people using Google copy the faulty references, the truth differs from what LLMs and all Google adepts ‘believe’ it is. It is well known that ChatGPT isn’t very reliable in references to the literature. This is of course an instance of its ‘unreliable connection’ with the world in general.   

X = Unified Agency

“The final obstacle to consciousness in LLMs, and maybe the deepest, is the issue of unified agency. We all know these language models can take on many personas. As I put it in an article on GPT-3 when it first appeared in 2020, these models are like chameleons that can take the shape of many different agents. They often seem to lack stable goals and beliefs of their own over and above the goal of predicting text. In many ways, they don’t behave like unified agents. Many argue that consciousness requires a certain unity. If so, the disunity of LLMs may call their consciousness into question.”

A person is a social entity, a unity of mind and body. A human being doesn’t only have a body, it is a body. The type of relation between mind and body is at stake.

The identity of a machine is a mathematical identity implemented in a physical world. The machine is a working mathematical token. There are many tools and machines of the same type. They share the same mathematical identity, but they differ in matter. Like two pennies, or two screwdrivers. Technical instruments exists as more of the same kind. They are not unique. We assign identity to the social robot, give it a name as we do with our pats, the way we assign identifiers or unique service numbers to citizens.   

Chalmers concludes this part with:

For all of these objections except perhaps biology, it looks like the objection is temporary rather than permanent.

The AI engineer says: “Tell me what you miss in current AI systems and I tell you how to build it in.”.

The idea is that by this process of adding more and more features AI will eventually reach a stage where we can say that we managed to realize an artificial conscious being.

It witnesses the typical mathematical stance that the engineer takes. The idea is a mathematical entity that exist as a limit of a real proces. As if we could produce mathematical circles from physical matter in a circle fabrique.

Chalmers conclusion

In drawing a general conclusion Chalmers is clearly walking on eggs.

“You shouldn’t take the numbers too seriously (that would be specious precision), but the general moral is that given mainstream assumptions about consciousness, it’s reasonable to have a low credence that current paradigmatic LLMs such as the GPT systems are conscious.

It seems entirely possible that within the next decade, we’ll have robust systems with senses, embodiment, world models and self models, recurrent processing, global workspace, and unified goals. 

It also wouldn’t be unreasonable to have at least a 50 percent credence that if we develop sophisticated systems with all of these properties, they will be conscious.“

He mentions four foundational challenges in building conscious LLMs.

  1. Evidence: Develop benchmarks for consciousness.
  2. Theory: Develop better scientific and philosophical theories of consciousness.
  3. Interpretability: Understand what’s happening inside an LLM.
  4. Ethics: Should we build conscious AI?

Beside these ‘foundational challenges’ Chalmers mentions a couple of engineering challenges, such as: Build rich perception-language-action models in virtual worlds.

And if these challenges are not enough for conscious AI, his final challenge is to come up with missing features.

The final question is then:

“Suppose that in the next decade or two, we meet all the engineering challenges in a single system. Will we then have a conscious AI system?”

Indeed, not everyone will agree that we do. But, Chalmers is optimistic about what engineering can offer: if someone disagrees, we can ask once again: what is the X that is missing? And could that X be built into an AI system?”

“My conclusion is that within the next decade, even if we don’t have human-level artificial general intelligence, we may well have systems that are serious candidates for consciousness.” (Chalmers)

My ‘conclusion’ would be that Chalmers’ conclusion will recur after each decade in the future. I believe so because Artificial Intelligence is an Idea, an ideal if you wish, that technology tries to realize and that big AI-enterprises tries to sell on the market as being the ideal that we have to strive for. An idea that will remain an ideal until we will have other ideals we strive for. That will not be before we have answered the question what it is that we strive for.

If the machine had consciousness, it would fight for its own truth to the death.

Histoire se répète.

When the indian inhabitants in North America saw for the first time a steam boat coming down the Mississippi river, they thought it was a living creature having a soul.

When Descartes and Lamettrie came up with their mechanical ‘bête machine’ and the theory of l’homme machine’ it lasted a few centuries before the heated debates about the difference between men and machine faded out and men became just men again and a machine just a machine. With the LLMs and the talking social robots the same debate recurs. It won’t take long before also this heated discussion about conscious machines will cool down and a machine will again be just a machine and a human being a human being. The difference between the situation now and the situation in the 17th and 18th centuries is that AI is nowadays promoted by powerfull commercial enterprises that control the thinking of the masses addicted to information.

The attemtps to make us believe that machines (LLMs) are (potential) conscious beings that are able to know the world and that we can hold responsible for what they do are supportive to the powerfull forces that keep us addicted to the information they produce. Addicted to an religious ideal image of man that the future of AI would bring us.

As long as we are free men we will never accept that machines are taken responsible for what they ‘do’. As we also do not take any God responsible for what happens in the world.

Death of an Author

In his post “The Future of Writing Is a Lot Like Hip-Hop” in The Atlantic of 9 may 2023 Stephen Marche reports his findings during his cooperation with ChatGPT in writing their novel Death of an Author. In that report he comments on how users ask ChatGPT things. “Quite quickly, I figured out that if you want an AI to imitate Raymond Chandler, the last thing you should do is ask it to write like Raymond Chandler.” Why not? Because “Raymond Chandler, after all, was not trying to write like Raymond Chandler.”

I believe this refers to the core insight why AI is not human. It behaves like humans. At least AI tries to behave like them. But humans do not try to behave like human beings. They even do not behave ‘like human beings’.

I mean that we make a path while walking. The path is the result, the history of this act. The path is the walking abstract from the real act of walking. The real act of following a path always involves and presupposes the original act of making a path.

What we can learn from Marche’ report is that AI is not so much of a machine but more of a tool. A tool requires for its succesfull use handcrafship and a lot of experience from the human user. There is not one path, there are many you have to choose from.

Notes and references

A 2020 survey of professional philosophers, around 3 percent accepted or leaned toward the view that current AI systems are conscious, with 82 percent rejecting or leaning against the view and 10 percent neutral. Around 39 percent accepted or leaned toward the view that future AI systems will be conscious, with 27 percent rejecting or leaning against the view and 29 percent neutral. (Around 5 percent rejected the questions in various ways, e.g. saying that there is no fact of the matter or that the question is too unclear to answer).

David Bourget and David Chalmers (2023). Philosophers on Philosophy :the 2020 PhilPapers Survey, Philosopher’s Imprint, January 2023, https://philpapers.org/archive/BOUPOP-3.pdf

David J. Chalmers (2023). Could a large language model be conscious? Within the next decade, we may well have systems that are serious candidates for consciousness. Boston Review, 9 Augustus 2023.

Mark Coeckelbergh (2012), Growing moral Relations: critique of moral status ascription. Palgrave MacMillan, 2012.

Mark Coeckelbergh (2014). The Moral Standing of Machines: Towards a Relational and Non-Cartesian Moral Hermeneutics. Philos. Technol. (2014) 27:61–77.

David Gunkel (2018). The Other Question: Can and Should Robots have Rights? Ethics Inf Technol 20, 87–99 (2018).

David J. Gunkel (2023). Person, thing, robot. A Moral and Legal Ontology for the 21st Century and Beyond. The MIT press, forthcoming, September 2023

When one coach is not enough…

The main idea of the EU Horizon 2020 project Council of Coaches is to have a number of coaches that you can gather and meet. I think it is a clever idea (you never know where playing video games are good for…). Why?

Here I present the challenges I see in this project regarding reseach in the use of virtual conversational characters for serious applications (other than demonstrators, gaming or art) . There is an extensive project website containing a lot of information.

Goal of the project is to help elderly people like me (for practical purposes: 55+ and you are old nowadays; and young as well) to reflect on their health issues. Specific target are the chronic diseases: lung-diseases (astma; copd) , heart-diseases, chronic back pain; obesitas, diabetes. If you happen to suffer from one of these the others will most of the time come soon. How to cope with this situation?

To be honest – and why shouldn’t I ? – I do not know how many people (in the target group) reflect on their health issues so that they search for help. And if they do what is the trigger and what are their needs and where do they go for help. I myself, I call the doctor if I think there is something not good with my body; maybe after I searched on the internet to see what information I can find. We had a chat once with an internet doctor when we were abroad and needed some medical advice. When we have a lasting back pain we go to a doctor. The doctor says: it’s the age. Ageing is slowly dying. You have to live with that. Or die, if you want. We are getting older and I think we are not special.

Digital coaching app all over the place

There are digital personal diabetes coaches in the form of an app that runs on your mobile phone. Some of them have an embodied character that pops up when you want. They try to motivate you to measure your glucose regularly. There are physical activity coach apps. There are food coaches; sleep coaches, depression coaches; budget coaches. The most important function is that they help you to monitor your physical or financial condition in terms of some vital physiological or financial parameters and some of your daily health-related activities. And sometimes they can give you personal advice. Your values are either obtained from sensors connected to the app (step counter, glucose meter), from information you provide yourself by typing in values in form fields, from other applications you use. And sometimes from a short information dialog with your personal coach. But this is still a real challenge: to have a free, open, natural interaction in your own language with an artificial conversational agent that is really of your help.

I am rather sceptic about the potentials of human-virtual human-interaction. The value of apps is that they store relevant data, they can help you reflect on a specific health related issue: your weight, your diabetes. People that use such an app for a longer time are already motivated to keep an eye on their health condition. You data can easily be shared with your doctor. Older people forget things; it helps you remember. Ease of use and functionality is what counts. Should it be fun to work with an app? I think it is a nice added value. But it should not undermine the primary functions. For fun there are games.

Serious coaching games

Children like games. In one of the Human Media Interaction projects at Twente University we built a gamification platform to support young diabetes patients in dealing with their disease. In a journal paper we discuss what barriers we encounter on the path from design to the final implementation and inclusion in the current health ecostructure. Some elderly people like games as well. So why not design serious games that help people with their personal issues in a challenging way?

The very idea of the Council of Coaches project comes from the world of video games. Different coaches, covering expertice in various domains of life, chat about some issues relevant for and brought in by the user. This way the user need not be actively contribute to the conversation all the time. She can jump in whenever she likes.

One of the big challenges of the project is to get content for the health dialogs. How to feed the virtual coaches so that they are able to contribute in a sensible way to a conversation about the personal issues raised by a user? Maybe interaction and cooperation with real public coaching sessions can be of help.

Health Insight:  Council of Coaches as Interactive TV format

On the list of Most Important Things For a Good Life a good health condition seems to be number one. But football is definitely second. The most popular and most awarded TV production in the Netherlands for many years already was Voetbal Inside. A council of four football (soccer) coaches discusses the most important issues of the week. TV watchers can drop a line via social media and ask questions, often addressed to one of the council members. The issue selected by the moderator is shown on screen and lively discussed by the different characters of the council (see Figure X). It is a mixture of spontaneous live and scripted interactive TV. Sometimes direct video communication with a guest/watcher is broadcasted.

Why not exploit Council of Coaches as an Interactive TV format. The Dutch Omroep MAX  would be the first to be interested. They target the older segment of the Dutch TV watchers. They have close contacts with care institutes for elderly people. Most popular program of Omroep MAX is the cooking/bakery contest “Heel Holland Bakt’’  in which “normal’’ people take part.  Omroep MAX might be interested to cooperate in a TV series where normal people can discuss their health related problems (focusing on a specific disease or general health related issue: diabetes or obesitas) with the Council of Coaches on TV. The Council consists of well-known medical experts and other “Well known Dutch’’ that suffer from the disease of the week. The recordings together with the feedback about engagement (e.g., audience ratings) can be used as training material for home-made artificial coaches. TV doctors we have already for quite some time but a council of coaches that discusses a statement like “Doctor, I have diabetes could it be because of stress in my work?’’  (see for an answer: https://www.dokterdokter.nl/gezondheid/tv-dokter/page/2/    )   could be a valuable addition. For the society it would be a welcome contra-weight against all those media productions and commercials that promote the food industry.

Figure 1:  The Council of Football Coaches, one of the most popular interactive TV productions in the Netherlands about the second most important thing in men’s life.

Figure 2. Insight. The Council of Coaches on TV. 

Figure 2 shows the Council of Coaches on TV. The picture on top shows a scene from the popular Dutch OmroepMAX TV production “Hendrik Groen’’ about a group of elderly people in an elderly house. The character on the right is Hendrik. The character on the left is his best friend Evert, a diabetic. Evert likes to drink, maybe a bit too much. He just had his leg cut off in hospital because of gangrene (necrosis caused by diabetes).  At the bottom you see the question under discussion. The council members are known Dutch TV personalities and experts in food, diabetes care.  They discuss the problem and conclude that it would be good that diabetics find out for themselves how their blood glucose values are affected by the alcohol consumption because. TV watchers can download the COACH app, that can be connected to their glucose meter.  The app allows them to chat with their personal diabetes coach, a virtual replica of one of the council members (they lent their voice and style to the character).  The coach gives them instructions how to perform the test and motivates them to adhere to the protocol for the duration of the test and to keep track of their alcohol consumption. Outcomes are sent to the COUCH server and shared with the audience in a following episode of Health Insight.

Spoken dialog with artificial characters: a real challenge

The problem of real-time automatic speech recognition is, and will remain, “close to being solved’’, thanks to Big Data and DNN technology. Real-time is a necessary requirement for spoken dialog that doesn’t suffer from processing delay and that allows realistic turn-taking and interrupting behaviour.  One big problem is the recognition of special utterances, named entities, and newspeak.  Data used for training machines is typically historical and outdated. Hence the need for continuous updates. As Hugo Brandt-Corstius – one of the founding fathers of Dutch research in formalisation of natural languages – used to say, “Wat je ook doet, de semantiek gooit roet.” (“Whatever you do, semantics bothers you”).

The generation of natural live-like speech is ready to be exploited by virtual characters (see: https://deepmind.com/blog/wavenet-generative-model-raw-audio/ ) so it is possible to have a personal coach with the voice of, for example, André van Duin (Evert in the Hendrik Groen TV series) or Trump to name a trustworthy figure.

An unsolvable paradox

But the core problem of an artificial dialog is the logic of the conversation. There is none. And if there is some logic it is the participants themselves who decide what it is. Not the designer. Trying to design a system for natural open dialog is trying to solve a paradox. Being a conversational partner you do not want to control the other party’s response. Of course when you ask a question you more or less force the other to respond in some way or another. But not in a deterministic way. That’s the whole idea of a question, isn’t it? Getting to know someone is different from asking all kinds of information about or from someone. The best realisable technical system offers the users a number of options. It also has a number of options the system can choose from to respond to a user’s action. These systems assume by desing that the world of conversations is closed; that there is something like a mathematical space of all possible conversations. That language is a system. I believe it is not. Autonomous agents ignore the users’ freedom, their identity and autonomy. The very concept of user already challenges these human values.

Demonstrators

Modern projects deliver demonstrators. Project reviewers don’t like to read reports or scientific publications. Show me.

The Council of Coaches project built a functional demonstrator. The world of possible conversations is designed using a dialog editing system, called WOOL also developed in the project. So every conversation that a user and the coaches in the council have is a realisation of one of a huge set of possible paths of pre-scripted dialog continuations.

In another technical demonstration system embodied virtual 3D characters simulate realistic multi-party conversations. It demonstrates progress made and state-of-the-art in the development of turn-taking, addressing and argumentative dialog behavior generation for artificial conversational embodied 3D characters.

Societal Challenges

In his “Computer Power and Human Reason’’ Joseph Weizenbaum tries to understand how it is possible that people interact with his computer program ELIZA as if they are talking to a real human. The psychological value of interaction with artificial characters is also the theme of Sherry Turkle’s “Alone together”. As long as people recognize themselves in the answers given and as long as there is sufficient room for interpretation for the user her experience of being recognized by the system is strengthened.

Council of Coaches is a project that builds bridges between new media, art and design, and technology.

How will people stand towards virtual health coaches? Will they see them as personal coaches that they are willing to share their personal experiences and part of their life with?  Do members of the council have to say only what we (or the medical people) believe is correct or do we allow bad or sceptic characters?  Is the system seen as a medical instrument? Or is it a system that tries to make users aware of the way they stand towards their own life in whatever way that we believe that works?

Moreover, do users want to have private discussions with one of the virtual coaches? May a coach deceive the patient or withhold information because he believes it is not good for the client’s health to know? Old issues in medical ethics get a new dimension when coaching in the health domain becomes virtual. What is new is that many users are inclined to uncritically believe what the computer says: “The computer told me!”. These are some of the societal issues that have to be considered before the Council of Coaches will find its place in the social organization of patient-centered health care.

Future dream and worst scenarios

Some people prefer to talk about their personal health issues with a virtual character instead of talking to a human expert. Others are more or less forced by the health care system to first chat with an e-health coach before they see a real human. I am not sure if this is a healthy thing. Maybe the Council of Coaches can help the users to identify their personal issues and brake the barriers to talk to real humans.

A worst case scenario would be, when society decided to replace human coaches and experts by artificial agents because of the economic burden of a human good quality health care system for elderly people. Many people have the impression that western society is moving in the direction of this worsest scenario rules by policies that have unlimited trust in autonomous artificial intelligent agencies.

The Council of Coaches project has delivered a proof of concept and a software platform and tools that can be applied for building end user applications in other domains, for example for social skill training in professional organisations.

The real danger of autonomous agents

Regarding the discussion about “autonomous technology’’ (social robots, killer robots, autonomous cars, virtual coaches that would take the place of real humans): some people see a danger in the growing number and autonomy of intelligent machines that would take over the world. I believe the following makes more sense.

“The real danger, then, is not machines that are more intelligent than we are usurping our role as captains of our destinies. The real danger is basically clueless machines being ceded authority far beyond their competence.’’ (Daniel.C.Dennett, In: The Singularity—an Urban Legend? 2015).

References

Harm op den Akker et al., 2018. Council of Coaches – A Novel Holistic Behavior Change Coaching Approach. Proceedings of the 4th International Conference on Information and Communication Technologies for Ageing Well and e-Health

Sherry Turkle (2011). Alone together: why we expect more form technology and less from each other. Basic Books, New York, 2011.

Joseph Weizenbaum, 1976. Computer Power and Human Reason: from judgement to calculation. W. H. Freeman & Co. New York, NY, USA, 1976.

A Causal Diagram on COVID-19 Infection

“I want to know why my friend, 69, was home with 104 fever, reported to his dr., test negative for flu, pneumonia, finally on day 5 was tested for Covid-19, 3 days later positive, told to stay home, and now is near death on ventilator in hospital? Why so long to test????”

(One of the many tweets that express that we are desperately looking for causes. From: Twitter 25-03-2020)

The SARS-CoV-2 virus is rapidly spreading over the world. Every hour the media come with news about the Corona pandemic; the numbers of deaths, of people tested positive, of people admitted to Intensive Care units. Every day we can follow debates between politicians, experts and the public how to handle the various problems caused by the virus.  In the Netherlands we have to stay at home as much as possible, to wash hands regularly, to not shake hands and to keep a “social distance” towards one another of at least 1.5 meters. Schools, restaurants and many shops are closed. Elder people in community houses or living alone are among the most vulnerable. More and more countries decide to a complete lock-down.

Scientists, in particular virologists, epidemiologist, physicians explain their audiences the mechanisms behind the spreading of the virus to make clear the reasons for the measures taken by their governments (or to comment on them). Statisticians try to make sense out of the wealth of data collected by national and international health institutions. What questions can they answer based on their statistics?

People ask what the chances are they will be exposed to the virus when they go to the supermarket. Others want to know how long it takes before the pandemic is under control so that they can go to work and the children to their schools. But a lot is still unknown. The virus differs from those of known influenza epidemics.

A model on the individual level

National health institutions (in the Netherlands the RIVM) as well as international health organisations (in particular the WHO) gather and publish data about the numbers of infected people, as well as how many people died because of the virus. Researchers use this data to explore mathematical growth models to see if they can fit them to the data so they can be used to predict the virus spread, the peak, and when it will decline and how this depends on the policy. These models look on the level of the population of a whole nation, or a particular region (e.g. Lombardi) or even at a city (Wuhan). Sometimes they do look at the different age groups or gender differences. For example to predict fatality rates for different groups.

But these model do not look at the individual level. The causal model that I propose here differs from the epidemiological models. It models the factors that play a role in the effects of the virus on the individual level.

Monitoring Reproduction Number R0

An important societal quality to measure in case of a epidemic is the basic reproduction number of the virus R0 (R-naugth). Politicians and public health experts keep a close eye on this number. R0 represents the number of new infections estimated to stem from a single case. If, for example, R0 is 2.5, then one person with the disease is expected to infect, on average, 2.5 others. An R0 below 1 suggests that the number of cases is shrinking. An R0 above 1 indicates that the number of cases is growing. Of course the numbers counted are statistical measures over a population.

R0 depends on a number of factors. The number of people infected, the number of people vulnerable, the chances that a contact between people will make that the virus transfers from one to another person, how long people infected are able to affect other people. Some of these values are typical for a virus. Others can only be estimated from large populations. A definition of R0 refers to a scientific model of the complex mechanisms underlying the spreading of the virus. See this paper for the confusion about R0. All in all, it is hard to know the “real” value of R0. We can only estimate it.

Since spreading of the virus depends on the way people behave, policies try to steer peoples behavior. So they hope to control R0. In a similar way the doctor tries to influence a patient’s situation by application of some medical treatment. Since the doctor doesn’t know how the patient will react he will keep a close eye on the patient’s situation and adjust his treatment if required. A complicating factor is the time delay. A treatment will only have effect after some time. So in order to prevent critical situations we need to predict how the system that we try to control will behave in the future. So that we can adjust treatment in time. If you want to shoot a flying duck you need to estimate where it will be at the time the bullet will reach the point. All in all there are so many uncertainties, not in the least about how the public will respond in the long run to the measures taken by governments (e.g., to stay at home), and how this depends on expectations presented in the media. Maybe other values become more important after some time (e.g visiting family, go out for pleasure) so that people change their behavior.

Towards a causal Corona model

Many people get sick from the virus, some of them have only mild symptoms, a small percentage does not survive the attack.  Most of the ones that die are older and already have health problems. But there are exceptions. It seems that not all people get the disease. This raises our first question.

  • What are the factors that determine if a person will get COVID-19 ?

For a person to get a disease caused by a virus infection two things are necessary and sufficient: (a) the person is vulnerable for the virus and (b) the person is actually infected by the virus after he or she was exposed to the virus.

This is almost trivial logic of the classical potency act doctrine. Likewise: for a glass to break, the glass has to be breakable (vulnerable for breaking) and there must be some actor that actually breaks it.    

 Our first questions raises two follow-up questions:

(2) What are the factors that determine a person’s vulnerability for being infected?

(3) What are the factors that determine that the person’s gets exposed and infected with SARS-CoV-2?

Some people die after they got the disease, others survive.

(4) Which factors determine how serious the disease will become and what are the factors that determine the chances of survival?

If we have an answer to these questions we can predict which people run most risk of getting ill and what should be the best policy to prevent or control outbreak of the virus. To find answers we have to collect data. The more data, the better. Suppose we want to know what the effect is of age on chances to die because of corona, among those that are affected by the virus. We need statistics: for every patient we need age group, and effect, where effect is either survived or died. We can now estimate the probability for each age group. We might expect that the older the people are the smaller the chances to survive. Suppose we see an unexpected dip in the curve after the age group above 70. How can we explain this? Our data doesn’t tell. Was there a policy to not treat patients of this age group? Data alone is not enough to answer these questions and to make good predictions. We need to know the mechanisms behind it.

A plea for causal graphs

In The Book of Why (2018) computer scientist and philosopher Judea Pearl argues for the use of causal diagrams as a computational instrument to make causal inferences from data. Pearl, who invented Bayesian Networks, points at the fact that we can’t draw causal conclusions from data alone. We need causal intuition from experts about the case at hand. With such a causal model we can simulate experiments by setting values of variables of the causal model and see what the effects are on the outcome variables that we are interested in. Also, if we want to answer questions like “What if Mrs. S., aged 71, of which we know that she became seriously ill after being exposed to the virus and that she was not taken in hospital, had been taken in hospital care? Would she have died?’’

What is a causal diagram?

A causal diagram is a graphical structure with nodes and arrows between pairs of nodes. The arrows represent direct causal relations. The source of the arrow is the cause, the target node represents the effect of the causal relation. Figure 1 shows a causal diagram that could be a part of a causal graph for the analysis and prediction of the effect of age, medical condition and treatment on chances to survive.

Figure 1. A causal diagram

The diagram shows that Treatment depends on Age as well as on Condition. Treatment has effect on Lethality (chance to survive), but this is also dependent on Age and on Condition. Age influences Lethality through four different paths.

Causal diagrams can answer questions like “How does the lethality in the population change if we decide to give all infected people the same treatment independent of age? Or   “How does the recovery rate for the corona disease change if we enforce a complete lock-down instead of the actual soft “intelligent’’ lock down?”.

Such questions can only be answered correctly by means of a causal intervention in the diagram. In technical terms: by applying the do-operator on the intervention variable. (We give the variable a fixed value and compute the effect). This simulates more or less what we would do in a randomized controlled experiment. We learn how Nature works by bringing about some controlled change and observe how Nature responds to that. The validity of the conclusions that we draw from such simulations requires that the model is correct, no arrow between two nodes if there is no causal relation, and complete in the sense that all relevant factors and relations are in the model.

How do we know if the model is complete and correct? I am not aware of any convincing publication about validation of causal diagrams. In fact, a causal diagram is a theory that can at best be falsified, not proven to hold the truth. Pearl’s Book of Why describes some interesting episodes that show that the truth consists in the historical process of scientific research. Not outside this process. Although it helps that now and then someone stubbornly believes that she has seen the light and fights against the community’s doctrine. Those are the ones that make science progress. 

Causal Diagrams and Bayesian Networks

Nodes X and Y of a causal diagram represent factors that play a role in the model, with an arrow drawn from X towards Y if we conceive X as a “direct cause’’ of Y. Mathematically the factors X and Y are stochastic variables of various types: continuous, or discrete: ordered or categorical.

A causal network is a special type of Bayesian network. In a Bayesian network the arrows do not need to refer to direct causes. They stand for probabilistic “influence” without implying a causal direction.  For example, if we know that someone lives in an elderly home we can infer that he or she is most likely over 70 years of age. There might not be a direct causal relation in a strict sense between age and living in an elderly home, but if we got the information that mrs S. lives in an elderly home chances are raising she is over 70 years of age and not 15. In a Bayesian network to each node Y is attached a (conditional) probability table for P(Y|z) where z is a set of all variables that point at Y (the parent nodes of Y).  Thus when the network is X -> Y then node Y has a conditional probability table P(Y|X) and X has a table for the probability distribution P(X).

Where do the probabilities come from? Bayesian networks are quite tolerant about the source of the probability values: they are either computed from data (probabilities as relative frequencies) or based on expert opinions (probabilities as confidence measures of “belief states’’). For a comprehensive review about the use of Bayesian Networks in health applications refer to Evangelia Kyrimi et al. (forthcoming). One observation is that many researchers do not publish their network, nor do they comment on the design process.

Bayesian networks are used to compute (infer) the probabilities of some outcome variables given some known values of input variables (the observed ones). The computation is based on the classical Theory of Probability (Pascal, Fermat) and uses Bayes’ Rule for computing the unknown probabilities of variables.

Let me give an example. Let T be the variable that stands for the outcome of a corona test (its value can be positive or negative) and let D stand for the statement “patient tested has corona disease’’ (D=true or false). We know that there is some causal relation D -> T. In diagnostics we reason from test to disease, not in the causal direction. Suppose the test outcome is positive. Does the patient has the disease? Tests are not 100 % reliable and sensitive. We want to compute P(D=pos|T=pos), the chances that the patient has the disease, given that the test outcome is positive.

Bayes’ rule says:

P ( D=pos | T=pos ) = [ P ( D=pos ) / P ( T=pos ) ] * P ( T=pos | D=pos )

The second factor P ( T=pos| D=pos ), the probability that the test is positive, if we know that the patient has the disease, on the right hand side of the equation has known values. They are based on expert knowledge about the test mechanism explaining how the disease causes the test outcome. The first factor is a fraction: the numerator P (D=pos) is the prior probability, i.e., the chance that the patient has the disease before knowing the test outcome. The denominator P(T=pos) is the probability that the test is positive whatever the situation of the patient. There are small chances that the test shows a positive outcome in case the patient does not have the disease (a false positive).

Bayes rule is consistent with intuitive logic. As you can see from the formula on the right hand side of Bayes rule, the chances P ( D=pos | T=pos ) grow when the prior P (D=pos) grows. When almost everyone has the disease then chances that you have it when you have been tested positive are also high. On the other hand, the higher P( T = Pos ), i.e. the higher chances are that the test is positive (also for people that do not have the disease), the smaller the probability that a positive test outcome witnesses the occurrence of the disease.

When P( X | Y ) is not equal to P (X),  we call X and Y probabilistically dependent. This dependency should not be confused with causal dependency. P(D=pos|T=pos) differs from P(D=pos) but having a disease is not caused by a test.  Also correlation (a measure for co-occurrency) should not be confused with causation.    

What counts as a causal relation?

As we said before, what makes a causal diagram a special (Bayesian) network is that the arrows in the network represent direct causal relations. But what is a causal relation? Where do they come from?

Almost every day we express our beliefs in the existence of a causal relation between phenomena. Either in an implicit way, e.g., “If you don’t hurry now, you will get late.’’, or, explicitly, e.g., on a package of cigarettes: “Smoking causes lung cancer.’’, or “Mrs. Smith died because of corona.’’.  But if we come to think about it, it is not easy to tell what a causal relation is, and how we can distinguish it from other types of influencing between things that happen. The modern history of the idea of cause started with Aristotle’s theory of the four causes of being (see his Metaphysics). Only two of them have survived: the notion of efficient cause and the notion of final cause. In our modern mechanical scientific world view we see a cause as something that brings about some effect: a billiards ball that causes another ball to move and so on. We see chains of causes and effects and many causes that influence other things or processes. On a medical internet site I found “Smoking does not cause lung cancer’’. The motivation is:  (1) not everybody who smokes gets the disease, (smoking is not a sufficient condition to bring about the effect) and (2) some people get lung cancer and never smoked (it is not a necessary condition to bring about the effect). Indeed, after many years of debates and research (see Pearl’s account of this episode in the history of science in The Book of Why), the causal mechanisms of cancer development under the influence of smoking have been largely unraveled. We know that the primary effect of smoking is that the tar paralyzes and can eventually kill the tiny hair-like structures in the lungs. Lung cancer happens when cells in the lung mutate and grow uncontrollably, forming a tumor. Lung cells change when they are exposed to dangerous chemicals that we breathe. The tar makes the lungs more vulnerable for these chemicals. In that sense smoking is indeed a cause of lung cancer. The search for the “real’’ cause will only end when we have found a plausible theory that describes the chemical mechanism of the change of the lung cells.

In general, it seems that as long as cause and effect are two separate objects or processes our wish to know how nature works is not really satisfied. A successful and satisfying search for the real cause of a given effect eventually reveals that cause and effect are actually two sides of the same mechanism that we tend to objectify. In reality cause and effect are materially one and the same process. But this only holds at the end of the search for causes. In making a causal diagram we must take for a cause everything that has a direct influence on (the status of) the effect.  

To make a long story short: there is no agreement about what counts as a “cause’’, so we have to rely on our “causal intuition’’ and expertise in the relevant domains.    

Need for expert knowledge

I am not an expert in one of the relevant areas, but I take the challenge to make a causal diagram based on the information that I gathered from the literature and the internet. I have some experience in the design of such models, but in quite different areas. In the past I gave courses in Artificial Intelligence devoted to Reasoning under Uncertainty. We used (Dynamic) Bayesian Networks for Natural Language Processing and for recognition of participant’s behavior in conversations. In making it I have learned how challenging it is to make such a diagram. The Dutch newspapers and the Dutch RIVM provide statistics and background articles that I use to make the diagram.

My causal corona diagram is just the basic structure of a Bayesian network: I do not attach (conditional) probability tables to the nodes. Computing probabilities is not my concern. I am interested in the causal structure. The main function is to organize and visualize the connections between the most important factors that play a role in the process of getting caught by the corona virus.

The Corona Diagram

The core of the diagram is based on the simple idea that a subject will get sick if and only if two things hold: 1) the subject is vulnerable and therefore sensitive for being infected and 2) he or she is exposed and actually infected by the virus. This is as trivial as cheese and almost tautological. But therefore it is not less truthful.

Here is the kernel of the causal diagram in which Seriousness is the label for the variable that indicates how serious the disease of this person is.   

Now that we are in the middle of the pandemic the public policies focus on minimizing the chances of exposure and infection. Several factors influence a person’s vulnerability for viruses and infections in general. We currently do not have a complete picture of which factors are typical of influence for sensitivity for this new SARS virus.  

Figure 2. The core of the Corona diagram

SeriousNess:  indicates how seriously ill the condition of the patient gets. Values varies from Asymptomatic, Mild, Serious to VerySerious.

It is directly influenced by two factors:

Vulnerability : indicates how sensitive the patient is for the virus. It refers to the subject’s physical condition. Other relevant conditions belong to the second cluster of nodes.

Infection : Infection is when the virus has entered the body. It ranges from nill to a few to many.

When there is no Infection SeriousNess is nill. SeriousNess is determined by the degree of Infection and by Vulnerability. The factors that determine Vulnerability should answer the questions why this young man became seriously ill and needs hospital care where this woman shows only mild symptoms. Recent research indicates that Infection is a gradual thing (a person can be disposed to a few or many viruses) and that SeriousNess depends in a more complex way on the Infection grade (Or type? It might be that difference variants of the Corona virus make a difference here.)

Lethality indicates the chances to die. It depends on the Seriousness and the MedicalCare, i.e. the medical treatment, offered. There is no cure at the moment; only care can help the patient to survive.

Note: It is often difficult to say what the exact cause of death is. Statistics about populations may reveal how many deaths are the effect of the corona virus.

Infection is caused by Exposure (is the subject exposed to virus) and influenced by SelfHealthCare (did subject take hygienic precautions; wash hands, etc.).  SelfHealthCare is important to prevent Infection. That is why it is urged to wash hands with soap or alcohol regularly and not to touch the face.

Vulnerability and Infection are the two main factors. Both are necessary for becoming ill. As you can see there is no direct relation between the two, i.e. Vulnerability does not cause Infection nor the other way around.  However from data collected it will come out that there is a high correlation between the two, suggesting a causal relation. This will show for example when we collect data from medical staff taken into hospital care after they have been tested positive. We cannot conclude from this that in the general public there is a causal relation between the two. This “spurious’’ causal relation only holds for people that are actually infected. (See for an explanation of this phenomenon Pearl’s discussion of Berkson’s paradox in The Book of Why.)

Figure 3. The Corona causal diagram

What are the factors that influence these two main factors?

Vulnerability

This is influenced by:

                HealthCondition:  the physical condition. Some people say that the people that die after being Infected would die anyway within a year or so because of their bad health condition.

                ChronicCondition: chronic diseases: lung diseases, coronary, heart, diabetes. There are indications that a high percentage of patient taken in hospital have a too high BMI (body mass index).

                Gender: 2/3 of the patients that passed away after getting serious ill are male. 

GeneticFactors:  other genetic factors besides gender specific ones may play a role. The COVID-19 Host Genetics Initiative (2021) reports that “While established host factors correlate with disease severity (e.g., increasing age, being a man, and higher body mass index) these risk factors alone do not explain all variability in disease severity observed across individuals. The genetic makeup of an individual contributes to susceptibility and response to viral infection.

“This international initiative describes the results of three genome-wide association meta-analyses comprised of up to 49,562 COVID-19 patients from 46 studies across 19 countries. They report 13 genome-wide signifcant loci that are associated with SARS-CoV-2 infection or severe manifestations of COVID-19.

                ImmunizationStatus:  it is currently not known how the immune system responds to the virus. Several vaccins are developed and have been applied since the end of 2020. It is to be expected that people that recovered have lesser chances to become ill again. Current research seems to indicate that people that had only mild symptoms are not completely immune for the virus.

                Age: is directly related to a number of other factors. Age is also a factor in the Exposure cluster. Age is more a property that can be used as an indicator for certain inferences. Ageing as a proces influences many aspects of peoples’ life: work, social contacts, etc.  (See below for a discussion about ageing as multi-factor confounder).

Exposure

A subject can be exposed to the virus to several degrees dependent on the concentration of the virus in the locality where exposure occured and the duration of the contact. Exposure is determined by several factors:  mainly by social contacts and activities (type of work, hobbies).  Another factor is

VirusSpread: how much is the virus spread over the subject’s social environment?

If VirusSpread is high and SocialContacts is also high then chances for Exposure are high as well.[1]

It is difficult to get a good picture of the spread of the virus. Only a limited number is tested for the virus. Conclusions drawn from the data collected are most likely biased. Also, different countries have different policies, which makes merging data a real challenge. Even within the Netherlands different regions have different policies dependent on the test capacity available.  See below for a recent publication about a simulation study of the spreading mechanisms of the virus.

Personality: influences SocialContact (See also the comment about compliancy in the notes below.)

Household: apart from social contacts the direct contact with family members influence the chance to be exposed to the virus.

Activities: it makes a difference if the person lives on his/her windowsill or is an active participant in all kinds of social events and organizations.

How can a Corona diagram be used?

The causal diagram presented is the result of an exercise in making a causal diagram for a realistic case. There is need for expert knowledge to come up with a better model. It can be completed with probabilities based on available data.  

Eventually, a causal diagram like this for the Corona pandemic can be used to answer several questions, not just probabilistic questions: what are the effects of certain observations on the occurrence of other events?  But also causal questions: what is the effect of changing values on outcome variables? It can also be used to answer counterfactual questions on an “individual’’ level[2].

The diagram can also be used for qualitative analyses. How is Age related to Lethality? The diagram shows that the influence of Age goes through Sensitivity. Ageing makes people more sensitive for diseases. So if they are infected they run a risk for getting in a serious condition. If there is not enough medical care for the subject chances raise that he or she will die.

What is the impact of gender on lethality?  Data about the incidence of COVID-19 indicates that there is a high correlation between Gender and Seriousness. This suggest a direct effect of Gender on Vulnerability, i.e., that male subjects are more sensitive than female subjects. Statistics show that 2/3 part of all patients are male. But other genes of the human genome may play a role as well.

Health care workers, doctors and nurses in hospitals are more vulnerable than, e.g., computer programmers. The former run a higher risk of becoming exposed and infected than the latter group.  This is reflected in the Activities node in the diagram.

What is the impact of the ecological environment on chances to get ill? The influence of the environment is indirect via HealthCondition. Polluted air causes chronic lung diseases such as COPD which makes people more vulnerable. The risks of getting ill are higher in large urban areas (e.g., Wuhan, Madrid, North Italy, New York) than in traditional agricultural areas. People living in urban areas run more risk to get exposed because of the heavy use of public traffic.

Finally, such a diagram might also help in deciding for or to explain policies that aim at minimizing exposure for specific subgroups of citizens. Isolation of elderly people helps because it affects Exposure, a necessary condition for getting infected.

A note on confounding factors: age

Our diagram has a simple tree-like structure. It has two main branches that are not connected upwards: Seriousness and Infection are not directly connected. Each of the two branches have a tree-like structure themselves.

This implies that Seriousness and Infection are not related. This would be the case if the two branches had a confounding cause. A confounder C of X and Y is simply said a common cause of X and Y. Pearl gives in The Book of Why (2018) a better, mathematical, definition in terms of his do-operator. Since Age influences in fact not only Vulnerability but also Exposure via SocialContacts (not indicated in the diagram) Age is a confounder.

Pearl(2018) discusses a study among retired men that revealed an association between regular walking and reduced dead rates. Is there a causal relation between the two? According to Pearl, the experimenters did not prescribe who would be a casual walker and who would be an intense walker. So we have to take into account the possibility of confounders. “An obvious confounder might be age”. The causal diagram is shown below.

Figure 4. A causal diagram from The Book of Why (J. Pearl, 2018)

Confounding factors cause spurious correlations between other variables. If we want to know the effect of Walking on Mortality we should do that for fixed age groups. The situation would be different when Walking had a causal influence on Age, which is not the case (well, maybe it is).

Some other studies: testing for the virus and mortality

A recent analyses using collected datasets of over 17 million adult NHS patients of which a bit less than 5700 died attributed to COVID-19, confirms the relevance of most the factors in our model that determine chances on death caused by the Corona virus. The study reports (version 7th May 2020) numerical values for the impact of these factors on lethality: age, sex, medical conditions (chronic diseases), deprivation. Race is a factor: “Compared to people with ethnicity recorded as white, black people were at higher risk of death“. Behavioral factors, as the ones in our model, have not been studied. A nice causal diagram is missing in the report.

Some epidemiologists are active on twitter and discuss causal models. One of them is Ellie Murray. She likes to illustrate her model and ideas with nice handmade infographics. See figure below.

A nice illustration of a causal analysis by Ellie Murray on Twitter (May 8, 2020)

Besides the personal question “How are chances for this particular person that he or she will be infected by the virus and survive?’’, there are questions concerning the effect of policies as well as compliancy on the spread of the virus and on mortality. Other researchers focus on these and on the global effect of the pandemic on mortality in the general public. How much does the virus shorten life expectancy?

David Spiegelhalter (2020) discusses the issue of whether many deaths from COVID-19 would have occurred anyway as part of the `normal’ risks faced by people, particularly the elderly and those with chronic health problems who are the main victims of COVID. The current – 25-03-2020 – estimated infection fatality rate (percentage death among infected subjects) is anywhere between 0.5% and 1%.

Norman Fenton et al.(2020) illuminate the need for random testing to prevent bias due to the selection methods used now. They advocate the use of a causal model for the analysis of the results. They present an example of a causal model for a given country and its population in order to show that the COVID-19 death rate is “as much a function of sampling methods, testing and reporting, as it is determined by the underlying rate of infection in a vulnerable population.’’

Wilder et al.(2020) develop an agent based model for studying the spread of the SARS-CoV2 virus. “A key feature of the model is the inclusion of population-specific demographic structure, such as the distributions of age, household structure, contact across age groups, and comorbidities.’’ The aim of this study is “to evaluate the impact of age distribution and familial household contacts on transmission using existing data from Hubei, China, and Lombardy, Italy –two regions that have been characterized as epicenters for SARS-CoV2 infection –and describe how the implications of these findings may affect the utility of potential non-pharmaceutical interventions at a country-level.’’

Conclusion

Motivated by Judea Pearl’s The Book of Why in which he advocates the use of causal diagrams, interested in the mechanisms that play in the Corona disease, not disturbed by any expert knowledge in virology, epidemiology, or whatever relevant domain, I made a causal diagram. I hope I have explained what it is, what it can be used for and why it would be good to work on a better one.  

Rieks op den Akker, Lonneker, March 2020

References

COVID-19 Host Genetics Initiative (2021). Mapping the human genetic architecture of COVID-19. Nature (2021). 

David Spiegelhalter (2020), How much ‘normal’ risk does Covid represent? Blog posted March 21, 2020. https://medium.com/wintoncentre/how-much-normal-risk-does-covid-represent-4539118e1196

Judea Pearl & Dana Mackenzie (2018). The Book of Why : the new science of cause and effect. New York: Basic Books.

Judea Pearl (2001), Causality: models, reasoning and inference. Cambridge University Press, Revised edition, 2001.

Norman Fenton, Magda Osman, Martin Neil, Scott McLachlan. Improving the statistics and analysis of coronavirus by avoiding bias in testing and incorporating causal explanations for the data. http://www.eecs.qmul.ac.uk/~norman/papers/Coronavirus_death_rates_causal_model.pdf

Wilder, Bryan and Charpignon, Marie and Killian, Jackson and Ou, Han-Ching and Mate, Aditya and Jabbari, Shahin and Perrault, Andrew and Desai, Angel and Tambe, Milind and Majumder, Maimuna, The Role of Age Distribution and Family Structure on COVID-19 Dynamics: A Preliminary Modeling Assessment for Hubei and Lombardy (March 31, 2020). Available at SSRN: https://ssrn.com/abstract=3564800 or http://dx.doi.org/10.2139/ssrn.3564800

Kyrimi, E., McLachlan, S., Dube, K., Neves, M.R., Fahmi, A., & Fenton, N.E. (2020). A Comprehensive Scoping Review of Bayesian Networks in Healthcare: Past, Present and Future. ArXiv, abs/2002.08627.

Rieks did some mathematics and computer science. He has a PhD in computer science. He moved to dialogue systems research and natural language processing. He was Assistant Professor Artificial Intelligence and Human Computer Interaction at the University of Twente. He designed and used Bayesian Networks for modeling and prediction of conversational behaviors. He lectured logic and statistical inferencing in AI courses focusing on reasoning with uncertainty.

Notes:

[2] Real individuals do not occur in scientific models. Science is about categories. Individuals are abstract entities represented by a list of attribute/value pairs. In a counterfactual question to the model we change certain values of some individual to see what the effect would be had she had this property instead of the one that she has in the real world.


[1] Comment by Miriam Cabrita. Related to Exposure, I miss how the community (excluding the individual self) comply with the measures proposed to stop the virus. For example, one way or another everybody needs to do shopping. Let’s assume that an individual goes to the supermarket and takes all precautions given by the government and even more (e.g. washes hands carefully when arrives home, removes shoes). If the people in the community are sloppy (not to say  stupid), and keep going to the supermarket while sick, the chances that the individual is Exposed to the disease are much higher than in a community where everyone respects the rules. What if the people you live with do not comply with the rules of SelfHealthCare? Then you are also much more exposed to the virus.

Response by Rieks: Compliance is indeed a factor that affects VirusSpread. It maybe affected by Personality, i.e., some people are more compliant to authority regulations than others (youngsters, elderly, for different reasons). You need to know compliance to treatment to know the effects of treatment. You need to know if people were free to choose for or assigned to a treatment, since that affects compliancy. These are relevant issues in the current discussion in the Netherlands about the introduction of a corona app to trace, detect and warn people for corona. What can we do with the data collected? 

Revisiting Simpson’s Paradox

Most people involved in car accidents have a driver’s license

Has Simpson’s paradox anything to do with causality as Judea Pearl claims in The Book of Why ? In this book the computer scientist and philosopher of science describes the historical development of a mathematical theory of causation. This new theory licenses the scientist to talk about causes again after a period in which she could only report in terms of correlations. Will the Causal Revolution, in which Pearl playes a prominent role, eventually lead to a conversational machine that passes the Turing test?

The strange case of the school exam

A school offers courses in statistics. Two Professors are responsible for the courses and the exams. The contingency tables below show statistics about the students exam results in terms of passed (Positive) or not passed (Negative) for each of the two Professors.

The school awards the Professor with the best exam results. Professor B claims the award pointing at the first table. This table shows indeed that the relative frequencies of passing are higher for Professor B (2% negative result) than for Professor A (3% negative result).

Professor A objects against B’s claim. It was recorded which students were well prepared for the exam, and which were not. He compiled a table for the segregated results. Indeed, this second table shows that for both student categories the results of Professor A are better than for those of Professor B.

Which Professor wins the award?

The strange outcome of the statistics exams

The statistics in the aggregated table shows clearly that for the whole group of students prof B has better results than prof A, but for both subgroups of students it is reversed: prof A is better than prof B.

How is this possible?

This surprising outcome of the statistics exams is my favourite instance of Simpson’s paradox. The paradox is well known among scholars and among most students that followed a course in statistics. I presented it my students in a lecture to warn them for hidden variables. I have surfaced my slides again when I was reading Judea Pearl’s discussion of the paradox in The Book of Why.

Beyond statistics: causal diagrams

After he introduced Bayesian Networks in the field of Artificial Intelligence, Pearl invented causal diagrams and developed algorithms to perform causal inferences on these diagrams. In The Book of Why Pearl presents several instances of Simpson’s paradox to clarify that we cannot draw causal conclusions from data alone. We need causal information in order to do that. In other words: we need to know the mechanism that generated the data.

Causal diagrams are mathematical structures, directed acyclic graphs (DAGs) in which the arrows connecting two nodes represent a causal relation, not just a probabilistic dependency.

Figure 1 shows two possible causal diagrams for the case of the school exams.

Figure 1. Two causal diagrams for the school exams

Both networks can be extended to a Bayesian network with probabities that are consistent with the statistics in the tables. In both models the Professor and the Student, represented by the node labeled Prepared, are direct causes of the exam result, represented by the node labeled Passed. The diagrams differ in the direction of the arrow between the Prof node and the Prepared node. In the diagram on the left the causal direction is towards the Prof node; in the diagram on the right the cuasal direction is towards the Prepared node: the Professor determines how well students are prepared for the exam.

If the latter model fits the real situation the school should award Professor B. The decision should be based on the table with the combined results. The better exam results are the Professor’s credit.

The diagram on the left models the situation in which the preparedness of the students somehow determines the Professor.  In this case the school could award Professor A based on the results in the lower, segragated, table.

What has Simpson’s paradox to do with causality?

What makes Simpson’s paradox a paradox? There has been some discussion about this in the statistical literature. Simpson himself gives two examples of the phenomenon. One is about the chances of survival after a medical treatment where the contigency tables show that the treatment is good for males as well as for females but valueless for the race. Of course, such a treatment cannot exist. But what should we conclude from the tables? Again, the answer depends on the underlying mechanism, that can be represented by a causal diagram. Simpson suggests that the “sensible interpretation” is that we use the segregated results for the genders. It is a bit strange, indeed, to assume that the treatment affects the patient’s gender.

Pearl distinguishes between Simpson’s reversal and Simpson’s paradox. He claims that Simpson’s paradox is a paradox because it “entails a conflict between two deeply held convictions”. Notice that also in case there was no reversal different causal diagrams are possible.

Why does Simpson’s paradox reveal?

In Causality(2003) Pearl introduces the paradox in terms of conditional probabilities.

“Simpson’s paradox refers to the phenomenon whereby an event C increases the probability of E in a given population p and, at the same time, decreases the probability of E in every subpopulation of p. In other words, if F and ~F are two complementary properties describing two subpopulations, we might well encounter the inequalities

P(E | C ) > P(E | ~C)

P(E | C,F) < P( E | ~C,F)

P(E | C,~F) < P(E | ~C,~F)

“Although such order reversal might not surprise students of probability, it is paradoxical when given causal interpretation.’’ (Causality, p.174; italization is mine)

From the first inequality we may not conclude that C has a positive effect on E.  The effect of C on E might be due to a spurious confounder, e.g., a common cause of E and C.

In our example of Simpson’s paradox we could estimate conditional probabilities P(Passed|Prof)  from the contingency tables.

From the inequality

P(Passed=True|Prof = A) > P(Passed=True| Prof=B)

derived from the combined table we could conclude that the Professor has a causal influence on Passed, i.e. on the exam results. If we do this we give the inequality a causal interpretation. And this is clearly wrong! There could be other mechanisms (confounders) that make Passed dependent on Professor.

Why is Simpson’s reversal surprising?

Consider the following statement.

If a certain property holds for all members of a group of entities then that same property also holds for all members of all subgroups of the group and vice versa.

This seems to me logically sound. It holds for whatever property. The statement differs from the following.

If a certain property holds for a group of entities then that same property also holds for all subgroups of the group and vice versa.

The second one is about properties of aggregates. This is not a sound logical rule. It depends on the property if it holds truth.

If a student sees the contigency tables of the school exams and notices the reversal he might perceive this as surprising and see it as contradicting the first statement.. On second thought, he might notice that it is not applicable: there is no property that holds for all students. The student might think then that it is contradicting the second statement. But then he realizes that this is not sound logic. Simpson’s paradox makes him aware that the second rule, the one about aggragates does not apply here. The reason is that the property is not “stable’’. The property changes when we consider subgroups instead of the whole group. The property is a comparison of relative frequencies of events. In our example:

 6/600 < 8/600 and 57/1500 < 8/200

and for the merged group it holds that:

(6+57)/(600+ 1500)  > (8+8)/(600+200)

The abstract property hides, in a sense, the differences that occur in the underlying relative frequencies. The situation is like winning a tennis match: a player can win the match although her opponent wins most of the games. The outcomes of the games are hidden by counting the number of sets that each of the players wins. With set scores 6-5, 0-6 and 6-5 player A wins 2 sets to 1, but player B wins with 16 games to 12.

Indeed, “Simpson’s reversal is a purely numerical fact”.

What has Simpson’s paradox to do with causality?

Pearl’s claims that for those who give a physical causal interpretation of the statistical data, there is a paradox. “Causal paradoxes shine a spotlight onto patterns of intuitive causal reasoning that clash with the logic of probability and statistics” (p.190).

In The Book of Why he writes that it cost him “almost twenty years to convince the scientific community that the confusion over Simpson’s paradox is a result of incorrect application of causal principles to statistical proportions.”

It looks like it depends not only on the rhetorical way an argument is brought but also on the receiver if an argument or construct is perceived as a paradox.

The heading “ Most people involved in car accidents have a driver’s license’’ is conceived as funny by the reader in as far as it suggests for the reader a causal relation, i.e. that having a driver’s license causes car accidents.’’

How would a student of the Jeffrey’s and Jaynes’ school, i.e. some one who has an epistemological concept of probability perceive Simpson’s paradox?

When I saw Simpson’s paradox for the first time I was surprised. Why? Because of the suggestion the tables offer, namely that they tell something about general categories. Subconsciously we generalize from the finite set of data in the tables to general categories. If we compute (estimate) probabilities based on relative frequencies we in fact infer general conclusions from the finite data counts. The probabilities hide the numbers. In my view the paradox could very well be caused by this inductive step. We need not interpret probabilistic relations as causal to conceive the paradoxical character.

What are probabilities about?

At the time I was a student, probability theory and statistics was not my most popular topic. On the contrary! My interest in the topic were waken up when I read E.T. Jaynes’ Probability Theory. Jaynes is an out and out Bayesian with a logical interpretation of the concept op probability. According to this view probability theory is an extension of classical logic. Probabilities are measures of the plausibility of a statement expressing a state of mind. P(H|D) denotes the plausibility of our belief in H given that we know D. I use H for Hypotheses and D for Data. P(H|D) can stand for how plausible we find H after having observed D. Bayes’ rule tells us how we should update our beliefs after we have obtained new information. Bayes’ rule is a mathematical theorem within probability theory. It allows us to compute P(H|D) from P(D|H), the probability of D given some hypothesis, and P(H), the prior probability of H.

Jaynes warns his readers to distinguish between the concept of physical (or causal) dependency and the concept of probabilistic dependency. Jaynes theory concerns the latter, epistemological (in)dependencies, not causal dependencies.

Neither involves the other. “Two events may be in fact causally dependent (i.e. one influences the other); but for a scientist who has not yet discovered this, the probabilities representing his state of knowledge – which determines the only inferences he is able to make – might be independent. On the other hand, two events may be causally independent in the sense that neither exerts any causal influence on the other (for example, the apple crop and the peach crop); yet we perceive a logical connection between them, so that new information about one changes our state of knowledge about the other. Then for us their probabilities are not independent.’’ (Jaynes, Probability Theory, p. 92).

Jaynes’ Mind Projection Fallacy is the confusion between reality and a state of knowledge about reality. The causal interpretation of probabilistic relations is an instance of this fallacy. Logical inferences can be applied in many cases where there is no assumption of physical causes.

According to Pearl the inequalities of Simpson’s paradox are paradoxical for someone who gives them a causal interpretation. I guess Jaynes would say: the fact that these inequalities hold shows that we cannot given them a causal interpretation; they express different states of knowledge. You cannot be in a knowledge state in which they all hold true.

But how would Jaynes resolve the puzzle of the school exam? Which of the two Professors should win the award? Jaynes was certainly interested in paradoxes, but he didn’t write about Simpson’s paradox, as far as I am aware of. I think, he would not consider it a well-posed problem. Jaynes considered the following puzzle of Bertrand’s not well-posed:

Consider an equilateral triangle inscribed in a circle. Suppose a chord of the circle is chosen at random. What is the probability that the chord is longer than a side of the triangle?

Bertrand’s problem can only be solved when we know the physical process that selects the cord. The Monty Hall paradox discussed by Pearl, is also not well-posed, and hence unsolvable, if we don’t have information about the way the quiz master decides which door he will open. The outcome depends on the mechanism. Jaynes and Pearl very much agree on this. Jaynes relies on his Principle of Maximum Entropy to “solve” Bertrands’paradox. I don’t see how this could solve the puzzle of the school exam. Somehow Jaynes must put causal information in the priors.

How can Jaynes theory help the scientist in finding if two events are “in fact causally dependent’’ when probabilities are about the scientist’s “state of knowledge’’ and not about reality? After all scientist aim at knowledge about the real causes. We are not forbidden, Jaynes says, to introduce the notion of physical causation. We can test any well-defined hypothesis. “Indeed, one of the most common and important applications of probability theory is to decide whether there is evidence for a causal influence: is a new medicine more effective, or a new engineering design more reliable?’’ (Jaynes, p.62).

The only thing we can do is compare hypothesis given some data and compute which of the hypothesis best fits the data. Where do the hypothesis come from? We create them using our imagination and the knowledge we have already gained about the subject.

The validation of causal models

Causal diagrams are hypothetical constructs designed by the scientist based on his state of knowledge. Which of the two causal diagrams of school exam case fits the data best? We have learned that we cannot tell based on the data in the contingency tables: both hypothetical models fit the data. Gathering more data will not help us in deciding which of the two represents reality. We can only decide when we have extra-statistical information, i.e. information about the processes that made the data. Jaynes advocates the use of his principle of maximum entropy when we have to make a choice for the best prior. But the causal direction is not testable by data. So I do not see how this can solve the school’s problem.

But how does Pearl justify the causal knowledge presented in a causal model? How can we decide that this model is better than that one? The hypothetical causal models are in fact theories about how reality works. We cannot evaluate and compare them by hypothesis testing. Data cannot decide about causation issues. How do we validate such a theory then? It seems that we can at best falsify them.

Pearl doesn’t give an explicit answer to this critical question in The Book of Why. The answer is implicit in the historical episodes of scientific inquiries that he writes about; the quests and quarrels of researchers searching for causes. If there is something like the truth, it is in these historical dialectical processes. Not outside this process. Although it helps that now and then someone stubbornly believes that she has seen the light and fights against the establishment’s doctrine. Those are the ones that make science progress. The Book of Why contains a few examples of such stubborn characters. To quote Jaynes: “In an field, the Establishment is seldom in pursuit of the truth, because it is composed of those who sincerely believe that they are already in possession of it.” (Jaynes, p.613). Eventually, it is history that decides about the truth.

The Big Questions: Can machines think ?

In the final chapter of The Book of Why Pearl shares some thoughts about what the Causal Revolution might bring to the making of Artificial Intelligence. “Are we getting any closer to the day when computers or robots will understand causal conversations?’’ Although he has the opinion that machines are not able to think yet, he believes that it is possible to make them think and that we can have causal conversations with machines in the future.

Can we ever build a machine that passes the Turing test, a machine that we can have an intelligent conversation with as we have with other humans? To see what it means to build such a machine and what this has to do with the ability to understand causality, consider the following two sentences (from Terry Winograd, cited in Dennett (2004)).

“The committee denied the group a parade because they advocated violence.’’

“The committee denied the group a parade because they feared violence.’’

If a sentence like these occurs in a conversation with a machine it must figure out the intended referent of the (ambiguous) pronoun “they”, if it will be able to respond intelligently.

It will be clear that in order to do this, the machine must have causal world knowledge, not just about a few sentences, or about some “part or aspect of the world’’ (which part or aspect then?).  Such a machine might also be able to see the pun in “Most drivers that are involved in a car accident have a driver’s license.’’.

I worked for quite some time in the field of Natural Language Processing, building dialogue systems and artificial conversational agents. We haven’t succeeded up to now in making such machine, although results are sometimes impressive. Will we ever be able to build such a machine? It is an academic issue often leading to quarreling about the semantics, something that Turing tried to prevent with his imitation game.

What about responsibility?

What is not an academic issue, but a real practical one, is the responsibility that we have when using machines; computers and robots that we call intelligent and that we assign more and more autonomy and even moral intelligence.

I end my note about Simpson’s paradox that became a sort of review of Pearl’s The Book of Why, with emphatically citing another giant in the philosophy of science, Daniel C. Dennett.

“It is of more than academic importance that we learn to think clearly about the actual cognitive powers of computers, for they are now being introduced into a variety of sensitive social roles, where their powers will be put to the ultimate test: In a wide variety of areas, we are on the verge of making ourselves dependent upon their cognitive powers. The cost of overestimating them could be enormous.’’ (D.C. Dennett in: Can Machines Think?).

“The real danger is basically clueless machines being ceded authority far beyond their competence.” (D.C.Dennett in: The Singularity—an Urban Legend? 2015)

Great books are books that make you critically reflect and revisit your ideas. The Book of Why is a great book and I would definitely recommend my students to read it.

References

Daniel C. Dennett (2004) Can Machines Think? In: Teuscher C. (eds) Alan Turing: Life and Legacy of a Great Thinker. Springer, Berlin, Heidelberg (pp. 295-316)  https://link.springer.com/chapter/10.1007%2F978-3-662-05642-4_12  

Daniel C. Dennett(2015) The Singularity – an urban legend?  Published in What do you think about machines that think? Edge.com  https://www.edge.org/response-detail/26035

E.T. Jaynes (2003) Probability Theory: the logic of science. Cambridge University Press, UK, 2003.

Judea Pearl(2001) Causality: models, reasoning, and inference. Cambridge University Press, UK, reprint 2001.

Judea Pearl and Dana Mackenzie(2019) The Book of Why: the new science of cause and effect. First published by Basic Books 2018. Published by Penguin Random House, UK, 2019.

Stuart Russell and Peter Norvig(2009) Artificial Intelligence: A Modern Approach, 3rd edition. Published by Pearson, 2009.

E.H. Simpson(1951) The Interpretation of Interaction in Contingency Tables. Source: Journal of the Royal Statistical Society. Series B (Methodological), Vol. 13, No. 2 (1951), pp. 238-241. Published by: Blackwell Publishing for the Royal Statistical Society. Stable URL: http://www.jstor.org/stable/2984065

Prof. D.C. Dennett and where the Power of the Computer comes from

Kein Satz kan etwas über sichselbst aussagen, weil das Satzzeichen nicht in sichselbst enthalten sein kann (Ludwig Wittgenstein, Tractatus 3.332)

The thinking machine is the result of a historical proces in which men tries to express what it means to think and why it is usefull to do so. I want to make clear that it is not for nothing that the thinking machine is a ‘mathematical machine’ and a ‘language machine’. I will do that in the form of a discussion with the work of D.C. Dennett.

D.C. Dennet is a scientific philosopher of mind. He is deeply concerned with questions about the human mind, consciousness, free will, the status of man and machine in the world of creatures. The stored program computer is a key concept in his philosophy. (For a bibliography of D.C. Dennett see
https://adamwessell.wixsite.com/bioml/daniel-dennett ).

“Thinking is hard.” We think a lot but we often follow paths that mislead us from the truth. In “Intuition pumps” (2013) Dennett collected a large number of stories, thought experiments that he developed in order to think properly to find answers to nasty questions. Reading the book is a good way to introduce yourself into the rich world of one of the most important thinkers of today. I follow Dennett from the time he published The Mind’s I (1981) together with D. Hofstadter. I got a copy from my students when I left high school, where I teached mathematics and physics. I returned to the university where I graduated four years later on a mathematical theory about the implementation of programming languages.

As I said, the computer plays a key role in Dennett’s thinking. I always felt that there is something wrong with the way he explains how the computer works, that he misses the point, but it was always hard for me to understand what it exactly was and, most importantly, how I could understand how this fits in his philosophy of mind and his idea where the mind comes from. In this essay I try to explain where I believe Dennett is missing an important point when he explains “where the power of the computer comes from”.

Dennett is opponent of every belief that smells magic or religion.

According to Dennett we do not need “wonder tissues”, to explain the working of the human mind. When we understand what a computer can do and when we see how computers work we will eventually see that we do not need to rely on “magic” to understand the human mind. Electronic circuits can perform wonderful things. The way the thinking computer is built is not really different from the way the human brain is built. The difference is that there was no designer that built the human brain and mind, whereas the computer was designed. I think this is an important difference. The relation we have with machines is different from the relation we have with ourselves and other living creatures.

Dennett explains his students how the computer works in order to unveal the secrets of the power of the machine. By showing the students where the power of the computer comes from he tries to make clear that the evolution of the machine eventually may lead to a computer that equals the power of the human mind.

Where does the power of the computer come from? Or how does a computer work?

Difficult questions. For me at least. From the time I was a student (I studied mathematics and computer science in the 70s at the University of Twente in the Netherlands) these questions kept me busy. How do we have to think properly to find an answer? I read many texts that describe the working of the computer. I taught students how to program computers in various types of programming languages. I taught them in “Compiler Construction” courses how to implement higher order programming languages. I programmed computers in order to allow people having a conversation with the computer in Dutch or English. I gave courses in formal language theory, mathematical logic, machine learning, and conversational analyses. I taught my students to program a Universal Turing Machine or Register Machine, the basic mathematical models of the stored program computer, precursors of all modern computers.

But I always felt that being able to program a computer, and being able to teach others how to program a Turing machine or a Register Machine does not mean that you can give a satisfying answer to the question: how does a computer work?

From Louk Fleischhacker, my master in Philosophy of Mathematics and Technology, I learned that a satisfying answer to the question how the computer works is hard to give without understanding mathematics, without understanding what it means to compute something. The computer would not be possible without a fundamental idea in metamathematics: that the language of arithmetics can be constructed as mathematical structure itself and that the arithmetical and logical operations can be formalized as operations on a formal language. This language becomes the interface, a programming language, to the mathematical machine. It is not for nothing that many people answer the question what mathematics is by saying that it is a special language. When we make a computation we manipulate tokens according to rules that we have learned.

In mathematics we use language. But we use it in a different sense as we normally use our natural language when we speak. Also computers use language. Without language the computer wouldn’t work.

There are at least two types of answers to the question how a computer works.

There is the technical answer, of the type that Dennett gives. He explains in a very clear way how the register machine works by showing his students how to program the register machine. This machine is programmed using a very simple programming language: it has only three types of instructions. Step by step he explains what the machine does with the instructions. After he has explained how the machine can be programmed to add two numbers he asks his reader to be aware of the remarkable fact that the register machine can add two numbers without knowing what numbers are or what addition is. (I emphasize “without knowing” because it is a central idea in Dennett’s thinking: many creatures show intelligent behavior “without knowing”.) Animals show very intelligent behavior too, at least to me, but does that mean they ‘know’ what they do? A tough question.

Technical answers like this never satisfied me. They do not explain what we exactly mean by phrases like “what the machine does”. I had the feeling that something essential is missing in the technical explanation. Something that remains implicit, because it is so trivial.

As an answer to “how does a computer work?” I sometimes gave my students the following demonstration.

I hold a piece of paper for my  mouth and I shout “Move!”. The moving of the paper I then explained by saying:  “you see, the paper understands my command.” In a sense (Dennett would say “sort of” understands!). In what sense? Well, the meaning of the word confirms the effect of the utterance of the word: the paper moves as if it understands what I mean with uttering the word. This is an essential feature of the working of the computer. Note that the movement of the piece of paper is conditional on my uttering of the word. There is a one-to-one correspondence between the meaning of the word and the effect of uttering it. Of course computers ‘understand’ many words and sentences. But the correspondence between the physical proces and the mental proces that we implemented is the same as in this simple demonstration.

A replica of the Pascaline by Blaise Pascal. The pencil is used to enter the numbers by setting the wheels in the corresponding state.

If we use a computer as a word processor, it is essential that we recognize the physical tokens on the screen as words of our language. This is so trivial that we simply forget this important assumption. Of course the machine is designed this way. If the ATM machine sounds: “Please insert your card”, we will recognize the words, spoken and understand it as a request to do what it asks us to do. At least, when it is said in the proper context of use.

The computer is a “language machine”. You instruct it by means of a (programming) language. The hardware is constructed in such a way that the effect of feeding it with the tokens satisfies the meaning that the tokens have. Therefore the programmer has to learn the language that the machine “sort-of” understands. The program is the key, the machine is the lock that does the work when handled with the proper key.

What has this to do with mathematics? Well; what is typical for mathematics is that mathematical expressions have an exact and clear meaning: there is no vagueness. There is a one-to-one correspondence between the effect of uttering the word and the physical effect caused by it, that represents the meaning of the word.

A demonstration I gave people in answer to the question “how does a computer compute the sum of two numbers?” runs as follows. By way of an example I demonstrate how a computer computes 2 plus 3. First I put 2 matches in one basket. Then I put another 3 matches in a second basket. Then one by one I move the three matches from the second basket to the first one. And look: the result can be read off from the second projector: five matches.

Explanation: the two and three matches stand for the numbers 2 and 3 respectively: there is a clear unambiguous relation between the tokens (the three matches) and their meaning, the mathematical object (the number 3). The moving of the 3 matches to the first projector stands for the addition operation: a repetition of adding one until there is no match left on the second projector. The equality of the 2 and the 3 as seperated units (representing the numbers 2 and 3) on the one hand and the whole of 5 matches (representing the number 5) is a mathematical equality.

You might say that I execute a conditional branching instruction when doing the demonstration: if there is a match on the second projector then take one match and put it on the first projector; else stop and read off the result.

It has the general format (pattern): IF <Condition> THEN <do A> ELSE <do B>.

For an interesting historical overview of this type of pattern in the evolution of knowledge see (Rens Bod 2019).

But notice that also my execution is conditional on the procedure that I follow. In the stored program computer this procedure is represented by a part of the machine storage. There is no difference in status between the program parts, the statements, and the numbers, the data operated on. The difference between statements or operators and numbers or operants is only in the minds of the designer and the programmer and in the way the parts of the machine state function.

I think most people did not took my demonstration as a serious answer to the question how a computer works. But I believe it shows an essential feature of the computer. A feature that Dennett misses when he tries to explain the power of the computer.

The function add for adding two natural numbers can be specified in a functional programming language by means of a simple recursive function as follows.

ADD A B = IF (B = 0) THEN A ELSE ADD (A+1) (B-1)

The function shows two essential features that every programming language must have: repetition and a conditional branching instruction. The repetition is realized by means of the recursion in the definition. The function calls itself, so it is repeatedly applied. Until some stop condition holds true.

For example: ADD 3 2 = ADD (3+1) (2-1) = ADD (3+1+1) ((2-1)-1) = #+1+1 = 5

According to Dennett the power of the register machine is in the conditional branching instruction. This construction tells the machine to check if a certain register contains the number 0 and then take a next step based on the outcome of this check. What is so special about this instruction?

“”As you can now see, Deb, Decrement-or-Branch, is the key to the power of the register machine. It is the only instruction that allows the computer to “notice” (sorta notice) anything in the world and use what it notices to guide its next step. And in fact, this conditional branching is the key to the power of all stored-program computers, (…)’’ (From: Intuition Pumps and other tools for thinking. The same text – without the bracketed sorta notice – can be found in Dennett’s lecture notes The secrets of computer power revealed , Fall 2008).

What Dennett misses, and what is quite essential, is that every instruction is a conditional instruction. Not just the Deb instruction. The End instruction, for example, only does what it means when the machine is brought in a world state that makes the machine execute this instruction. Eventually this is the effect of our act of instructing the machine. When we instruct the computer by pressing a key or a series of keys the computer “notices something in the world” and acts accordingly. For example by stopping when we press the stop button. This is precisely the feature I try to make clear by my first demonstration with the piece of paper. The set up demonstration (the piece of paper held in front of the mouth) is such that it “notices” (‘sort of’ notices Dennett would say) the meaning of the word “move”. How do we know? Because of the way it responses to it. We see that the computer responds in correspondence to the meaning and goal of our command and we say that it “understands” what we mean.

Every instruction is conditional in the sense that it is only executed when it is actually given. Indeed, the machine ‘does not know’ what it means to execute a command. A falling stone ‘doesn’t know’ Newton’s law of mechanics. Does it? And yet, you might say that it computes the speed that it should have according to Newton’s laws when it touches the ground. Sort of.

However, Dennett is right in that the conditional instruction is special in the sense that it is the explicit form of the conditional working of the machine. But it assumes the implicit conditional working of the instructions we give to the computer. Just like the application of the formal rule of modus ponens assumes the implicit use of this rule. (See the References and Notes for how Lewis Carrol’s tries to make this clear with the story “What the tortoise said to Archilles”).

We call a logical circuit logical because the description of the relation between the values of the input and output of the circuit equal that of the formal logical rule seen as a mathematical operator.

The “world” that is noticed by the computer and whose value is tested in the branching instruction is in the end the input provided by the programmer by setting the initial state before he kicked off the machine to execute the instructions given.

Modern people don’t use computing machines like this. They use apps on their mobile phones or lap tops. They click on an icon shown on their user window to start an application and some interaction starts using text fields or buttons. When you ask them where the power of their computer comes from they probably would say from the provider of their popular application or maybe from the user-friendly functionality that the app offers them. Under the hood, hidden from the user, events or messages are send to specific parts, objects or agents of a virtual machine. These events trigger specific actions executed by the agents or objects that receives them.

Programmers don’t write programs for a register machine in machine code. They program in a higher order programming language like Java or Perl or some dedicated application language. Java is an object-oriented language that allows to program applications that are essentially event-based virtual machines.

The first computing machines were constructed to automate the arithmetic operations on whole numbers. Programmers were mathematicians that build and used programs to do numerical computations. Later, in the fifities, people like Yngve at MIT wanted to use the computer to automatically translate texts written in a natural language into a second natural language. The objects to be stored and manipulated are not numbers but words and sentences, strings, sequences of characters. They defined a string processing language so that linguists could use it in their scientific research. The very start of machine translation.

We distinguish a sentence from the act of someone expressing the sentence and meaning what it says. Somewhere in history of mankind this distinction was made. Now we can talk about sentences as grammatical constructs, objects that somehow exist abstract from a person that utters them in a concrete situation. Now we talk about “truth values” of sentences, we study “How to do things with words”; words and sentences have become instruments. Similarly, we analyse “conversational behaviors” (such “tiny behaviors” like head nods, eye gazes) as abstract gestures. And we synthesize gestures in “social robots” as simulations of “human conversational agents behavior”. Many people think that we can construct meaningfull things and events from meaningless building blocks if the constructs we built are complex enough. Complexity is indeed the only measure that rests for people that have a structural world view, a view that structure is basically all there is. (In Our Mathematical Universe: My Quest for the Ultimate Nature of Reality, Max Tegmark posits that reality, including life!, is a mathematical structure.)

Many people, including Dennett, think about the computer as something that is what it is abstract from the human mind, abstract from the user and the designer. As if the machine is what it is without the human mind for which it is what it is and does what it does. However, the real power of the computer is in the mind of the human who organises nature in such a way that it can be used as representation of meaningfull processes. Indeed, the machine calculates ‘without knowing’ what numbers are. Why does Dennett stress this fact? Because there is somehow knowledge of arithmetics expressed in the machine, namely in the way it is constructed. The machine itself is not the subject of this knowledge, the knowledge as knowledge of numbers is outside the machine. The Dutch philosopher Jan Hollak expresses this relation using a term from phenomenology: ‘intentional correlate’. Whenever we say that a machine ‘reflects’, or ‘thinks’ or ‘notices’ we do not mean that the machine really thinks, or notices (the machine does not have consciousness) what we mean is the ‘intentional correlate’ of our knowledge. The states of the machines are not just physical states they are states that represent for us states of a mathematical system. Their substance is mathematical, not just natural.

The Turing test does not test how intelligent a machine is. It tests if the human mind is already able to construct a machine that makes other humans believe that it is intelligent. This has consequences for the question who is ultimately responsible for what machines do. It has consequences for what we mean when we talk about “autonomous machines” or “artificial intelligence”.

Dennet sees the machine and the human mind as distinct realities that can exist seperately. For Dennett there is no fundamental difference between the computer that “sort of” understands and the human mind that “really” understands. The difference between the two is only gradual: they are different stages in an evolutionary proces.

Can robots become conscious? Dennett answers this question with a clear yes. In a conversation with David Chalmers about the question if superintelligence is possible Dennett posits:

“(…) yes, I think that conscious AI is possible because, after all, what are we?
We’re conscious. We’re robots made of robots made of robots.
We’re actual. In principle, you could make us out of other materials.
Some of your best friends in the future could be robots.
Possible in principle, absolutely no secret ingredients, but we’re not going to see it. We’re not going to see it for various reasons.
One is, if you want a conscious agent, we’ve got plenty of them around and they’re quite wonderful, whereas the ones that we would make would be not so wonderful.” (For the whole conversation (recorded 04-10-2019): https://www.edge.org/conversation/david_chalmers-daniel_c_dennett-is-superintelligence-impossible)

Can machines think? Dennett would answer this question with a clear yes, too. After all: people are machines, aren’t we? But he doesn’t consider this question as really important. I think Dennett confuses our technical reconstruction of natural intelligent behavior (for example social robots understanding natural language) with the real thing (people having a conversation).

The real challenge of artificial intelligence might not be in this type of “philosophical” questions.

According to Dennett the real challenge of AI is not a conceptual but a practical one.

“The issue of whether or not Watson can be properly said to think (or be conscious) is beside the point. If Watson turns out to be better than human experts at generating diagnoses from available data it will be morally obligatory to avail ourselves of its results. A doctor who defies it will be asking for a malpractice suit.”

The human expert will have to motivate what he did with the knowledge stored in the computer. The final responsibility for the treatment chosen must always remain with the human expert. The computer may have statistical knowledge based on big data, the human expert has to relate this to the case at hand.

“The real danger, then, is not machines that are more intelligent than we are usurping our role as captains of our destinies. The real danger is basically clueless machines being ceded authority far beyond their competence.” (D.C.Dennett in: The Singularity—an Urban Legend? 2015)

I cannot agree more with Dennett than with this. As soon as machines are considered autonomous authorities they stop being seen as usefull technical instruments. They are considered Gods, magical masters, then. People should not uncrtically belief the texts that ChatGPT or similar language machines generated based on large statistical (neuronal) models of linguistic data.

A.M. Turing, D.C. Dennett and many more intelligent minds are products of evolution. Machines are products of evolution as well. But there is a fundamental difference between natural intelligence as we recognize it in nature as a product of natural Darwinian evolution, and artificial intelligent machines that are invented by human intelligence.

As soon as we forget, for whatever reason or by whatever cause, this important difference will disappear.

References and Notes

Rens Bod (2019/2022). Een wereld vol patronen – de geschiedenis van kennis. Prometheus Amsterdam. Translated in English: World of Patterns: a global history of knowledge, 2022 Open access

Provides a historical overview of the human quest for patterns and principles in the world that surrounds us from pre-history to 1800.

Lewis Carroll, “What the Tortoise Said to Achilles,” Mind 4, No. 14 (April 1895): 278-280.

This is a story about the Hypothetical Proposition (HP): if A and B then Z

Tortoise: I accept A and B as true, but not the HP. Convince me I have to accept Z by logic. The Turtoise proposes to call the HP: (C): if A and B then Z.

“If A and B and C are true, Z must he true,” the Tortoise thoughtfully repeated. “That’s another Hypothetical, isn’t it? And, if I failed to see its truth, I might accept A and B and C‘, and still not accept Z. mightn’t I?”

This amounts to: (D) If A and B and C are true, Z must be true.

And on the same reasoning the next Hypothetical Proposition is:

(E) If A and B and C and D are true, Z must be true. 

Until I have granted that, the Tortoise claims, of course I needn’t grant Z.

 “So it’s quite a necessary step, you see?”

And so on…ad infinitum.

What Lewis Carroll wants to make clear to the reader is that in applying the Hypothetical Proposition we use a rule implicitly. The Turtoise asks Achilles to write that implicit rule down, just as the HP. But this results in a new HP, that he wants to be treated on the same level as the one before.

Compare the sequence of utterances: “It rains”, ” It rains is true”, “It rains is true is true.” Every next one is making explicit what is implicit in the previous statement.

D.C. Dennett, Intuition Pumps and other tools for thinking, W.W. Norton Publ.,2013. Translated in Dutch: Gereedschapskist voor het denken. Uitg. Atlas Contact, Amsterdam/Antwerpen, 2013.

L.E.Fleischhacker, Beyond Structure: the power and limitations of mathematical thought in common sense, science and philosophy. European University Studies 20(449), Peter Lang, Frankfurt am Main, 1995.

The modern form of objectivity is that of the mathematical object, the result of an objectifying all-pervading mathematical attitude of thought. “The enterprise, of which this book is a report, consists of an attempt towards a systematic ‘deconstruction’ of mathematism.” Mathematism is the dogmatic view that sees mathematical structures as the essence of being. The author not just ‘deconstructs’ (in the sense of Derrida) modern mathematical metaphysics, but also formulates a new metaphysics of principles beyond cartesianism and hegelianism.

Goldstine, Herman H. (1972). The Computer – from Pascal to von Neumann. Princeton University Press, 1972.

Interesting history of the computer showing the development of various calculating machines from the mechanical Pascaline to the programmed electronic computer written by an insider.

Hertz, Heinrich (1894). Die Prinzipien der Mechanik in neuen Zusammenhange dargestellt. Mit einen Vorworte von H. von Helmholtz.

From the Einleitung:

The correspondence between nature and mind became the principle of information technology