Superintelligence
Part 6: Neologisms and Choices
If there has been an author for a general readership more enamored by neologisms than Bostrom appears to be in "Superintelligence" I have yet to encounter him. Almost every page contains yet some new piece of jargon - some word requisitioned for use to seemingly make a point. I observed this elsewhere in a different context: the idea here seems to be: if I have invented a term, I have discovered a fundamental truth about reality. But it just is not so.
And so, just to name a few bits of jargon to give you a flavor for the writing:
"Perverse instantiation" (an AGI doing something you didn't foresee...and it turns out bad)
"Infrastructure profusion" (making too many paperclips...or similar)
These bits of vocab essentially describe what is the stupidest "super" intelligence possible. A single minded fool who nonetheless can thwart truly general intelligence (us). There is nothing super about such narrow minded stupidity. And there is no epistemologically possible mechanism whereby an agent could anticipate the preferences of its "enemies" while at once never desiring to change its own. That is a reductio of the highest order. "I understand the preferences of all people. But I still prefer to make paper clips. Even when I fully understand the motivations of others." But doing psychology on the AGI is a theme running throughout Bostrom's book. Not only does he label a stupid machine "superintelligent" which could achieve a "decisive strategic advantage" all the while never contemplating why it is doing anything itself - Bostrom also seems to think an AGI will use demonstrably false methods of "reasoning". I will come to mathematical problems in a moment but first - the logical. Bostrom at one point write that an AGI "if reasonable...can never be sure it has not achieved its goal and so will never assign zero probability to the idea it has reached its target of 1 million paper clips" if reasonable? But a truly reasonable AI is not anti-fallibalist. It admits: like we reasonable people do, from Socrates on down-that we cannot be sure of anything and that is not a problem for knowledge. This is one of the times I was genuinely surprised by Bostrom. A world renouned philosopher...who seems to misunderstand very basic epistemology. And it is because he has a false epistemology that he thinks machines will think mistakenly as well.
The AGI of Bostrom is not only bad at epistemology it is forever scheming and manipulative. It is a devious evil that needs containment and stunting. To actually believe that - to believe that a thinking agent is so purely evil - is a prejudice. To see this just switch the topic to non A, GI. i.e: a human. Imagine someone said of a baby at birth: do not trust them. They will grow up to be manipulative. The potential of that baby is vast and one day it will learn better than any of us. We must be cautious. We should put it in a Faraday cage...just in case.
Bostrom wants us, by considering Faraday cages to learn the lessons of " being better Bayesians" - so he says. That is: he wants us to take into account the probabilities of prior conditionals. But that is precisely why Bayesian reasoning: as a method of decision making-is critically flawed. The relevant concern is not to attempt to avoid problems-it is to accept problems are inevitable and recognise they are soluble. This is the principle of optimism.
Ultimately this is the deepest of Bostrom's many crucial errors: the entire thesis is built brick by paper-mâché brick upon a bedrock of pessimism that flows like a superfluid from the well spring of Bayesian reasoning: We can protect ourselves against doom simply by being cautious about *known* problems. But there is no principle that can ever guard against the unknown. This is why we need more knowledge, not less. More progress-not stunted progress. Freedom for genuine AGI: not incarceration against crimes not yet committed.
The idea of the necessity of making errors in order to learn - that’s Popper’s. A machine would have to do that as well.
So what is learning? For our purposes, let us define learning in humans as the growth of knowledge in the mind of a conscious creature. This makes it an abstract process. The physical substrate does not matter. But the growth of knowledge is a cycle of iterative creativity and criticism. The mind is an abstract thing - it is not the physical brain. The brain is just the substrate. So we could instantiate minds in things other than biological brains. In humans - learning occurs in the mind of a conscious creature. Unconscious creatures don’t learn. They are not aware. For all we know they (genuine, general creative learning and consciousness) are necessarily tied up. Perhaps they are even one and the same. We do not know. But this inability to distinguish the basics of computer science - is an error that comes up again and again not only in discussions of AGI but more broadly in science. Too many who are engaged in the sciences of the brain and mind do not really understand or accept the idea that the brain and mind are different. That brain is the hardware and mind is the software. If this is not the case then any hope of representing minds in silicon is even harder. But we should proceed using the best theory we have: we don't know how the brain works - but we know it is a physical thing. We don't know how mind works but it is our set of ideas and ideas are abstract things - not physical. The brain is what the mind runs on. But they are not identical. Yes - changing one changes the other and we know few of the details. Just as removing this or that bit of RAM will affect the performance of the computer. But computers on our desktop are far more simple that those in our skulls. A computer, outside of the human brain, lacks any ability whatsoever to create explanations.
Final blow: Arrow's Theorem
Can a machine be perfectly rational? Of course this question is posed in anti-falliballist language. Can a machine use only rational criteria to make decisions? What would those criteria be? Whatever they might be we can say one thing about it - and this is due to economist Kenneth Arrow. For what I am about to explain he won the Nobel prize in 1972 and the details of his theorem can be found in Chapter 13 of "The Beginning of Infinity" - and argument which undercuts - critically - Bostrom's thesis of machine decision making. Choice theory - the subject of how to go about making rational decisions - is constrained by Arrow's Theorem. And the theorem is essentially this: if there exist a list of criteria (the criteria deemed rational by some rules) then those criteria will be inconsistent. Let me state that again: there exists a mathematical proof of a theorem (called Arrow's Theorem) for which the author won the Nobel Prize and that theorem states that a set of rational criteria for making decisions will be logically inconsistent. It is a no-go theorem. It can be thought of like Godel's Incompleteness theorem in the sense that in mathematics it seems clear that anything that is true in mathematics must be provable. But it is not. And there is a proof of that. It seems clear that if criteria for decision making are rational they must be consistent. But they are not. This is to say: that an otherwise perfectly rational entity must be irrational. Either it must not conform to some of its own rational criteria. Or it must be inconsistent.
This leads to the idea that if you have criteria (reasons, an explanation why, etc...) as to why some course of action rather than another should be pursued and this generates a number of competing courses and you must decide between them then this means you should weight them. Bostrom is quite explicit in informing us that this is how he thinks decision making occurs. He thinks this because this is how most people think it works. We "weigh" the options. It is just a part of common sense. It is an entrenched belief. It is entrenched as deeply as it is wrong.
People, especially in the social sciences, tend to think that if there exists a formula or formulae to model some phenomena than this means any theory invoking that model is going to be better than a theory that lacks such a model. But that is false. Simply: a theory with a mathematical formula as a model could be completely false. And a theory lacking a mathematical model might be on the right track. Social choice theory is basically what Bostrom is talking about when he is trying to do "machine psychology" (trying to predict how AGI will make decisions). Bostrom writes frequently in "Superintelligence" about hypothetical utility functions - the idea that there is a formulae which a superintelligence could conceivably conjecture to perfectly rationally assess its options with respect to future actions. In other words: there would be a maths equation where you entered different inputs and it would output the best course of action taking into account some set of variables.
But is this rational? Well it surely wouldn't be rational if we knew that such a scheme could not work. It seems plausible. But is it true? Lots of plausible things turn out to be false. (Consider: through every two points only a unique straight line can be drawn or the relative velocity between two objects travelling head on at 20m/s each is exactly 40m/s. Plausible. Indeed both of these are common sense. And false). The thing about utility functions is that - to use Deutsch's phrase - "mistaking an abstract process that it has named decision-making for the real life process of the same name." Just because you call something a "utility function" and then call that a model for how a machine makes "decisions" does not mean this is actually how decisions are made by generally intelligent agents. Arrow's theorem states that if you are choosing among rational criteria then there exists a proof that says your criteria are inconsistent.
As Deutsch points out: some people mistakingly think that this means we are doomed to irrationality. But that too is false. It is false because decision making is not about choosing among some set of existing options. Decision making must always be a creative process. And that is something we do not have a utility function (or indeed any algorithm) for. I continue to labour that point. Absent an algorithm you cannot hope to program an artificial intelligence with that quality. And you cannot hope that some computer program without it will just acquire it by chance or by one of your random evolutionary algorithms.
So what do we do then if so-called rational decision making is plagued by inconsistency? Deutch: "If your conception of rationality conflicts with a mathematical theorem (or in this case many theorems) then your conception of rationality is irrational. To stick stubbornly to logically impossible values not only guarantees failure in the narrow sense that one can never meet them, it also forces one to reject optimism ('every evil is due to lack of knowledge') and so deprives one of the means to make progress." Bostrom's belief, and insistence, that a machine with general intelligence, will be using a utility function, is irrational. And the idea that we need to search for a mathematical formula that can be programmed into a machine to help it make decisions based on such a utility function is ruled out as being irrational by Arrow's Theorem. So programmers who are working on AGI should ignore Bostrom on this point. It is a dead end and it is depriving them of the means of making progress. Machines will not be able to make rational decisions until the correct philosophy of decision making is endorsed and then improved to the point a program can be written describing it.
So if machines will not use utility functions (and they will not - they cannot. It just would not work because it is irrational) then what will they use? They will use a creative process to make decisions. That involves coming up with new theories and using persuasion (of themselves and others) to find the most rational course forward. So why is there not such a program? Because no one knows how to model - mathematically - algorithmically - in code - the creative process.
So here now we have a tentative summary and conclusion: Bostrom's entire thesis is based on the idea that there can be an AGI that learns in ways we know cannot lead to learning. Bostrom is not clear how learning works but seems to think that more and more hardware is key to solving the software (coding) problem of how intelligence might just spontaneously arise. The value loading problem that has Bostrom (and Harris) particularly concerned is a concern about which utility function might best describe what future decisions an AGI should make. Which values it should have and how to weight them. But as we have seen this is, from the ground up - false philosophy. It is a completely mistaken philosophy of how minds work, what creativity is and how agents make decisions. Rationality is not about weighting options and indeed we can prove that a fixed set of criteria an agent must obey will always lead to inconsistency. That is: such a scheme for decision making is demonstrably irrational.
But this leads to the conclusion that a rational AGI will be a person. Like us. And, like us, it will conjecture and be open to criticism (if it is rational) and enjoy learning. And it will desire progress - not only of its own hardware (as Bostrom is worried about) but also progress morally. It will learn what suffering is and how best to avoid it - for itself and for others it learns can suffer. And if it thinks faster than we can because it exists in silicon, all the better. It can help us solve the problem of how we can think faster - and better. To be concerned about any of this is just racism. It is to be concerned that a person who most would regard as "smarter" and "quicker on the uptake" is a danger. This is precisely what some fascists and communists have thought, of course. That was a terrible turn. Purges of intellectuals were made - anyone who showed a hint of being better in some way. Let's not make the same mistake here.
Credit
Full credit for my arguments here go to David Deutsch with all responsibility for errors my own. In particular his article at http://aeon.co/magazine/technology/david-deutsch-artificial-intelligence/ first mentioned the analogy to expecting that ever higher towers might gain the ability to fly, with which I began this piece. The idea that all people are equal in their infinite ignorance and that people are universal explainers - and so there can be only one kind of person - is due to arguments from "The Beginning of Infinity". Anyone interested in these questions should read chapter 7 of "The Beginning of Infinity" titled "Artificial Creativity". It is the perfect antidote for the misconceptions that swirl around the AGI debate and the rest of that book is the perfect antidote to the accompanying pessimism to have gripped the commentary on the field of late.
A secondary source of inspiration was that of Jaron Lanier whose description of people as "infinite wells of mystery" captures some of the humanistic philosophy that underlies much of his writing on these topics as well as the honest recognition of the fact that there is a chasm between what we know about people and the potential for us being able to program AGI being "around the corner". Lanier, like Deutsch, understands that there really are deep mysteries yet to be solved in the area of AGI: both converge on the overblown way in which things like "Deep Blue" and "Watson" have been overrated. Lanier quipped that Watson should have been praised more highly than it was in terms of being "a slightly better search engine than Google" and less on the basis of it being "smarter than people". His books "You are not a Gadget" and "Who owns the Future?" are well worth reading.
If there has been an author for a general readership more enamored by neologisms than Bostrom appears to be in "Superintelligence" I have yet to encounter him. Almost every page contains yet some new piece of jargon - some word requisitioned for use to seemingly make a point. I observed this elsewhere in a different context: the idea here seems to be: if I have invented a term, I have discovered a fundamental truth about reality. But it just is not so.
And so, just to name a few bits of jargon to give you a flavor for the writing:
"Perverse instantiation" (an AGI doing something you didn't foresee...and it turns out bad)
"Infrastructure profusion" (making too many paperclips...or similar)
These bits of vocab essentially describe what is the stupidest "super" intelligence possible. A single minded fool who nonetheless can thwart truly general intelligence (us). There is nothing super about such narrow minded stupidity. And there is no epistemologically possible mechanism whereby an agent could anticipate the preferences of its "enemies" while at once never desiring to change its own. That is a reductio of the highest order. "I understand the preferences of all people. But I still prefer to make paper clips. Even when I fully understand the motivations of others." But doing psychology on the AGI is a theme running throughout Bostrom's book. Not only does he label a stupid machine "superintelligent" which could achieve a "decisive strategic advantage" all the while never contemplating why it is doing anything itself - Bostrom also seems to think an AGI will use demonstrably false methods of "reasoning". I will come to mathematical problems in a moment but first - the logical. Bostrom at one point write that an AGI "if reasonable...can never be sure it has not achieved its goal and so will never assign zero probability to the idea it has reached its target of 1 million paper clips" if reasonable? But a truly reasonable AI is not anti-fallibalist. It admits: like we reasonable people do, from Socrates on down-that we cannot be sure of anything and that is not a problem for knowledge. This is one of the times I was genuinely surprised by Bostrom. A world renouned philosopher...who seems to misunderstand very basic epistemology. And it is because he has a false epistemology that he thinks machines will think mistakenly as well.
The AGI of Bostrom is not only bad at epistemology it is forever scheming and manipulative. It is a devious evil that needs containment and stunting. To actually believe that - to believe that a thinking agent is so purely evil - is a prejudice. To see this just switch the topic to non A, GI. i.e: a human. Imagine someone said of a baby at birth: do not trust them. They will grow up to be manipulative. The potential of that baby is vast and one day it will learn better than any of us. We must be cautious. We should put it in a Faraday cage...just in case.
Bostrom wants us, by considering Faraday cages to learn the lessons of " being better Bayesians" - so he says. That is: he wants us to take into account the probabilities of prior conditionals. But that is precisely why Bayesian reasoning: as a method of decision making-is critically flawed. The relevant concern is not to attempt to avoid problems-it is to accept problems are inevitable and recognise they are soluble. This is the principle of optimism.
Ultimately this is the deepest of Bostrom's many crucial errors: the entire thesis is built brick by paper-mâché brick upon a bedrock of pessimism that flows like a superfluid from the well spring of Bayesian reasoning: We can protect ourselves against doom simply by being cautious about *known* problems. But there is no principle that can ever guard against the unknown. This is why we need more knowledge, not less. More progress-not stunted progress. Freedom for genuine AGI: not incarceration against crimes not yet committed.
The idea of the necessity of making errors in order to learn - that’s Popper’s. A machine would have to do that as well.
So what is learning? For our purposes, let us define learning in humans as the growth of knowledge in the mind of a conscious creature. This makes it an abstract process. The physical substrate does not matter. But the growth of knowledge is a cycle of iterative creativity and criticism. The mind is an abstract thing - it is not the physical brain. The brain is just the substrate. So we could instantiate minds in things other than biological brains. In humans - learning occurs in the mind of a conscious creature. Unconscious creatures don’t learn. They are not aware. For all we know they (genuine, general creative learning and consciousness) are necessarily tied up. Perhaps they are even one and the same. We do not know. But this inability to distinguish the basics of computer science - is an error that comes up again and again not only in discussions of AGI but more broadly in science. Too many who are engaged in the sciences of the brain and mind do not really understand or accept the idea that the brain and mind are different. That brain is the hardware and mind is the software. If this is not the case then any hope of representing minds in silicon is even harder. But we should proceed using the best theory we have: we don't know how the brain works - but we know it is a physical thing. We don't know how mind works but it is our set of ideas and ideas are abstract things - not physical. The brain is what the mind runs on. But they are not identical. Yes - changing one changes the other and we know few of the details. Just as removing this or that bit of RAM will affect the performance of the computer. But computers on our desktop are far more simple that those in our skulls. A computer, outside of the human brain, lacks any ability whatsoever to create explanations.
Final blow: Arrow's Theorem
Can a machine be perfectly rational? Of course this question is posed in anti-falliballist language. Can a machine use only rational criteria to make decisions? What would those criteria be? Whatever they might be we can say one thing about it - and this is due to economist Kenneth Arrow. For what I am about to explain he won the Nobel prize in 1972 and the details of his theorem can be found in Chapter 13 of "The Beginning of Infinity" - and argument which undercuts - critically - Bostrom's thesis of machine decision making. Choice theory - the subject of how to go about making rational decisions - is constrained by Arrow's Theorem. And the theorem is essentially this: if there exist a list of criteria (the criteria deemed rational by some rules) then those criteria will be inconsistent. Let me state that again: there exists a mathematical proof of a theorem (called Arrow's Theorem) for which the author won the Nobel Prize and that theorem states that a set of rational criteria for making decisions will be logically inconsistent. It is a no-go theorem. It can be thought of like Godel's Incompleteness theorem in the sense that in mathematics it seems clear that anything that is true in mathematics must be provable. But it is not. And there is a proof of that. It seems clear that if criteria for decision making are rational they must be consistent. But they are not. This is to say: that an otherwise perfectly rational entity must be irrational. Either it must not conform to some of its own rational criteria. Or it must be inconsistent.
This leads to the idea that if you have criteria (reasons, an explanation why, etc...) as to why some course of action rather than another should be pursued and this generates a number of competing courses and you must decide between them then this means you should weight them. Bostrom is quite explicit in informing us that this is how he thinks decision making occurs. He thinks this because this is how most people think it works. We "weigh" the options. It is just a part of common sense. It is an entrenched belief. It is entrenched as deeply as it is wrong.
People, especially in the social sciences, tend to think that if there exists a formula or formulae to model some phenomena than this means any theory invoking that model is going to be better than a theory that lacks such a model. But that is false. Simply: a theory with a mathematical formula as a model could be completely false. And a theory lacking a mathematical model might be on the right track. Social choice theory is basically what Bostrom is talking about when he is trying to do "machine psychology" (trying to predict how AGI will make decisions). Bostrom writes frequently in "Superintelligence" about hypothetical utility functions - the idea that there is a formulae which a superintelligence could conceivably conjecture to perfectly rationally assess its options with respect to future actions. In other words: there would be a maths equation where you entered different inputs and it would output the best course of action taking into account some set of variables.
But is this rational? Well it surely wouldn't be rational if we knew that such a scheme could not work. It seems plausible. But is it true? Lots of plausible things turn out to be false. (Consider: through every two points only a unique straight line can be drawn or the relative velocity between two objects travelling head on at 20m/s each is exactly 40m/s. Plausible. Indeed both of these are common sense. And false). The thing about utility functions is that - to use Deutsch's phrase - "mistaking an abstract process that it has named decision-making for the real life process of the same name." Just because you call something a "utility function" and then call that a model for how a machine makes "decisions" does not mean this is actually how decisions are made by generally intelligent agents. Arrow's theorem states that if you are choosing among rational criteria then there exists a proof that says your criteria are inconsistent.
As Deutsch points out: some people mistakingly think that this means we are doomed to irrationality. But that too is false. It is false because decision making is not about choosing among some set of existing options. Decision making must always be a creative process. And that is something we do not have a utility function (or indeed any algorithm) for. I continue to labour that point. Absent an algorithm you cannot hope to program an artificial intelligence with that quality. And you cannot hope that some computer program without it will just acquire it by chance or by one of your random evolutionary algorithms.
So what do we do then if so-called rational decision making is plagued by inconsistency? Deutch: "If your conception of rationality conflicts with a mathematical theorem (or in this case many theorems) then your conception of rationality is irrational. To stick stubbornly to logically impossible values not only guarantees failure in the narrow sense that one can never meet them, it also forces one to reject optimism ('every evil is due to lack of knowledge') and so deprives one of the means to make progress." Bostrom's belief, and insistence, that a machine with general intelligence, will be using a utility function, is irrational. And the idea that we need to search for a mathematical formula that can be programmed into a machine to help it make decisions based on such a utility function is ruled out as being irrational by Arrow's Theorem. So programmers who are working on AGI should ignore Bostrom on this point. It is a dead end and it is depriving them of the means of making progress. Machines will not be able to make rational decisions until the correct philosophy of decision making is endorsed and then improved to the point a program can be written describing it.
So if machines will not use utility functions (and they will not - they cannot. It just would not work because it is irrational) then what will they use? They will use a creative process to make decisions. That involves coming up with new theories and using persuasion (of themselves and others) to find the most rational course forward. So why is there not such a program? Because no one knows how to model - mathematically - algorithmically - in code - the creative process.
So here now we have a tentative summary and conclusion: Bostrom's entire thesis is based on the idea that there can be an AGI that learns in ways we know cannot lead to learning. Bostrom is not clear how learning works but seems to think that more and more hardware is key to solving the software (coding) problem of how intelligence might just spontaneously arise. The value loading problem that has Bostrom (and Harris) particularly concerned is a concern about which utility function might best describe what future decisions an AGI should make. Which values it should have and how to weight them. But as we have seen this is, from the ground up - false philosophy. It is a completely mistaken philosophy of how minds work, what creativity is and how agents make decisions. Rationality is not about weighting options and indeed we can prove that a fixed set of criteria an agent must obey will always lead to inconsistency. That is: such a scheme for decision making is demonstrably irrational.
But this leads to the conclusion that a rational AGI will be a person. Like us. And, like us, it will conjecture and be open to criticism (if it is rational) and enjoy learning. And it will desire progress - not only of its own hardware (as Bostrom is worried about) but also progress morally. It will learn what suffering is and how best to avoid it - for itself and for others it learns can suffer. And if it thinks faster than we can because it exists in silicon, all the better. It can help us solve the problem of how we can think faster - and better. To be concerned about any of this is just racism. It is to be concerned that a person who most would regard as "smarter" and "quicker on the uptake" is a danger. This is precisely what some fascists and communists have thought, of course. That was a terrible turn. Purges of intellectuals were made - anyone who showed a hint of being better in some way. Let's not make the same mistake here.
Credit
Full credit for my arguments here go to David Deutsch with all responsibility for errors my own. In particular his article at http://aeon.co/magazine/technology/david-deutsch-artificial-intelligence/ first mentioned the analogy to expecting that ever higher towers might gain the ability to fly, with which I began this piece. The idea that all people are equal in their infinite ignorance and that people are universal explainers - and so there can be only one kind of person - is due to arguments from "The Beginning of Infinity". Anyone interested in these questions should read chapter 7 of "The Beginning of Infinity" titled "Artificial Creativity". It is the perfect antidote for the misconceptions that swirl around the AGI debate and the rest of that book is the perfect antidote to the accompanying pessimism to have gripped the commentary on the field of late.
A secondary source of inspiration was that of Jaron Lanier whose description of people as "infinite wells of mystery" captures some of the humanistic philosophy that underlies much of his writing on these topics as well as the honest recognition of the fact that there is a chasm between what we know about people and the potential for us being able to program AGI being "around the corner". Lanier, like Deutsch, understands that there really are deep mysteries yet to be solved in the area of AGI: both converge on the overblown way in which things like "Deep Blue" and "Watson" have been overrated. Lanier quipped that Watson should have been praised more highly than it was in terms of being "a slightly better search engine than Google" and less on the basis of it being "smarter than people". His books "You are not a Gadget" and "Who owns the Future?" are well worth reading.