superintelligence
Part 4: Irrational Rationality
Bostrom believes that a Superintelligence will not only be "perfectly" rational but that, in being "perfectly" rational it will be a danger. Bostrom appears to be concerned that too much rationality is dangerous. What is implied here is that if a machine that he thinks were too rational it would...do something the rest of us would consider irrational? It is not exactly clear what Bostrom is suggesting - but he seems to fear a machine that might be, in his eyes, smarter than him - able to think faster than he can. And he is worried that the machine might, for example, decide to pursue some goal (like making the universe into paperclips) at the expense of all other things.
Of course a machine that actually decided to do such a thing would not be super rational. It would be acting irrationally. And if it began to pursue such a goal - we could just switch it off. "Aha!" cries Bostrom "But you cannot! The machine has a decisive strategic advantage" (this is a phrase that appears more times than I was able to keep count of on the audiobook). So the machine is able to think creatively about absolutely everything that people might decide to do to stop it killing them and turning the universe into paperclips except on the question as to "Why am I turning everything into paperclips?" It can consider every single explanation possible - except that one. Why? We are not told. Something to do with its programming. On the one hand it has human-like (but super) intelligence and on the other it cannot even reflect in the most basic way about why it is doing the very thing occupying most of its time. It is never clear whether some flavors of Bostrom's Superintelligence can actually make choices or not. Apparently some choices are ruled out. Like the choice not to make paperclips (or whatever the "goal" that the machine has been programmed with is compelled to pursue).
All of this argument turns largely on Bostrom's belief that an A(G?)I will be some sort of perfect Bayesian inference generator. So sometimes it might not assign a zero probability to the chance it has not yet made (say) 1 million paper clips were this a part of its programming.
But Bayesian reasoning is false! (For more on that, see my page here). That is not how we generate new hypotheses and it is not what would motivate an AGI. An AGI, like us, would understand, and be required to utilise the fact, that hypothesis formation is a creative process. Not one which generalises from particulars or attempts to calculate success based on only existing theories (this is what Bayesian inference is about: non-creative calculation of the likelihood that some outcome will obtain. The anti-human unintelligent thing is: humans think creatively - they invent new hypotheses and this must change the Bayesian calculations. And, before the new theory is created you simply cannot calculate a conditional probability...because you don't yet have the theory that generates the probability).
AGI to be true general intelligence (like we humans) will not rely only on known hypotheses to predict future outcomes. It would (must!) create new possibilities to be an AGI. Anything less and it is far from intelligent. At all. Let alone Superintelligent. And for that reason - not a danger. And far from an existential threat. It's a dumb machine if it's just calculating probabilities in order to "decide" which course to take. (I put "decide" in scare quotes there because if a machine is just compelled to do what some probability calculus assigns as the highest probability, then this is no decision in the real sense.). Bayesian inference generation is, essentially, pattern extrapolation. It can only take some pattern in data and assume it continues and make predictions from that. Of course, this is not how science or any part of the real world tends to work. If one heats water on a stove and monitors the temperature, you can find a rough pattern. Like every minute the water might rise 10 degrees. But if you do this for 5 minutes to collect 5 data points there is no reason to presume that every minute into the infinite future you'll get that same rise. Indeed at the boiling points something surprising happens: the water ceases to increase its temperature. At the boiling point the temperature is a constant (around 100 degrees centigrade as we know, near sea level on Earth). So if something as simple as heating water throws up the unexpected given some constant pattern "in" the data, expect other systems to be equally surprising. Hence: require that knowledge creation systems are truly creative conjecture-makers (which will sometimes hit on the truth), not "Bayesian" inference predictors (which are guaranteed to make silly mistakes).
This is a problem with those who believe in the looming AGI apocalypse. Their definition of Superintelligence can be so elastic as to encompass even things with obviously no intelligence (the capacity to solve problems) at all. To solve a problem one needs to minimally be aware that there is a problem to solve. And yet Sam Harris believes a pocket calculator has something like Superintelligence because it can be used by a (presumably marginally intelligent) person to solve an arithemetic problem in a fraction of a second.
But that is to make a mockery of the term "intelligence". One may as well say a cow has Superintelligence because it can squirt milk from its udder in great quantities: a quality no human possesses or is ever likely to. Or perhaps a parrot is superintelligent because its mental capacity for mimicry exceeds that of many people.
Intelligence has nothing to do with displaying a number or squirting milk or mimicking. It has everything to do with solving problems. And no, a cow is no more "solving the problem" of how to squirt milk than a calculator is of solving the problem of what 123 x 456 is. In both cases it is only a problem when a person recognises it as such. That is to say: there is an awareness of what the problematic *situation* is. A calculator with 123 x 456 entered on its keys is even less aware of the "problem" than a cow is of an excess of milk.
Intelligence is not the capacity to do things faster or better than humans in some exceedingly narrow domain. It is the ability to create new solutions - new explanations - a uniquely human attribute. Indeed it is the capacity to be a universal explainer (to use Deutsch's formulation). That is the attribute if we want to create general intelligence instantiated in computer chips. It will require an algorithm we don't yet possess which enables a program to be written which, as yet, we cannot guess. Such an algorithm will be able to generate explanations for anything - for any problem. And that will include the problem of which problem to choose to solve next. That is, it will have the quality of being able to choose. And so - it will not be able to be programmed to, for example, pursue paperclip building whilst ignoring lots and lots of other stuff (like the suffering of people) if it is a genuinely intelligent AGI.
This is why Deutsch’s criterion for what it takes to be a person is so important: people are universal explainers. We explain stuff. We explain our lives, science, how things work. We create new explanations: new theories. Creativity is what we have, and what animals (and computers!) lack. We just don’t understand it. We have no “theory” of creativity and we know this because we have never programmed a creative computer.
So is AGI around the corner? We have no idea - but it seems no closer than it did when Alan Turing invented the theory of classical computing. We know we can increase speed and increase memory - which is to say we know how to make hardware improvements to make computers better - but no hardware improvement can possibly give us the software we need - the algorithm for learning. That is a question about knowledge: how it is created. Knowledge is information - but not merely so.
Creativity might very well be tied intimately with consciousness. For to solve a problem you must be aware of a problem. And therein lies the problem: to observe, consciously be aware - of that which you do not know - requires an ingredient we simply cannot express mathematically or in any programming language. Yet. None of this is to say the question of what creativity is, is beyond us. It is certainly a soluble problem. Only there is no sign of a solution. And so there is no reason to be concerned. Not especially. Not now. No more than we should be concerned that any computer will suddenly become aware - let alone a box of switches in the basement of some silicon valley company.
To solve this problem requires a new philosophy which is an improvement upon our best philosophy. It is a philosophy of how learning actually works in fine grained detail. So fine we can write down a step-by-step method that can be instantiated in code. But philosophy is not valued by the majority of people working on the AGI issue. And where some philosophers have taken an interest (like Bostrom and Harris) they seem to have missed the point entirely. They haven’t read Popper - or if they have they don’t seem to understand the crucial aspects of knowledge creation that they need to. Namely that no amount of more memory and faster speed can give us the missing ingredient - the explanatory gap: how are explanations created by human beings? If you want to program super-human intelligence you need to have an understanding of what algorithm (what set of instructions, what program) will enable you to be at least human.
Here is a possibility: whatever creativity is - its dependence on memory and speed is only weak. So a computer with a trillion times faster speed and trillions of times more memory than a human brain might only be 2 times as good at solving problems. People already can solve some problems as fast as a computer - they are the computer. When a person wants to multiply two 20 digit numbers - the computer does it. But the person understands the problem being solved. When a person wants to learn the capital of some far off country - they type it into google and out spits the answer. The internet now is very much personal memory of a sort. As Einstein quipped "My pen and I are more intelligent than I". He meant that extra hardware helps him (i.e: a pen) and these days we have trillions of times more assistance than Einstein did...if only we put in the effort when we want to solve certain problems. Learning is something we do. We just are not entirely clear how we do it. We know it has something to do with criticism and creativity as I have said - but finer details elude us.
Learning say, a language...how does that happen? Is it interacting with other people? Is that just inherently a speed-limited process? Is it limited by interest as I argue it must be? A true AGI must have its own interests. It cannot want to learn everything all at once - or one thing only ever. It's a non-human person. But a person all the same. And so how, exactly, would it learn language? Just watch and listen? Or would it have to try to speak and write itself to get feedback? Would it have to be corrected by other people? Bostrom seems to think that an AGI would just read the internet and watch YouTube or something and learn a language. I do not see that this is at all clear. There is much implicit knowledge in language learning.
Would an AGI even want to learn languages? What would it decide to do? Bostrom thinks that, being perfectly rational, the AGI will act in such a logical way that we cannot even guess its motivations. But as much can be said for people around us now. But there is something more crucial here when it comes to how AGI makes decisions. So let us turn to that question now:
Part 5
Bostrom believes that a Superintelligence will not only be "perfectly" rational but that, in being "perfectly" rational it will be a danger. Bostrom appears to be concerned that too much rationality is dangerous. What is implied here is that if a machine that he thinks were too rational it would...do something the rest of us would consider irrational? It is not exactly clear what Bostrom is suggesting - but he seems to fear a machine that might be, in his eyes, smarter than him - able to think faster than he can. And he is worried that the machine might, for example, decide to pursue some goal (like making the universe into paperclips) at the expense of all other things.
Of course a machine that actually decided to do such a thing would not be super rational. It would be acting irrationally. And if it began to pursue such a goal - we could just switch it off. "Aha!" cries Bostrom "But you cannot! The machine has a decisive strategic advantage" (this is a phrase that appears more times than I was able to keep count of on the audiobook). So the machine is able to think creatively about absolutely everything that people might decide to do to stop it killing them and turning the universe into paperclips except on the question as to "Why am I turning everything into paperclips?" It can consider every single explanation possible - except that one. Why? We are not told. Something to do with its programming. On the one hand it has human-like (but super) intelligence and on the other it cannot even reflect in the most basic way about why it is doing the very thing occupying most of its time. It is never clear whether some flavors of Bostrom's Superintelligence can actually make choices or not. Apparently some choices are ruled out. Like the choice not to make paperclips (or whatever the "goal" that the machine has been programmed with is compelled to pursue).
All of this argument turns largely on Bostrom's belief that an A(G?)I will be some sort of perfect Bayesian inference generator. So sometimes it might not assign a zero probability to the chance it has not yet made (say) 1 million paper clips were this a part of its programming.
But Bayesian reasoning is false! (For more on that, see my page here). That is not how we generate new hypotheses and it is not what would motivate an AGI. An AGI, like us, would understand, and be required to utilise the fact, that hypothesis formation is a creative process. Not one which generalises from particulars or attempts to calculate success based on only existing theories (this is what Bayesian inference is about: non-creative calculation of the likelihood that some outcome will obtain. The anti-human unintelligent thing is: humans think creatively - they invent new hypotheses and this must change the Bayesian calculations. And, before the new theory is created you simply cannot calculate a conditional probability...because you don't yet have the theory that generates the probability).
AGI to be true general intelligence (like we humans) will not rely only on known hypotheses to predict future outcomes. It would (must!) create new possibilities to be an AGI. Anything less and it is far from intelligent. At all. Let alone Superintelligent. And for that reason - not a danger. And far from an existential threat. It's a dumb machine if it's just calculating probabilities in order to "decide" which course to take. (I put "decide" in scare quotes there because if a machine is just compelled to do what some probability calculus assigns as the highest probability, then this is no decision in the real sense.). Bayesian inference generation is, essentially, pattern extrapolation. It can only take some pattern in data and assume it continues and make predictions from that. Of course, this is not how science or any part of the real world tends to work. If one heats water on a stove and monitors the temperature, you can find a rough pattern. Like every minute the water might rise 10 degrees. But if you do this for 5 minutes to collect 5 data points there is no reason to presume that every minute into the infinite future you'll get that same rise. Indeed at the boiling points something surprising happens: the water ceases to increase its temperature. At the boiling point the temperature is a constant (around 100 degrees centigrade as we know, near sea level on Earth). So if something as simple as heating water throws up the unexpected given some constant pattern "in" the data, expect other systems to be equally surprising. Hence: require that knowledge creation systems are truly creative conjecture-makers (which will sometimes hit on the truth), not "Bayesian" inference predictors (which are guaranteed to make silly mistakes).
This is a problem with those who believe in the looming AGI apocalypse. Their definition of Superintelligence can be so elastic as to encompass even things with obviously no intelligence (the capacity to solve problems) at all. To solve a problem one needs to minimally be aware that there is a problem to solve. And yet Sam Harris believes a pocket calculator has something like Superintelligence because it can be used by a (presumably marginally intelligent) person to solve an arithemetic problem in a fraction of a second.
But that is to make a mockery of the term "intelligence". One may as well say a cow has Superintelligence because it can squirt milk from its udder in great quantities: a quality no human possesses or is ever likely to. Or perhaps a parrot is superintelligent because its mental capacity for mimicry exceeds that of many people.
Intelligence has nothing to do with displaying a number or squirting milk or mimicking. It has everything to do with solving problems. And no, a cow is no more "solving the problem" of how to squirt milk than a calculator is of solving the problem of what 123 x 456 is. In both cases it is only a problem when a person recognises it as such. That is to say: there is an awareness of what the problematic *situation* is. A calculator with 123 x 456 entered on its keys is even less aware of the "problem" than a cow is of an excess of milk.
Intelligence is not the capacity to do things faster or better than humans in some exceedingly narrow domain. It is the ability to create new solutions - new explanations - a uniquely human attribute. Indeed it is the capacity to be a universal explainer (to use Deutsch's formulation). That is the attribute if we want to create general intelligence instantiated in computer chips. It will require an algorithm we don't yet possess which enables a program to be written which, as yet, we cannot guess. Such an algorithm will be able to generate explanations for anything - for any problem. And that will include the problem of which problem to choose to solve next. That is, it will have the quality of being able to choose. And so - it will not be able to be programmed to, for example, pursue paperclip building whilst ignoring lots and lots of other stuff (like the suffering of people) if it is a genuinely intelligent AGI.
This is why Deutsch’s criterion for what it takes to be a person is so important: people are universal explainers. We explain stuff. We explain our lives, science, how things work. We create new explanations: new theories. Creativity is what we have, and what animals (and computers!) lack. We just don’t understand it. We have no “theory” of creativity and we know this because we have never programmed a creative computer.
So is AGI around the corner? We have no idea - but it seems no closer than it did when Alan Turing invented the theory of classical computing. We know we can increase speed and increase memory - which is to say we know how to make hardware improvements to make computers better - but no hardware improvement can possibly give us the software we need - the algorithm for learning. That is a question about knowledge: how it is created. Knowledge is information - but not merely so.
Creativity might very well be tied intimately with consciousness. For to solve a problem you must be aware of a problem. And therein lies the problem: to observe, consciously be aware - of that which you do not know - requires an ingredient we simply cannot express mathematically or in any programming language. Yet. None of this is to say the question of what creativity is, is beyond us. It is certainly a soluble problem. Only there is no sign of a solution. And so there is no reason to be concerned. Not especially. Not now. No more than we should be concerned that any computer will suddenly become aware - let alone a box of switches in the basement of some silicon valley company.
To solve this problem requires a new philosophy which is an improvement upon our best philosophy. It is a philosophy of how learning actually works in fine grained detail. So fine we can write down a step-by-step method that can be instantiated in code. But philosophy is not valued by the majority of people working on the AGI issue. And where some philosophers have taken an interest (like Bostrom and Harris) they seem to have missed the point entirely. They haven’t read Popper - or if they have they don’t seem to understand the crucial aspects of knowledge creation that they need to. Namely that no amount of more memory and faster speed can give us the missing ingredient - the explanatory gap: how are explanations created by human beings? If you want to program super-human intelligence you need to have an understanding of what algorithm (what set of instructions, what program) will enable you to be at least human.
Here is a possibility: whatever creativity is - its dependence on memory and speed is only weak. So a computer with a trillion times faster speed and trillions of times more memory than a human brain might only be 2 times as good at solving problems. People already can solve some problems as fast as a computer - they are the computer. When a person wants to multiply two 20 digit numbers - the computer does it. But the person understands the problem being solved. When a person wants to learn the capital of some far off country - they type it into google and out spits the answer. The internet now is very much personal memory of a sort. As Einstein quipped "My pen and I are more intelligent than I". He meant that extra hardware helps him (i.e: a pen) and these days we have trillions of times more assistance than Einstein did...if only we put in the effort when we want to solve certain problems. Learning is something we do. We just are not entirely clear how we do it. We know it has something to do with criticism and creativity as I have said - but finer details elude us.
Learning say, a language...how does that happen? Is it interacting with other people? Is that just inherently a speed-limited process? Is it limited by interest as I argue it must be? A true AGI must have its own interests. It cannot want to learn everything all at once - or one thing only ever. It's a non-human person. But a person all the same. And so how, exactly, would it learn language? Just watch and listen? Or would it have to try to speak and write itself to get feedback? Would it have to be corrected by other people? Bostrom seems to think that an AGI would just read the internet and watch YouTube or something and learn a language. I do not see that this is at all clear. There is much implicit knowledge in language learning.
Would an AGI even want to learn languages? What would it decide to do? Bostrom thinks that, being perfectly rational, the AGI will act in such a logical way that we cannot even guess its motivations. But as much can be said for people around us now. But there is something more crucial here when it comes to how AGI makes decisions. So let us turn to that question now:
Part 5