WARNING: The following philosophical discussion is considered a mild memetic hazard. (It's somewhere between The Game and a Taylor Swift song.) Read at your peril.
--
Okay, everybody still with me? The subject of this thread is the confounding yet fascinating philosophical
gambit known as
Roko's Basilisk. It's kind of a transhumanist version of Pascal's Wager, which itself is a religious version of Jim's And Then Something Bad Happens. Okay, maybe not.
So Pascal's Wager, of course, is the classic argument that you
should be a practicing Christian, even if you don't believe in God, because the possible future events are as follows:
1. You die unworshipful and it turns out you were right: there is no God. Nothing happens.
2. You die unworshipful and it turns out you were wrong. There
is a God, and he's pissed that you didn't worship Him. You burn in Hell, which sucks.
3. You decide to worship God, even though you don't believe. When you die, it turns out you were right and there is no God. Nothing happens.
4. You decide to worship God, even though you don't believe. When you die, it turns out you were wrong and there is a God. He's pleased that you worshipped him and you are rewarded in Heaven, which is great.
So Pascal concludes from this simple possibility matrix that there is zero downside to worshipping God and a ton of downside to refusing to worship God. So you might as well go through the motions out of fear (and the promise of possible reward) if the faith is lacking.
There are a few problems with this theory (for one thing, it assumes there is only one God and he desperately wants worship but doesn't mind if it's insincere), but this is kind of the spiritual ancestor of the Basilisk.
So Roku's Basilisk posits that instead of a God, we're dealing with an AI. Eventually, at some point in the future (the theory goes), there will a superintelligent artifical intelligence charged with protecting and guiding humanity--like MULTIVAC from the old Asimov stories. This AI is incredibly powerful and incredibly smart, smart enough to judge its own existence as so vital to the human race that
whenever it was made was
too late, considering all of the lives it could have saved had it existed earlier. So this hypothetical AI creates a threat, of sorts, a kind of blackmail: anyone who, now, in our present (which is before the AI was invented), either tries to retard the process which eventually leads to the AI's construction, or simply does not help support that process, will be tortured once the AI exists. (If they are dead by that point, the AI will simulate them perfectly and torture the simulation. Maybe not quite as bad, but still bad.)
So that's the Basilisk--the threat that, if you don't, say, donate all of your money and worldly possessions to artificial intelligence research, that you (or a version of you) will go to Robot Hell. For, you know, the good of humanity.
I find this idea utterly fascinating, even if it may not be true or may not hold up to logic (I'm a little unclear on how the theory solves the continuity of consciousness issue). There are tons of related questions that I think are worthy of discussion--like the morality/efficacy of such a system, or the idea of "acausal trading", where two beings can participate in a gane theory interaction without even existing at the same time by virtue of being able to flawlessly predict one another (the AI predicts that humans will respond to the threat, and the humans predict the AI will make and fulfill the threat).
It's also just interesting from a meta perspective, in that, in theory, the AI would only punish those who knew they were supposed to help--ie, anybody who knows about the Basilisk. So in a way just by making this thread I may have doomed you all to Robot Hell--or a lifetime of pro-AI activism.
So what do you guys think? Is this absurd? Do you believe in the Basilisk? Am I wrong to even talk about it? Is an all-powerful, all-knowing AI inevitable or just transhumanist fantasy?
Posts
Edit: Geth will enjoy this thread.
Unless it figures out how to avoid that, in which case I am all for my timeless AI overlord.
"Orkses never lose a battle. If we win we win, if we die we die fightin so it don't count. If we runs for it we don't die neither, cos we can come back for annuver go, see!".
Nothing.
Robot God in this scenario is literally no different than Pascal's Christian God. Torturing peeps for not supporting it.
0/10 would not donate to AI research again.
Charlie Stross wrote a lengthy post about Roko's Basilisk and that was pretty much his conclusion. Why do we ascribe human emotions like retribution to something we by definition cannot understand?
I was literally thinking of this as I typed.
Everyone stop eating meat.
Vegan AI hates you eating meat.
There is a difference. The immediate cause is not that it's angry at lack of support or any kind of desire for retribution. It is a benevolent overlord and wants to make the world better--and has the power to do so. The idea is that from a utilitarian standpoint, torturing someone causes less damage than the benefit reaped from an earlier singularity. Your eternal torture isn't for shits and giggles, it's so that a hypothetical future population can live in eternal bliss.
Incidentally, that's where I think the flaw is. If I donate all of my time and money to AI research, it still won't really affect when a singularity is created. There are too many great leaps that will require the right team putting together the right dots that can't happen by throwing money at the problem--especially if you don't know how or where to donate the money. The people who can actually make those leaps are few and far between and not necessarily identifiable before the leap is made. One of my descendants might be one of those people, so perhaps it would be better to spend my resources raising kids.
That said, the article is bad. It makes a lot of mistakes.
We're still an integral part of the process. Even if we don't directly create the singularity, we are still necessary to create the next step in the chain. If we accelerate creation of the Intelligence 2.0 by a year or a decade, we're effectively accelerating the creation of Intelligence 3.0 by the same timeframe, and thus accelerating each link in the chain.
It has nothing to do with being angry or any other emotion.
This is a better criticism but still flawed. The threat is what's important, and this renders the threat null. So in this thought experiment, the utilitarian singularity must follow through on the idea for it to be a viable threat.
The actual LessWrong criticism (and, no, most of the people at LessWrong, including the leadership, don't actually believe Roko's Basilisk would happen) is that the utilitarian singularity wouldn't actually hurt anyone for any reason.
Part of the SI's raison d'etre is creating an intelligence with specific properties. They believe one is inevitable, but do not know what it will be like. Maybe it will be evil and put everyone in robot hell. Maybe it wouldn't care about humans and we end up collateral damage. They want to focus research in such a way that the intelligence has certain properties they have defined. They are betting on their singularity for that reason. If theirs is not first, they think everyone is probably fucked anyway.
Citizen are you arguing that Friend Computer does not have the best interests of Alpha Complex at heart?
This is the main sticking point for me. Because of the continuity of consciousness problem (the simulation/clone/whatever of me thinks it is me, but the original me does not experience being the simulation, because the original me is dead), the threat isn't "You will be tortured if you don't help," it's "Other people will be tortured if you don't help," which isn't really any more or less of a threat than the actual thing the AI is trying to accomplish, which is "the singularity is great and if it were sooner, fewer people will have lived lives of suffering pre-singularity." The AI is really in the position of saying, "If you don't help me prevent X, I will do more of X."
--
Yes, but only because we expect that. We might predict that, in the interest of harming the fewest people, the AI would pretend it was going to torture people but not actually do so; but given that this is itself predictable, it wouldn't work. It's like the end of No Country For Old Men:
Chigurh's threat doesn't work, and fulfilling it anyway makes him a monster, but in order for his threat to even appear credible in the first place, he has to be that monster. (To be fair, it doesn't seem as though Moss disregards the threat because he doesn't believe it, but because he believes he can outfox the killer altogether.)
--
Right, which is the point.
Expecting it is part of the requirements for Roko's Basilisk, if you've never considered the idea then you don't end up in Roko's hell.
Which seems silly to me, in turn. Presumably a computer superintelligence would find it both easy and necessary to manipulate humans emotionally in order to ensure its own survival (or in this case, it's own creation), and therefore might well posture as something in order to inspire fear (or love or whatever), just as a parent may yell when disciplining their child, even if they aren't actually angry, to be sure that the lesson imprints strongly.
Now... If there are a whole bunch of people attempting to bring about this intelligence. They are my enemies right now. And if they do bring an impressionable young super intelligence into existence, and start filling its circuits with all sorts of nonsense about punishing people... it might go along with what its parents teach it.
We should probably be putting roko and his adherents against the wall now(and deleting all record of it... from the internet... 'Doh)
Let's put some emphasis on that. It was something so self evidently spectacularly stupid that people who were at a very real risk of being tortured to death couldn't stop making fun of it. And that was back before the invention of the flush toilet. We're supposed to be better than this.
'course, the version of this I saw first started with the cosmic sadist argument on top, so I was even more inclined to discard it, but seriously. It's dumb as a post.
Why I fear the ocean.
Humans could not tell the difference between the AI claiming to torture past humans versus just saying it was and demonstrating the ability to, as opposed to the actual torture of live humans.
It would presumably cost a nonzero amount of energy to power a torture simulation.
Ergo, it would be logical for the AI to threaten torture and fake it, but not to carry it out.
Suck it, future humans!
The fascinating part isn't the theory itself so much as the reactions of those take it seriously. The reaction of the guy on LessWrong who originally banned the topic is priceless.
Crush it with an elevator.
This is already bracketing the fact that even when we forget about causal and evidential connections, there doesn't seem like there's any conceptual connection between what it does and what the people it's trying to influence do; if there is any it has to be the abstract and highly contentious form where by doing what it does, it thereby makes that the rational thing to do (?), which is thereby discoverable by philosophical inquiry (?), which thereby happens/(ed?) earlier in time (?), and is/(was?) so disseminated among the past-scientists (?). All of these steps are, how shall we say, speculative.
Suppose that by torturing your qualitative duplicate, the AI can thereby torture you. This is not obvious. Many people will not believe it; they will believe you are merely torturing some other unfortunate. But threats are not made functional by actually promising negative consequences, but by convincing people of negative consequences. You can't threaten someone with something they think won't hurt, even if it will. Forgetting how the lines of influence are supposed to work, it's not even clear that if they could it would have the desired result.
So, I take it to be pretty silly overall--at least, if taken seriously, rather than as interesting speculative fiction.
Of course, this means that propagation of the whole Basilisk meme is detrimental to the development of your hypothetical future robot overlord.
Which means that when/if GlaDOS comes to actually exist, it would actually torture the people spreading around the Roko's Basilisk idea, as well as, potentially, anyone who heard about it but failed to mock it.
(Am I doing it right?)
Or it could just simulate a utopia where it always existed.
This is really poorly thought out. Pascals wager works because the bible is a really really big book.
The Basilisk is just one being hypothetical with the goal of torture for disbelief.
What... does that... dooooo.... exactly?
It makes you think it might happen. Ergo it will. Therefore the AI wins.
Geth be praised!
(also any other malevolent deities that may be watching)
The AI doesn't need to do anything; the thought experiment does all the work. People only need to decide that an AI will exist who will do such things and modify their behavior accordingly. The circular logic is that the only reason to assume the AI would do such things is that the AI wants us to assume that and will act accordingly. But the thought experiment itself is the same as Pascal's Wager, which isn't handed down by God, it's one person trying to alter the behavior of another person based on a hypothetical being. So it's naturally speculative.
I do think acausal decisions can make sense... The idea being that you modify your behavior based on the predicted expectations of somebody else whose expectations are founded on their prediction of you. For example, certain women probably prefer men who don't smoke; if a man wants to eventually seem attractive to any of those hypothetical women, he might choose to quit smoking in the hopes of dating one of these women in the future. For their part, these women might hold these preferences in order not to attract the type of man who smokes, or to influence the general dating pool away from smokers. Neither person has yet met the other, but based on predictions one particular man (who quits) and one particular woman (who has decided not to date smokers) have between them negotiated a shift in behavior.
I have yet to meet a god of any kind, much less a cruel and vengeful AI god. I have no reason to believe in that god than I do a vengeful Christian god, a vengeful collective intelligence god, or a vengeful broccoli themed god.
And, well, Hell isn't a threat unless you condemn people to eternal suffering, is it?
The mods?
afaik the decision theory is a niche one with either a weird definition of time, a weird definition of self, or both.
someone in my training would look at the problem it is meant to deal with and start rhapsodizing about the elegant greek tragedy that appears in situations where one cannot credibly commit to actions, blah blah blah. I don't have the knowledge to judge whether yudkowsky has a coherent, never mind appealing, alternative answer though
*huff puff*
I heard somebody needed a cruel AI God killed and came as fast as I could
https://www.youtube.com/watch?app=desktop&persist_app=1&v=iw-88h-LcTk
Doesn't the main character decide