Artificial General Intelligence That Plays Video Games: How Did DeepMind Do It? 93
First time accepted submitter Hallie Siegel writes Last December, an article named 'Playing Atari with Deep Reinforcement Learning' was uploaded to arXiv by employees of a small AI company called DeepMind. Two months later Google bought DeepMind for 500 million euros, and this article is almost the only thing we know about the company. A research team from the Computational Neuroscience Group at University of Tartu's Institute of Computer Science is trying to replicate DeepMind's work and describe its inner workings.
DeepMind? (Score:5, Funny)
I've seen the next-generation after DeepMind, and it requires seven and a half million years of calculation to play a video game.
Re:DeepMind? (Score:4, Interesting)
They have Multi-GPU accelerated map reduced neural nets these days. And their comparative performance is amazingly fast, and cheap. You can even buy physical servers built for that exact function.
Re: (Score:3)
I'm guessing 42 minutes. Or is it 42 years?
Re: (Score:2)
they are a part of (ahem!) team Googie, not team IBM. Sorry article.
Both the summary and article say "google".
Opensource remake (Score:1)
https://github.com/kristjankor... [github.com]
In Python of all languages. Clearly not concerned about the Ai's performance.
Re: (Score:3)
Well, you know what they say, make a proof of concept first, then make it good later(only a few people ever bother to do this).
Re: (Score:3)
Re: (Score:2)
Me, too. No, wait... that was the Minestrone / Cannabis problem. My bad. I did solve it, though. Ate every damn bit of that minestrone. I think I found some old popcorn under the sofa seats, too. Don't exactly remember.
Re: (Score:2)
Take the goat and cabbage across the river in the boat.
Leave the wolf behind.
Why the heck were you traveling with a wolf to begin with?
(ob xkcd I think).
Re: (Score:2)
Twelve years on all I remember are the basic concepts at a high level.
I formally studied AI and neural nets 25yrs ago, I recently came across this series of video lectures on YT [youtube.com]. I started watching to refresh my memory and ended up learning quite a bit of new stuff that was unknown when I did my degree. It took me about a month or so to watch the whole series, definitely worth the effort if you already have the basics, but forget it if statistical maths or matrices scare you.
Peal/Python - A toy AI doesn't need to be fast, it's purpose is to play with ideas, scripts are muc
Re: (Score:2)
I remember Peal... from Bell Labs, right? Yeah, I thought it rang a bell.
Re: (Score:2)
"Clearly not concerned about the AI's performance?"
It uses Python, indeed. And for the computationally intensive tasks, it uses numpy and theano. Theano is general symbolic computation framework that will automatically accelerate your vector computations on a nearby GPU, etc.
I don't know how it compares with (likely Lua, torch-based) deepmind's implementation. But assuming that scientific python programs actually do their expensive computations in the Python VM is really rather silly.
Re: (Score:2)
Somebody else already told you about Theano. To add to that, a lot of neural net stuff gets done in Python because Theano will happily take your equation, compile it for a multi-GPU or CPU setup, optimize it, and run it fast.
A neural net is a couple of equations that need to run fast and a lot of data manipulation and visualization. Theano, Cython, a C module, pyOpenCL/pyCUDA, or something equivalent takes care of the little bit that needs to be fast.
How to do it. (Score:5, Interesting)
That's neat. The demo takes in the video from a video game of the Pong/Donkey Kong era, can operate the controls, and in addition has the score info. It then learns to play the game. How to do that?
It's been done before, but not this generally. [google.com] "Pengi", circa 1990, played Pengo using only visual input from the screen. It had hand-written heuristics, but only needed vision input from the game. So we have a starting point.
The first problem is feature extraction from vision. What do you want to take from the image of the game that you can feed into an optimizer? Motion and change, mostly. Something like an MPEG encoder, which breaks an image into moving blocks and tracks their motion, would be needed. I doubt they're doing that with a neural net.
Now you have a large number of time-varying scalar values, which is what's needed to feed a neural net. The first thing to learn is how the controls affect the state of the game. Then, how the state of the game affects the score.
I wonder how fast this thing learns, and how many tries it needs.
Re: (Score:3, Informative)
all about 'vision' (Score:2)
from the technical side, the real hurdle is vision...the ability to compute the best move is relatively easy
here's why: video games are 'AI'...we program games to 'play' us all the time...which is reacting to continually changing parameters to choose the best option for input to control the game to 'win'
from a complexity standpoint, think of the AI from a new P2P shooter's level of complexity vs a ghost in PacMan
all of 'ai' is abstractions based on arbitrary choices...in this instance they define "artificia
Re: (Score:1)
video games are 'AI'...we program games to 'play' us all the time...which is reacting to continually changing parameters to choose the best option for input to control the game to 'win'
Yes, because programs written to specifically search well defined problem spaces and to explicilty encode tactics and game play as seen by game designers comes even close to a program that needs to figure out the game and problem space from scratch. Even if fed a list of objects and their exact position, so that image recognition was not necessary, you aren't even close to a trivila problem, nor will throwing anything resembling video game ai at it work in general. At least for image recongition there is
machine learning is optimization (Score:2, Insightful)
"machine learning" is the same as every other machine behavior: it is the product of coding instructions from humans
i don't have a problem with the language, but it's not the same as "human learning" at all
when people say a machine "learned" what they really mean is that its optimization algorithm did its programmed task effectively
"learning" = optimization over time based on parameters set by humans
Re: (Score:1)
simple hard nonsense (Score:1)
this is nonsense
you mention "determine rules" which is part of the "machine learning" linguistic contextualization
that's what my post was about...how every behavior that is called "machine learning" including when this particular machine determined the rules of this game, it was not in any way like "learning" that humans
engineering not linguistics (Score:2)
again, misquoting me and using linguistics to make your case not engineering
the behavior you describe, a computer doing a task without 'a priori knowledge' is "machine learning"
my point, which you ignore, is that "play any game
Re: (Score:2)
i don't have a problem with the language, but it's not the same as "human learning" at all
I think you're probably right... but we can't prove it since we don't yet know how human learning works. Once we do know how human learning works, we will be able to program machines to learn the same way.
Re: (Score:2)
i don't have a problem with the language, but it's not the same as "human learning" at all
I think you're probably right... but we can't prove it since we don't yet know how human learning works. Once we do know how human learning works, we will be able to program machines to learn the same way.
Oh, one more point: I should mention that it seems pretty likely that once we understand how human learning works, we will actually not program machines to learn the same way; we'll program them to learn a better way.
human learning for you (Score:2)
there is no "singularity"...humans are unique and have free will and civil rights...we are always dynamic and each human learns differently...
also, we understand alot about how humans learn...there are whole fields of inquiry in academia and professions devoted to it...you may have even met one of these people when you attended school...they study people like Vygotsky now...and integrate neuroscience into their learning models
the same neuroscience we programmers use to model computer architecture
humans are
'teh singularity' (Score:2)
this whole ontology, it's not science or engineering...it's language tricks to make us humans feel like we've accomplished something when really it's just coding...
'ai' is code...code written by humans
also, there is no specific line where we can say "we've learned everything about how humans learn"...you can't have a black/white dichotomy with an abstract idea like "learning"
"learning" is different to every h
Re: (Score:2)
Yes and no.
I agree that it's likely that there's no specific line, at least not a sharp one, but there is a qualitative difference between machine learning as we know it now, and human learning. Human learning, at least the best human learning, is about the creation of knowledge, not the acquisition of facts, nor even the identification of key facts from a larger mass (which is what machine learning as done today is about).
A key difference is explanation. A human playing Breakout not only comes up with
free will is not a religious idea (Score:2)
absolutely not...you've been reading too much Richard Dawkins...put his books down forever he's a troll on academia
**secular humanism** also holds to this same essentially...
"every human is unique in the universe and has free will...no machine will ever have these characteristics"
not the last part about machines, but the free will aspect of human existence is **NOT TIED TO RELIGION**
here is the UN declaration of human rights: http://en.wikipedia.org/wiki/U... [wikipedia.org]
it i
Re: (Score:2)
So... if free will isn't an emergent behavior arising from complex interactions which are nevetheless inherently limited to the laws of physics, then what is it? And, more importantly, what is it about free will that makes it impossible for a machine to acquire it? What, fundamentally, is special about the computations in human brains that constitute "free will" that makes it impossible to replicate them in different hardware? Or to replicate them in the _same_ hardware? Eventually we will be able to bui
Re: (Score:2)
ha!
that's quite a few questions
so, maybe i can answer all by falsifying my argument...
the human brain does work *somehow*...IMHO we have really only just scratched the surface...really...and i hope we can agree that the whole singularity notion that because of some unscientific conjecture about processor speed that 'ai' is predictable is nonsense...
that said, i have to admit that theoretically the human mind works and is a system and therefore can (and this is very far-flung...pure conjecture) be constructe
Re: (Score:2)
i hope we can agree that the whole singularity notion that because of some unscientific conjecture about processor speed that 'ai' is predictable is nonsense...
I agree that processor speed has little if anything to do with it. It's clearly about software. If it were about speed only, then we should, right now, be able to build an artificial intelligence that runs very slowly. Perhaps it would think at a millionth of the speed of a human brain, but the processes of creative thinking would still be recognizable as such. Then we could know that we just need a computer a million times faster to match a human brain, and that further performance improvements would surpa
Re: (Score:2)
imagine a traditional computer running a fully-detailed simulation of a human brain. This simulation is an exact replica of a real human brain, and simulates every neuron, every chemical reaction, etc. It even simulates the quantum uncertainty effects at the finest level of detail.
Why would that simulation not evince "free will" (whatever that is)?
right...that is a good question
"no" is the answer, if you use legal definitions of 'free will' (or concepts similar to in practice)
"yes" is the answer if it really, truly is what you say...and we have a public debate about it and have a true democratic/legal decision...which even then would have limitations...it would be in a room on a university campus...what if we let it control a drone? whole different story...
look, we're just going to have to agree to disagree about how actually feasable what you descri
Re: (Score:2)
i meant to type:
"no" is the answer, if you use current legal definitions of 'free will' (or concepts similar to in practice)
Re: (Score:2)
"no" is the answer, if you use legal definitions of 'free will' (or concepts similar to in practice)
Cite?
ook, we're just going to have to agree to disagree about how actually feasable what you describe really is...it's just so far out there...it really is, from an engineering and psychology perspective, about as likely as humans being able to travel across the whole universe and through time
Nonsense. There is a fundamental difference between something that is barred by the laws of physics and something that is perfectly possible, but just beyond our current ability. Oh, it's possible that we'll discover new physics that make supralight and time travel possible (it's even possible that the same discovery will enable both), but it's more likely, I think, that both are simply disallowed by the laws of nature.
Construction of brains, however, is incontrovertibly not barred by any physical la
"artificial intelligence" has become a religion (Score:2)
it's not "perfectly conceivable"...it's complete conjecture
like i said a few comments back, you've been watching too much sci-fi and have no concept of how this stuff is actually made
that's why i said, earlier, that i'd have to *literally* take you by the hand and have you talk to the Watson (or other ai) team, look at the codebase...because it seems that's the only way you can understand how complex this work is
here's your problem in a nutshell:
I suspect that we'll understand and be able to construct artificial intelligence before we can replicate a human brain, but I don't think either is more than 100 years away.
before what?
***we already understand "artificial intelligence"
Re: (Score:2)
like i said a few comments back, you've been watching too much sci-fi and have no concept of how this stuff is actually made
I've been consistently ignoring such snide remarks and I'm going to continue doing so... but my willingness to be so patient with your snark is wearing thin. Cut it out or I'll simply stop responding.
As for whether or not I know "how this stuff is actually made", you might consider that I'm a professional software engineer with 25 years' experience, currently working for Google. I know quite a lot about how "this stuff is actually made", including familiarity with current machine learning techniques, sinc
humans machines (Score:2)
the notion that our brains are deterministic machines
you're already a "true believer" arent' you?
the idea that the *thing that created machines* (human brain) is nothing more than a machine is ridiculous
machines are tools for humans...that's all...
our brains can be compared to machines (anything can be compared to anything else), but that doesn't mean that our brains function like machines
it's a false ontology...and it's based on your **personal beliefs** not rationality or logic
Re: (Score:2)
also, wanted to say that this is "fun" chatting with you, and I appreciate the experience you bring to the conversation...
i just disagree and have strong feelings on the subject
Re: (Score:2)
Your post is entirely reasonable except for:
"but it's not the same as 'human learning' at all."
You need to support that position.
Re: (Score:2)
the idea is it's blatantly obvious to anyone with technical experience
however, i'd like to continue if you are, so tell me, what is acceptable "support" for that position?
remember, you quoted one phrase, but it was part of a larger conversation that started about machine vision...which I was told was an example of machines learning a new skill...which I disagreed with
it will probably be helpful for one of us to define "human learning" if we're going to really do this...i'll let you make the call
Re: (Score:2)
I'm not sure what acceptable support is, which is one of the reasons I'd like to hear whether you have any actual foundation for your argument.
There are a few learning strategies that animals, including humans, have been observed to use. One is mimicry, which has been demonstrated in primates (just recently for the first time by macaques in the wild), where one animal watches another perform a task then imitates it. Another is reinforcement learning, where an animal becomes more likely to demonstrate a gi
"obvious" and consent (Score:2)
hey thanks for the comments
You seem to be implying that humans somehow learn differently than programs because the program is "programmed" and we're not. Do you have anything to support that assertion, besides "it's blatantly obvious to anyone with technical experience?" There's fairly good evidence that we've been "programmed" very effectively, and quite beyond what most of us would like to believe, by evolution.
now...what kind of evidence could I present that would satisfy your need?
if i had access, i could take Watson or another well known AI and show you the logic schematics then show you the codebase, then demonstrate how changing the codebase changes how Watson (or w/e) behaves with highly predictable results. I could have the engineers who made Watson walk you through their entire development process, and at each point of decision, explain to you how the decision effects how it work
human learning (Score:2)
also, you may want to brush up on your education theory, because it's made leaps and bounds in the last 15 years, incorporating the exact same neuroscience that AI learning tries to use
i got an MA in Education from CU-Boulder in 2007...don't teach now, but i was genuinely impressed with how teaching has advanced as a profession
i also taught snowboarding for 6 seasons...applying the "Facilitated Learning Model" and was developed by...wait for it...education theorists at CU-Boulder
Vygotsky and Csikszentmmihal
Re: (Score:1)
vision is the only thing (Score:2)
hey...i came to the exact opposite conclusion as you...my comment is above you can check it out and tell me what you think
Re: (Score:1)
How to do it. (Score:5, Informative)
Advances in Deep Learning have made it far easier to extract features from vision -- in fact, feeding pixels straight to the neural net is pretty close to being all you need to do.
Take a look at these slides and read about convolutional neural networks: http://www.slideshare.net/0xda... [slideshare.net]
interesting piece (Score:1)
Re: (Score:2)
I don't recall Reaper bots learning opponent play styles. They did learn to path around new maps (the guy who wrote the AI had a day job writing routing routines for network routers, and he applied concepts from that to game pathing), and they had a simple finite state machine that allowed them to tarck enemies, search for enemies, engage in combat using techniques like circle-strafing, disengage from comat if canditions like health/armour/powerups/weapons were unfavourable, etc. As far as I remember they t
Re: (Score:1)
Per the article, the AI has absolutely no knowledge of the game, what a "player" is, etc. All it has as input are the 64x64 pixels from the game, and a "score delta" (represented as -1,0,1 -- score goes up, it gets a 1 signal).
Everything else is "learned" by the engine (on its own) after repeatedly playing the game for a couple of hours. They tried this algorithm out on 7 different Atari games -- nothing learned from Game A was carried over to Game B.
Re: (Score:2)
It's not scary. It's pretty basic.
Genetic algorithms has a classic example where a GA evolved a chip design that could distinguish between two frequencies of an electrical input. It do so in a more efficient and smaller package than anything designed to do so. And it was so complex that (it was said, but people say a lot) nobody really understands how it works... but it does.
The problem is that it's also not intelligence of any kind. If anything, the opposite. Pigeons, for example, if you put them in a
How to do which part? (Score:2)
What I would like to know how to do is to get $500M for so little track record, intellectual property, or even publications. I don't get it.
Re: (Score:2)
Learn about deep networks. Google is throwing money at people who can build them.
No AI Can Simulate A Video Game Tester (Score:4, Interesting)
Re: (Score:2)
Re: (Score:1)
time is money. profit = revenue - cost. unless your game company is a hobby or charity...
if it doesn't break gameplay, there's no point in fixing rare bugs like that. usually a project manager will sit with the designer and assign priority levels to defects. something that only breaks the game if the player goes outside normal bounds is going to be low priority.
also, back in the good-ol-days, program space & processor speed was limited.
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
I'm not talking about code testing.
Neither was the AC to whom you replied. Gameplay testing can and should be automated.
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Scientific testing? (Score:1)
I find myself wondering about the following question:
How did they differentiate "learning to play the game" from "learning how to track the game's RNG"?
Most video games have ridiculously simplistic PRNG generators embedded in them. An AI might get "sidetracked" and learn how to play the underlying RNG output of the game, rather than the game itself. That would yield really good results for most arcade games of this type, I imagine (weak RNG, limited input and timing options, etc.) I don't know if they ch
Re: (Score:2)
Re: (Score:1)
"Tracking the RNG" would help you win the game, but it doesn't tell you anything about how to play the game.
That would be my point.
This AI learns to play the game, it then wins the game using experience it gains in the same way a human does - feedback from the game score.
That is one possible interpretation, which is not supported by the statements so far. That is not to say that it is not the case, only that it is not currently supported by what I have seen so far; something along the lines of "We tested this against games with multiple RNGs with no perceptible change in AI performance" would support that interpretation. There are other interpretations. People are *assuming* that "wins" = "plays the game" - and the company that did it isn't relievi
Incremental Improvement Only (Score:1)
Differentiating between the two AIs is easy. One should mostly work on all levels and the other needs to be trained on every level.
I also regret getting a masters in AI. Before everything AI related was awesome, now it's all trivial. This is what reinforcement learning does. Given some goal, it runs hundreds to millions of simulations slowly reusing info from what worked better than before. If you setup the problem correctly, it will always eventually reach the goal. Some of the specifics are non-triv
It should be obvious. (Score:2)
Why is no one getting the important part? (Score:1)
Q Learning (Score:4, Informative)
The methodology deepmind used for training the game player is based on a classical reinforcement learning algorithm called Q Learning (http://en.wikipedia.org/wiki/Q-learning), developed in the late 1980's. This approach of maximizing expected future rewards for the agent to select an action in a current state has some parallels with studies of how the basal ganglia region of our brain conduct reward learning (basal ganglia).
What has been done is to approximate the reward function Q (which originally used a look up table) by a more general function to approach larger problems with much larger (or infinite) number of states. The approach here was to use a function which can fit large amounts of data, in this case a multi layered neural network (with convnet layers to preprocess the raw image input first to identify features) to attempt to learn the game.
This has actually been done a while ago, by Tesauro (now at IBM research) who used the same approach to create a Q Learning agent to play Back Gammon at an advanced level.
The reason why this is new is because in recent years we can employ cheap GPU's to learn exponentially more quicker than conventional cpu's and can construct much larger and deeper networks to learn from more complicated systems. Also many new 'tricks' have been developed to optimize learning in recent years (sigmoid functions replaced by simplified rect linear function, and dropout, etc), so we are going to see better and more amazing uses for this relatively old technology.
Learnfun/Playfun (Score:2)
How does this compare to Learnfun and Playfun, programs publicized last year as learning and playing NES games?
http://www.cs.cmu.edu/~tom7/ma... [cmu.edu]