Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
AI Google Games

Artificial General Intelligence That Plays Video Games: How Did DeepMind Do It? 93

First time accepted submitter Hallie Siegel writes Last December, an article named 'Playing Atari with Deep Reinforcement Learning' was uploaded to arXiv by employees of a small AI company called DeepMind. Two months later Google bought DeepMind for 500 million euros, and this article is almost the only thing we know about the company. A research team from the Computational Neuroscience Group at University of Tartu's Institute of Computer Science is trying to replicate DeepMind's work and describe its inner workings.
This discussion has been archived. No new comments can be posted.

Artificial General Intelligence That Plays Video Games: How Did DeepMind Do It?

Comments Filter:
  • DeepMind? (Score:5, Funny)

    by ArcadeMan ( 2766669 ) on Thursday September 25, 2014 @03:25PM (#47997287)

    I've seen the next-generation after DeepMind, and it requires seven and a half million years of calculation to play a video game.

  • https://github.com/kristjankor... [github.com]
    In Python of all languages. Clearly not concerned about the Ai's performance.

    • Well, you know what they say, make a proof of concept first, then make it good later(only a few people ever bother to do this).

    • by jdavidb ( 449077 )
      I took a graduate neural networks class in 2002 and did my implementation in Perl using PDL. The professor desperately pushed matlab on everybody but left us free to choose our own implementation language, and I chose Perl. I felt I understood neural networks pretty well at the end of the project. Twelve years on all I remember are the basic concepts at a high level.
      • Twelve years on all I remember are the basic concepts at a high level.

        I formally studied AI and neural nets 25yrs ago, I recently came across this series of video lectures on YT [youtube.com]. I started watching to refresh my memory and ended up learning quite a bit of new stuff that was unknown when I did my degree. It took me about a month or so to watch the whole series, definitely worth the effort if you already have the basics, but forget it if statistical maths or matrices scare you.

        Peal/Python - A toy AI doesn't need to be fast, it's purpose is to play with ideas, scripts are muc

        • by fyngyrz ( 762201 )

          Peal/Python - A toy AI doesn't need to be fast...

          I remember Peal... from Bell Labs, right? Yeah, I thought it rang a bell.

    • by paskie ( 539112 )

      "Clearly not concerned about the AI's performance?"

      It uses Python, indeed. And for the computationally intensive tasks, it uses numpy and theano. Theano is general symbolic computation framework that will automatically accelerate your vector computations on a nearby GPU, etc.

      I don't know how it compares with (likely Lua, torch-based) deepmind's implementation. But assuming that scientific python programs actually do their expensive computations in the Python VM is really rather silly.

    • by ceoyoyo ( 59147 )

      Somebody else already told you about Theano. To add to that, a lot of neural net stuff gets done in Python because Theano will happily take your equation, compile it for a multi-GPU or CPU setup, optimize it, and run it fast.

      A neural net is a couple of equations that need to run fast and a lot of data manipulation and visualization. Theano, Cython, a C module, pyOpenCL/pyCUDA, or something equivalent takes care of the little bit that needs to be fast.

  • How to do it. (Score:5, Interesting)

    by Animats ( 122034 ) on Thursday September 25, 2014 @03:38PM (#47997429) Homepage

    That's neat. The demo takes in the video from a video game of the Pong/Donkey Kong era, can operate the controls, and in addition has the score info. It then learns to play the game. How to do that?

    It's been done before, but not this generally. [google.com] "Pengi", circa 1990, played Pengo using only visual input from the screen. It had hand-written heuristics, but only needed vision input from the game. So we have a starting point.

    The first problem is feature extraction from vision. What do you want to take from the image of the game that you can feed into an optimizer? Motion and change, mostly. Something like an MPEG encoder, which breaks an image into moving blocks and tracks their motion, would be needed. I doubt they're doing that with a neural net.

    Now you have a large number of time-varying scalar values, which is what's needed to feed a neural net. The first thing to learn is how the controls affect the state of the game. Then, how the state of the game affects the score.

    I wonder how fast this thing learns, and how many tries it needs.

    • Re: (Score:3, Informative)

      According to their paper, DeepMind's Q-learning is indeed passing simplified, vectorized Atari screen pixels straight into a neural net. There's no MPEG or other pre-encoding of the screen, just conversion to grayscale and normalizing to 64x64 pixels.
      • from the technical side, the real hurdle is vision...the ability to compute the best move is relatively easy

        here's why: video games are 'AI'...we program games to 'play' us all the time...which is reacting to continually changing parameters to choose the best option for input to control the game to 'win'

        from a complexity standpoint, think of the AI from a new P2P shooter's level of complexity vs a ghost in PacMan

        all of 'ai' is abstractions based on arbitrary choices...in this instance they define "artificia

        • by Anonymous Coward

          video games are 'AI'...we program games to 'play' us all the time...which is reacting to continually changing parameters to choose the best option for input to control the game to 'win'

          Yes, because programs written to specifically search well defined problem spaces and to explicilty encode tactics and game play as seen by game designers comes even close to a program that needs to figure out the game and problem space from scratch. Even if fed a list of objects and their exact position, so that image recognition was not necessary, you aren't even close to a trivila problem, nor will throwing anything resembling video game ai at it work in general. At least for image recongition there is

          • "machine learning" is the same as every other machine behavior: it is the product of coding instructions from humans

            i don't have a problem with the language, but it's not the same as "human learning" at all

            when people say a machine "learned" what they really mean is that its optimization algorithm did its programmed task effectively

            "learning" = optimization over time based on parameters set by humans

            • by Anonymous Coward
              That post seems to have no relevance to the post you replied to, which didn't use "machine learning" or even the word "learning" at all. You originally made the claim that the difficult part was the image recognition, and implied that the rest was trivial. Your reply here doesn't address the criticism at all that very general algorithms that have to both determine the rules and then optimize them is a lot less explored than image recognition.
              • Your reply here doesn't address the criticism at all that very general algorithms that have to both determine the rules and then optimize them is a lot less explored than image recognition.

                this is nonsense

                you mention "determine rules" which is part of the "machine learning" linguistic contextualization

                that's what my post was about...how every behavior that is called "machine learning" including when this particular machine determined the rules of this game, it was not in any way like "learning" that humans

            • i don't have a problem with the language, but it's not the same as "human learning" at all

              I think you're probably right... but we can't prove it since we don't yet know how human learning works. Once we do know how human learning works, we will be able to program machines to learn the same way.

              • i don't have a problem with the language, but it's not the same as "human learning" at all

                I think you're probably right... but we can't prove it since we don't yet know how human learning works. Once we do know how human learning works, we will be able to program machines to learn the same way.

                Oh, one more point: I should mention that it seems pretty likely that once we understand how human learning works, we will actually not program machines to learn the same way; we'll program them to learn a better way.

                • there is no "singularity"...humans are unique and have free will and civil rights...we are always dynamic and each human learns differently...

                  also, we understand alot about how humans learn...there are whole fields of inquiry in academia and professions devoted to it...you may have even met one of these people when you attended school...they study people like Vygotsky now...and integrate neuroscience into their learning models

                  the same neuroscience we programmers use to model computer architecture

                  humans are

              • Once we do know how human learning works, we will be able to program machines to learn the same way.

                this whole ontology, it's not science or engineering...it's language tricks to make us humans feel like we've accomplished something when really it's just coding...

                'ai' is code...code written by humans

                also, there is no specific line where we can say "we've learned everything about how humans learn"...you can't have a black/white dichotomy with an abstract idea like "learning"

                "learning" is different to every h

                • Yes and no.

                  I agree that it's likely that there's no specific line, at least not a sharp one, but there is a qualitative difference between machine learning as we know it now, and human learning. Human learning, at least the best human learning, is about the creation of knowledge, not the acquisition of facts, nor even the identification of key facts from a larger mass (which is what machine learning as done today is about).

                  A key difference is explanation. A human playing Breakout not only comes up with

                  • What you're saying is fundamentally religious

                    absolutely not...you've been reading too much Richard Dawkins...put his books down forever he's a troll on academia

                    **secular humanism** also holds to this same essentially...

                    "every human is unique in the universe and has free will...no machine will ever have these characteristics"

                    not the last part about machines, but the free will aspect of human existence is **NOT TIED TO RELIGION**

                    here is the UN declaration of human rights: http://en.wikipedia.org/wiki/U... [wikipedia.org]

                    it i

                    • So... if free will isn't an emergent behavior arising from complex interactions which are nevetheless inherently limited to the laws of physics, then what is it? And, more importantly, what is it about free will that makes it impossible for a machine to acquire it? What, fundamentally, is special about the computations in human brains that constitute "free will" that makes it impossible to replicate them in different hardware? Or to replicate them in the _same_ hardware? Eventually we will be able to bui

                    • ha!

                      that's quite a few questions

                      so, maybe i can answer all by falsifying my argument...

                      the human brain does work *somehow*...IMHO we have really only just scratched the surface...really...and i hope we can agree that the whole singularity notion that because of some unscientific conjecture about processor speed that 'ai' is predictable is nonsense...

                      that said, i have to admit that theoretically the human mind works and is a system and therefore can (and this is very far-flung...pure conjecture) be constructe

                    • i hope we can agree that the whole singularity notion that because of some unscientific conjecture about processor speed that 'ai' is predictable is nonsense...

                      I agree that processor speed has little if anything to do with it. It's clearly about software. If it were about speed only, then we should, right now, be able to build an artificial intelligence that runs very slowly. Perhaps it would think at a millionth of the speed of a human brain, but the processes of creative thinking would still be recognizable as such. Then we could know that we just need a computer a million times faster to match a human brain, and that further performance improvements would surpa

                    • imagine a traditional computer running a fully-detailed simulation of a human brain. This simulation is an exact replica of a real human brain, and simulates every neuron, every chemical reaction, etc. It even simulates the quantum uncertainty effects at the finest level of detail.

                      Why would that simulation not evince "free will" (whatever that is)?

                      right...that is a good question

                      "no" is the answer, if you use legal definitions of 'free will' (or concepts similar to in practice)

                      "yes" is the answer if it really, truly is what you say...and we have a public debate about it and have a true democratic/legal decision...which even then would have limitations...it would be in a room on a university campus...what if we let it control a drone? whole different story...

                      look, we're just going to have to agree to disagree about how actually feasable what you descri

                    • i meant to type:

                      "no" is the answer, if you use current legal definitions of 'free will' (or concepts similar to in practice)

                    • "no" is the answer, if you use legal definitions of 'free will' (or concepts similar to in practice)

                      Cite?

                      ook, we're just going to have to agree to disagree about how actually feasable what you describe really is...it's just so far out there...it really is, from an engineering and psychology perspective, about as likely as humans being able to travel across the whole universe and through time

                      Nonsense. There is a fundamental difference between something that is barred by the laws of physics and something that is perfectly possible, but just beyond our current ability. Oh, it's possible that we'll discover new physics that make supralight and time travel possible (it's even possible that the same discovery will enable both), but it's more likely, I think, that both are simply disallowed by the laws of nature.

                      Construction of brains, however, is incontrovertibly not barred by any physical la

                    • it's not "perfectly conceivable"...it's complete conjecture

                      like i said a few comments back, you've been watching too much sci-fi and have no concept of how this stuff is actually made

                      that's why i said, earlier, that i'd have to *literally* take you by the hand and have you talk to the Watson (or other ai) team, look at the codebase...because it seems that's the only way you can understand how complex this work is

                      here's your problem in a nutshell:

                      I suspect that we'll understand and be able to construct artificial intelligence before we can replicate a human brain, but I don't think either is more than 100 years away.

                      before what?

                      ***we already understand "artificial intelligence"

                    • like i said a few comments back, you've been watching too much sci-fi and have no concept of how this stuff is actually made

                      I've been consistently ignoring such snide remarks and I'm going to continue doing so... but my willingness to be so patient with your snark is wearing thin. Cut it out or I'll simply stop responding.

                      As for whether or not I know "how this stuff is actually made", you might consider that I'm a professional software engineer with 25 years' experience, currently working for Google. I know quite a lot about how "this stuff is actually made", including familiarity with current machine learning techniques, sinc

                    • the notion that our brains are deterministic machines

                      you're already a "true believer" arent' you?

                      the idea that the *thing that created machines* (human brain) is nothing more than a machine is ridiculous

                      machines are tools for humans...that's all...

                      our brains can be compared to machines (anything can be compared to anything else), but that doesn't mean that our brains function like machines

                      it's a false ontology...and it's based on your **personal beliefs** not rationality or logic

                    • also, wanted to say that this is "fun" chatting with you, and I appreciate the experience you bring to the conversation...

                      i just disagree and have strong feelings on the subject

            • by ceoyoyo ( 59147 )

              Your post is entirely reasonable except for:

              "but it's not the same as 'human learning' at all."

              You need to support that position.

              • the idea is it's blatantly obvious to anyone with technical experience

                however, i'd like to continue if you are, so tell me, what is acceptable "support" for that position?

                remember, you quoted one phrase, but it was part of a larger conversation that started about machine vision...which I was told was an example of machines learning a new skill...which I disagreed with

                it will probably be helpful for one of us to define "human learning" if we're going to really do this...i'll let you make the call

                • by ceoyoyo ( 59147 )

                  I'm not sure what acceptable support is, which is one of the reasons I'd like to hear whether you have any actual foundation for your argument.

                  There are a few learning strategies that animals, including humans, have been observed to use. One is mimicry, which has been demonstrated in primates (just recently for the first time by macaques in the wild), where one animal watches another perform a task then imitates it. Another is reinforcement learning, where an animal becomes more likely to demonstrate a gi

                  • hey thanks for the comments

                    You seem to be implying that humans somehow learn differently than programs because the program is "programmed" and we're not. Do you have anything to support that assertion, besides "it's blatantly obvious to anyone with technical experience?" There's fairly good evidence that we've been "programmed" very effectively, and quite beyond what most of us would like to believe, by evolution.

                    now...what kind of evidence could I present that would satisfy your need?

                    if i had access, i could take Watson or another well known AI and show you the logic schematics then show you the codebase, then demonstrate how changing the codebase changes how Watson (or w/e) behaves with highly predictable results. I could have the engineers who made Watson walk you through their entire development process, and at each point of decision, explain to you how the decision effects how it work

                  • also, you may want to brush up on your education theory, because it's made leaps and bounds in the last 15 years, incorporating the exact same neuroscience that AI learning tries to use

                    i got an MA in Education from CU-Boulder in 2007...don't teach now, but i was genuinely impressed with how teaching has advanced as a profession

                    i also taught snowboarding for 6 seasons...applying the "Facilitated Learning Model" and was developed by...wait for it...education theorists at CU-Boulder

                    Vygotsky and Csikszentmmihal

        • Check out a mini robot learning to physicaly fly light aircraft. http://airsoc.com/articles/vie... [airsoc.com] Uses vision and robot arms to operate scaled down controls on a flight simulator.
    • How to do it. (Score:5, Informative)

      by Jmstuckman ( 561420 ) on Thursday September 25, 2014 @04:38PM (#47997893) Journal

      Advances in Deep Learning have made it far easier to extract features from vision -- in fact, feeding pixels straight to the neural net is pretty close to being all you need to do.

      Take a look at these slides and read about convolutional neural networks: http://www.slideshare.net/0xda... [slideshare.net]

  • interesting piece...
  • The AI itself seems alright, although Atari-era graphics and gameplay is extremely simplified compared the imagery and real-world dynamics that robotics struggles with routinely. So, for example, the AI doesn't seem necessarily very advanced compared to a self-driving car.

    What I would like to know how to do is to get $500M for so little track record, intellectual property, or even publications. I don't get it.

  • by __aaclcg7560 ( 824291 ) on Thursday September 25, 2014 @04:31PM (#47997853)
    When I worked as a video game tester for Accolade/Infogrames/Atari (same company, different owners, multiple identity crisis), I drove the programmers nuts on a racing title. Most video game players will play a race from beginning to end. Not an experienced video game testers. I would stopped the vehicle just before the finished line, turn around or drive in reverse, and crash the game by crossing the starting line. The programmers will complain that no one plays a racing game that way, try to wiggle out from fixing their code, and fix the bug only when its prevent them from going to code release. This is why testing automation is never used in the video game industry.
  • I find myself wondering about the following question:

    How did they differentiate "learning to play the game" from "learning how to track the game's RNG"?

    Most video games have ridiculously simplistic PRNG generators embedded in them. An AI might get "sidetracked" and learn how to play the underlying RNG output of the game, rather than the game itself. That would yield really good results for most arcade games of this type, I imagine (weak RNG, limited input and timing options, etc.) I don't know if they ch

    • "Tracking the RNG" would help you win the game, but it doesn't tell you anything about how to play the game. This AI learns to play the game, it then wins the game using experience it gains in the same way a human does - feedback from the game score. There' nothing really "new" in any of this, if you want a really impressive demo of what this kind of AI can do then pop over to youtube and watch the videos of IBM's "Watson" beating the snot out of the best human players in the TV game show "Jeopardy ". When
      • by sdeath ( 199845 )

        "Tracking the RNG" would help you win the game, but it doesn't tell you anything about how to play the game.

        That would be my point.

        This AI learns to play the game, it then wins the game using experience it gains in the same way a human does - feedback from the game score.

        That is one possible interpretation, which is not supported by the statements so far. That is not to say that it is not the case, only that it is not currently supported by what I have seen so far; something along the lines of "We tested this against games with multiple RNGs with no perceptible change in AI performance" would support that interpretation. There are other interpretations. People are *assuming* that "wins" = "plays the game" - and the company that did it isn't relievi

        • by Anonymous Coward

          Differentiating between the two AIs is easy. One should mostly work on all levels and the other needs to be trained on every level.

          I also regret getting a masters in AI. Before everything AI related was awesome, now it's all trivial. This is what reinforcement learning does. Given some goal, it runs hundreds to millions of simulations slowly reusing info from what worked better than before. If you setup the problem correctly, it will always eventually reach the goal. Some of the specifics are non-triv

  • They used a fish. Didn't you see it on twitch or wherever?
  • OMFG, these guys talking about pixels(soon they will write OpenCV stubs in comments), It's not about the video or picture or anything like that, It's about the fact that without the knowledge of what the game is the AI can learn the structure and the rules of the game.
  • Q Learning (Score:4, Informative)

    by Giant Robot ( 56744 ) on Thursday September 25, 2014 @06:05PM (#47998491) Homepage

    The methodology deepmind used for training the game player is based on a classical reinforcement learning algorithm called Q Learning (http://en.wikipedia.org/wiki/Q-learning), developed in the late 1980's. This approach of maximizing expected future rewards for the agent to select an action in a current state has some parallels with studies of how the basal ganglia region of our brain conduct reward learning (basal ganglia).

    What has been done is to approximate the reward function Q (which originally used a look up table) by a more general function to approach larger problems with much larger (or infinite) number of states. The approach here was to use a function which can fit large amounts of data, in this case a multi layered neural network (with convnet layers to preprocess the raw image input first to identify features) to attempt to learn the game.

    This has actually been done a while ago, by Tesauro (now at IBM research) who used the same approach to create a Q Learning agent to play Back Gammon at an advanced level.

    The reason why this is new is because in recent years we can employ cheap GPU's to learn exponentially more quicker than conventional cpu's and can construct much larger and deeper networks to learn from more complicated systems. Also many new 'tricks' have been developed to optimize learning in recent years (sigmoid functions replaced by simplified rect linear function, and dropout, etc), so we are going to see better and more amazing uses for this relatively old technology.

  • How does this compare to Learnfun and Playfun, programs publicized last year as learning and playing NES games?

    http://www.cs.cmu.edu/~tom7/ma... [cmu.edu]

"Marriage is low down, but you spend the rest of your life paying for it." -- Baskins

Working...