Matt Jones, Google Research
MATT JONES
With several decades as expert and opinion leader in the digital landscape, Matt Jones has influenced many of us both as Nokia’s Director of User-Experience and as the Co-founder of BERG London, an innovative design consultancy and one of the stars of the British technology scene. He writes and speaks all over the world on how technology evolves and affects all of us, and as Design Director at Google Research he is now well positioned to give us all a glimpse of what’s next.
View transcript
Hello, is this working? Okay, good. First test passed. Second test is can I do a better talk than the mayor of our house? I'm not used to kind of the mayor of a city giving a talk which steals most of my ideas, I think. Thirdly, a million trillion years ago before I got into technology, I was an architecture student and this was one of the buildings that I had to draw, so it's nice to be here. I think I need to move to Aarhus before they take away my EU passport. Anyway, I'm Matt. I work in Google Research. It's a group of... mainly computer scientists and engineers distributed around the globe. There's a big center in Zurich. I'm in London, New York, California, obviously, and a big group that I work with is in Seattle. And I'm going to talk a little bit about the work that we do, which is broadly...we tend not to talk about artificial intelligence. I'll go into maybe my personal reasons for that later. But we talk about machine intelligence. A broad set of techniques. Mainly, you've probably heard about techniques around machine learning, and I believe that one of the speakers later on today is going to speak in more depth about that. I'm also going to talk a little bit about machine perception. And a little bit about an area that we work in called machine synthesis. Which, broadly speaking, is kind of trying to understand how we can create built systems that can display behaviors we might describe as creative or kind of synthetic. But first of all, who here... You probably all know more about machine learning than I do, right? Who kind of thinks they know what machine learning is? Okay, good. I'm going to explain it with a cat gif. You know, because it's the internet. So, broadly speaking, a lot of the techniques of machine learning that you hear about employ a device, a system, a design approach called a deep neural network. And I illustrate it with this cat gif. You have an input layer, which takes in the input from the world. In this case, a picture of the world which has a creature lying in a laundry basket. And we ask it if it's a cat or a dog. Through understanding that photo, this input layer, and then taking these... Do you know the game Pachinko? The kind of Japanese... The walls go down. It's basically that. Over thousands and thousands of layers. It sort of discerns whether, oh, I think this is a cat, this is a dog, this is a cat, this is a dog. Through these kind of neural networks, as we go further and further down, we get to an output layer, which makes a discernment about the world. The way in which these sorting layers of neural networks are built is that they're trained. They're trained with a large amount of data about labeled photographs of whether this is more like a cat, this is more like a dog. And so we call this kind of supervised learning. And this kind of supervised machine learning is a... It's at the heart of an awful lot of the sort of applied machine intelligence that's in particularly Google products at the moment. There's another concept that I want to sort of use in the talk, which I will rely on my colleagues, Fernanda, Viagas, and Martin Wattenberg to explain, because they're far cleverer than me, I think. There's a core concept in machine learning called how to learn. called high dimensional space. Here's one way to wrap your head around this concept. You can think about people as being high dimensional. For example, take famous scientists. You can think about when they were born, where they were born, their fields of study. Each of these is like a dimension of that person. These dimensions become difficult to untangle when you think about different people because someone might be similar in some ways but very different in others. But this is the kind of thing you can use machine learning for. With machine learning the computer isn't told the meaning of these dimensions. It just sees them as numbers and it sees each set of numbers as a data point. But by looking across all of these dimensions at once it's able to place related points closer together in high dimensional space. So with that in mind as I say most of these kind of machine learning techniques are applied in Google products right now. For instance, smart reply in Google inbox and Google mail uses a neural network to kind of offer you three predictions about the sorts of email that you might send back to somebody, a quick email. It offers it as a button that you can push to kind of complete this task. Now most of us look at this in horror and go, well, I want to handcraft my emails to my friends and colleagues to show them. But then you realize actually probably like into these three buckets and you're actually quite grateful for machine learning. So the next thing I want to talk about is machine perception which is the approaches of using machine learning to understand to help systems understand the human world around them, really. And this is probably most recognizable in Google photos where we're using many開始 many of the algorithms and techniques used in Google Image Search to then work across your stored photos in the cloud. So you're able to group together photographs of people who are close to you, places, objects, and things. And this is really using the power of Google's Image Search across your own photographs using machine perception techniques, computer vision techniques, image recognition techniques. But I also want to quickly mention this, which is Google Translate acting through a viewfinder through a lens. And what this is doing is kind of painting your language over the top of the language that it sees in the world. And why this is particularly interesting, at least to myself as a designer, is that this is machine learning working inside your device. It's not accessing the network. It's not accessing the cloud, as we call it. It's actually working inside your device. It's working locally on your device. It needs to work locally on your device for a number of reasons. One is a very good user experience reason that if you're in a foreign city, often you won't have access to the network. You might not be able to afford roaming, or you might not be able to connect. The most important reason is for this kind of image recognition and kind of painting the pixels based on what the neural network recognizes in the world, it has to happen in real time for it to make any sense. And for it to happen in real time, it has to happen on your device. Now, you know, the wonderful thing in the last couple of years is that, oh, my laser beam is failing. But these portable devices now have vast amounts of processing power in them and can run neural networks on device. That technique is something that my group in research in machine intelligence is actively pursuing, and it's going to lead to some really powerful new capabilities for owners of smartphones in the near future. And as I said, the mayor of Aarhus stole all of my best topics. But one of the things that most of Google's machine learning, applied machine learning is built on is a system called TensorFlow. And TensorFlow is something that we've also open sourced, and there's a large community around the TensorFlow project to use those sorts of techniques that we use in Google. So, I encourage you to go and look at that. One of the things that we did inside Google to kind of both kind of publicize TensorFlow to an extent and show what you could do with it and also kind of show with play and creativity what sorts of things machine intelligence can do is this site, AI experiments with Google.com. And I'm just going to show one by a local person. Artist, a collaboration with a local artist here, Quick Draw. Hi, I'm Jonas. Hi, I'm Henry. Quick Draw is a game a few of us at Google made. You draw and the computer uses machine learning to guess what you're drawing. I see square or suitcase or canoe. Oh, I know. It's a shoe. It's an experiment that uses some of the same technologies that helps Google translate, recognize your handwriting. To understand handwriting or drawings, you don't just look at what the person drew. look at how they actually drew it. Which strokes did they make first? Which direction did they draw on it? You train the computer on millions of characters from hundreds of languages. And over time, it learns whether you wrote look or whether you wrote book. Training is a big part of how the computer can guess your drawings correctly. As people, it's easy for us to look at these three drawings and know they're all cats. But to a computer, they're very different. One is just a head, one has a full body, and one is just facial features. It's just all cats. To get the computer to understand, you have to show it a lot of cat doodles. And then it starts to see patterns, like that almost all doodles of cats have pointy ears, a small nose, and whiskers. Of course, it doesn't always work. That's because it's only seen a few thousand doodles. But the more you play with it, the more it will learn, and the better it will get at guessing. Oh, I know. It's cat. We put it on the web for anyone to play with. We hope it inspires other people to think about fun ways to use machine learning. You can play it at g.co slash AI experiments. I think the amount of photons in this beautiful hall are messing with my controller. I'll just put these on. Oh, what just happened while I was messing around? Can somebody click that? I'm sorry. Stop messing around. It gets serious. So the next thing I want to talk about, and this is something that we're really just kind of like scratching the surface of right now, is this area of machine synthesis. Who saw something like this about a year ago? Yeah. And was it because you had eaten some cheese that was a little bit old? No? Okay. This is something that we published in Google Research, actually in 2015. So it's about a year and a half old now, maybe a little longer. And we called it Deep Dream. It's a technique of sort of reversing the machine perception approaches. And I'll let my boss, Blaise Aguera-Arcas explained it. He did fantastic TED Talk, which I'll point you to if you want to know more about this. But just this excerpt should explain what we're talking about here. And about a year ago, Alex Morvinsov on our team, he decided to experiment with what happens if we try solving for X given a known W and a known Y. In other words, you know that it's a bird, and you already have your neural network that you've trained on birds, but what is the picture of a bird? So it turns out that by using exactly the same error minimization procedure, one can do that with the network trained to recognize birds, and the result turns out to be a picture of birds. So this is a picture of birds generated entirely by a neural network that was trained to recognize birds, just by solving for X rather than solving for Y, and doing that iteratively. So this is a very interesting launching off point for a lot of research. We know that there's a link between our sensing and playing in the world, our learning, and our ability to then turn play into creativity and synthesis and create new things. What we're sort of scratching at here is if we turn the work in machine perception the other way around and ask a system trained to recognize things in the world, it thinks it should paint, if you like, if you ask it what a bird looks like, you get back this kind of really interesting sort of Picasso-like interpretation. But by looking at this, we can actually start to understand more about the relationship between perception and synthesis, and we hope kind of use this as a launching off point to do more work in sort of understanding how we can use machine learning systems as kind of, exoskeletons or kind of assistance for human creativity. And part of how we're doing that is we have a program called Artists and Machine Intelligence that we're running. And so this is pairing up Google researchers, computer scientists and engineers with artists, visual artists, literary writers, architects, designers, people who don't come from the field of machine learning, or even computer science or technology even, and getting them to kind of explore what's possible with a computer scientist or engineer. And so this is a traditional Japanese dancer, butoh dancer, who worked with an engineer in our group to kind of use the deep dream techniques and machine perception techniques as part of her performance. So her dance that she's staging is being interpreted by the machine, and then projected back to the computer. And then projected back onto her and reacting to her movements. So it becomes a partnership between her and the machine in some ways. So a little departure. This is where we sort of get into kind of like how might this kind of affect how we design going forward to focus on kind of partnerships between humans and machine intelligences. Kevin Kelly, who's written a lot about this kind of stuff, most recently in a book called The Inevitable, looks back to look forward. So he looks to the beginning of the 20th century and the process of electrification. It starts to apply that to the kind of proliferation of machine intelligence in the world. And he calls that cognification. He believes that we are at a turning point similar to the electrification of the world, but he calls it the cognification of the world. I find that very, very interesting. And if you are sort of like looking back to look forward, you might look back and sort of see, you know, the dawn of electricity, very concentrated in large cities, very concentrated on the elites, if you like, all concentrated in one generating place and then spiraling out. Again, the mayor took my best ideas. You know, the restaurant is serving the food rather than people cooking for each other. But I think we're moving. So if you sort of think about kind of data centers as being the sort of power plants of cognification at the moment, that's where we're at. And one thing that we're doing to kind of move forward on that and break that down is that we're offering, again, the machine learning power that we've developed in the data centers as something that you can buy as a small business or an entrepreneur or even a large business through the Google Cloud platform. But the thing that I think is interesting is kind of moving to this world where there's both electricity at the center, if you like, and the generation and usage of electricity at the edge. The invention of the fractional horsepower motor revolutionized how our cities were designed and how our products were designed because motive power could be made very, very small and placed in smaller and smaller things powered by smaller and smaller amounts of electricity. And if we look back to the example that we already have in our pockets of Google Translate through the lens, then we start to see these neural networks compressed down and put on our devices, running in real time, running very fast, running on very small amounts of power. Neural networks are actually very, very compact. They're very small. They're very well suited to these types of devices. So we start to move from the giant power plants, if you like, running these machine learning techniques, to the devices in our pockets that we own running these sorts of applications. I might skip over this even though it's my favorite GIF in the world. There you go. I like showing this because I think we often think of machine learning and artificial intelligence, and this word artificial intelligence makes us think of brains and very sort of complex things. You know, higher levels of consciousness and cognition and creativity and all of those sorts of things, which of course we're looking at. But we have neurons all over our body. We think all the way through ourselves. Our knees are thinking. We have these very, very small processes running the whole time, which are sort of cognitive processes, but they're subconscious. They keep us breathing. They make our knees move when somebody hits one with a hammer. And I think I would like people who are designing and engineering and working in these kinds of things, to not only think about machine learning being about this stuff, but also being much more about this stuff. These very small processes that we can use to enhance what we do every day. So I made my excuse to show my knee GIF, so I'm happy. One other thing I want to talk about that our group has been working on is a technique which makes this kind of edge learning or edge cognition much more feasible. And it's something that we just started. It was just published two weeks ago, I think. So this is fresh off the press. It's a technique called federated learning. If you want to know far more about it, you can go and read the Google research blog about it. But basically what this does is it allows the neural networks that are small and compact and efficient, as I've said, to run on device over here. Sorry, my dot is not working. So you have your own machine learning running on your smartphone. It's running at the edge. It's yours. Your kind of tiny brain is learning from you and your world. When you plug in your phone in a completely privacy-preserving manner, we can then upload the learning from your model to combine with others' learning through the day to create an update of the kind of view of the world, if you like, and sort of recycle back down to personal devices overnight. So you have the best of both worlds. You have the ability to have a very personalized machine intelligence working for you at the edge, but you also benefit from the wisdom of the crowd. And it's done through this completely privacy-preserving mechanism. We think this is incredibly exciting and could enable some really, really wonderful uses of machine learning in real time on device in the near future. Like so. So now, as I said, some more personal views about the near future, I guess. As a designer, particularly working in technology, we kind of labor under the perceptions that are created by mainly Hollywood. So for the last 10 years, if you like, designers had to go into client meetings and get told to make things that look like Minority Report. And kind of, you know, our hands flailing around, it turns out the world does not look like Minority Report. It looks like this. But for the next 10 years, I think we're going to go into kind of meetings or discussions, and people are going to expect the world of machine intelligence to be, you know, voiced by Scarlett Johansson and kind of be this kind of very human approach to applying machine intelligence. And I think that is sort of inevitable, but it's also kind of a missed opportunity. There are other ways to kind of think about how this stuff can work. And the way that I like to think about it, maybe because I'm lazy, I take a lot of work to make Scarlett Johansson kind of talk to you. I like to think about reaching for something nearer, first of all. And we have a long history of working with kind of things that, things, dogs, animals, horses, companions that extend us, that have different senses, have different abilities, have different ways of thinking, but they are part of a team with us. There's a great book that I read some years ago by Donna Haraway, a philosopher, which is called The Companion Species Manifesto. And it's about her relationship with the dogs in her life, working dogs, and animals in her life, working animals, and how they are part of a greater system that enhance her and she also enhances their lives, she thinks. Philip Pullman's books talk about, have this kind of conception of the demon. Here we see the little girl, the heroine of the, of the books, Lyra, with her demon, Pantalaimon, who is outside of her and has different senses and can do different things, but it is actually her. It's an extension of her. And I think kind of, like, looking to almost like different fictions and different thinking about where we might head would help us kind of broaden our horizons and come up with better solutions. This is a favorite book of mine that I read late last year called Haches for Hawk by Helen MacDonald. It's about training a hawk. And she talks about, very beautifully, writes very beautifully about her relationship with this animal that she can't imagine how it's thinking or how it's kind of seeing the world, but it's pleasurable for her to try. And so she kind of has this kind of very, very close partnership and relationship with this thing which is outside of her that she's sort of imagining through her eyes. And finally, I want to talk a little bit of... I'm just showing you weird GIFs that I like from the Internet. I'm sorry. But I want to talk a little bit about another kind of mythology in a way from kind of familiars and demons. I want to talk about centaurs. And this comes from Garry Kasparov, the chess grandmaster's revelation, after he was defeated by IBM's giant, deep blue chess-playing machine. And actually came back and sort of created a new form of chess, which gets called... When I first read about it, and this sort of captured my imagination, was called centaur chess. And it's about not having kind of humans versus machines, but teams of humans and machines competing against other teams of humans and machines. And again, as the mayor said, before I got to speak, which I think is quite unfair, the humans and the machines bringing out the best in each other. So I'm kind of quite taken by this pattern of, well, centaur chess, whatever things could we sort of create centaur teams around, using the best of what machine intelligences can do, perceive the world in different ways, crunch large amounts of data, and the best of what humans can do, be intuitive, be creative. This is some work by Robin Sloan, an American writer and technologist, who created a word processor which learnt all of the open source available science fiction that it could find on the web. And it became his writing partner. And it spouts nonsense. And it spouts nonsense in a sort of auto-complete way as he types. But he uses it just as a sort of exercise to get around writer's block. He just starts typing. And he won't ever use, you know, a huge percentage of what the machine suggests. But it'll be enough for his vastly superior intelligence to kind of create. And I find this kind of centaur pattern very, very, very interesting. So the last film I'll show is something along these lines called AI Duet. Hi, I'm Yutam. This is an experiment called AI Duet. It uses machine learning to let you play a duet with the computer. Making music using code isn't a new thing at all. But machine learning gives us a different way to go about it. If I was trying to make AI Duet with more traditional programming, I'd have to write out lots of rules. Like, if someone plays a C, maybe respond by going up to a G. Or if someone plays three ascending notes, then maybe go back down. I'd basically be creating this map to tell the computer how to make these decisions. But there are just too many note and timing combinations to map it all by hand. This experiment approaches the problem differently, using machine learning. Specifically, neural networks. We played the computer tons of examples of melodies. Over time, it learns these fuzzy relationships between notes and timings, and builds its own map based on the examples it's given. So in this experiment, you play a few notes. They go to the neural net, which basically decides, based on those notes and all the examples it's been given, some possible responses. I had some friends try it out. It was fun to see how it responds to different things. It picks up on stuff like key and rhythm that you're implying, even though I never explicitly programmed in the concepts of key and rhythm. It was cool to see people use it in ways I didn't expect. Instead of taking turns, a few people play it at the same time as the neural net's response, kind of getting in a creative feedback loop with the computer. It's also fun to just mash the keyboard. The neural net tries to return something coherent from any input that you give it. I made all of the code open source, and the neural net that I'm using is from Google's open source Magento project, so anyone can grab it and train their own net. I wanted to put this experiment out there just as an example of the many kinds of things you can make with machine learning and music, and I'm really excited to see what other people do. You can play with it at g.co.ai experiments. So I think I'll probably leave it there. I think the last thing that I wanted to say, perhaps because I flew into Billund last night and it was on my mind, is we have to create these systems in a way that play and constructive creativity by the most amount of people is possible. And again, the mayor stole my stuff. Damn it. But being able to show... I think as technologists, we often talk about seamless solutions, and seamless networks. Seamless is the worst thing in the world. The best thing in the world are seams. Being able to see how things are made, being able to see... I see my kids look at Lego bricks and they can figure out how things are put together. It's the most genius thing in the world is that you can see how... The most genius thing about what happened here was that it exposed the seams. It exposed how things are made. We need to do that. We need to stop talking about magic and seamlessness and all of those things that technologists and designers like to talk about. We need to talk about beautiful seams. And we need to talk about it in a way that's legible and understandable by the most amount of people so that the most amount of people can build this stuff going forward. Thanks very much for your time.