Monday July 04, 2016

Azeem Azhar

“Artificial Intelligence: Tangible opportunity?”

View transcript

And as Neil says, I also consider myself the product person at Exponential View. How many people receive that, by the way? Yay, okay, the rest of you, there's the sign up URL. And I have a little bit of background in the kind of machine learning and AI space from a product and founder perspective. So my last company, Peer Index, did large scale machine learning across the Twitter graph, was acquired a couple of years ago. Been an investor in a company called Evie. Does anybody here have the Amazon Echo? Of course Rob does, yeah. So the Evie is inside the Echo and does some of the inferencing there. And I was an investor in PowerSets, which is in Cortana, which nearly none of you will use, but it's from Microsoft. And some kind of current companies that I help. One is called Weave, it's doing, they say, an AI operating system layer. ReInfer, which does some very interesting deep learning around natural language and Selden, which is machine learning as a service. So with this audience, I don't really need to show this slide, which is what is AI. But essentially, just to kind of clarify the idea of where AI lives in science fiction is up there. Where AI lives in investor world is sort of down there. How many of you have invested in companies you'd consider AI? So like 60% of the group. So it is a tough audience, okay, good. So let's talk about where companies are playing right now. So up in the red ball is artificial super intelligence. This is the Nick Bostrom stuff, where the AI that's making paper clips decides to take all the world's resources to make more paper clips and we all die. Not many people admit to actually playing there now, although I think a number of people would like to. The bulk of activity for startups is in this artificial narrow intelligence, when we kind of close the domain down. And naturally it is, we're a small company with a small team, we have to focus on a niche. And that's where a lot of the interesting work is. The big majors, Google and Amazon and so on, are starting to play in this kind of human-ish space of generalized interfaces using AI. And the way they're probably doing that is through faking it, so knitting together lots of narrow artificial intelligences and applications, rather than actually having come up with a real solution to how do we build a generalized artificial intelligence that looks like a human. And again, a lot of the newspaper headlines tend to be dominated by the yellow and the red balls, whereas most of the opportunity is down in the green one. And it also encapsulates, I don't know if you can read this, but this encapsulates the umbrella term that is AI. So AI meant something deeply unfashionable 10 years ago, and now it's really sexy. So lots of things that didn't consider themselves AI five years ago now do, so machine learning and natural language processing and symbolic reasoning and control theory and so on. So it is a distracting term in some sense. And I guess you've all seen the CB Insights data as well, which is that the money is really flowing into AI deals right now, although we haven't really seen any stellar exits come out of AI. And worth saying something about the incumbents as well, which is that they are all getting really excited about it. So Google is not only putting AI into every product and emphasizing Google now, but it's open source TensorFlow, and it's made these cloud APIs available. Microsoft has done stuff that isn't just to do with that chat bot that went crazy. They've actually done some quite good stuff as well. Apple is also heavily overlooked in terms of what it's done. People ignore the fact that Siri is probably the most used AI-based interface in the world, but starting back with I guess Polaroids was one of the first AI acquisitions Apple made. I think the founder is here somewhere. Yep. And a honk. And a honk, right. But of course, Apple's been pretty active. They've done about 10 or 12 deals in the space in the last few years. And it's quite interesting when you start to see people like Salesforce come in. So about a month ago, they acquired Metamind, and last year or the year before they acquired a CRM system founded by DJ Patel, which was called Relative IQ or something. Relate IQ, thank you. Everybody knows that one, yeah. So the way I look at trying to explain this is to say that there are seven accelerants to the AI boom, three of which are kind of fundamental foundations, which is essentially Moore's Law and increasing computational power, the explosion of data, and the availability of great collaboration tools. And the three of those things are allowing people to take maths approaches that were developed in the 1970s, 60s, 80s, even further back, and apply them in large scale onto problems of data that we now have and then share those results. And one of the results of that is increasing returns in terms of the performance of systems that we have. The middle three are what I would say are enablers, which are the ways in which the value chain has changed that allows this back stuff to be more helpful. So that is this idea that software is eating the world, so that your lovely car is now just a vehicle with poorly designed software. The availability of APIs and microservices, and of course, the way that we now open source things. And those come together into the lock-in loop, which I guess is the thing that we, as investors, you're all looking for, which is how do I build a sustainable and defensible business? But that's kind of the framework that I have used. I'm not gonna dig too deeply into too much of this because we all know Moore's Law, but I just like this slide and the next one, which is that we're now creating more transistors every tenth of a second than there are stars in the galaxy. So that's from the bucket of spurious comparisons. But back in 1955, when we produced our first transistor, and it cost, I think it cost $10, right? We've now got to the stage where transistors cost less than a tenth of a billion, less than 1 10 billionth of a dollar, and the volume that we now produce is hundreds of trillions of trillions every year. And it's astonishing. So it takes me to this next slide, which is just to remind some of us, probably I was an early teen when the Cray 2 was around. So this was kind of what I was lusting after, age 12. It was a Cray 2, cost $20 million, and its processing power was less than half of my Apple Watch. And I know Moore's Law and I get it and I've studied it, but this still amazes me, and that's why we're seeing what we're seeing. The other key thing that has really driven some change has been the arrival of the GPU. And what we show on the right is the error rate in ImageNet, which is like an image classification task in the red line, and the green is the number of competing teams in this competition who use GPUs. And what you see is the error rate dropped to about 7%. So back in 2010, these algorithms would misclassify about a quarter of all images. Now it's about 1.14, so it's gone beyond that now. And it sort of has this correlation with the deployment of GPUs, which are just much better at parallelizing tasks. And so Nvidia has been a major driver of all of that. Okay, so one other piece that is sometimes overlooked is the importance of APIs and microservices. So anybody here was a software developer before they became an investor? Right, so you recognize that bit over there. That's the software you wrote in the 90s. It was monolithic, it was spaghetti, and it was really horrible. And in one of our previous AI winters, one of the reasons of many that AI failed was it couldn't make head or tail of that mess. You know, this is the, we have to refactor this code problem. And the problem is that you didn't have any clear, clean interfaces, so there was nothing that was easy to optimize. In today's world, that is microservices, as drawn by me, you've got these clean interfaces that are kind of contractually bound to take certain inputs and give certain outputs. So it becomes a much narrower domain problem to solve. And so a lot of the problems that we face are not solved by ourselves. And so lots of these niche applications that we see, like Amy from X and Snips and Weave, are reliant on this ecosystem of APIs. And that is a sort of fundamental difference to where we were 20 years ago. Viv, which is not trying to build a niche application, Viv is trying to build this new user interface, voice interface, is also heavily reliant on these APIs. And Amazon's Alexa are the same thing. So the thing about Alexa is you can just as a developer build a skill for Alexa, which is an API connector. And it makes Alexa smarter and smarter. Briefly on the majors, I mean, they've all moved into, kind of many of them, not Apple yet, has moved into open sourcing its technology. And they're releasing their internal machine learning frameworks. But they're also developing services that they're making available. So Google's Cloud Vision is a great example for, which is effectively the same object classification system that Google uses internally available to any of your startups. Bit of a problem if your startup happens to be in the image classification game, because you're kind of up against Google's standard. The last point about this, which is sometimes called, you know, how do you build a data network effect? I think of this as the AI lock-in loop, which is essentially that if AI improves your product, your product will drive more usage, which is the blue box, which generates more data. And that data will then improve your AI, fundamentally around the edge cases, right? Because we can always get the AI to respond to the first early cases we train it with. But it's the weird examples that only happen when you have the 200th customer or the 200,000th customer that makes your product differentiate as you scale it. But equally, you should then see an increase of profits, which theoretically you can invest in improving AI. Obviously, you have startups, so that last loop won't happen until after you've sold them. The other question is then what happens to industries, right? So I've got a set of industries, not a comprehensive list, where you can't now compete unless you are using AI. So you can't compete in first-person shooter video games unless you have deep AI skills. You can't compete in search unless you have deep AI skills. And once AI gets into an industry or into a subsegment, it tends to improve the products efficiently that they become hard to compete with. And so that loop is what is causing some part of this run. So I summarized this in this really punchy slide, which is that stuff is just getting better. I mean, it is getting really, really better. And I'm going to just take you through a few of these quickly. So this is object detection. And machines are now better than humans. This is the error rate. Do I have a laser pointer thingy? Does it? Oh, it doesn't look. Yeah, it does. Yeah. Okay, so this is the error rate in 2010 of the best teams doing this ImageNet thing, and it's 28%. We saw in 2014 it was about 7%, humans being about 6%. And last year we surpassed human capability. So we're starting to see this in image classification and NLP, that computers are generally doing better than humans at figuring out what's going on. This is the video game data that comes out of Google from DeepMind. And these were all their original action-oriented games. And this is the percentage performance that you got from their DQN, DeepQ network. And effectively, video pinball, it was sort of 25 times better than a human. It's sort of teaching itself. And when I wrote these slides, and I presented this slide about a month ago, it said, but Google's DeepMind is better at action-oriented games rather than exploration-oriented games. And then yesterday they had beaten Montezuma's Revenge, which is an exploration-oriented game with these longer-term goals. And that was in the course of a month. We know the chess story, but just to say, you know, Magnus Carlsen, who's a sort of neighbor to here, is up at the 2,800 ELO rating. Komodo is up at 3,500 now. So chess, 10 years ago it fell, but the algorithms don't stop improving. Go happened last, a couple of months ago. The Go computer was 2,000 processing units in total. Even in some of these things that we think are human, so in the UK we put our kids through a lot of tests at school. Verbal IQ, this is last year. ConceptNet had a verbal IQ score of 100 for a four-year-old. So we've trained a bunch of deep learning networks to answer verbal IQ questions at the level of a four-year-old two years ago. So now probably a seven-year-old, eight-year-old, who knows? But it will be improving very, very fast. And then even in these more complex areas, my photo hasn't appeared, but this is better than human suturing, sewing things up in surgery. So this is a pig's intestine, and over there is star and its error rate in green, and these are previous systems and humans. So it's far outperforming a trained surgeon in the act of suturing a pig intestine. Now, it can't do the task end-to-end, so the surgeon still has to chat to the patient, and the nurse has to make the patient comfortable on the gurney, but the actual act of surgery is getting far better using the machine. And it raises all sorts of questions like, how do you now train surgeons if they don't get to learn in the operating theatre? And then we're starting to see things like betting fall as well. So this was about a month ago. Unanimous predicted the top four finishers of the 2016 Derby, and the guy who ran the test made $540 on his bet, because he only bet a buck. But he still did pretty well. And I'll quickly whiz through these, because you can look up the papers, but visual Q&A, being able to put a question, what vegetable is on the plate in getting broccoli? What sport is this baseball? And this is approaching human standard, and as is image description. So what is this image? It's a group of people shopping in an outdoor market. There are many vegetables at the fruit stand. And back to the drivers that I talked about earlier in terms of collaboration and modularity, what was quite interesting about this one was that they used a convolutional neural net, which is good for image processing, and then they threw it into a recurrent neural net, which is good at generating sequences like text. So you're just starting to chain approaches, and now you can start to see, well, that's actually a pretty good description. The bit that annoyed me was that it said it was a fruit stand when it's talking about the vegetables, and you're sitting there going, come on. That's an amateur mistake, amateur hour. Does anybody here have the Tesla S with autopilot? Someone's going to admit to it, yeah. Well, we're in Sweden with investors. Someone's bound to have one. And it drives pretty well, so over a billion kilometers driven. Accidents in the Tesla have half the injury rate when it's in autopilot mode than a typical car accident, and so you're going to start to see the decline in car accidents. And I talked about how fast things are improving. So this is what's happening in the game playing world over the course of 12 months. So you remember the Atari stuff that DeepMind did when I showed earlier on. About 12 months ago, using the DQN, the DeepQuery network, about half of the games were performing at sub-human level, and about 25% were at superhuman level. And then they introduced, someone started to play around with this thing called the Dueling DeepQuery network, DeepQ network, and you'll see that human level performance increased still further. And now there's prioritized dueling, I love these names, where you're up to almost a third of all games being played at superhuman level, and the proportion at sub-human level declining. This is just over 12 months, and this is researchers at universities, as opposed to companies focused on the profit motive in terms of the improvement we're getting. So you can start to, again, take that process of acceleration. So the foundational blocks are of improved NLP, improved computer vision, APIs, microservices, better data engineering and data quality, driving better integrated applications, whether it's customer service or some of these other apps that we will have read about. So then the question is, where does that take us in five to seven years? So if you look at this wide variety of companies and these wide variety of trends, where do we think things get to? And the list is really long, and I sort of extracted six or eight here, but the thing that's in common is that this idea that Peter Drucker talked about, which is effectiveness should be a human pursuit and efficiency should be delegated to machines. And so you're already starting to see, you know, we will see in five to seven years conversational agents actually doing tasks end to end, you know, really accurate pest control. There shouldn't be a reason for a farmer to have pests because he should have drones flying over his crops at all times figuring out what's going on or even very early heart attack detection. So this is something that Sutter Health did with Nvidia where they were able to identify precursors to heart attacks weeks or months before they happened. It's going to be very exciting. I feel a bit like Ray Kurzweil giving you this sort of optimistic view of the world. It's not all going to be optimistic. Lots is going to go wrong. So just in the last few minutes, I think four interesting considerations that may be things that you think about as well, which is, you know, algorithms versus data. What matters more? What can incumbents do against startups and vice versa? Is there room for tool sets in this space? And what about bots? So on the, you know, the kind of conventional wisdom is that the algorithms are useless without data. And so all founders need access to data. Otherwise, they're just acquires. I guess people may feel that. And I think there's an alternative view, which is how much data do you actually need? And can you establish a core technology that includes all the sort of feature engineering and parameter tuning with smaller pools of data? And is this sufficiently hard enough for you to build a defensible business and to get started? Right. And I think that there are some investors in London who look at this more deeply than me who are asking these sorts of questions. Another question is, does your approach work on a small amount of data? So I mentioned this company, re-infer, which does unsupervised representation, and their technology essentially allows you to learn from 20 or 30 samples of text rather than 500 or 1000 training examples and get similar quality. So you get this ability to bootstrap faster. And the last question is, can you build an effective strategy to acquire data to make your product work? And there was a great blog post by somebody whose Twitter handle is Moola Freitag, where he talked about sort of 10 things that you could do to acquire data if you had a machine learning problem. And at Peer Index, we faced exactly the same thing. And I thought, actually, I'll reflect on that a little bit. So Peer Index was trying to index all of Twitter and make predictions about people's behavior, so 300 million people. And there are these 10 strategies. And I reflected on them that actually we did all of these things, or many of them. We did a lot of manual work, which means actually having interns, thank God for MBA programs, sit there and type hundreds of things into a spreadsheet. We narrowed our domain. So we started to look at a class of predictions that we said, these are useful, but we're not going to do them because they're not essential. We crowdsourced, we developed an application which we then put out to Amazon Mechanical Turk, and we had Turkers classify things. We put the user in the loop, so we actually had our end users tell us where we were right or where we were wrong. And we would get thousands of referrals that way. We built a side product that was also Trojan Horse. So one of my colleagues who went on to Balderton Capital after he was at Peer Index, Forensis, designed a game called Rate My Mates, where you had to, effectively you were given an incentive to tell us about your friends as we marched through your Twitter network. And we took that data and we used that to train our algorithms. We used a lot of public data. We had a project called Project Deep, Deep, Deep, Deep, Deep Undercover, which involved crawling various sites and grabbing the data and matching, keying on Twitter handles to then join data. And it's a real pain in the ass, frankly, but actually it's one way that you can, if you don't have access to large pools of data like Google or Shibsted, you can build some sort of defensibility. The next question for me is where can startups play against incumbents who have data volumes? And this is a question which is just so obvious in the venture business, which is you find a niche, right? And you really understand that niche really, really well and you super serve it. And then from that niche you use that as a beachhead to expand into other areas. And it's as true in the AI space, which is why I think it's relevant to think about narrow AI rather than generalized AI. And in the niches you get to iterate fast and you figure out a fast path to value. So there are some of these AI projects. Psych is my favorite, CYC, which has been running for 35 years with this big symbolic representation of all human knowledge. And about two years ago it became Lucid AI and took some venture funding. It's a long path to your Series A. And there are ways of playing that game as a startup, which is to find a niche. So X, Amy is probably a good example. Howdy is another that does some bot stuff. And Quid, which is a bit more mature. It's done a Series C now, but it's a P2T business that focuses on extracting intelligence out of blog posts. And then there's a question for me about tool sets. So if there is an AI boom, do you want to just make some shovels and other things? And I think there is a case for tool sets because people know what the business benefits of AI are now. And as a business person it's inescapable. But there's a real shortage of talent. And I'd be interested in people's perspective of where they see the talent shortage. We see lots of people coming out with a master's degrees and PhDs in graph methods or machine learning in general. We don't see many people who are product AI people and we don't see many people with the data engineering skills that you need to often implement these things. There's a big headache because of the complexity of training data and bootstrapping. And there are all these questions of time to market. So all of these suggest that people can come out and solve those problems for you. And Metamind was just acquired by Salesforce. It's a $30 million acquisition that they're about. So it's not going to pay back your fund. But it's still an acquisition. And you've got these guys like Seldon who I help out from time to time and Crowdflower. So I think there will be a lot of tool sets coming out there because a lot of this stuff is hard and there's just sort of skill shortage. One other area is how does this all connect with the bot boom? So just a second audience question. Anybody invested in a kind of bot company that does stuff? Yes. So we've got some investments in bot companies. And the argument goes this way. We need a new distribution outlet because nobody downloads things from app stores anymore. Conversational UI is going to get better. The AI approaches result in better completion and we're going to move towards voice driven interfaces. And humans like chatting. And we like chatting. And you can start to establish an empathetic relationship with your customer through the chat channel, which will result in more retention. And actually it's kind of an interesting thing that I find that I like in the chat services that I use is that my history of interactions is human possible. So I use a chat service called dispatch, which doesn't have AI on it yet. I'm a real human to order presents predominantly from my mum because they make suggestions to me. And I can go back and look back over the chat history in a way that I can't really go back over my Amazon history. So what that suggests is that there should be something happening here where AI is a kind of core part of how do you scale this thing efficiently. So a few areas that I found interesting. So there's Zaoichi, which is a Microsoft chat bot. I've pronounced it horribly. It's based in China and I think it was up to 20 million unique users per month where people chat and have these kind of emotional conversations with this bot, which has learnt what to say on Weibo, which seems risky, but it seems to work there. YourMD is a diagnostic product for consumers to figure out what medical problems they have that uses chat. Cardskill are some guys who are trying to do clothes shopping in the Facebook Messenger app. And lots of them are using these new interaction paradigms. So this is from Cola, which is a San Francisco firm, where you put these mini apps into the chat interface, which again I think is now mainstreaming. So I can tell you that lots of big companies are thinking about making their apps look like this as well. So it seems like there's something interesting going on. And then I was really flummoxed by this Google Trends data, which is the obligatory side, which shows the interest in AI seemingly declining. So I can't remember what was happening in 2005, but we were really, really excited about it. But there is like a little boom here. So this time it's genuinely going to be great. So just to finish off, the Gartner slide here, you're all familiar with this, Technology Trigger, the hype chart. So before we move to any questions, we just figure out where we all sit on this chart. And we don't have to share it beyond this room. But if you think we are at the peak of inflated expectations for AI, would you put your hand up? No? Okay, so if we're in the trough of disillusionment, anyone? How about Technology Trigger stage? Anyone at the Technology Trigger stage? Okay, a couple. Who's in the slope of enlightenment? Great. And anyone already marched into plateau of productivity here? No. Okay, perfect. So this is why you're all investing heavily now. Good. Thank you very much.