Understanding Emotion with AI, with Rana el Kaliouby
Rana el Kaliouby: I have this vision of the world, where in the future, our technology interacts with us just the way we interact with one another through conversation and perception and empathy. And that will make for a better human machine in our actions, but also just more empathy in the world.
Jon Prial: That's Rana el Kaliouby. Rana is the Co- founder and CEO of Affectiva. Affectiva was spun out of the MIT Media Lab in 2009, and Rana's has got an amazing story to tell. For years, way before the technology was that mature, Rana has been a pioneer in emotion AI, her newly released book is her story and the story of technology that we probably don't think of enough. Her book is entitled, Girl Decoded, but I find this subtitle even more interesting, A Scientist's Quest To Reclaim Our Humanity By Bringing Emotional Intelligence To Technology. As a passionate advocate for humanizing technology, for ethics and AI and diversity, I know you're going to enjoy our discussion. I'm Jon Prial and welcome to Georgians Impact Podcast. Rana, welcome. You have an incredible backstory that led you to where you are today. Please give us a thumbnail of your history.
Rana el Kaliouby: I was born in Egypt, but grew up in the Middle East in Kuwait first. Both my parents are in technology, they actually met in a COBOL programming class in the 1970s. I think my mom was one of the very early computer programmers actually. Studied computer science as an undergraduate at the American university in Cairo, and I built the very first emotionally intelligent machine at Cambridge University. Towards the end of my PhD, I met my role model and mentor professor Rosalind Picard. She wrote the book, Aspect of Computing, in the late'90s. So I joined her lab and we were very focused on the application of technology and mental health and autism. But being at MIT, we started to get a lot of commercial interests in the technology and spun out in 2009 with this grand vision and mission to humanize technology. Lots of use cases, I'm sure we're going to cover that in our conversation. Everything from autism to advertising, to automotive, to health. I have this vision of the world where, in the future, our technology interacts with us just the way we interact with one another through conversation and perception and empathy. And that will make score a better human machine in our actions, but also just more empathy in the world.
Jon Prial: Often a company founder has an epiphany, some grand idea to get started. And for me, I felt like that you're thinking about EQ, evolved, and I'd like to ask you to share a couple of stories. So the first one, we've often heard that body language is one way of communicating, arms are closed, it means something. But talk to me about how you perceive that eyes and faces, critically communicated pieces from your growing up in the Middle East and the women were in rooms, and take me through how you began to understand what that meant.
Rana el Kaliouby: So as it turns out, 90% of how people communicate is nonverbal, and it's kind of split equally between your facial expressions, your body gestures, and even your vocal intonation. How excited are you? How monotonous is your voice? Yeah, your vocal intonations are very powerful too, and so you take all of that together and that's 90% of how we communicate. Only 10% is in the choice of words we use. Growing up, I grew up in a kind of a conservative family. I went to a co- ed school, but I wasn't allowed to date. I spent a lot of my teenage years watching other high school students, my peers, essentially be in these relationships. And I remember kind of really cluing into, oh, there's something going on here based on the eye contacts and the different kind of non- verbals. And I was just really fascinated by this parallel communication language that it's unsaid, but it's really powerful. And that was the impetus kind of for just becoming very fascinated with non- verbal communication. And then of course, when we're in front of our machines, our machines are completely blind to all these signals, right?
Jon Prial: Another point that drove home for me, and here you are in Cambridge and your text chatting with your husband and" How are you?" he says, and you say," I'm okay" and you're writing that there are tears running down your cheeks. You knew it, a camera would know it, but your husband didn't. So how did this thought of recognizing empathy really begin to drive and effect the work you did? Because it's just incredible.
Rana el Kaliouby: I would say that was really the aha moment for spurring kind of the seed of an idea. What if technology could understand our emotions just the way we understand each other? So when I moved to Cambridge, I was a new bride, and my husband at the time had to stay back in Egypt. So it was my first experience being away from family, and I had two realizations. First, that the majority of my time was being spent in front of the machine coding, right? And yet this machine, this laptop, had absolutely no clue how I was feeling. So it often would just give me recommendations or suggestions that were just like out of context, right? And it had no understanding of whether I was having a good day or I'm stressed or lonely. And then the second realization was exactly, John, the moment that you shared, which was I was chatting with my husband, back then, and I was literally in tears. I was so homesick, I mean, I was ready to go back home, right? It was so homesick. And he was like," How are you doing?" I was like," Yeah, busy." And I decided not to share anything. And of course, because we were just chatting, he couldn't sense that anything was wrong. And it just hit me that often we hide behind our screens. We have the illusion of a connection, it's not a real connection, and you can generalize that to the social media platforms we use today. Even on Zoom, right? We're connected right now, but we can't see our audience and it's not a real shared experience. So that sent me on a path of building computers that have emotional intelligence.
Jon Prial: I love it. So let's dig a bit into the tech and the data. So I'm a techie at heart, I couldn't be happier that I get to still learn stuff, even now. And I was particularly fascinated by some of the early capture of face, you talk about sitting in front of a screen. So I'm actually a big fan of Adam Geitgey's machine learning, his fun podcasts and the things he's written. And I was fortunate to actually record a podcast with him, but I learned some of the basics of capturing a face at that point, pixel mapping and figure out a mouth and a nose and eyes. But that was 2014. How did you even get started? What was the tech back then?
Rana el Kaliouby: Oh gosh. Yeah, it was challenging, right? So that was back in the time where web cams were huge. They were like this rounded thing and they were blurry so you could hardly see exactly what's happening. Processing capabilities were nowhere near where they are today. There were no mobile phones, there were no smartphones, right, which is hard to imagine. So it was a very different world, but I could kind of project into the future and my hypothesis was we're going to be surrounded more and more by these devices. And these devices will have better cameras and better, more powerful processing power. So even though the technology wasn't quite there yet, I knew we would potentially get there. And of course now we're in a world where, all our devices have cameras and it's pretty ubiquitous. And there's also culturally, there's an acceptance of cameras being more ubiquitous.
Jon Prial: So what makes all this work of course, is having a pool or a giant pile of data to work with. Talk to me about what I'm calling a breakthrough, I'm not sure that's the word you used in the book, but you connected up with another scholar who had been working with autistic children and he had a database, an amazing 412 emotional and mental states, so what did that mean to you as you were looking at your project?
Rana el Kaliouby: I was at the beginning of my PhD journey and I realized that if I'm going to use machine learning, which I did, I had to like, all these algorithms are very data hungry, and so I had to kind of find a database to train the algorithms with. And very serendipitously, I got connected to Simon Baron- Cohen, by the way, he is Sasha Baron Cohen cousin, but a lot more soft- spoken. So he runs the Autism Research Center at Cambridge University and, as it turns out, at the time, he was compiling this huge database, both visual and vocal datasets of a diverse set of people expressing these different emotions. And he had spent years building it and millions and millions of pounds, British pounds, to put it all together. And I knew that as a lowly PhD student, I did not have the time or the resources to replicate that. And he was kind enough to share that data set with me and I used it to train the Dynamic Bayesian Networks.
Jon Prial: There we go, that meant a lot to a lot of people out there. As you talk about autism, it was fascinating, and you mentioned 90% of our communications is non- verbal and you've talked about a range of empathizers to systematizers, talk a little bit about that.
Rana el Kaliouby: Yeah, this is also Simon Baron's work, and I would encourage people who are curious about this to take a look at his work. He basically hypothesize that people fall on a spectrum of being systemizers and then emphasizers on the other end of the spectrum and we all fall somewhere on that spectrum. So on the one hand, you've got the systemizers who are very left brain focus, they tap into their left brain, they like doing a lot of repetitive tasks. And then on the other hand, you have the empathizers and we're all a combination of both. It turns out individuals on the autism spectrum tend to be more systemizers and they really struggle with understanding other people's non- verbal signals, both actually their own and other people's non- verbal and emotional kind of expressions. So again, that made me realize like, wow, what if we could build technology that can help autistic kids in the same way that I was able to draw on from the autism discipline I wanted to give back. And the very first tool I availed to a use case I explored for emotion AI was prototyping glasses that look like Google Glass. With a little camera, we actually have to source spy cameras from the web. They run for about$ 70, but they're tiny they're not like your webcams, they're tiny, so we integrated those in a pair of glasses. And then we had real- time feedback for the kids. So if I had autism, for example, I would wear the glasses and it would say, oh, John looks interested or Don looks really bored and I could learn to read these expressions in real time.
Jon Prial: It's tremendous, what an application. I want to actually go down that path of applications now, and I'm going to rename your book. We're going to go from Girl Decoded to right now Person Decoded, because there are so many different application areas. I was fascinated that you got early use of Super Bowl ad testing, and I believe 25% of the Fortune Global 500 is using your tech. Now take us through some of the application areas that you're focused on.
Rana el Kaliouby: So the autism work was all at MIT, and then we quickly got commercial interests and so we spun Affectiva out and knowing that there are so many use cases. The first kind of product market fit we found was in the area of market research, where exactly as you're saying, brands all over the world create content, they want to know the emotional engagement people have or their audiences have with this content. So if you were Coca- Cola, before you spend millions of dollars pushing a particular ad, we were able to, measure moment by moment, how people respond to the ad. Everything with people's consent and opt- in by the way, cause that's really important for us. But yeah, we were able to see when people laughed when people were disengaged, when people were attentive, when people were skeptical, that's a big one, right? If you're trying to persuade people or you're sharing a message and people are like," Hm," that's not good,
Jon Prial: Would it be optimal for your application for me to be strapped to a lie detector or I have a beautiful new Apple Watch Version 100 on my wrist so you could know more? About how far can you go and how much better would your app be?
Rana el Kaliouby: Okay. So first of all, we stay away from lie detection, surveillance and all of these applications where... So we started the company because we wanted to build a bridge between how people communicate and connect. That was kind of the autism origin story, right? So when we spun out, Ros and I, Ross, my co- founder and I, around her kitchen table, just outside of Boston, we were like," Okay, there are so many applications, where are we going to draw the line?" And we decided that there were a number of core values that were going to drive our business strategy. So privacy and consent were key core values. So we agreed that we're not going to take on any business where people do not explicitly consent. Lie detection is one example, surveillance and security is another. So we steer away from all of these industries, even though at many points in time, we were offered millions of dollars of funding. And we just felt it was not in line with who we are as a company. And I mean, we turn, we turn this funding down routinely.
Jon Prial: Wow. It's a challenge for a startup, they want to say yes to every customer, they have to decide quickly what are the investments to get, say yes to that customer, is it replicatable. Clearly you had a replicatable solution and you made your values overall and said no to customers, which is really quite an impressive story. But I'd still be okay if the camera in my rear view mirror of my car was telling me that I was not paying enough attention. I'll take that one, is that a reasonable spying for me?
Rana el Kaliouby: So that's an application we are spending a lot of time on. So the automotive industry is really fascinating because there's a lot of disruption happening. Three years ago, we built a hypothesis that everybody was focused on sensing what's happening outside the car and the inside of the car was a black box. And so we were like," Hm, we think in- cabin sensing is going to be a thing." And starting with driver monitoring, right, all of this kind of texting while driving behavior, that's very recognizable when you're driving, or worse when you're falling asleep, right.? That's another clear eye patterns, blinking patterns, head movement patterns, very easily detectable using machine learning and computer vision and so that's an area we're very focused on. But now we've expanded it beyond just basic driver monitoring to include things like occupant monitoring. And is there a child in the backseat that's left behind or forgotten? We can flag that. So there's really fascinating safety, but also comfort and experiential kind of applications of in- cabin sensing and automotive using computer vision.
Jon Prial: That's awesome. I want to go back a little bit in time in your company, because I think companies, we talked about saying yes to every customer, worrying about technical debt that you have to acquire or that you don't want to acquire. So early on, the company had been spun off, you had four particular face patterns you could recognize. And you did something that a larger company could do it, so a larger company could spin off some skunkworks, get a small team, doing some prototype work and test it. You did more than just skunkworks, you kind of went behind the scenes and totally shifted from existing older ML tech to deep learning. Take us through what was going on in your head when you did that because that's astounding risk that you took.
Rana el Kaliouby: Yeah. So at the time, we were using support vector machines, if there are any machine learning geeks out there, which has a very feature engineering approach to machine learning. And in 2013, 2014, deep learning was starting to kind of emerge in the academic literature. And I knew that that was the way of the future. It was going to allow us to very quickly bring these new features to market. But to do that, we had to invest in a deep learning infrastructure and we have to invest in a ton of data. So I was not the CEO at the time, I was the Chief Technology Officer. So I went to our CEO at the time and I said," Hey, we need to invest in deep learning." He was like," So explain what that means to me." I was like," Well, it's going to take a little bit of a while, but when we get there, we're going to move faster." And he was like," I don't think we should do that because it's a technical debt, right? And it's not going to show to our customers in the next, call it six to eight months." So I went back and I recruited two people on my team and I was like," We're going to work on a secret project. Nobody needs to know about this. We're going to migrate our entire technology stack to deep learning." And they went off, got it done in a few months and got back and obviously it was totally worth it. And then I was like,"Okay, it's a done deal now." And now-
Jon Prial: You went from four faces to how many? It's tremendous, right?
Rana el Kaliouby: Yeah. We now have almost over 40 different extra. And now, we've gone beyond the basic facial expressions. We do things like cognitive overload and attention and stages of drowsiness and object detection. I mean this underlying machine perception platform is enabling us to train all sorts of things, voice and vocal analytics, we're able to quickly add on voice as a different modality.
Jon Prial: You talked about all the different data sources. There's often issues, obviously with collection of data and bias in the data. I've heard most heart attack data is all white male. So as you're looking at drowsiness or attention deficit, all these fascinating thing, how much does getting a broad set of data matter in terms of having unbiased data in this case?
Rana el Kaliouby: It totally matters. And that is something that is often unfortunately really overlooked in the AI space. In our world, I mean, I'm sure a lot of you have seen this in the news over the past 18 months, face recognition systems are being deployed but they're very biased. And so they may not even detect a face like me, right, because it's not really trained on a lot of data of women or particularly women of color. So our approach to it is first of all, ensure that the data is diverse. The quantity does matter, but the diversity of the data matters. And also when we train the algorithm, we make sure that the sampling includes an equal sample of different subpopulations. So different age groups, different genders, different ethnicities, people with glasses, like you John, or people with beards or hijabs, it's so important to be thoughtful about all of this. And the only way to get there is if your team is diverse in the first place, because we all have blind spots based on our prior experiences and we bring back to the table. So the more diverse voices you have, the less of these blind spots collectively you'll end up with so, very passionate about that one.
Jon Prial: Awesome. As I was winding near the end of our session, I was debating if I was going to go down a diversity discussion or kind of a work- life discussion and you covered diversity. So actually I'd like to do just a tad on work- life. And I think what impressed me a lot was your unabashed acknowledgement of your own personal privilege. Talk to me about some of your work life learnings, and you could even, even what you learned from your co- founder Ros was kind of fascinating, so talk to me about that. A little personal, if you don't mind.
Rana el Kaliouby: Yeah. There's a couple of, first of all, because I know there's a lot of founders and a lot of startup communities is tuned into this, I will be the first to acknowledge that I learned this the hard way. So when we started Affectiva, it was just, I mean, it's still is all consuming all the time, but I've really learned to make time for myself. So self care is something I did not prioritize at all in the early days of the company and now I make a point of doing. Not just for me, but actually to set the tone for the rest of the team. So if you go onto my Google Calendar, I have very publicly like Zumba class on Fridays, you can not schedule on top of my Zumba class. And I think it's important. It sends a message that you got to stay fit, you've got to stay mentally fit and that's important to me as a CEO and as a leader. But it was not always the case. And I also remember when I first moved to the US, so I grew up in the Middle East, as I said, and when I first moved to the US and got to know Ros and her family, I remember the first time I had dinner at our house and we were all like kind of moving the dishes to the kitchen and that her husband wanted to load the dishwasher. And I had never seen a guy ever helped in the kitchen. So I was shocked. I was like," No, no, no, Len I've got this." And he was like," No, that's my job." I was like," Your job is to load the dishwasher?" So I think kind of being exposed to these kind of family dynamics where a woman's role is really supported by a partner, but also there's shared responsibility in terms of how small chores and kids, but that was a real revelation for me and hashtag goals, right? My next relationship, yeah.
Jon Prial: Excellent. Now you mentioned when we talked about the deep learning project that you were CTO at the time, not CEO, and you had made a decision not to be CEO and in the end became CEO. Share some of your learnings of not taking the job that perhaps you should have at the beginning.
Rana el Kaliouby: Yeah. When we first spun out Ros and I decided to, with the nudging of our first investor, decided to hire a seasoned business executive to run the company. And he was our CEO for four years, and then he decided to move on. And so the question became, who should be the next CEO? And a few board members said, Rana should be. Ros never left MIT, she's a tenured MIT professor, but I was company full- time as CTO. And I just remember thinking, well, I've never been CEO before, I don't want to fail. And so I declined the job. Now our head of sales at the time, who had also never been CEO, and he had only been with the company for like two years. He said," Sure, I'll take the job," which was so interesting, and it's unfortunately also so classic,
Jon Prial: Such a guy, such a guy.
Rana el Kaliouby: Such a guy, yeah. So he was CEO for a couple of years and we built this amazing relationship built on trust and he was great. But I remember in 2015, after my Ted Talk, I came back to work and he was like," We've lost track of our mission here and we've become... We had, at the time, kind of morphed into an ad tech company, which was not at all what we had set out to do. And I also went on Google and I Googled what are the jobs and responsibilities of the CEO? And I realized I was doing all of them, I was raising, we had raised almost$ 30 million at the time through the company, I was involved very heavily in all of these. I was hiring our product and R& D team. I was very involved in product market fit. I was the face of the company and I was like," Hm, I'm kind of doing the job." So I mustered a lot of courage and again, lot of support from my mentors who helped me visualize what being CEO could look like. And I took that and I approached the CEO at the time, I didn't want to go behind his back and we negotiated it, talked about it back and forth and then took it to the board and it was a unanimous vote. So it's been almost four and a half years now since I stepped into the CEO role.
Jon Prial: Clearly a good call. One of the takeaways I had in the book was you want to make sure, as a CEO, you want to make your people successful and maintain work- life balance, which I think is fantastic. Let's not lose the importance of what you just took us through. Tell me what it was like being a female fundraiser.
Rana el Kaliouby: It is interesting. It's evolving, but I'd say we're not quite there. So when we raised our first check 10 years ago, Ros and I did the kind of famous Sand Hill Road Show. So we spent a few days meeting almost 30 VCs in the Bay Area. Zero were women. Like zero. There was never a woman in any of the meetings we had, so that was quite unfortunate. And in the last couple of years, I raised$ 26 million, but I was adamant this time round that I bring in a more diverse investor base. And I was able to find firms that have female partners and I really kind of was very intentional about it and also people of color, that was very important to me as well. But it's not easy, so I'm part of an organization called All Raise. We are on a mission to support both female founders and female investors and build the ecosystem. So I would encourage you to look it up and sign up and join the community. It's really amazing.
Jon Prial: That's great. I hope that many of you walk away from this podcast and check out All Raise. We'll include a link in the show notes. As we begin to close, let's talk about your business and look a bit into the future. I just can't stop thinking about all the sources of emotional data that could be captured and how important all this could be. We could be watching Netflix movies and there's a lot more data to be gathered than just saying people who liked X liked Y. People who liked X didn't really like X, they winced a little much, or they smiled a lot more. Will that be able to be done in a reasonably anonymized way? Do you think that's a future?
Rana el Kaliouby: I absolutely think that's a future, yes. And it has to be done with kind of respecting people's privacy. And also like, I have to get some value in it as a consumer, right? Why am I sharing this data? And for the right kind of user experience and again, with automotive, it's safety, yeah, I would... My daughter is starting to learn how to drive and yeah, I would sign up for a car that helps her be safer on the road, for sure. I think this is all about the cracking of the user interface really. And I'll go back to the automotive industry where we partner very closely with the car companies. Say we detect you're distracted or say we detect you're falling asleep at the wheel, how should the car respond? And different car manufacturers have different points of view. So some will say," Oh, we're just going to like give you a gentle alert," all the way to cars that have semi autonomous capabilities where they're basically saying," Oh no, we're going to take over control of the car." So it's a spectrum. And I think it's important to build trust like this new social contract between humans and machines that's built on trust is going to be so key as we see more and more AI's deployed to the world.
Jon Prial: I love it. I think that's great. Well, from your fascinating personal story to your amazing founders story and this tremendous product in tech to everyone out there, I can't recommend Girl Decoded enough. It was just a pleasure talking to you. Thank you so much, stay safe, stay sane, and thanks for taking the time to be with us.
Rana el Kaliouby: So much, Don. And for all of you entrepreneurs and founders out there, happy to always help. I'm very easy to find online and if you read the book, please share what you think. Thank you again. Thank you for having me.
Imagine a future where our technology interacts with us the same way we do with one another through conversation, perception and empathy. Dr. Rana el Kaliouby is our guest on this episode of the Georgian Impact Podcast. She is the CEO and Co-founder of Affectiva, a pioneer in the field of Emotion AI and Human Perception AI and author of the book Girl Decoded.
You’ll Hear About:● Her journey to humanize technology starting from The American University in Cairo, to Cambridge, and then MIT.
● How her life experiences helped to influence and inspire her vision for technology.
● Her transition from working with people with autism to commercial uses of her techniques.
● The challenge of turning down opportunities in order to stay true to established core values.
● The importance of diversity in not just your team, but also in the data.
● Work life balance and making mental health a priority.