Emotion Reader

After many weeks spent quietly working with Padraig and Stephen on our prototype, and years of thinking about using computer vision and AI to hack together the perfect video analytics software, I finally got to update my LinkedIn profile today to: COO at Emotion Reader.

I am stoked.

Lots more to come, but if you’re reading this and you’re curious, please get in touch.

www.EmotionReader.io

Look Ma, I’m on TV

It is a very surreal thing to see yourself on the evening news. I knew going into this that news editors have a tendency to take 15 minutes of intelligent discourse and then use the 15 seconds where you sound like a complete moron, so I was kind of dreading seeing this. It turns out that it’s not too alarmist, though they do still take the whole BIG BROTHER IS WATCHING YOU angle. I think what people are really afraid of is that they don’t know how data collected about them is being used. There’s very little transparency when you walk into a store or swipe your credit card around what data is being collected or the ends to which that data will be used. I think if businesses were more up-front about the trade-offs and benefits to consumers, they’d be more comfortable with new tech. I’d love to see businesses actively create more accessible messaging about the data they collect and how it benefits their customers.

https://www.nbcbayarea.com/news/local/emerging-technology-turns-cameras-into-intelligent-sensors.html

The Nth Dimension

how I see the worldThe limiting factor in furthering our understanding of how stuff works is often our own biology. We need technology to bridge the gaps our brains aren’t designed to manage. Our brains are amazing at doing certain things, like filling in perception/knowledge gaps with memories and best guesses, but ask someone to listen to two songs simultaneously or make a decision based on more than twenty simultaneous variables and you start to see the brain break down.

For the last few hundred years, media technology has mostly been focused on supplying people with information like facts, figures, stories, parables, and other linear forms of information that are easy for a normal brain to digest. In the age of the internet, filtration of all this information has become the main challenge, and technologies from TV Guide to Google to Pandora have done a pretty good job of it. The technological challenge that remains, however, is cognition and understanding. Where is the machine that helps me understand how weather affects daily life, from alcoholism rates to the type of cloth preferred by climate? Sure, there are studies that pick apart the macro data to focus on micro issues, but they’re inevitably a kludge of imperfect data analyzed by programs designed to simplify complex issues into bullet points our brains can handle. I don’t think that’s good enough.

We now live in the era of big data. Everything is out there, but we don’t really know what to do with it. We are not literate in data, and for most people, thinking about a problem in terms of raw data rather than outcomes and conclusions is a difficult exercise. I love Malcolm Gladwell’s book Blink because it gave us a peek at the problem. To paraphrase, there are two parts of your brain–the part that thinks and is reading this word and sounding out the vowels and consonants in your head, and the part that reacts and subconsciously comes to conclusions. The part of your brain that thinks is wired to assume that the subconscious part of your brain is making good decisions when it reacts, but Gladwell questions this. Do you really have enough good data stored away in the recesses of your memory to make good “gut” decisions?

To determine the quality of a decision, you have to know whether there is enough data, and whether the data itself is correct. Data analysts refer to the problem of low quality data as “garbage in, garbage out.” Think about your subconscious as a database kicking out statistics. If you grew up in a big city and every gun shot you ever heard was criminal violence, your “gut” will produce a statistic that says 99.9% of the time you hear a gun shot, something illegal just happened. If, however, you grew up in a rural area where hunting is normal, you would have a very different data record in your brain. Your database would say that 95% of the time if you hear a gun shot it’s probably just a hunter. That city person would assume the wrong thing if they went to the country and heard a gun shot. Their gut would mislead them, not because their internal data was incorrect, it was simply incorrect out of context. That person’s internal database wouldn’t be big enough to produce a correct “gut reaction” all the time.

This is where most of the books out there on the subject end (Blink, Predictably Irrational, Subliminal). They simply want us to be aware that our gut reaction is capable of being wrong when we think it’s right, and vice versa, and that we should actively think about what data is going into these subconscious reactions. What they don’t do is go on to give us any sort of framework for fixing the problem. They simply tell us to hand over the decision making to the experts when we realize we don’t have the depth or breadth of knowledge to make accurate predictions. But, what if those experts are prohibitively expensive or the “experts” are just marketers steering you into the arms of their clients? Why can’t we empower ourselves to fix problems like this?

I think technology is capable of radically changing the ways in which we make decisions, and in doing so, increase the accuracy of our decision-making. Just as we use weather forecasts to correct our occasionally inaccurate assumption that a bright, sunny morning means we won’t get rained on during an afternoon bike ride, we can use technology to help us identify bias and inaccuracy in the internal models our brains use to predict all sorts of other things. Just thinking out loud here…what if we had a GPS based threat indicator app on our phones that took into account who we are? A child’s threat indicator might start buzzing when he nears a dangerous intersection where data shows that pedestrians have been hurt before. My threat indicator might start buzzing when I walk into a store that has a bad Better Business Bureau rating. My work calendar could warn me when I schedule a meeting at an office where the management has been investigated or convicted of white collar crimes. In all these cases, technology helps push contextually relevant nudges to adjust for gaps in our intuition–that database of experience our faulty subconscious depends on.

Now, think about how such a system would be constructed. What you need is public data that everyone can agree on to use as the basis for your model. This is difficult to do in a capitalist society, because everyone is trying to profit off their data, or worried that someone else will. Certain data-sets, however, have been made publicly available for the greater good of everyone, and many more for the greater good of Google and the money they make from advertising. Weather data is relatively open-source. The correct time is available to all–no one has to calibrate their watch using calculations based on the sun. GPS is fairly open-source, allowing our devices to know where they, and by proxy we, are. The location of the next bus, train, or plane is open-source, allowing us to commute much more efficiently. Google Maps is an amazing open-source data-set (and is often taken for granted), and if you wanted to, you could quickly and easily head over to mapbox.com and plug the location of whatever you want directly into Google Maps.

I did a simple experiment to see if I was really spending my time the way I think I’m spending it. I tracked all my time by place using the Place Me app (uses GPS to look up addresses and figure out where you are), and was really surprised by what I found. The app isn’t 100% accurate, so I had to combine the time for about 10 places within a block of where I live that it thought I had visited in order to get to a single time-spent number for being home. Predictably, I spend the majority of my time at home, followed by the office, but the number three location on the list was my local coffee shop. I had no idea I was spending so much time there. My gut was clueless.

We run into problems when open-source data runs into quality issues–the garbage in, garbage out effect. There’s a lot of objectively false data floating around the internet, and because the internet wasn’t designed to capture objective data, it generally doesn’t. We’ve gotten used to big tech companies just giving us stuff that radically changes how we live our lives, but few people ever stop to wonder where all the data that goes into their products comes from, or what data trails we create in our own lives. Maybe it’s time that data literacy should be taught in schools. Perhaps in a decade or less, we’ll be ready to move from linear story-telling to real-time complex models.

A normal person living today has as much data at their fingertips as a high-paid McKinsey consultant did 20 years ago. The rapid democratization of data and computing power easily brings boat-loads of information right to your cell phone. New filtration systems allow us to zero in on the important bits of information and pull precise bits from petabytes. Where we still fall flat on our faces, however, is getting past our own brains, and our own biases, in order to make better decisions.

 

Google Glass & Invisible Data

google glass

 

Google Glass, the wearable computer with an always-on video sensor and over-eye display, freak a lot of people out. There are very real privacy issues that come into play when anyone has the ability to take photos or video surreptitiously, and (more importantly) doing so becomes normal and acceptable in society. Judging by the number of people trying to preemptively ban the device, society is definitely not ready for this to become normal. However, notice that I called it a video sensor, not a video camera. My point is that while someone probably could use the technology for evil ends, I’m not convinced it’s actually supposed to or designed to do these things that everyone is so afraid that it will do.

On the far end of the spectrum are life-loggers–people who embrace the idea of recording and sharing everything. We’ve been doing some experimentation at the IPG Media Lab with POV video–basically video cameras strapped to your head. I tried out a bunch of wearable video recording eye-glasses and ear-pieces. These cameras are legitimately video cameras, because they have two states: 1. off, and 2. recording video which is saved to memory for later viewing. A lot of people use these wearable cameras (like the Looxcie) to broadcast their experiences in real-time. Unless they choose to turn the thing off, they are never not recording whatever is happening in front of them. Trying this device out, I felt like publicly life-logging video-streams that live forever on the net is an annoyance at best, and at worst, an inescapable panopticon of intentional and accidental junk shots. “No, thank you,” I thought. After this experience, I was really skeptical about the chances of Google Glass ever succeeding.

But, something clicked for me recently as I was reading Robert Scoble’s blog about his experience wearing the device, and I realize I’ve been thinking about it all wrong. What if the camera end of the device is not intended to be constantly recording video or creating a photographic record, but is simply acting as a sensor to create a record that is 99% actionable data, and only 1% saved images? The average single-frame picture has over 600,000 bits of data. A video taking 15 of those per second stacks up gigs of data very quickly. These human-viewable images are data-overload for both humans and machines. Computers don’t need nearly this much information to make a decision, and humans would need two life-times to create the video and then view it all. A camera-based sensor can scan all those pixels for meaning in real-time, but it will dump the image from memory and keep only the information it needs. In this way, it actually acts a lot like your human eyes and brain memory–keeping the important moments containing information it needs, and discarding all the long dull bits between those moments, the information it does not need.

I think there’s precedent for a device like this gaining societal acceptance. When the TSA started using a full-body scanner, there was a brief uproar about the idea that TSA employees were checking out our naughty bits, yet once people saw the lumpy humanoid images that the scanner actually creates, the furor died down. Similarly, before Nintendo released the Kinect, nobody could fathom why you would want to subject yourself to a video camera in your own living room. Once people figured out that the device isn’t really meant to take video or pictures of your living room, it’s simply meant to figure out where objects are in the space directly in front of it, people quickly got over it. You could even look to gmail for an example. When they first released this email that was “free” and offered an unheard of amount of free cloud storage, but the “price” for this product was letting google scan the contents of your email and serve you ads based on what was written in your emails, critics scoffed that such an invasive product would ever catch on. Clearly, they were wrong. What all these products have in common is that they are able to handle the most sensitive data in our lives, yet do not judge us or betray us, the users.

I firmly believe that Google is crafting a data architecture for Glass that does not betray the user, because they do not want to be in the morality business. I think the data these devices collect could potentially be incredibly damaging, but if the data is stored and parsed in such a way that damaging facts can be hidden, and the user controls what data is used vs. not, the device will quickly become a trusted extension of one’s own self, much in the way smartphones already have. The default setting is to save all pictures and video recordings to a private area of Google+, where the user must choose to make it public. Google glass doesn’t really do much that the smartphone doesn’t already do, it simply gives smartphone addicts a new, faster, more immersive way to interact with data about the world around them. The real innovation here is not Google Glass, the innovation is, was, and will continue to be the incredibly powerful way that apps can augment your experience with the physical world by using real-time, passively collected data. Glass is simply a better app interface.

One of the more interesting apps I’ve been playing with lately is called Place Me. If I turn it on, it politely runs in the background, passively listening to my phone’s GPS data, then matching those points in space against a database of places, then providing me with a list of places and businesses it thinks I’ve gone to, and a tally of the hours I spent at each place. It’s usually pretty accurate, and provides an interesting way to quantify my time, over time. This sort of data could be used in very useful ways by marketers, and super-charged by a device like Google Glass. For example, “Hey Tim, we noticed you eat lunch at Plant on the Embarcadero a lot. Other people that work near you and like Plant have been going to this other new restaurant. Here’s a 10% off coupon for lunch today. Try it out and let us know if our hunch was right.” That sounds a lot like the kind of advice I would value from a friend, but suddenly I could get that from an automated system. As far as I can tell, that is what the future looks like.

Glass, I should point out, does not have GPS built in (yet), and much of its functionality lies in tethering to a smartphone (Android, of course) that does have GPS and a keyboard. The proponents of Glass love that it’s operated by voice-command, but the idea of loudly prompting my Glass device at work or dictating my emails while walking down the street makes me cringe. I already stop and stand on the side of the sidewalk to type texts rather than use Siri. I will definitely kick the first person who randomly starts yelling “OK, Glass…” out of any meeting I’m running. So, while it does do a bunch of cool stuff on it’s own by soaking up ambient information and feeding the user data, for that user to interact with the world or do much with this information, I expect to see a lot of tethering and dual-device usage.

While I’m bullish on apps and helpful data collected from passive sensors like Glass, I’m still not totally sure that Glass will actually wind up being a successful consumer device (unless, as Scoble points out, they make it really cheap, like <$200). A friend of mine calls them “Segway Glasses,” because he thinks they’re so nerdy looking that only rich guys, rent-a-cops, and tourists would be willing to be seen in public using them. I think he might be right. However, I really do believe that the future will be full of apps that act like 99-cent sixth-senses, and the sensors that feed data to these apps will grow smaller and smarter, and in time, become normal.

Spam made me do it

According to eMarketer and InsightOne, a despicable crime is being perpetrated all across America, right under our very noses. Yes, I’m talking about human on computer violence, a subject the media has been far too lax in reporting up to this point. How can you sleep at night knowing 4% of computers and mobile devices are being mercilessly beaten? I, for one, feel it’s time to put an end to the violence. Type, not smash.

Screen Shot 2013-04-26 at 1.55.24 PM

Ad:Tech 2013

adtech

This year for Ad:Tech I did something I’ve never done before–I worked a booth in the exhibit hall. Usually I hang out in the conference rooms with big name marketers and exchange ideas with keynote speakers, and it’s rare that I have time to make it to the expo floor. However, we’re working with a very cool start-up called IMRSV that uses software to track people as they walk in front of a simple webcam and they invited me to co-curate a booth. Their technology guesses people’s age and gender in real time, and is an incredibly easy way to get metrics about people in physical space without breaking the privacy barrier. Everything is anonymous and no images are saved, only data about the number of people and type of people walking in front of the webcam. We had cameras set up all over the building and were actively quantifying the traffic of people around the show. So, going into this, I thought it was going to be cool. It was not.

Let me now give you my highly biased, subjective, and purely qualitative view of the foot traffic we got at our booth. Over the course of a day spent explaining the technology and how IPG Lab uses it, I had roughly 5 intelligent conversations. The vast majority of people I spoke with were sad, desperate, weirdos. There was the Russian spammer, the guy who runs free give-away websites, the dozens of bad email marketing, direct marketing, SEO marketing, bottom-feeder worst-of-the-worst inventory ad network sales guys, and finally, the people that were very concerned that the software was unable to automatically detect when they were flipping their middle finger at the camera, despite the fact that it did correctly guess their age and gender within seconds.

It was this pile of winners I’d like to talk about. I realized as I looked out on the teeming masses of internet assholes that I am incredibly sheltered. I forget that working at a place where we actually try to make the internet better is relatively rare compared to the huge majority of internet businesses just trying to spam, game, or otherwise make a quick buck off of the frictionless mistakes of a million accidental clickers. And, I realized that I viciously hate these people.

I guess the point to take out of all this is that as much as big media can suck at times, they’re safe. The world of media is a lot like the real world in the sense that anarchy sounds great when you’re flush and comfortable, but when the shit hits the fan and a mob is marching towards your front door, suddenly you can’t wait to find the police. I love the anarchy of the internet, the democratization of information, and the way technological disruption is keeping big media on its toes, but I have to say, days like this I’m really glad that the giants behind sites like Hulu, NYTimes, and Netflix create such cozy and safe little media havens where I can hide myself away from the masses and enjoy my media in comfort and safety.

P.S. Russian spam guy, please don’t hack my site. I know you can. You don’t have to prove anything.

The Era of Big Feelings

[This post was originally written for ipglab.com, then edited for this blog. On a personal side-note, I totally just coined the phrase “Big Feelings.” Go me.]

 

At the IPG Media Lab, we often engage in what we call “Quantitative Qualitative.” In other words, we listen for Qualitative information, like how you feel when a pop-up ad startles you, or the rush you get when a web service seemingly reads your mind and suggests the exact thing you were looking for, but we do it on a very large scale. We don’t just want to know what a dozen people behind a two-way mirror think or feel, we want to know what hundreds or thousands of people think or feel, and be able to probe the nuance and variance in feeling that become available with large datasets.

“Big Data” is a term that gets thrown around a lot, but the information stored in these databases tends to be easily quantifiable types of information that are black or white, hard, objective sorts of things. Big datasets are rarely made up of odd, squishy subjective things, like emotions. In practice, Big Data means being able to collect large amounts of data, but also be able to do something productive with it—to analyze it for insight. Some qualitative researchers are already making the shift to quantifying qualitative info–where they used to be dependent on small focus groups, they now skip them in favor of mining social networks for product sentiment. Why ask twelve housewives what they think when you can mine a million tweets, then run a semantic sentiment analysis to get unvarnished qualitative feedback from thousands of actual product users? To an extent, this is progress, but it’s also an invitation to listen to the fringe rather than the mainstream (do you really think the person venting about their vacuum cleaner on twitter represents you?). So, while more information is generally better than less, I have to wonder if we’re simply exchanging one set of biases for another.

A technology that has intrigued us and led to some fascinating work here at the lab is webcam based facial coding. The way facial coding works is by asking a panelist to turn on their webcam, in their own home, on their own computer, then using software to identify points on their face and infer emotion as we watch them react to whatever stimuli we put on screen. We use the term valence to describe the range of emotion they exhibet, which can be positive or negative. By getting hundreds of people to do this we start seeing patterns and can predict reaction. To be fair, it can be tricky to work with this data. People rarely tend to register much emotion when presented with a bland website, boring television show, or predictably pandering advertisement. However, when the content is good, suddenly the data comes to life. I have to wonder if years of lousy focus groups have convinced content creators that they need to dumb down their work to suit the lowest common denominator viewer. If we were to employ facial coding and base our optimization on manifestations of joy and laughter, or intensity of engagement, would the media landscape look like a very different place? What if we program our algorithms to listen for laughter, and rewarded the audience with more of the same?

I’m not entirely sure where this path of exploration will lead, but I can say that in our early tests we have already found some very interesting findings. To share just one, while working on a project with the Undertone video ad network, we watched the facial expressions of people as they were exposed to either A) a video ad that launched automatically as an auto-play ad, or B) a video ad that launched upon user-click, as a user-initiated ad. It sounds obvious, in retrospect, that auto-play ads annoy people. But, we found it fascinating that we were able to quantify the aggregate delta of valence (positive vs. negative emotion) as real people were randomly exposed to the click-to-play ad vs. the auto-play ad.

In the example below from the study we did with Undertone, you can see that valence slightly decreased (indicating a negative change in emotion) when the auto-play ad launched compared to the base-line of how they felt about the content on the page, while valence increased (indicating a positive change in emotion) when the ad was navigated to and clicked on. Something as simple as expecting the ad to play vs. being surprised by it made a big difference. You can see that in the overall valence numbers during the time the ad played, which show a 3% increase from neutral (0% valence) for the click-to-play cell.

 valence

This is certainly a small, tentative step towards using this technology in a meaningful way, but we feel it is an important one. We at the IPG Lab plan to keep pushing the boundaries of media insight by mashing up new technologies and quantifying qualitative data.

If you’re interesting in learning more about the Undertone Research, feel free to watch the webcast, get in touch with the good folks at Undertone, contact me, or come see us present this study live at the 2013 ARF Re:Think Conference in New York City.

Screen Equality (did not see this coming)

The narration on this video is a little intense, but it does a good job of showing off how I spent most of May and June, 2012. I’ll be presenting the findings for this study along with YuMe at the 2013 ARF Re:Think Conference in New York. Not to spoil your viewing experience, but the really interesting thing we discovered here is that…. [continued below]

 

 

 

…ad clutter has a lot more to do with ad effectiveness than does the device on which you view the ad. Surprising even me (pretty darn rare), the notion that people watch video on their mobile devices while on the go was largely disproven. Sure, if you have 30 seconds to kill while waiting for the train you might watch a short clip, but longer-format video and any serious viewing time on a phone or tablet is reportedly done while in bed or sitting on a couch. It’s definitely what I do, and it makes sense intuitively that other people would act this way, but it seems like the industry hasn’t yet wrapped it’s head around the idea that consumers don’t really differentiate between screens if the content is the same. Given the choice between Hulu and 4 minutes of ads in a show vs. 8 minutes in a live broadcast, the winner is generally Hulu or a DVR. Hulu, and online video providers like them, may well be the best bet advertisers and publisher can make. After all, if no one pays attention to the ads, why should anyone pay to advertise?