Eugene Nho: Whiskey and Blackjack — What Machine Learning Teaches Humans about Learning


[MUSIC] So it’s a typical Thursday night. As a third year, gone are the days
of beer pong Thursdays, and I’m hunched over my computer in the
basement of Huang Engineering Building, the clock ticking toward a deadline at 11. For the past year,
as part of my joint degree program, I’ve been trying to teach machines
how to learn like humans. But I’m here today to tell you
about the time that machines taught me something about how to learn and
live more fully. So that day in the basement, I’m building this program that’s supposed
to learn how to play blackjack on its own. Using a type of machine learning
called reinforcement learning, which is the same kind that
was used by Google’s Alphago. But unlike Alphago, my program is
acting like a kid that never learns. It keeps making the same mistakes,
like hitting on 19 or 20, and sometimes standing pat on 3. For those of you who are not blackjack
players, those are not good moves to make.>>[LAUGH]
>>I’m super frustrated, see, what my program is supposed to
do is actually quite intuitive. It only needs to do two things. First, it’s just as any kid
would do when given a new toy. It’s supposed to try out a bunch
of things and collect experiences. Like, try hitting on 10, or 15, or
20, and experience what happens. And second, it’s supposed to stop and
extract lessons from those experiences. Learn things like aha, whenever I choose
hit when I’m on 20, bad things happen. Not going to do that again. So I spend an hour of debugging and I feel like my soul just
got sucked out of my body. And I realize that I’ve somehow built
a program that’s doing a great job of collecting experiences,
running around doing all these things and not stopping to learn anything
from those experiences. Too busy running around to
learn anything from experience. That actually didn’t
sound that unfamiliar.>>[LAUGH]
>>As no stranger to what seems to be the GSB mantra of do more,
fit more stuff in schedule and be more productive, I saw a little
bit of myself in that program. I had to ask myself,
am I like that program, too? Are we like that program that
never learns from experience? So, unfortunately, I realized yes,
I was like that program.>>[LAUGH]
>>More than once, for example, I walked out of class thinking wow,
that speaker had such clarity of thought. I’d love to learn how to do that. And I never did because I have to run to
the next BVL or the next coffee chat. I never took the time to think through
which of these comments struck me as insightful and what it would look like
if I tried to imitate that line of thinking on a different topic. So you might be thinking, wait a second,
we’re not like computers. Important things will stick in our memory,
even if we don’t stop and reflect, right? That’s exactly what I was thinking, too. And to see if it was true, and
I really wanted it to be true, I decided to reexamine
a learning moment from my own experience to see if I truly
got out everything that I should have. Knowing that I didn’t have the time
to reflect on that experience. The experience that came to my mind was
one of my proudest moments from army days in Korea. Cleaning the second floor bathroom. So I had just made corporal that month,
it’s a pretty big deal. I had this new, prestigious responsibility
as the leader of the cleaning squad.>>[LAUGH]
>>So ten or so privates from my platoon who
cleaned after work every day. And how we cleaned these places
never changes over time. In a barrack dominated by this oppressive, hierarchical culture, you simply do
what you’re told by your superiors. But this time we really want you to
bring some fun and creativity into this. So for the first couple days,
two privates walked around and jotted down what everyone
was doing at any moment. Then they learned to create this chart
that showed us the sequence of activities that we took and where the bottlenecks
were, true consultant style.>>[LAUGH]
>>And, with that, we debated how to make our process better. We’d ask things like, should we double
the personnel for wiping the toilets, so whoever is watering down
doesn’t have to wait? Or we’re using toothbrushes to clean
the urinals, not ours, but the spare ones. But the surface area is too small,
so it takes a long time to clean. How can we get normal brushes? We tried having two area captains
as well as having one centralized cleaning czar system, and
after a month of experiments, we cut down the time from
30 minutes to 14 minutes. And you might be wondering if
I’m being facetious about this being one of my proud moments,
but I’m kind of not. And if anybody asks,
what did you learn from that experience? I’d say something like this. I learned how to motivate a team,
give ownership, and change an organizational culture. A fine answer, specific enough to
fool others, and most importantly, myself, that I’ve learned
enough from that experience. But if anybody further pushes, and asks
me, but really, how do you give ownership, what actionable insights did
you learn from that experience, you can use here and now? I’d be stumped. I have this log of experience in my head,
but contrary to my hopes, my brain didn’t magically distill
insights from that experience. Sure, some of those have popped
up in my head before, but I assumed those were important
enough to stick in my memory. And sure enough, they didn’t. So one weekend I spent 30 minutes
to reflect on that experience. And the lessons I walked away this time
were of completely different resolution. For example, I learned about the moment
that I become susceptible to inadvertently taking ownership away from my team and
how to counteract that temptation. Or the tactics I can use to address
norm violations in a stern way when I’m also trying to build
a positive collaborative culture. So there it was, the verdict. The high quality insights doesn’t
stick in your memory automatically. I had to do the work to get there. How much of invaluable experience
you’re having here at the GSB are you extracting lessons from? Are you learning enough
from your experience? So why did I bring up this
complicated computer algorithm? To make a simple point,
that we should stop and reflect to learn. Because it exposes an inconvenient truth. When dealing with fuzzy things that are
hard to quantify, like self development or learning, we often to choose to believe in
mental models that are convenient to us rather than the ones that
might be more accurate. So when it comes to our belief about how
we learn from experience, it makes our lives so much easier to simply believe
in this mental model of aging whiskey. Where, as with whiskey,
time takes care of everything, and it somehow magically distills insights from
your conscious reservoir of experience. But as much as I love whiskey,
I believe the fundamental relationship between experience and
learning is more accurately captured by the mental model of that
dumb blackjack program. If we don’t stop and
reflect, we will not learn. We have to put in the work to
get what’s important to us. Because no one else, not time,
not our subconscious, is going to do that job for us. So I hope this program, this dumb program could act as the same wake up call as
it did to me, and motivates you to spend five minutes every day to make
sense of what happened that day. I hope we can all live and learn to be a little better
today than we were yesterday.>>[APPLAUSE] [MUSIC]

Add a Comment

Your email address will not be published. Required fields are marked *