Yoga Pose – Andy Ruestow & Bryan Donovan – JSConf US 2019


Yoga Pose
– Andy Ruestow and Bryan Donovan Hello? Hi! Welcome back! We made it to 5:00. Oh, my gosh! Woohoo!! Having a great day so far? Yeah? The weather is beautiful? Awesome. So this is going to be our last talk in this
room, so after this talk is done, I’m going to encourage everyone to head over to the
SitePen Track to watch our last talk of the day, and then there’s an awesome party, of
course, so our last talk is going to be by Andy Ruestow and Bryan Donovan, and this is
kinda cool, actually, they are going to show us how to use TensorFlow to rate yoga poses. So the fun fact for both of them is kind of
funny because it’s like the exact opposite. So Andy actually is going to be the one doing
the yoga and he doesn’t really do yoga. He’s climbed a lot more mountains than he’s
done yoga, but Bryan has done a lot more yoga than he’s climbed mountains, so they complement
each other very nicely. So let’s give it up for Bryan and for Andy. [applause] Hey, everybody, thanks for coming.>>Hello. My computer is locked here, so you can watch
me type my password. All right, so we’re going to get started here
with the yoga pose and if you’ve been in this room earlier today, you’ve probably seen a
lot of really great presentations. David just gave a really great one about Imposter
Syndrome which I think we’re both feeling a little bit right now.>>Absolutely.>>And he also mentioned some things about
confirmation bias and a quick little story about that. I’ve got a wife and two kids, and they were
lucky enough to come along with us on the trip here today. And any time I go on a work conference, my
wife calls it a work vacation. So I always explain, no, it’s — we’re actually,
we’re learning, it’s, you know, very dedicated, we are spending a lot of learning new technologies,
not learning people, so when we’re driving up to the resort yesterday and we saw the
palm trees and the beautiful son, I think the confirmation bias set in and this is in
fact a work vacation.>>She needs a vacation.>>Well. She’s at the beach today.>>So just a little bit about Bryan and myself. Buyian is a software lead at Lockheed Martin. He’s been developing software for pretty much
his whole life but over the last 20 years, really as a lead developer, chief architect
of a lot of cool systems. He is really the driving force, like the conductor
and really the engine to drive a lot of our programs forward and that’s where my train
analogy runs out of steam. Myself, I’m a DevOps tech lead which means
that I’m not good as a developer and not good at relations, but what I’m really good at
is to be able to enable developers to be the best at what they’re at and for humans it
is being creative and we’re best at solving problems and a problem we’re not really good
at solving is doing repetitive tasks over and over again. I take a lot of joy in automating all of those
things, so that the creative people, especially like Bryan, can do what they’re best at.>>Yeah, DevOps makes our life better, absolutely.>>Andy: I little bit more on the personal
side. And to highlight how we’re opposites. I live in Upstate New York and that means
that I enjoy winter and I have two small humans as roommates and like I mentioned, they’re
at the beach today. They say that you should try everything at
least once once in life just to see what it’s like. I’ve done the science on this next one, you
guys — don’t get hit by a car. It’s not so much fun. And am primarily a carnivore.>>Bryan: For myself, I live in Los Angeles,
where we experience zero months of winter each year. I live near a beach, Venice Beach, I have
two roommates, wife and dog and I’ve been hit by zero cars, a much better experience. Herbivore, so no meat and what we can both
agree on it beer and coffee.>>Andy: I think that’s what make us such
great friends.>>Bryan: That’s right.>>Andy: So why are you here? Primarily to see us make fools of ourselves. Bryan called me up a few months and said hey,
why don’t you come to Southern California and go to a conference and a couple weeks
later I said, hey, Bryan, why don’t we present at that conference. So a few of the technologies we use for yoga
possess. Node.js, I don’t think they’d let us in the
door if we didn’t use.>>React, TensorFlow one of the fun toys that
we don’t get to play a whole lot with our day jobs. Some pose estimation that happens in real
time in the browser and then some of the deployment things that I find pretty interesting.>>Bryan: So React, yes, the one with hooks,
really great if new tech in 16: 8 I’m sure you’ve seen a lot of those, and great, we’re
moving over to hooks, we’re going to use hooks for this whole thing. Look how great use state is and then I’m off
using use effect and then I want to share some global states and now I’m using use context
and then I have this mutable thing that I want to keep track of and so now I’m into
useRef and then you want to use the set interval. So you’re using a custom hook called useInterval. And it’s super interesting. Dan wrote a blogpost on how it can be used
instead of set interval.>>Yeah, it’s really interesting how the underlying
hook technology allows you to take the ones that React has published and use them. But also extend them in ways that you can
be creative about and find new uses about, like the one that Dan published for handling
intervals. TensorFlow, how many people have heard of
TensorFlow? Several. How many have used it before? Fewer, great. So TensorFlow.js is cool. It allows you to train and deploy models in
a browser or in a node environment. I think it makes machine learning easy. We work with a lot of data scientists, and
they’re really big nerds and they make machine learning actually easy, like the things that
they do is really impressive and it makes our life as more front-end and application
developers, easy to take what they’ve done and really just implement it, so it was fun
for us to jump into TensorFlow and actually start playing with some of the models, creating
our own and seeing how we can work with them. Another great thing about TensorFlow is the
docs. There’s a handful of tools that I think have
really amazing docs. Docker is one of them. TensorFlow had really great docs. Bootstrap is amazing that stuff. So when a tool has really great docs, I think
it makes it easier to understand it or at least how to use it.>>Bryan: Specifically out of TensorFlow there’s
a model called pose net and what it’s used for is to determining pose information in
real time or you’ll feed it a video or image and what that model will give back to you
are 17 different body parts, it will show you where somebody’s eyes are, where the nose
is, shoulders and you can take that pose then and basically run your algorithm against those
17 different body parts to do whatever you want. In our case, we’re going to be scoring some
yoga poses, but the pretty cool thing about PoseNet where we get to see the browser and
how far it’s come is it runs completely in the browser, so there isn’t any data being
sent out to some external server. That pose estimation, that neural network
is all running exclusively with your browser and just use a convolutional neural net to
basically return those pose and there’s a neural net under the hood that’s using to
determine oh, a human is in this image, these are the parts of that human and then mapping
out the XY coordinates that you can then use. Andy: So out of the box, PoseNet ships with
a handful of models. It’s a mobile mold. It gives a lot less accuracy when coming up
with the key point positions and confidence in each of those. There’s a beefier one, called ResNet 50 which
I really wanted to work and it really crushed the processer, to the point where we were
getting like 10 to 15 frames a second, which sucks, so we couldn’t go with that one. So we used mobile net v1. Then like Bryan mentioned, out of the box
it can detect multiple people which is kind of cool and could have some implications for
different ways that you could use Posenet. So let’s have some fun with the first toy
that we created. Face detection and tracking so here’s Will
with some really cool glasses on. Bryan, I don’t know if you want to step out
and show some code here.>>Bryan: Absolutely so basically here …
>>Andy: Did you hit the go button? You have to — perfect. There we go.>>Bryan: Thank you. Basically see we’re grabbing that pose, the
pose consists of those — I’m 17 points that we talked about before, and we’re basically
going to go into that pose, look at the key points and say where is the nose, where is
the eye, and just do some simple math here to determine how wide the glasses are, we’ll
do a ratio of the image that we’re going to be using. We’ll do a calculation so just doing the inverse
tangent of looking to calculate the angle of the face so we can apply the glasses in
the correct orientation and just doing a quick translation based on the nose so we can place
them in the correct spot, applying the rotation based on the angle and we’re going to draw
that image onto the canvas. So what does that look like? Sorry, Andy.>>Andy: Yeah, no worries. So you can see we fully embraced the saved
by the bell theme here. Do you want to talk about how you added some
key press magic here?>>Yeah, so basically what we do, there’s
Andy. Bryan: We wrapped the Pose net model in the
React and basically we’re hooking the keyboard and we can do some simple logic to say we
want to render this canvas on top of the video we’re seeing. This harsh light is a little rough, but — hahahaha
[applause] Andy: So I think we also learned something
pretty important about users by coming here and being on this stage. It’s super-easy as developers to build something
in a way that you expect your users to interact with it. So for Bryan and I sitting in an office, sitting
in our desks in our homes.>>Andy Bryan: Nice ambient lighting
Andy: Yeah, and then you come up here and you have white background and harsh lighting
so I think it’s great to think that users are going to abuse your systems, so here’s
a great example. There we got Bob there. Here’s another one of the technologies that
we used. It’s the Canvas API. This is pretty need. You’re able to do basic drawing manipulations
just with JavaScript. It’s native to what, all browsers? And there’s a bunch of libraries that wrap
that, that make things easier. I guess we did it the hard way just by manipulating
Canvas directly but there’s always a break point where you realize, do I need this additional
library? Do I not need it? And I think it’s a good practice to try and
do things yourself first, insofar as you can still make progress. As soon as you find yourself running into
a brick wall and the progress is slowed, it’s a great opportunity to bring in some of those
other tools, but because we had this simple use case, we decided to use Canvas just directly. A few other things we did here related to
Canvas and also having access to user media is to grab the camera that user might have
available. So there’s a few different ways that that
can be tuned to inspect which devices a user has available to them whether it’s a front-facing
or rear-facing camera. Bryan: So once you have those 17 key points
to actually start being able to understand or how to score how that those body parts
are in relation to something else, we need to turn those into vectors, so just basically
taking each one of the 17 key points, iterating through those, creating vectors through an
additional key point and basically we’re creating a bunch of XY vectors in that space that we
can start to say, OK, based on this vector, how is it going to compare to some other vector
that you’re expecting? And the way we did that is we created basically
like a gold standard of a yoga pose and we said, OK, if we were to look at that yoga
pose, what would it look like in X-Y vector space and create that model. So that’s a algorithm that’s running that
says let’s compare that vector to another vector and use something called cosine similarity. If you’re at 1 that means you’re perfect,
those vectors are completely on top of one another, or if it’s 0, it’s going to be basically
90 degrees away from that if it’s -1 it’s 180, so you’re going to be doing exactly the
opposite of what you expect. Andy: Any time you’re talking to a data scientist,
and you come up with a model, you can just tell them that it’s a neural net, just use
your brain for it, so …>>Bryan: So this is basically how we did
this. This image has nothing to do with this slide. I just wanted to get a slide of ZachG on there. Bryan: So where Posenet gives you the pose,
what we’re doing is adding some additional information to that pose so we’re going to
look and say we’re going to create this list of vectors, we’re just going to give it the
index of one part and the part index of another part and then basically define what’s that
— so the right side then is the expected vector, so in this case we’re expecting the
left eye and the right eye to just be one in the positive of X and so essentially we’re
looking for both of your eyeballs are level.>>Andy: Can I jump out here and show an example
of one of the models?>>Bryan: Yeah. So if you’re familiar with mountain pose,
maybe you can demo what a mountain pose.>>Andy: Bonus points if I don’t fall off
the stage.>>Just stand there literally just stand there.>>I can climb one, I can’t pose one.>>So this is what mountain pose looks like. We’re taking every line here. Basically lines 21-30 we’re saying these are
the expected vectors of that and it’s actually really simple when you define that, basically
you’re looking for how is that represented, and it’s basically translation invariant. You’re just looking for the direction of that
vector. Maybe if we can show — do you mind stepping
in front we can show those points? So what you basically get from Posenet —
>>Andy: I am not here. Where did I go? Lost the camera. Well, how about this: Production to the rescue?>>Bryan: Call in the DevOps guy.>>Andy: Might have lost the network here. Can’t use the internet for anything! Well, we’ve lost network Bryan: We’ll come back …
A>>
AUDIENCE: Try Internet Explorer. Andy: Oh, no! Oh, I just died a little. [laughter] AUDIENCE: Check that the camera is still authorized? Andy: Camera still authorized. AUDIENCE: Top right. Andy: Take all my permissions!
[laughter] Well, this is the most disappointing thing
to happen today. Well, let’s go through the slides and then
try and resurrect things. man, what a fraud. These people came for yoga poses.>>Bryan: That’s right, confidence just plummets. Yeah, so basically for all the math people
out there, this is the cosine similarity scoring that we’re using again. It’s really just taking a look at those angles
of the two vectors and we go across, so if you have, like in mountain pose it’s going
to look across those possible 10 vectors, look at the one —
>>Andy: The camera is back up?>>Bryan: Oh, it is.>>Andy: And now it’s off. Oh! That is cruel. Yeah, all right. [applause] Bryan: Oh! Andy: Come on! I do not exist! AUDIENCE: Restart the camera. Andy: Different ports, everything. What have we got? [applause] So there’s our key points. Bryan: 17 magical key points. Who knows how they show up. Andy: Well, since the thing is working, why
don’t we go through the vectors. Bryan: If in your poses you want to look across,
say compare the shoulder to the foot, you would have that vector available to you. All right. So I say at this point we need to make Andy
do some yoga poses. AUDIENCE: Yeah! Bryan: So we’re going to be looking at every
pose from 0 to 10, 10 being the best, we’ll second him through five levels of yoga poses. So first one, Andy, we’ll be starting off
with mountain pose. Good job, 9 on mountain pose, that’s good. Next one is — hands up. Andy: I can do better. Nope, I can’t do better. Bryan: Yoga pose says you can do better. We’re going on to warrior II now. Looking good, Andy, don’t fall off the stage. Three. Good, get that back foot up a little bit. Oh, looking good, looking good. And then chair pose. Oh, nice chair pose, maybe bring those arms
up a little bit more. Lean forward a little bit more? Andy: Nope! Bryan: All right, 38!
[applause] So 38 is pretty good. Wonder if anybody could beat 38? Andy: Anyone feel like they can beat 38? Any volunteers?>>I can do it. So what do I do first? Mountain? Mountain pose. Yes, please, watch the edge. And next up will be warrior I. Oh, that’s
nice warrior. I do yoga. All right, next up warrior II. Looking good. Andy still might be in the lead somehow. Doesn’t make any sense. It’s these pants! Bryan: All right, warrior 3.>>I don’t want to fall!
[laughter] Bryan: Looking good, and now chair pose. Utkatasana
Oh, looking good. Pull two points out of it.>>Andy: I think she was handicapped there.>>Bryan: Thank you. Thanks, Katie.>>Andy: Thank you, Katie. Bryan. I recode we coded it to make Andy win, by
the way. Andy: That’s how to boost myself confidence,
don’t think I’m so good. Oh, sure, so a few things here about design. You tell from the slides to the actual application,
we tried to embrace the Saved By the Bell theme. It gives them clear direction and purpose
of what they’re supposed to be doing. It can convey a mood, you can tell this was
not supposed to be an overly serious application that we built, and it’s — it adds value. When you look at like an application that
doesn’t have a coherent design, it just doesn’t feel as fun to interact with. So because I haven’t learned my lesson —
Bryan: Real quick while Andy brings tup, one of our graphic designers, Eian did all the
work, so a big shout-out to Eian. Andy: Yeah, he did a great job. Bryan: This is when you ask a developer to
build you something. Andy: And of course the camera won’t work,
but as you can tell, there’s absolutely no design here. We had a canvas displaying the image with
the camera and it was not fun to interact with. Didn’t let anyone know what to do and without
that design, you can be quite lost. I’ll cut my losses on that. So some potential next steps we might want
to take if we chose to be even more serious than we are right now. Bryan: We missed fix the camera on this list.>>Andy: Yes, support cameras? So we could improve the model. Obviously the model that Posenet model is
the mobile version, a lot of the parameters are tweaked just to be able to run on a MacBook. Considering the wide array of devices that
a user are likely to use you want to be able to accommodate them as much as possible. We could up our modeling game of the actual
yoga poses by doing some real machine learning TensorFlow of known good yoga poses and bad
ones, as well. Tests are extremely important. Tim this morning gave an amazing talk this
morning on test driven development.>>
Bryan: But we have no tests. Andy: None at all. The more costly to a user any change would
be, the more valuable having tests against those changes are. And then some optimization. We again did absolutely none. Is and I don’t know, adapt the Peloton model
and become rich by New Year’s rush. And with that, have a super-nice night, San
Diego. [applause]>>Thank you. Thank you, aren’t computers wonderful? They always work perfectly when we expect
them to. All right. So thank you everyone. This is our last talk of the day, in this
room, so if we could head on over next door, starting at 5:45 is going to be the last talk
of the day. She’s going to talk about who’s not in the
room, how to build better products by engaging hard to reach users and it sounds like it’s
going to be a great talk and I will see you over there. Thank you …