This question originally appeared on Quora: How does how-old.net work?
Answer by Eason Wang, Senior Program Manager on Bing, on Quora
I directly worked on this project. The backend code is built by my team in Bing in collaboration with Microsoft Research. To be honest, it's a big surprise to me that this tiny web app went viral. I did some post analysis on why it went viral and wrote a blog post on
Medium.
Back to the main topic, I want to answer this question in two parts. First part I will talk about how to quickly implement the exact same capabilities in any app. In the second part, I will go a bit deeper to describe the technology itself.
In Bing Image Search we have built the best in industry image understanding capabilities in the past few years. It was used in Bing and quickly expanding to other Microsoft products. Now it is open to all developers: Microsoft Project Oxford Home. In order to implement the same capability in an app, you can simply call the web API and get all the necessary information back in JSON format. You can give it a try by uploading an image here: Page on www.projectoxford.ai. It gives the data back in seconds. The face coordinates, gender, age information are all included. Face API is just one of the many features that we have made open in Project Oxford. There are many other core capabilities in the API to empower innovative scenarios. I am very excited to see this Microsoft internal API open to all developers and I know this will have profound impact to the developer world because the previous impossible scenarios are now just one simple web API call. #HowOldRobot was just one tiny demo to show off these capabilities. It was put together by one developer from Azure ML team just in one day.
JSON:
[
{
"faceId": "5af35e84-ec20-4897-9795-8b3d4512a1f9",
"faceRectangle": {
"width": 60,
"height": 60,
"left": 276,
"top": 43
},
"faceLandmarks": {
"pupilLeft": {
"x": "295.1",
"y": "56.8"
},
"pupilRight": {
"x": "317.9",
"y": "59.6"
},
"noseTip": {
"x": "311.6",
"y": "74.7"
},
"mouthLeft": {
"x": "291.0",
"y": "86.3"
},
"mouthRight": {
"x": "311.6",
"y": "88.6"
},
"eyebrowLeftOuter": {
"x": "281.6",
"y": "50.1"
},
"eyebrowLeftInner": {
"x": "304.2",
"y": "51.6"
},
"eyeLeftOuter": {
"x": "289.1",
"y": "57.1"
},
"eyeLeftTop": {
"x": "294.0",
"y": "54.5"
},
"eyeLeftBottom": {
"x": "293.0",
"y": "61.0"
},
"eyeLeftInner": {
"x": "297.8",
"y": "58.7"
},
"eyebrowRightInner": {
"x": "316.0",
"y": "54.2"
},
"eyebrowRightOuter": {
"x": "324.7",
"y": "54.2"
},
"eyeRightInner": {
"x": "312.9",
"y": "60.9"
},
"eyeRightTop": {
"x": "317.8",
"y": "57.7"
},
"eyeRightBottom": {
"x": "317.9",
"y": "63.7"
},
"eyeRightOuter": {
"x": "322.8",
"y": "60.8"
},
"noseRootLeft": {
"x": "304.0",
"y": "60.2"
},
"noseRootRight": {
"x": "312.2",
"y": "61.2"
},
"noseLeftAlarTop": {
"x": "302.6",
"y": "70.2"
},
"noseRightAlarTop": {
"x": "313.0",
"y": "70.0"
},
"noseLeftAlarOutTip": {
"x": "298.8",
"y": "76.2"
},
"noseRightAlarOutTip": {
"x": "315.2",
"y": "76.6"
},
"upperLipTop": {
"x": "307.3",
"y": "84.0"
},
"upperLipBottom": {
"x": "306.6",
"y": "86.4"
},
"underLipTop": {
"x": "305.5",
"y": "89.6"
},
"underLipBottom": {
"x": "304.1",
"y": "94.0"
}
},
"attributes": {
"age": 24,
"gender": "female",
"headPose": {
"roll": "4.0",
"yaw": "31.3",
"pitch": "0.0"
}
}
}
]
How Old Do I Look? mainly relies on 3 key technologies (i.e. face detection, gender classification and age detection). Face detection is the foundation for the other two. For age detection and gender detection, they are just classic regression and classification problems in machine learning. It involves facial feature representation, collecting training data, building regression/classification models and model optimization. There are plenty of publications in this area. Let me know if you have enough interest to go deeper.
On the other hand, deep learning and large scale data understanding have led to a new breakthrough of image understanding. This opens a door to more intelligent systems and APIs. You can check out my latest blog to understand how the Image Graph works to power advanced scenarios.
http://blogs.bing.com/search-qua...
Source:Quora