Image recognition is one of the buzz-words of our times. Facebook has acquired Face.com with its flagship product — facial recognition app — and invested huge money to better suggest who’s who on the images uploaded to the service by over a billion of its users. Google+ Photos uses image recognition technology to make photos searchable, and Google Goggles is a new powerful gadget from the Mountain View giant that will let you find information on objects not by typing their names, but by simply „showing” Google what they look like.
Not as easy as you may think
Our mobile team at Goyello has recently faced a challenge of a similar type (in a smaller scale though) — how to recognize if a photographed image of a solved puzzle set matches one of the predefined templates. The task that seems to be so easy to the human brain is in fact a complex feat, and highly sophisticated algorithms out there still fall short of what any of us — humans — is capable of. But the power of today’s computers is tremendous and constantly growing, so we started looking for solutions to our conundrum. And first, we don’t want to send the image to the server for matching, so the algorithm must perform acceptably on a mobile device; second — we’d love the code to be open source so that we can freely play with it and adopt it to our specific demands.
After some research we’ve found OpenCV — an open-source library of real-time image processing and manipulation functions developed initially by Intel in C++ and ported to the most popular mobile platforms. Being both BSD-licensed and native on Android/iOS, it seemed to be a perfect hit, so we started feasibility analysis right away.
OpenCV includes methods useful for matching two images that produce similarity values as a result, like shown on the following sample:
First of our problems stems from the fact that we’re supposed to compare an ideal template with a photo made using a relatively simple built-in phone camera, in unpredictable lighting conditions, including random white balance, unevenly lit surface, camera noise, poor focus and motion blur. Heavy blur will surely render the whole matching task impossible, so in that case we can only inform a user that an image is not sharp enough based on edge detection. Most of the other imperfections can be overcome with additional „cleaning”, like going black-and-white and balancing the luminance levels.
To avoid cheating
The second problem looks even more challenging: every puzzle set comes with a template, so a user can „cheat” the app, e.g. shooting the box cover rather than the solved puzzles. What can help us is the fact that the template on the cover is slightly spoilt by an overlay image on one of its sides, but that may not be enough for the computational algorithm to tell the box from the real puzzles. But see what we have found, look at the images below:
See the pattern? Voila, edge detection is the way to go! At least in our „lab” environment, with the light far from perfect, detecting the edges makes a whole difference between a “flat” box image and a „bumpy” puzzle set.
So far, so good, it looks like we can see the light in the tunnel. Stay tuned if you want to find out what comes next!