It used to be that computer vision - getting a computer to not only 'see', but also to 'understand', the objects around it - was one of the hardest problem for engineers and researchers to solve. However, with modern advances in neural networks and machine learning, computer vision has become common place. Now, Google has shared the latest big advancements in this space, which spell doom for humanity.
In a post on its research blog, Google open sourced the latest iteration of its image captioning system. A cross between computer vision and machine learning, the system analyzes images and then describes what it sees in captions, which are supposed to be easily understandable and as human-like as possible.
We’ve seen previous incarnation of such systems from both Google and Microsoft, not to mention other companies in the IT space. Microsoft even turned its captioning software into an app, CaptionBot, that’s available on the web.
But according to Google, this latest system is considerably better than previous incarnations. For one thing, accuracy has gone up significantly. Currently, the system can identify and correctly caption images 93.9% of the time, as compared to the original version which only scored 89.6%. But what’s even more impressive is that the time it takes to train the system by feeding it information has gone down dramatically, by 75%.
In other words, Google’s new system is faster and better at identifying and hunting down
humans objects in images. Google released its image captioning system as open source so you can even use it yourself, but you’ll need to train it first.
Source: Google Research