Monday, November 28, 2016

Highlights and new discoveries in Computer Vision, Machine Learning, and AI (October 2016)

In the latest issue of this monthly digest series you can learn about Google's breakthrough with DeepMind, why you might soon see robocops on the streets of Dubai, how many quadcopters it takes to screw in a lightbulb, and much more.

Google breaks through on AI with DeepMind

DeepMind is now able to learn new things based on knowledge it already has. This brings about a radical change in the playing field for Google’s artificial intelligence.

This kind of artificial intelligence is not only capable of learning. It can even teach itself based on data that it has already acquired. A technology called Differential Neural Computer (DNC) makes all these things possible. This system utilizes a smart artificial intelligence and a neural network which can rapidly analyze data together with the storage capacity of traditional computers. As a result, the computer can learn to use its memory to answer questions about complex, structured data, including artificially generated stories, family trees, and even a map of the London Underground.

Illustration of the DNC architecture. The neural network controller receives external inputs and, based on these, interacts with the memory using read and write operations known as 'heads'. To help the controller navigate the memory, DNC stores 'temporal links' to keep track of the order things were written in, and records the current 'usage' level of each memory location.

In a DNC, nodes are linked together forming a network which triggers certain data centers necessary to perform certain tasks. The AI optimizes these nodes in order to arrive at the best solution to get the desired results. After some time, DeepMind shall be able to accumulated enough data which makes it more skillful in finding the right answers or solutions that it ever was before.

The Google Research team wanted to test DNCs on problems that involved constructing data structures and using those data structures to answer questions. Graph data structures are very important for representing data items that can be arbitrarily connected to form paths and cycles. In a recent Nature paper they showed that a DNC can learn on its own to write down a description of an arbitrary graph and answer questions about it. When we described the stations and lines of the London Underground, we could ask a DNC to answer questions like, “Starting at Bond street, and taking the Central line in a direction one stop, the Circle line in a direction for four stops, and the Jubilee line in a direction for two stops, at what stop do you wind up?” Or, the DNC could plan routes given questions like “How do you get from Moorgate to Piccadilly Circus?”

DNC was trained using randomly generated graphs (left). After training it was tested to see if it could navigate the London Underground (right). The (from, to, edge) triples used to define the graph for the network are shown below, along with examples of two kinds of task: 'traversal', where it is asked to start at a station and follow a sequence of lines; and 'shortest path' where it is asked to find the quickest route between two stations.

These results are really promising. Now it would be possible to establish a large number of such possible goals in a number of different tasks, and then ask the network to execute the actions that would produce one or another goal state on command. In this case, again like a computer, the DNC could store several subroutines in memory, one per possible goal, and execute one or another. The question of how human memory works is ancient and our understanding still developing. The DNC might thus provide both a new tool for computer science and a new metaphor for cognitive science and neuroscience: here is a learning machine that, without prior programming, can organize information into connected facts and use those facts to solve problems.

Source via

Dubai to replace real police with robocops in 2017

Dubai’s government will begin to introudce a “new fleet of intelligent police androids” that will be patrolling streets, malls and other crowded public spaces in 2017. Eventually, it’s intended that these first generation “Robocops” will become a permanent part of the Dubai police force, alongside their fleet of McLarens, Lamborghinis and Buggati Veyron hypercars, by the end of the decade.

Unlike the fictional, crime fighting Robocop though it’s intended that these androids, made by PAL Robotics, will act more like security guards and public information terminals than hunter killers on a mission. “The robots will initially interact directly with people and tourists. They will include an interactive screen and microphone connected to the Dubai Police call centers and people will be able to ask questions and make complaints, but they will also have fun interacting with the robots,” said Colonel Khalid Nasser Alrazooqui, head of Dubai’s Smart Policing Unit. According to him, Dubai is planning on upgrading the robots in two to four years so they can interact with civilians without any human intervention. This is part of Dubai's long-term initiative to create the world's most advanced police force by combining artificial intelligence with robotics.


Speech recognition system reached human parity

Speech recognition software isn't perfect, but it is a little closer to human this month, as a Microsoft Artificial Intelligence and Research team reached a major milestone in speech-to-text development: The system reached a historically low word error rate of 5.9 percent, equal to the accuracy of a professional (human) transcriptionist. The system can discern words as clearly and accurately as two people having a conversation might understand one another.

By combining Microsoft’s open-source Computational Network Toolkit, and being a little bit over-obsessed with this project, the team was able to beat its goal of human parity by years in just months, according to Microsoft's blog. They hit the parity milestone around 3:30 a.m., when Xuedong Huang, the company’s chief speech scientist, woke up to the breakthrough.

It's highly accurate, but still imperfect, much like human transcriptionists might be. The biggest problem area where humans and the system disagree was in more nuanced signals, as the researchers note in their paper (ArXiV pre-print): "We find that the artificial errors are substantially the same as human ones, with one large exception: confusions between backchannel words and hesitations. The distinction is that backchannel words like “uh-huh” are an acknowledgment of the speaker, also signaling that the speaker should keep talking, while hesitations like “uh” are used to indicate that the current speaker has more to say and wants to keep his or her turn. As turn-management devices, these two classes of words therefore have exactly opposite functions."

Source via

MSDSE Summit

The Moore/Sloan Data Science Environments (MSDSE) program is an ongoing effort to enhance data-driven discovery by supporting cross-disciplinary academic data scientists at research institutions across the nation. Last month researchers from the University of Washington (UW), New York University (NYU), and University of California, Berkeley (UCB) came together to present their latest research and discuss the potential future of data science at a three-day summit. Researchers presented new developments on open-source software projects (e.g., Jupyter, Julia, ReproZip), addressed the reproducibility crisis, and reported back from workshops such as ImageXD and Data Science for Social Good (DSSG).

An extensive report of the event can be found here.


Joey Tribbiani, popular character from the TV sitcom "Friends", is being immortalized as an artificial intelligence by researchers from the University of Leeds. The hope is to create virtual talking avatar, much like today's Siri or Alexa. A first set of algorithms scan through various episodes of the TV show, tracking the different characters—along with their body language, voice and facial expressions. Another algorithm then analyzes the scripts to see how the subject puts sentences together. Eventually, the team hopes to use the technique on characters from other TV shows, creating new scenes where they have new conversations. (University of Leeds)

A new US Robotics Roadmap released Oct 31 calls for better policy frameworks to safely integrate new technologies, such as self-driving cars and commercial drones, into everyday life. The document also advocates for increased research efforts in the field of human-robot interaction to develop intelligent machines that will empower people to stay in their homes as they age. (via RoboHub)

Canadian Bank RBC announced that it is supporting two new initiatives at the University of Toronto to foster Canada as a leader in machine learning and AI. One of its initiatives include establishing the RBC Research in Machine Learning practice at the Banting Institute at the University of Toronto. RBC is also establishing a partnership with the Creative Destruction Lab, its seed-stage program for science-based companies, which is now home to 50 AI companies. (via BetaKit)

Scientific American spoke with Oren Etzioni, chief executive officer of the Allen Institute for Artificial Intelligence (AI2), at a recent AI conference in New York City, where he voiced his concerns about companies overselling the technology’s current capabilities, in particular deep learning. Etzioni also offered his thoughts on why a 10-year-old is smarter than Google DeepMind’s AlphaGo program, and on the need to eventually develop artificially intelligent “guardian” programs that can keep other AI programs from becoming dangerous. (ScientificAmerican)

Tesla announced that it would build and sell its cars with fully autonomous driving technology, but also said that drivers wouldn’t actually be able to use the technology yet. The company plans to calibrate and improve the system before actually enabling it—so those eager to try out this new piece of tech may have to wait just a little while longer. (via NYTimes)

Last but not least, we finally have an answer to the pestering question of how many quadcopters it takes to screw in a lightbulb:

The answer might surprise you. ;-)