In this digest you can hear Donald Trump sing about Obama, learn how machine learning can help autism screening, get the latest updates from the ICML and CVPR conferences, and much more.
Presenting "Obama Leaves", sung by Obama, Trump, Clinton
Style transfer is the technique of recomposing images in the style of other images. It takes traits from one piece of art, like the brushstrokes of a painting, and applies it to another image. It's the software behind the popular photo app Prisma, and Twitter bots like the now-defunct DeepForger.
The same idea can be applied to speech. WowTune VR has made a technology that uses sound processing and machine learning to turn speech into singing. And like any good breakthrough in 2016, they applied it to Donald Trump (see below).
International Conference on Machine Learning (ICML)
The International Conference on Machine Learning (ICML) and Computer Vision and Pattern Recognition (CVPR) 2016 occurred back-to-back this year. Whereas ICML tends to focus more on fundamental research in an intimate setting, CVPR is all about applications research. Both included copious amounts of deep learning applied to many different areas.
Kaiming He’s Deep Residual Networks tutorial revealed tips and tricks for training ultra-deep residual networks. First presented at ICCV 2015 where he won the ImageNet competition, Kaiming kept it interesting by cutting out to a video demonstrating the layer by layer variance visualization. CVPR would later show that ResNets are incredibly important and experts must reduce it to practice in order to obtain leading results.
Leon Bottou’s presentation on accelerating DNN training used optimization surface curvature information to speedup optimization, descending more quickly where the topology allows. He showed numerical results confirming his theorems that natural gradient methods allow scaling proportional to weights instead of neurons.
Ivo Danihelka presented associative Long Short Term Memory units (LSTMs) that used complex-valued vectors for a general, low cost, and parallelizable way of adding memory to LSTMs for longer recall (for instance to improve performance through noise reduction from multiple copies).
Computer Vision and Pattern Recognition (CVPR)
In contrast to ICML, CVPRfocuses more on applications of computer vision through which deep learning techniques lead in image classification, object detection, semantic segmentation, and multi-modal analysis. An excellent collection of open source tools released at CVPR is available at TensorTalk.
This year, there were a few interesting new layer types such as Wenzhe Shi's Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network, Gaussian Conditional Random Fields in many talks, and dynamic parameter hashing layer.
Wonmin Byeon presented PyraMiD-LSTMs (first shown at NIPS) which had a warm reception here. PyraMiD-LSTMs have one of the top results on medical image segmentation and are highly parallelizable for 3D scans.
LocNet is used to improve bounding box accuracy. Next time, you’re inspecting an object detection demo, pay attention to the tightness of the bounding box!
Do we need a new Turing test?
A study published in the Journal of Experimental and Theoretical Artificial Intelligence exposed a major flaw in the standard test for AI—the Turing test. The test, devised in 1950 by pioneering computer scientist Alan Turing, assesses a machine’s ability to exhibit intelligent behaviour indistinguishable from that of a human. Also known as the "imitation game", it requires a human judge to converse with two hidden entities, a human and a machine, and then determine which is which.
However, if a machine were to "take the Fifth Amendment"—that is, exercise the right to remain silent throughout the test—it could, potentially, pass the test and thus be regarded as a thinking entity, authors Kevin Warwick and Huma Shah of Coventry University argue. However, if this is the case, any silent entity could pass the test, even if it were clearly incapable of thought.
"This begs the question, what exactly does it mean to pass the Turing test?," asked Kevin Warwick. "Turing introduced his imitation game as a replacement for the question 'Can machines think?' and the end conclusion of this is that if an entity passes the test then we have to regard it as a thinking entity."
"However, if an entity can pass the test by remaining silent, this cannot be seen as an indication it is a thinking entity, otherwise objects such as stones or rocks, which clearly do not think, could pass the test. Therefore, we must conclude that 'taking the Fifth' fleshes out a serious flaw in the Turing test."
Using machine learning to improve autism screening and diagnostic instruments
Researchers from the USC Signal Analysis and Interpretation Laboratory (SAIL) at the USC Viterbi School of Engineering Ming Hsieh's Department of Electrical Engineering, along with autism research leaders Catherine Lord (of Weill Cornell Medical College) and Somer Bishop (of University of California, San Francisco), are applying machine learning to the output of early diagnostic tests to improve the efficiency and effectiveness of autism detection. One of the fundamental questions that drove this research project, said co-author and SAIL Director Shri Narayanan, was, "How can we support and enhance experts' decision-making beyond human capability, how can we make sense of data and patterns not able to be detected by a single person?"
Study authors Daniel Bone, Somer Bishop, Matthew P. Black, Matthew Goodwin, Catherine Lord and Shrikanth S. Narayanan looked at two established industry tests: the Autism Diagnostic Interview-Revised (ADI-R), and Social Responsiveness Scale (SRS), both exams in which parents are interviewed about their children's behaviors. The scholars then applied machine learning techniques to analyze how parents' responses on individual items and combinations of items matched up with the child's overall clinical diagnosis of ASD vs. non-ASD. By using machine learning to analyze thousands of caregiver responses, the researchers were able to identify redundancies in the questions asked to caregivers. By eliminating these redundancies, the authors identified five ADI-R questions that appeared to be capable of maintaining 95% of the instrument's performance.
Flow diagram of machine learning-based algorithm development (Source). First, an ML classifier is used to design an algorithm that can map Instrument Codes to BEC Diagnoses; this is the training phase. It requires a set of data independent from the held out portion of data used for testing (evaluation). In testing, a Predicted BEC Diagnosis is derived from Instrument Codes, and then compared to the previously known BEC Diagnosis.
The authors also believe they can use machine learning to provide another lens on autism, offering a picture that is clearer, more distilled, and overall more data-informed for caregivers and practitioners. This, the authors believe, could be revolutionary in that it "takes out the guesswork or subjectivity involved even in trusted, industry-wide instruments."
Google's DeepMind AI—the same one that beat Go! master Lee Sedol—is now working towards more practical ends, as it is using its virtual brains to cut electricity costs at Google data centers by more than 40%. They accomplished this by taking the historical data that had already been collected by thousands of sensors within the data center (e.g., temperatures, power, pump speeds, setpoints, etc.) and using it to train an ensemble of deep neural networks to optimize a metric called Power Usage Effectiveness. (DeepMind Blog)
Honda and SoftBank are partnering up to develop an artificial intelligence that can converse with drivers and assist them in a way that seeks to harmonize mobility with people, so that drivers can feel a kind of friendship with their vehicles. Honda's walking robot Asimo (left), first shown in 1996, walks, runs, dances and grips things. SoftBank's Pepper (right), which went on sale last year, doesn't have legs but is programmed to recognize mood swings in people it interacts with.
Nvidia combines eye-tracking with understanding of the nature of peripheral vision in the human eye to sharpen virtual worlds and improve rendering in virtual reality. (via TechnologyReview)
Microsoft’s new Pix app applies machine learning algorithms to automatically select and edit the best picture from a sequence of photos, improving the resultant picture. (Microsoft via Yahoo)
The White House issued a call to technologists to leverage big data, artificial intelligence, and machine learning to resolve the United States' incarceration crisis (via DigitalTrends).
A Tesla blog post describes the first fatality involving a self-drive system. A Tesla was driving on autopilot down a divided highway, where suddenly a white truck crossed the highway perpendicularly to the car (for unknown reasons). Like Nissan's system, it can maintain a set speed and keep the car within its lanes. Undetected by the Tesla's vision system, the car went "under" the truck, so that the windshield was the first part of the Tesla to hit the truck body, with fatal consequences for the "driver." Tesla points out that the autopilot system has driven 130 million miles, while human drivers in the USA have a fatality about every 94 million miles (though it’s a longer interval on the highway). (Tesla Blog)
Google launched two new machine learning APIs into open beta: Cloud Natural Language API for parsing plaintext sentences and Speech API for speech recognition. (via ZDNet)
UK-based FiveAI received a $2.7 million round of funding to build new AI for self-driving cars, promising a more autonomous approach less reliant on pre-made maps. (via VentureBeat)
Former NASA chief Daniel Goldin unveiled KnuEdge, a secretive start-up developing new chips to resolve the von Neumann bottleneck and support new learning algorithms. (NASA via VentureBeat, WSJ, PCWorld)