## Wednesday, May 24, 2017

### DEvol - Automated deep neural network design with genetic programming

Joe Davison made available an interesting implementation of an automated tool for deep neural network design using genetic programming (h/t François ). From the page:

## DEvol - Deep Neural Network Evolution

DEvol (DeepEvolution) utilizes genetic programming to automatically architect a deep neural network with optimal hyperparameters for a given dataset using the Keras library. This approach should design an equal or superior model to what a human could design when working under the same constraints as are imposed upon the genetic program (e.g., maximum number of layers, maximum number of convolutional filters per layer, etc.). The current setup is designed for classification problems, though this could be extended to include any other output type as well.
See demo.ipynb for a simple example.

### Evolution

Each model is represented as fixed-width genome encoding information about the network's structure. In the current setup, a model contains a number of convolutional layers, a number of dense layers, and an optimzer. The convolutional layers can be evolved to include varying numbers of feature maps, different activation functions, varying proportions of dropout, and whether to perform batch normalization and/or max pooling. The same options are available for the dense layers with the exception of max pooling. The complexity of these models could easily be extended beyond these capabilities to include any parameters included in Keras, allowing the creation of more complex architectures.
Below is a highly simplified visualization of how genetic crossover might take place between two models.
Genetic crossover and mutation of neural networks

### Results

For demonstration, we ran our program on the MNIST dataset (see demo.ipynb for an example setup) with 20 generations and a population size of 50. We allowed the model up to 6 convolutional layers and 4 dense layers (including the softmax layer). The best accuracy we attained with 10 epochs of training under these constraints was 99.4%, which is higher than we were able to achieve when manually constructing our own models under the same constraints. The graphic below displays the running maximum accuracy for all 1000 nets as they evolve over 20 generations.
Keep in mind that these results are obtained with simple, relatively shallow neural networks with no data augmentation, transfer learning, ensembling, fine-tuning, or other optimization techniques. However, virtually any of these methods could be incorporated into the genetic program.
Running max of MNIST accuracies across 20 generations

### Application

The most significant barrier in using DEvol on a real problem is the complexity of the algorithm. Because training neural networks is often such a computationally expensive process, training hundreds or thousands of different models to evaluate the fitness of each is not always feasible. Below are some approaches to combat this issue:
• Early Stopping - There's no need to train a model for 10 epochs if it stops improving after 3; cut it off early.
• Train on Fewer Epochs - Training in a genetic program serves one purpose: to evaulate a model's fitness in relation to other models. It may not be necessary to train to convergence to make this comparison; you may only need 2 or 3 epochs. However, it is important you exercise caution in decreasing training time because doing so could create evolutionary pressure toward simpler models that converge quickly. This creates a trade-off between training time and accuracy which, depending on the application, may or may not be desirable.
• Parameter Selection - The more robust you allow your models to be, the longer it will take to converge; i.e., don't allow horizontal flipping on a character recognition problem even though the genetic program will eventually learn not to include it. The less space the program has to explore, the faster it will arrive at an optimal solution.
For some problems, it may be ideal to simply plug the data into DEvol and let the program build a complete model for you, but for others, this hands-off approach may not be feasible. In either case, DEvol could give you insights into optimal model design that you may not have considered on your own. For the MNIST digit classification problem, we found that ReLU does far better than a sigmoid function in convolutional layers, but they work about equally well in dense layers. We also found that ADAGRAD was the highest-performing prebuilt optimizer and gained insight on the number of nodes to include in each dense layer.
At worst, DEvol could give you insight into improving your model architecture. At best, it could give you a beautiful, finely-tuned model.

### Wanna Try It?

DEvol is pretty straightforward to use for basic classification problems. See demo.ipynb for an example. There are three basic steps:
1. Prep your dataset. DEvol expects a classification problem with labels that are one-hot encoded as it uses categorical_crossentropy for its loss function. Otherwise, you can prep your data however you'd like. Just pass your input shape into GenomeHandler.
1. Create a GenomeHandler. The GenomeHandler defines the constraints that you apply to your models. Specify the maximum number of convolutional and dense layers, the max dense nodes and feature maps, and the input shape. You can also specify whether you'd like to allow batch_normalization, dropout, and max_pooling, which are included by default. You can also pass in a list of optimizers and activation functions you'd like to allow.
1. Create and run the DEvol. Pass your GenomeHandler to the DEvol constructor, and run it. Here you have a few more options such as the number of generations, the population size, epochs used for fitness evaluation, and an (optional) fitness function which converts a model's accuracy into a fitness score.
See demo.ipynb for a basic example.

## Tuesday, May 23, 2017

### Global Guarantees for Enforcing Deep Generative Priors by Empirical Risk

Credit: NASA, h/t Sarah Horst

Vlad just sent me the following:

Hi Igor,

I'm writing regarding my recent paper with Paul Hand. It's about combining principles of compressed sensing with deep generative priors, which has been previously shown empirically to require 10X fewer measurements than traditional CS in certain scenarios. As deep generative priors (such as those obtained via GANs) get better, this may improve the performance of CS and other inverse problems across the board.

In this paper we prove that the non-convex empirical risk objective for enforcing random deep generative priors subject to compressive random linear observations of the activations of the last layer has no spurious local minima, and for a fixed depth, these guarantees hold at order-optimal sample complexity.

Best,

We examine the theoretical properties of enforcing priors provided by generative deep neural networks via empirical risk minimization. In particular we consider two models, one in which the task is to invert a generative neural network given access to its last layer and another which entails recovering a latent code in the domain of a generative neural network from compressive linear observations of its last layer. We establish that in both cases, in suitable regimes of network layer sizes and a randomness assumption on the network weights, that the non-convex objective function given by empirical risk minimization does not have any spurious stationary points. That is, we establish that with high probability, at any point away from small neighborhoods around two scalar multiples of the desired solution, there is a descent direction. These results constitute the first theoretical guarantees which establish the favorable global geometry of these non-convex optimization problems, and bridge the gap between the empirical success of deep learning and a rigorous understanding of non-linear inverse problems.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

### A 2.9 TOPS/W Deep Convolutional Neural Network SoC in FD-SOI 28nm for Intelligent Embedded Systems (and a Highly Technical Reference page on Neural Networks in silicon.)

So last night I was talking to Thomas at the STMicroelectronics Techno day at Opera de Paris. He was featuring a recent architecture they are designing and presented at the last ISSC conference.

A 2.9 TOPS/W DeepConvolutional Neural NetworkSoC in FD-SOI 28nm forIntelligent Embedded Systems by Giuseppe Desoli, Nitin Chawla, Thomas Boesch, Surinder-pal Singh, Elio Guidetti, Fabio De Ambroggi, Tommaso Majo, Paolo Zambotti, Manuj Ayodhyawasi, Harvinder Singh, Nalin Aggarwal

I also discovered a Highly Technical Reference page on Neural Networks in Silicon by Fengbin Tu. The page is here: https://github.com/fengbintu/Neural-Networks-on-Silicon

The page has been added to the Highly Technical Reference Page.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Monday, May 22, 2017

### Short and Deep: Sketching and Neural Networks

This is a revised version of an earlier preprint.
Short and Deep: Sketching and Neural Networks by Amit Daniely, Nevena Lazic, Yoram Singer, Kunal Talwar
Data-independent methods for dimensionality reduction such as random projections, sketches, and feature hashing have become increasingly popular in recent years. These methods often seek to reduce dimensionality while preserving the hypothesis class, resulting in inherent lower bounds on the size of projected data. For example, preserving linear separability requires Ω(1/γ2 ) dimensions, where γ is the margin, and in the case of polynomial functions, the number of required dimensions has an exponential dependence on the polynomial degree. Despite these limitations, we show that the dimensionality can be reduced further while maintaining performance guarantees, using improper learning with a slightly larger hypothesis class. In particular, we show that any sparse polynomial function of a sparse binary vector can be computed from a compact sketch by a single-layer neural network, where the sketch size has a logarithmic dependence on the polynomial degree. A practical consequence is that networks trained on sketched data are compact, and therefore suitable for settings with memory and power constraints. We empirically show that our approach leads to networks with fewer parameters than related methods such as feature hashing, at equal or better performance.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Friday, May 19, 2017

### DeepArchitect: Automatically Designing and Training Deep Architectures - implementation -

So now that Learning to Learn is becoming more trendy thanks to Sundar Pichai, we now need better methods to look around for models, because who has 800 GPUs [1] ! Here is one with an implementation, woohoo !

In deep learning, performance is strongly affected by the choice of architecture and hyperparameters. While there has been extensive work on automatic hyperparameter optimization for simple spaces, complex spaces such as the space of deep architectures remain largely unexplored. As a result, the choice of architecture is done manually by the human expert through a slow trial and error process guided mainly by intuition. In this paper we describe a framework for automatically designing and training deep models. We propose an extensible and modular language that allows the human expert to compactly represent complex search spaces over architectures and their hyperparameters. The resulting search spaces are tree-structured and therefore easy to traverse. Models can be automatically compiled to computational graphs once values for all hyperparameters have been chosen. We can leverage the structure of the search space to introduce different model search algorithms, such as random search, Monte Carlo tree search (MCTS), and sequential model-based optimization (SMBO). We present experiments comparing the different algorithms on CIFAR-10 and show that MCTS and SMBO outperform random search. In addition, these experiments show that our framework can be used effectively for model discovery, as it is possible to describe expressive search spaces and discover competitive models without much effort from the human expert. Code for our framework and experiments has been made publicly available.

An implementation of DeepArchitect is here: https://github.com/negrinho/deep_architect

[1] ICLR video: Neural Architecture Search with Reinforcement Learning Video starts at 1:08:44 see Learning the structure of learning.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

### Soft Recovery With General Atomic Norms

This paper describes a dual certificate condition on a linear measurement operator A (defined on a Hilbert space H and having finite-dimensional range) which guarantees that an atomic norm minimization, in a certain sense, will be able to approximately recover a structured signal v0∈H from measurements Av0. Put very streamlined, the condition implies that peaks in a sparse decomposition of v0 are close the the support of the atomic decomposition of the solution v∗. The condition applies in a relatively general context - in particular, the space H can be infinite-dimensional. The abstract framework is applied to several concrete examples, one example being super-resolution. In this process, several novel results which are interesting on its own are obtained.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Thursday, May 18, 2017

### Guaranteed recovery of quantum processes from few measurements / Improving compressed sensing with the diamond norm

Jens mentioned this a while back and I am just playing catch-up on this topic before the upcoming massive ArXiv NIPS2017 release. Enjoy !

Guaranteed recovery of quantum processes from few measurements by Martin Kliesch, Richard Kueng, Jens Eisert, David Gross

Quantum process tomography is the task of reconstructing unknown quantum channels from measured data. In this work, we introduce compressed sensing-based methods that facilitate the reconstruction of quantum channels of low Kraus rank. Our main contribution is the analysis of a natural measurement model for this task: We assume that data is obtained by sending pure states into the channel and measuring expectation values on the output. Neither ancilla systems nor coherent operations across multiple channel uses are required. Most previous results on compressed process reconstruction reduce the problem to quantum state tomography on the channel's Choi matrix. While this ansatz yields recovery guarantees from an essentially minimal number of measurements, physical implementations of such schemes would typically involve ancilla systems. A priori, it is unclear whether a measurement model tailored directly to quantum process tomography might require more measurements. We establish that this is not the case. Technically, we prove recovery guarantees for three different reconstruction algorithms. The reconstructions are based on a trace, diamond, and ℓ2-norm minimization, respectively. Our recovery guarantees are uniform in the sense that with one random choice of measurement settings all quantum channels can be recovered equally well. Moreover, stability against arbitrary measurement noise and robustness against violations of the low-rank assumption is guaranteed. Numerical studies demonstrate the feasibility of the approach.

Improving compressed sensing with the diamond norm by Martin Kliesch, Richard Kueng, Jens Eisert, David Gross
In low-rank matrix recovery, one aims to reconstruct a low-rank matrix from a minimal number of linear measurements. Within the paradigm of compressed sensing, this is made computationally efficient by minimizing the nuclear norm as a convex surrogate for rank.
In this work, we identify an improved regularizer based on the so-called diamond norm, a concept imported from quantum information theory. We show that -for a class of matrices saturating a certain norm inequality- the descent cone of the diamond norm is contained in that of the nuclear norm. This suggests superior reconstruction properties for these matrices. We explicitly characterize this set of matrices. Moreover, we demonstrate numerically that the diamond norm indeed outperforms the nuclear norm in a number of relevant applications: These include signal analysis tasks such as blind matrix deconvolution or the retrieval of certain unitary basis changes, as well as the quantum information problem of process tomography with random measurements.
The diamond norm is defined for matrices that can be interpreted as order-4 tensors and it turns out that the above condition depends crucially on that tensorial structure. In this sense, this work touches on an aspect of the notoriously difficult tensor completion problem.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Wednesday, May 17, 2017

### Thesis (Honors): Improved Genomic Selection using VowpalWabbit with Random Fourier Features, Jiaqin Jaslyn Zhang

Image Credit: NASA/JPL-Caltech/Space Science Institute, Ian Regan

Combining Random Features and Vowpal Wabbit in this honors thesis. Congratulations Jiaqin !

Nonlinear regression models are often used in statistics and machine learning due to greater accuracy than linear models. In this work, we present a novel modeling framework that is both computationally efficient for high-dimensional datasets, and predicts more accurately than most of the classic state-of-the-art predictive models. Here, we couple a nonlinear random Fourier feature data transformation with an intrinsically fast learning algorithm called Vowpal Wabbit or VW. The key idea we develop is that by introducing nonlinear structure to an otherwise linear framework, we are able to consider all possible higher-order interactions between entries in a string. The utility of our nonlinear VW extension is examined, in some detail, under an important problem in statistical genetics: genomic selection (i.e. the prediction of phenotype from genotype). We illustrate the benefits of our method and its robustness to underlying genetic architecture on a real dataset, which includes 129 quantitative heterogeneous stock mice traits from the Wellcome Trust Centre for Human Genetics.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Tuesday, May 16, 2017

### Thesis: Robust Low-rank and Sparse Decomposition for Moving Object Detection: From Matrices to Tensors by Andrews Cordolino Sobral

Here is what Andrews (whom we have followed for a while now) just sent me (Congratulations Dr. Sobral !)
Hi Igor,

First of all, I would like to congratulate you for your excellent blog.
I would like to share with you my thesis presentation about Robust Low-rank and Sparse Decomposition for Moving Object Detection: From Matrices to Tensors. I think this research work may be of interest to your blog. Please find below the slide presentation and the thesis manuscript:
Thesis presentation (SlideShare):
https://www.slideshare.net/andrewssobral/thesis-presentation-robust-lowrank-and-sparse-decomposition-for-moving-object-detection-from-matrices-to-tensors
Thesis manuscript (ResearchGate):
https://www.researchgate.net/publication/316967304_Robust_Low-rank_and_Sparse_Decomposition_for_Moving_Object_Detection_From_Matrices_to_Tensors
Many thanks,

Andrews Cordolino Sobral
Ph.D. on Computer Vision and Machine Learning
http://andrewssobral.wix.com/home
This thesis introduces the recent advances on decomposition into low-rank plus sparse matrices and tensors, as well as the main contributions to face the principal issues in moving object detection. First, we present an overview of the state-of-the-art methods for low-rank and sparse decomposition, as well as their application to background modeling and foreground segmentation tasks. Next, we address the problem of background model initialization as a reconstruction process from missing/corrupted data. A novel methodology is presented showing an attractive potential for background modeling initialization in video surveillance. Subsequently, we propose a double-constrained version of robust principal component analysis to improve the foreground detection in maritime environments for automated video-surveillance applications. The algorithm makes use of double constraints extracted from spatial saliency maps to enhance object foreground detection in dynamic scenes. We also developed two incremental tensor-based algorithms in order to perform background/foreground separation from multidimensional streaming data. These works address the problem of low-rank and sparse decomposition on tensors. Finally, we present a particular work realized in conjunction with the Computer Vision Center (CVC) at Autonomous University of Barcelona (UAB).

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

### Thesis: Fast Algorithms on Random Matrices and Structured Matrices by Liang Zhao

Congratulations Dr. Zhao !

Fast Algorithms on Random Matrices and Structured Matrices  by Liang Zhao
Randomization of matrix computations has become a hot research area in the big data era. Sampling with randomly generated matrices has enabled fast algorithms to perform well for some most fundamental problems of numerical algebra with probability close to 1. The dissertation develops a set of algorithms with random and structured matrices for the following applications: 1) We prove that using random sparse and structured sampling enables rank-r approximation of the average input matrix having numerical rank r. 2) We prove that Gaussian elimination with no pivoting (GENP) is numerically safe for the average nonsingular and well-conditioned matrix preprocessed with a nonsingular and well-conditioned f-Circulant or another v structured multiplier. This can be an attractive alternative to the customary Gaussian elimination with partial pivoting (GEPP). 3) By using structured matrices of a large family we compress large-scale neural networks while retaining high accuracy. The results of our extensive are in good accordance with those of our theoretical study.

Image Credit: NASA/JPL-Caltech/Space Science Institute
N00281695.jpg was taken on 2017-05-14 19:21 (PDT) and received on Earth 2017-05-15 06:09 (PDT). The camera was pointing toward Saturn-rings, and the image was taken using the CL1 and CL2 filters.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

## Monday, May 15, 2017

### Highly Technical Reference Page: "the GAN Zoo" and "Delving deep into Generative Adversarial Networks (GANs)"

Much like what happened with the Advanced Matrix Factorization Jungle, here is a new Highly Technical Reference page, on a subject of increased interest that is difficult to follow for even a specialist: GANs

If you wonder what GANs are, take a look at the tutorial on Generative Adversarial Networks by Ian Goodfellow (and his NIPS slides) or John Glover's entry last August on the subject 'with TF code).

Avinash Hindupur who is behind deephunt.in recently listed the log series of GANs techniques in the GAN Zoo. From the page:

Every week, new papers on Generative Adversarial Networks (GAN) are coming out and it’s hard to keep track of them all, not to mention the incredibly creative ways in which researchers are naming these GANs! You can read more about GANs in this Generative Models post by OpenAI or this overview tutorial in KDNuggets.

Avinash also mentions that the list can be expanded:
You can visit the Github repository to add more links via pull requests or create an issue to lemme know something I missed or to start a discussion.
The subject is so hot that there is an earlier and somewhat more complete page on the subject

Delving deep into Generative Adversarial Networks (GANs) by Grigorios Kalliatakis
A curated list of state-of-the-art publications and resources about Generative Adversarial Networks (GANs) and their applications.....

Contributions are welcome !! If you have any suggestions (missing or new papers, missing repos or typos) you can pull a request or start a discussion.

### Jobs: Four Engineering positions at NVIDIA

Anita just sent me the following just before GTC2017 and subsequent discussion of their three billion dollars investment in the new chip effort (Tesla V100), the Volta Tensor Unit, Inference optimizers and their new cloud. Looks like the ML Hardware is eating the world !
Hi Igor,

Thanks for posting last time! I really appreciate. I have some other exciting roles I wanted to see if you are interested in posting, including a manager role! Thanks!

Thanks! Best Regards,

Anita Rexinger

NVIDIA Corporation

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Friday, May 12, 2017

### Speckle-based hyperspectral imaging combining multiple scattering and compressive sensing in nanowire mats

The paper has been pusblished but it was also just put on arxiv since Optics Letters is a Romeo green journal and allows preprints and even postprints to be archived. Enjoy !

Encoding of spectral information onto monochrome imaging cameras is of interest for wavelength multiplexing and hyperspectral imaging applications. Here, the complex spatio-spectral response of a disordered material is used to demonstrate retrieval of a number of discrete wavelengths over a wide spectral range. Strong, diffuse light scattering in a semiconductor nanowire mat is used to achieve a highly compact spectrometer of micrometer thickness, transforming different wavelengths into distinct speckle patterns with nanometer sensitivity. Spatial multiplexing is achieved through the use of a microlens array, allowing simultaneous imaging of many speckles, ultimately limited by the size of the diffuse spot area. The performance of different information retrieval algorithms is compared. A compressive sensing algorithm exhibits efficient reconstruction capability in noisy environments and with only a few measurements.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Thursday, May 11, 2017

### Paris Machine Learning Meetup #9 Season 4 @ IHP, Bias, Ethics & Fair Algorithms

video of the streaming is here:

Ce soir le meetup se fera à l'IHPCette soirée sera présidée par Cédric Villani.

Merci à l'IHP pour nous accueillir et remerciement à Quantmetry qui offre le buffet de clôture.
Pour cette soirée, nous souhaitons mettre l’accent sur les aspects sociaux, éthiques, philosophiques et juridiques.

Ceci afin de changer notre point de vue de Data Scientist et nous extraire de la technique.

Les développements récents (Amazon Go, voitures autonomes, bots conversationnels, objets connectés, drones militaires autonomes, ventes & police prédictives, ...) nous montre qu’on peut difficilement concevoir des algorithmes prédictifs sans s’interroger sur leur finalité, leur biais ainsi que sur les répercussions sociales de ces techniques.

Les décideurs politiques de tous les pays se penchent sur ces questions qui deviennent centrales. Ainsi, pour la France, la commission OPECST a publié un rapport dont voici les préconisations (le rapport en entier se trouve http://www.senat.fr/notice-rapport/2016/r16-464-1-notice.htmlhttp://www.senat.fr/notice-rapport/2016/r16-464-2-notice.html ).

Il y aura une petite introduction par Franck Bardol et moi sur le sujet.

Les invités

Programme

Prise de parole libre des intervenants pendant 20 min chacun.

Suivi de questions - réponses avec l'auditoire

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Tuesday, May 09, 2017

### Video: "Can the brain do back-propagation?" Goeff Hinton

The talk by Goeff was given a year ago at Stanford. I liked that sentence at 20 minutes and 30 seconds:

...I think it is a good idea trying to always try make the data look small by using a huge model, now this relies on you having more almost free computations...
I added below two papers mentioned in the talk

Learning Representation by Recirculation hinton by Geoffrey E. HintonJames L. McClelland
We describe a new learning procedure for networks that contain groups of nonlinear units arranged in a closed loop. The aim of the learning is to discover codes that allow the activity vectors in a "visible" group to be represented by activity vectors in a "hidden" group. One way to test whether a code is an accurate representation is to try to reconstruct the visible vector from the hidden vector. The difference between the original and the reconstructed visible vectors is called the reconstruction error, and the learning procedure aims to minimize this error. The learning procedure has two passes. On the first pass, the original visible vector is passed around the loop, and on the second pass an average of the original vector and the reconstructed vector is passed around the loop. The learning procedure changes each weight by an amount proportional to the product of the "presynaptic" activity and the difference in the post-synaptic activity on the two passes. This procedure is much simpler to implement than methods like back-propagation. Simulations in simple networks show that it usually converges rapidly on a good set of codes, and analysis shows that in certain restricted cases it performs gradient descent in the squared reconstruction error.

The brain processes information through many layers of neurons. This deep architecture is representationally powerful, but it complicates learning by making it hard to identify the responsible neurons when a mistake is made. In machine learning, the backpropagation algorithm assigns blame to a neuron by computing exactly how it contributed to an error. To do this, it multiplies error signals by matrices consisting of all the synaptic weights on the neuron's axon and farther downstream. This operation requires a precisely choreographed transport of synaptic weight information, which is thought to be impossible in the brain. Here we present a surprisingly simple algorithm for deep learning, which assigns blame by multiplying error signals by random synaptic weights. We show that a network can learn to extract useful information from signals sent through these random feedback connections. In essence, the network learns to learn. We demonstrate that this new mechanism performs as quickly and accurately as backpropagation on a variety of problems and describe the principles which underlie its function. Our demonstration provides a plausible basis for how a neuron can be adapted using error signals generated at distal locations in the brain, and thus dispels long-held assumptions about the algorithmic constraints on learning in neural circuits.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

## Monday, May 08, 2017

### SPARS 2017: The program is out !

Mario just sent me the following:

Dear Igor,
The program of SPARS 2017 is now available at the workshop website:
This may be of interest to the readers of Nuit Blanche (many of which will be attending SPARS). SPARS 2017 will feature an excellent set of 8 plenary speakers, 34 oral presentations, and 111 posters, on the general area of sparsity-related techniques and computational methods, for high dimensional data analysis, signal processing, and related applications.
Best regards,
Mario

## Sunday, May 07, 2017

### Sunday Morning Videos: Deep Learning and Artificial Intelligence symposium at NAS 154th Annual Meeting

Yann points to this series of videos of a symposium of the National Academies of Sciences. Noteworthy is Bill Press who introduces the symposium. This is quite fitting as Bill is a major figure behind the Numerical Recipes that has changed algorithm use in Engineering and Science in the mid-90's.

## Deep Learning and Artificial Intelligence

In less than a decade, the field of “artificial intelligence” or “AI” has been jolted by the extraordinary and unexpected success of a set of techniques now called “Deep Learning”. These methods (with some other related rapidly advancing technologies) already exceed average human performance in some kinds of image understanding; spoken word recognition and language translation; and indeed some tasks, like the game of Go, previously thought to require generalized human intelligence. AI may soon replace humans in driving cars, coding new software, robotic caregiving, and making healthcare decisions. The societal implications are enormous. In this session, experts in the field discuss this revolution from five different perspectives. The Symposium has concluded. A recording is available above.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.