Page Views on Nuit Blanche since July 2010


You can also join the Google+ Community (380), the CompressiveSensing subreddit (114), the LinkedIn Compressive Sensing group (2259) or the Matrix Factorization (657) and post there !
Reference pages include The Big Picture in Compressive Sensing, the Advanced Matrix Factorization Jungle Page and the Reproducible Research page

Friday, May 17, 2013

Ghost Imaging does 3D and multispectral Imaging




Sylvain Gigan (also in Science) let me know of the following paper that is also making the rounds in Science. If you recall, it was shown earlier that compressive sensing was speeding up the process of acquiring the data. The physics itself is not changed and revolves around this formula:



3D Computational Ghost Imaging by Baoqing Sun, Matthew P. Edgar, Richard Bowman, Liberty E. Vittert,Stephen S. Welsh, Ardrian Bowman, Miles J. Padgett
Computational ghost imaging retrieves the spatial information of a scene using a single pixel detector. By projecting a series of known random patterns and measuring the back reflected intensity for each one, it is possible to reconstruct a 2D image of the scene. In this work we overcome previous limitations of computational ghost imaging and capture the 3D spatial form of an object by using several single pixel detectors in different locations. From each detector we derive a 2D image of the object that appears to be illuminated from a different direction, using only a single digital projector as illumination. Comparing the shading of the images allows the surface gradient and hence the 3D form of the object to be reconstructed. We compare our result to that obtained from a stereo- photogrammetric system utilizing multiple high resolution cameras. Our low cost approach is compatible with consumer applications and can readily be extended to non-visible wavebands.
The 3D of this paper has little to do with the 2D extraction from Three-dimensional ghost imaging ladar and we measure reflectance difference here as opposed to time of flight information. Here is the multispectral paper:



Multi-wavelength compressive computational ghost imaging by Stephen S. Welsh, Matthew P. Edgar, Phillip Jonathan, Baoqing Sun, Miles. J. Padgett
The field of ghost imaging encompasses systems which can retrieve the spatial information of an object through correlated measurements of a projected light eld, having spatial resolution, and the associated reflected or transmitted light intensity measured by a photodetector. By employing a digital light projector in a computational ghost imaging system with multiple spectrally fi ltered photodetectors we obtain high-quality multi-wavelength reconstructions of real macroscopic objects. We compare di erent reconstruction algorithms and reveal the use of compressive sensing techniques for achieving sub-Nyquist performance. Furthermore, we demonstrate the use of this technology in non-visible and fluorescence imaging applications.

Thanks Sylvain !






Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Sparse FFT implementations

Dmitry Savostyanov pointed me to the location of an implementation of the other super Fast FFT we mentioned last September. It is on GitHub here as part of the very promising TT-Toolbox

Let us note that the sparse FFT "looks" slower than MIT's sFFT and that the MIT sFFT has currently only version 1 and 2 while Piotr mentioned results for version 3 and 4 on Wednsday at the "Big data: theoretical and practical challenges" workshop.

We are waiting for the Berkeley implementation mentioned previously for the end of the summer. The Ann Arbor FFT (AAFFT) is here.  

With regards to the comparison between sFFT [2] and the TT_toolbox version [1]


It looks like sFFT version 3.0 scales better for K sparse signals than the TT_toolvox one which scales as K^3. But I wouldn't mind seeing comparison for compressible signals (I think it is version 4.0 for sFFT) and the TT_toolbox one. An interesting comparison should eventually entail not integer frequencies and higher dimensions as well. 




[1] Superfast Fourier Transform Using QTT Approximation by Sergey Dolgov, Boris Khoromskij and Dmitry Savostyanov.
[2] http://groups.csail.mit.edu/netmit/sFFT/results.html

Thursday, May 16, 2013

Another day at the Big Data: Theoretical and Practical Challenges Workshop

Yesterday was the second day at the Big data: theoretical and practical challenges - May, 14-15, 2013 - IHP, Paris. A big thank you to both Francis Bach and Michael Jordan for organizing the meetingFrancis tells me the slides will be available at some in time so we can come back to specifics of some presentations but for the time being, here are just a few notes that drew my attention because of their relevance to some of the subjects we discussed here before. 

In Piotr Indyk's presentation on how the sFFT work (I very much liked the Bins and Balls explanation) , this is the first time I heard about an Hadamard based algorithm with good running time. I asked Piotr about it and here are the references for that work:   

The improved algorithm appeared in
L.A. Levin. Randomness and non-determinism. J. Symb. Logic, 58(3):1102?1103, 1993.
Alas, it is deeply buried in there. A good explanation is given in: O. Goldreich. Modern cryptography, probabilistic proofs and pseudorandomness. Algorithms and Combinatorics, 17, 1999. Appendix C.2.3 Improved Implementation of Algorithm. Alas, there is no actual implementation (i.e., code) available, to the best of my knowledge.
Thanks Piotr ! If now we could compare sFFT with the new one from Berkeley (see below) and the other superfast FFT for compressible signals, that would be great :-) [ of note all the Slides of the recent Workshop on Sparse Fourier Transform organized by Piotr are here.  ]

Given an $n$-length input signal $\mbf{x}$, it is well known that its Discrete Fourier Transform (DFT), $\mbf{X}$, can be computed in $O(n \log n)$ complexity using a Fast Fourier Transform (FFT). If the spectrum $\mbf{X}$ is exactly $k$-sparse (where $k<0:99); and (ii) computational complexity: we can reliably compute the DFT X using O(k logk) operations, where the constants in the big Oh are small and are related to the constants involved in computing a small number of DFTs of length approximately equal to the sparsity parameter k. Our algorithm succeeds with high probability, with the probability of failure vanishing to zero asymptotically in the number of samples acquired, M. Our approach is based on filterless subsampling of the input signal x using a small set of carefully chosen uniform subsampling patterns guided by the Chinese Remainder Theorem (CRT). The idea is to cleverly exploit, rather than avoid, the resulting aliasing artifacts induced by this subsampling. Specifically, our subsampling operation on x is designed to create aliasing patterns on the spectrum X that “look like” parity-check constraints of good erasure-correcting sparse-graph codes. We show how computing the sparse DFT X is equivalent to decoding of these sparse-graph codes. These codes further allow for fast peeling-style decoding, similar to that of fountain codes. The resulting DFT computation is low in both sample complexity and decoding complexity. We accordingly dub our algorithm the FFAST (Fast Fourier Aliasing-based Sparse Transform) algorithm. We analytically connect our proposed CRT-based aliasing framework to random sparse-graph codes, and analyze the performance of our algorithm using density evolution techniques from coding theory. Additionally our approach uncovers a new deterministic and efficient class of compressive sensing measurement matrices as an alternative to the popular choice of random matrices. Based on the qualitative nature of the subsampling patterns needed, our analysis is decomposed into two regimes: i) “very-sparse” (0 lt 1=3), where M < 2:5k, and ii) “less-sparse” (1=3 lt 1), where M lt 4k for lt 0:99. While our theory is asymptotic, our simulation results reveal that in practice our algorithm works even for moderate values of k and n, e.g. k = 200 and n = 4080. Further, when k = 300, and n is approximately 3:8 million, our algorithm achieves computational savings by a factor of more than 6000, and savings in the number of input samples by a factor of more than 3900 over the standard FFT. We analyze the 1-D DFT here, but our approach can be extended in a straightforward manner to multi-dimensional DFTs. Further, while our analysis is for the exactly sparse case, our approach can be extended to be robust to the more general approximately sparse and noisy measurement cases as well, albeit at the cost of increased sample and computational complexity. We provide extensive simulation results in Section VIII that corroborate our theoretical findings, and validate the empirical performance of the FFAST algorithm for a wide variety of 1D and 2D DFT settings for signals having an exactly or approximately sparse Fourier spectrum.

Kannan tells me that in the next version of their papers sFFT will be included in the comparison and that an implementation should be out by the end of the summer. woohoo! 

Coming back to the meeting, I was surprised to see such a good scores for the Netflix problem in Lester Mackey's talk on Divide-and-Conquer Matrix Factorization which we have featured here before (and is listed in the Advanced Matrix Factorization Jungle page page). I'll wait for the slides to be sure of what I saw. Since DCF goes so fast I wonder if Cable and I shouldn't be using it in our CAI adventures

Finally, I note that the talk by Mike Mahoney on Revisiting the Nystrom Method for Improved Large-Scale Machine Learning is the rebirth of randomization thanks to the use of theoretically stronger mathematical results (Johnson-Lindenstrauss) and is behind the Randomized Numerical Linear Algebra movement (RandNLA tag on Nuit Blanche). during the presentation I think I heard Mike saying because of the rounding errors, the randomized approach had a better accuracy than full LAPACK which reminded me this entry: Slowly but surely they'll join our side of the Force... Anyway, I have seen most of the figures of Mike's presentation in the following two papers: 


We reconsider randomized algorithms for the low-rank approximation of symmetric positive semi-definite (SPSD) matrices such as Laplacian and kernel matrices that arise in data analysis and machine learning applications. Our main results consist of an empirical evaluation of the performance quality and running time of sampling and projection methods on a diverse suite of SPSD matrices. Our results highlight complementary aspects of sampling versus projection methods based on leverage scores. We complement our empirical results with a suite of worst-case theoretical bounds for both random sampling and random projections methods. These bounds are qualitatively superior to existing bounds---e.g., improved additive-error bounds for the spectral and Frobenius norm errors and relative-error bounds for the trace norm error. 

and the attendant ICML version.




Image Credit: NASA/JPL-Caltech

This image was taken by Front Hazcam: Right B (FHAZ_RIGHT_B) onboard NASA's Mars rover Curiosity on Sol 275 (2013-05-15 15:12:59 UTC).




Full Resolution

Tuesday, May 14, 2013

A day at the Big Data: Theoretical and Practical Challenges Workshop

Today, I was at the Big data: theoretical and practical challenges workshop at IHP and liked it very much. In particular, I was pleasantly surprised that this issue of phase transition being center to demonstration arguments. Michael Jordan talked about a new kind of (simple but seemingly powerful) K-means algorithm. Alexandre D'Aspremont presented some new work on the approximation bounds for sparse principal component analysis. Alfred Hero described phase transition issues on his talk on information gathered from correlation mining. Finally, Martin Wainwright, provided a theoretical analysis as to why some gradient descent algorithm work well for particular non convex loss and regularization functions. I particularly liked this specific figure excerpted from paper [1] on which the presentation was partially excerpted from:


Why ? because it reminded me of this discussion that Donoho, Johnstone, Hoch and Stern had back in 1992 about the different regularizers they were using to obtain a sparse spectrum back [2]. In it, they compared the L1 norm, least squares and the maximum entropy functional. In the end, L_1 was found superior based on an empirical error minimum. As we seen recently in How to find real-world applications for compressive sensing, it's only when we got to learn which ensemble worked well with L_1 that compressive sensing was born. I wonder if like Maximum entropy, SCAD or MCP could yield different ensembles of interest to the sensing hardware makers. I also look forward to using the work of Alfred Hero for calibration purposes as he mentioned in his presentation. I am sure the slides will be made available later. 

[1] Regularized M-estimators with nonconvexity: Statistical and algorithmic theory for local optima
Po-Ling Loh, Martin J. Wainwright
We establish theoretical results concerning all local optima of various regularized M-estimators, where both loss and penalty functions are allowed to be nonconvex. Our results show that as long as the loss function satisfies restricted strong convexity and the penalty function satisfies suitable regularity conditions, any local optimum of the composite objective function lies within statistical precision of the true parameter vector. Our theory covers a broad class of nonconvex objective functions, including corrected versions of the Lasso for error-in-variables linear models; regression in generalized linear models using nonconvex regularizers such as SCAD and MCP; and graph and inverse covariance matrix estimation. On the optimization side, we show that a simple adaptation of composite gradient descent may be used to compute a global optimum up to the statistical precision epsilon in log(1/epsilon) iterations, which is the fastest possible rate of any first-order method. We provide a variety of simulations to illustrate the sharpness of our theoretical predictions.
[2] Maximum Entropy and the Nearly Black Object by Donoho, Johnstone, Hoch and Stern



Image Credit: NASA/Carla Cioffi
Expedition 35 Landing

The Soyuz TMA-07M spacecraft is seen as it lands with Expedition 35 Commander Chris Hadfield of the Canadian Space Agency (CSA), NASA Flight Engineer Tom Marshburn and Russian Flight Engineer Roman Romanenko of the Russian Federal Space Agency (Roscosmos) in a remote area near the town of Zhezkazgan, Kazakhstan, on Tuesday, May 14, 2013. Hadfield, Marshburn and Romanenko returned from five months onboard the International Space Station where they served as members of the Expedition 34 and 35 crews.

Monday, May 13, 2013

Please take the time to nominate somebody or something today

Good morning everyone, first some space goodness from Chris Hadfield


h/t to Chris Hadfield's G+ feed and Mario's G+ feed

I have known astronauts. The thing they all really really really want to do is to go back. If in some forum or if you have a position of influence, of if NASA, CSA or any space agency ask you who should go up, tell them Chris, or anybody with that profile. fits that bill. Chris is coming back to Earth today. I am rooting for him for a second extended trip on Space Station. 

If you are not in a position of influence with regards to selecting the next astronauts to go up but you read this blog and you are a US person, it means you care about some of the most advanced research there is and you are yourself an agent of change (If you are not a US person,, I don't think you can nominate anybody as the form seems to require a US address. Please note that you are a US person if you have a US address).  It is important to raise this community's recognition by nominating some of its champion of change. As you know The OSTP is seeking an Outstanding “Open Science” Champion of Change  Before tomorrow May 14th. please take the time to go to the White House website and nominate your Champion of Change in the Open Science category. Once you are done, please ask your colleagues to also a nominate their champion of change in the Open science category at: http://www.whitehouse.gov/champions/nominate. Here is a list of people, organization or efforts that you could choose from.




Standalone peer review platforms:

annotatr: “citeulike+disqus mashup: a place for online journal clubs; Find abstracts and comment on them with your friends.”


Arxiliv: Reddit clone pulling in articles from ArXiv.




Faculty of 1000: “identifies and evaluates the most important articles in biology and medical research publications. Articles are selected by a peer-nominated global 'Faculty' of the world's leading scientists and clinicians”.




Haldane’s Sieve - “Discussing preprints in population and evolutionary ecology”



OpenPub: “We’re launching a website to host scientific preprints alongside open discussion and review. We’re throwing in a reputation system to boot, so authors and readers can better assess the usefulness of reviews and comments. And it’s all going to be done out in the open.”


PaperCritic: “ offers researchers a way of obtaining and providing feedback for each others [sic] work in a fully open and transparent environment.” Heavy Mendeley integration


PaperRater.org: “open review and collaborative reading of research papers”


Rate, tag, and comment papers as they appear on arXiv.

Rate, tag, and comment papers as they appear on arXiv.


Peer Evaluation: “Curate the peer review of your scholarly communications;  make your work visible to scholarly search engines; track the impact & reuse of what you share online; disseminate anywhere and collect all feedback here...” [may go below][This is an interesting question. I’ve used PE as one publication venue for an article that I’d like to get peer reviewed. I also use it just to get feedback on stuff already published in peer-reviewed journals and for stuff like blog posts. So, where does it belong?]


Peerage of Science: Closed to public, access to community via invitation or by submitting a manuscript. Sells access to reviews and evaluation data to publishers, free for scientists.


PubPeer: “a website where open peer review is not intimidating to users, while maintaining the rigor and anonymity of the closed review process currently used by the major journals.”  Centralized database on which all first and last authors of biomedical articles can comment on biomedical articles.  Promoting a post-publication conversation.


PubUp: “an open access online platform for researchers to discover & share journal articles that are worth reading, to discuss scientific ideas that are worth spreading, and to connect with people who share similar interests.            


Rubriq: Closed review, charges authors to submit manuscripts. “As an independent, for-benefit organization, Rubriq can provide rigorous reviews by the same qualified peers who review for journals, but with a standardized scorecard that can be used in any publishing model. Our system will enable faster, more consistent reviews, and will help match papers with the right journals”.


Sympoze: Private, blind, review with votes for reject or accept - ratio between these determines accept, reject, request revisions. Like a journal in some respects. Accepted papers go online (OA), print/ebook for sale. "By crowd-sourcing peer-review, referees need not worry about typing up a detailed referee report. Referees can read the paper and vote for acceptance or rejection and write one or two comments about the paper, instead of an entire report."


SciOR: “Online registry service promoting efficient and accountable open peer review, effective reviewer participation incentives and reputation metrics, and rapid dissemination of discovery in science”.


The Third Reviewer: “a forum for scientists to share opinions about recently published research.” Now defunct as best I can tell


TiNYARM: “TinyARM is a simple tool to share the papers you read with your colleagues. You can suggest papers to your friends and as an extra bonus, you can keep track of what you read, skimmed, planned to read, and got as suggestions.”


Publons: “...a public peer-review platform that enables academics to gain reputation for their reviews.”


Journal Lab: “ database of figure-by-figure comments on published life sciences and biomedical research. Grad students and post-docs contribute by sharing the insights about published data that regularly occur in lab meetings, lunchroom discussions, and journal clubs.”


Publishing portals built around post-publication review

I’ve added this list because these products often get thrown in with the ones above, since they both manage post-publication reviews. There’s a fundamental difference, though: these systems only support review of articles they host themselves. In this way, they’re much closer to traditional publishing, since they put dissemination, archiving, and certification all under one roof.


F1000 Research: “F1000 Research will offer immediate publication; open, post-publication peer review; open revisioning of work including ongoing updates; and encourage raw data deposition and publication....broad range of article formats and content types.”


WebMed Central: “...we publish a range of articles without any prepublication peer review in virtually every biomedical discipline...within 48 hours of submission. WMC encourages scientists and readers to submit their reviews on the published material freely.”