Monday, February 12, 2007

Finding Jim Gray: Quantifying the state of our knowledge / Quantifying the state of our ignorance



The more I am looking at some of the multispectral images, the more I am convinced that the obstruction of the clouds should not be discounted. But more importantly, another issue is the data fusion from different sensors.
Thanks to both the John Hopkins and the University of Texas websites, we have data from a radar (radarsat) or in the visible wavelength regime (ER-2, Ikonos, Coast guard sightings). Every sensor has a different spatial and spectral resolution yet some can see through clouds whereas others cannot. Multispectral could be added to this mix but they suffer from low spatial resolution (lower than the radar) while having higher spectral resolution. Other information such as human sightings by private airplane parties should also be merged with the previous information. [ As a side note I have a hard time in convincing the remote sensing people that spatial resolution is not an issue as long as we can detect something different from the rest of the background.]

Finally, the other variable is time. Some areas have been covered with different sensors at different times. This is where the importance of the drift model become apparent.



The state of our knowledge of what is known and what is not known becomes important because as time passes by, it becomes difficult to bring about the resources of search and rescue teams. I have been thinking about trying to model this using a Maximum Entropy (Maxent) but any other modeling would be welcomed I believe. The point is that when a measurement is taken at one spatial point, we should look at it as if it were a measurement that will vary with time. The longer you wait, the more you won't know if the Tenacious is there or not.
For those points were we have identified potential targets, we need to give them some probability that Tenacious is there but we also know that if we wait long enough, there will be a non-null probability to have gone away from that point. Also, this formalism needs to allow us to portrait the fact that no measurements were taken over certain points in a region where other points were taken (the issue of clouds). This is why I was thinking of implementing a small model based on the concept of Probabilistic Hypersurface a tool designed to store and exploit the limited information obtained from a small number of experiments (a simplified construction of it can be found here). In our case, the phase space is pretty large, each pixel is a dimension (a pixel is the smallest pixel allowed for the spatial resolution of the best instrument). All pixels together represent the spatial map investigated (this is a large set). The last dimension is time. In this approach the results of JHU and UCSB as well as the Mechanical Turk could be merged pretty simply. This would enable us to figure out if any of the hits on Ikonos can be correlated to the hits on Radarsat. But more importantly, all the negative visual sightings by the average boater could be integrated as well in there because a negative sighting is as important as a positive one in this search. And if computational burden become an issue for the modeling, I am told that San Diego State is willing to help out big time.
[added note:
What I am proposing could already be implemented somewhere by somebody who is working in areas of bayesian statistics, maximum entropy techniques. Anybody ?]

No comments:

Printfriendly