Tuesday, March 20, 2007

Project Kraken and EarthBrowser circa 1998

For the past two weeks I've been working on a secret project (codename Kraken); and it will be the perfect compliment to the next version of EarthBrowser. I'll give out some hints and details in the coming weeks about it's potential uses and capabilities. However before I reveal the future, I wanted to share the earliest snapshot of EarthBrowser caught in the wild...

I haven't kept many records from the time, but I looked up my old website from Dec 6, 1998 in the wayback machine. It featured the first version of EarthBrowser which was called Planet Earth at the time. I couldn't get the domain name for that so I changed the name to EarthBrowser at the end of 1999. Anyone who is interested can take a look in the wayback machine here. Unfortunately there are no images saved, they were pretty cheesy anyway. Once Planet Earth started selling, I quit my job and have been working on it full time ever since.

Here's the main text:

Planet Earth 1.0
Have you ever wanted to see the Earth from space? Now you can see what it looks like right on your desktop. Planet Earth displays a realtime 3-dimensional model of the Earth with the current cloud information downloaded directly over the internet. Night and day shadows are updated continuously and you can rotate it to view any spot on the Earth.

Features
  • Cloud information updated every 6 hours.
  • Areas of night and day are updated continuously.
  • Real 3-dimensional model can be rotated to reveal any part of the earth.
  • Zoom in or out
  • Position viewing location over any spot on Earth
  • Rotates freely or stays in fixed position

Requirements
  • Macintosh with PowerPC processor
  • Internet connection (for clouds)
  • System 7.5 or above
  • 1 MB Disk
  • 4 MB Ram


Mac System 7.5, that's old!! I remember trying to get it to work on the Motorolla 68000 processors but they were just too under powered in floating point...

Monday, March 19, 2007

15 meters of Landsat

I've been working on a new Landsat 15 meter per pixel base map for version 3.0 of EarthBrowser which is due out later this year. Tasks like this make it painfully obvious that I am crazy to be doing this all by myself. Here is where I'm at with that...

A couple of years ago I decided that I needed a royalty free base map for EarthBrowser that looks better than the one available through the OnEarth WMS service which EarthBrowser currently uses. My only option at the time was to download the raw data and process it myself. I wrote a script that downloaded the raw tiles from Landsat.org slowly so as not to overtax their servers. I was looking at a long, uphill battle to get what I wanted. Now I have all of the raw data, but it needed a lot of pre-processing to be a seamless mosaic that wouldn't have the abrupt changes in contrast and quality from tile to tile.

Then in November of last year I found just what I was looking for: a set of pre-mosaiced 15 meter Landsat data from NASA John C. Stennis Space Center. Two of the color channels were in the infrared spectrum, which is great because many of the visible color channels have haze in them which makes it difficult to get a seamless looking image, but the infrared cuts through the haze. However this means that it looks strange because it isn't a true color image. More image processing needed...

I keep going and began taking scenes in different regions of the world and re-coloring them by hand and using the differences in the two images to "train" an algorithm to color the entire dataset in a more realistic color scheme. First I tried using principal component analysis to reduce the dimensionality of the problem. This didn't work too well since the principal components changed from region to region. I settled finally on using a RGB to HSV conversion to separate the color components. I then use a fairly simple piecewise polynomial approximation using Jacobian matrix decomposition in three dimensions, after which I re-project it back into RGB space and it doesn't look too bad!

However I am now running up against a problem that I didn't think about until I got to this step (doesn't it always work that way). This dataset has been compressed using a "lossy" wavelet compression algorithm. That means that my polynomial approximations will work for one trained tile but won't transfer well to other areas of the mosaic due to the incosistancy of color values. I'm getting little yellow dots appearing randomly and big swaths of green where it should be dark blue due to compression artifacts. Arrrrgh!

As an example just recently I was contacted by an environmental organization from the Greek island of Lesbos asking me if I could donate a high resolution image of Lesbos for a mural at their center. Happy to help an environmental organization and also as a first real test of some of the new capabilities of EarthBrowser, I made a high res image to his specifications and unfortunately had to do a lot of hand editing in order to make it look good. Here is a before and after showing my color transform algorithm in operation.



Unprocessed NASA Landsat Lesbos

EarthBrowser Processed NASA Landsat Lesbos

Hand Edited EarthBrowser processed Lesbos



It is clear from this test that I need the data in an uncompressed format in order for my algorithms to work properly. After inquiring around, I have finally found someone within NASA that can sell me the uncompressed data on hard drives but it is $16,000 and they are not quite clear on whether it is the exact dataset that I am referencing. Needless to say, I can't afford that. So I'm kind of stuck.

There is some hope though, just recently I read on the excellent Bull's Rambles Blog that a fantastic new Landsat 15m dataset has been "donated... to the public domain." One would assume that the use of NASA resources to produce this dataset would have a prerequisite of the product being in the public domain. I've been in contact with the WorldWind project lead Patrick Hogan to inquire about it's availability, and they understandably want to have it "premiered" by WorldWind. This leads me to believe that it will be generally available at some point, but there has been no difinitive answer yet.

I'm keeping my fingers crossed because I'd really rather spend my time programming.

Thursday, March 08, 2007

Sweet Sweet Amazon S3

Another gigantic hurdle overcome! Last night I woke up around 4am and laid there thinking about all of the sweet simplifications possible. I've just signed up with Amazon S3 (Simple Storage Solution) and I think it just lowered my blood pressure a bit. The Coding Horror blog, which is fast becoming a favorite, details their efforts at reducing server load by putting static images on S3. I've heard about S3 since it has been out but thought it too complicated and didn't want to be a first adoptor, but now I'm sold.

I run two separate servers for EarthBrowser, one for the website and one as the dedicated data server for all of the EarthBrowser instances out there. Whenever I put out a release, my web server would get slammed and become unresponsive with thousands of people each downloading the 10 megabyte installer file. I have actually written and implemented, in PHP, a distributed, caching data server toolset that would allow me to rent several cheap $7 dreamhost setups and plop it in to increase my capacity. They would talk to each other and update their internal datasets for more frequently requested items. I was even considering finding some European or Australian hosting sites to get closer to the rest of the world in anticipation of EarthBrowser v3 coming out later this year. It was working pretty nicely but the administrative and logistical overhead was getting to be a bit overwhelming. Now I don't have to worry about storing and managing a group of servers to distribute EarthBrowser v3 data. With Amazon S3 I have virtually limitless capacity and limitless download bandwidth for about $0.20 a Gigabyte.

Now all that work can be thrown away, which makes me both cringe and sigh in relief at the same time. Often I work for a week or two on a technology and wind up throwing it away when something better comes along. Sometimes only through the act of creating and exploring a problem space can better alternatives be discovered. That is why product specs should never be static documents. This time the better solution came along and whacked me on the head.

It is a little tricky to get working well since it uses a SOAP or REST interface with some quirks. I wrote a python script that works with the excellent boto module to allow anyone with a S3 account to create/delete/update objects and buckets from the command line. I'm making it open source and downloadable for anyone who wants it. If you have any suggestions, let me know. There are some issues if you are using Python 2.3 or below, contact me and I'll tell you how to fix them.

Monday, March 05, 2007

The virtual globe as a software platform

Web browsers are a very interesting and instructive case of a software platform that is completely dependent upon the network but lives on the desktop, like virtual globes should. Browsers have a language that controls the content (HTML), a language that can provide functional logic (Javascript) and a plugin architecture. The magnitude of the world wide web is a testament to just how powerful a platform the web browser is. But what elements make it so successful? A display language is a given since the browser would not exist without it, but that is not sufficient. The network is where the full power becomes apparent. The ability to algorithmically define content for local display on remote computers is where the real power comes from. However server side control of content has many limitations. HTTP being a stateless connection has been the headache of many web developers. We are now starting to see some of the real potential of the web browser become unleashed with the new "Web 2.0" hype surrounding the Javascript language. What we are really seeing is the control of the display being performed by the client rather than a remote server. Imagine that, client side control of an application, we're back to the 1980s and it's about time!

I have no doubt that each of the development teams of the major virtual globes want their creation to become the default platform for geospatial content. Why shoot for anything less? Let's take a look at where things stand for the big guys and give a little update on where EarthBrowser is headed.

Google Earth: The KML format gives the ability for novice users to add their own simple content to Google Earth, much like a static web page. The network link provides the ability to get dynamic, server based content into GE, it is very "Web 1.0." with no client side processing. They are currently at the top of the game. They have a tremendous asset with their enormous and growing dataset. However as I mentioned in a previous post, they will be running up against some limitations of their design if they don't change course, but their biggest problem right now is how to make money from it.

WorldWind: This product has advanced the farthest in the key platform categories. It supports KML and has a plugin architecture as well. There have been some recent developments which might allow one to use Python for client side processing. Their early decision to use .NET has delayed it becoming cross platform, however a Java port is on the way. I am generally underwhelmed by Java applications but am hopeful they will pull it off. Being from a government agency, they don't have the same pressures as a commercial entity. This, and being open source, gives them an advantage in the platform building arena. WorldWind has the best shot at becoming the platform of choice for researchers, if they can just convince them to write in Java.

Arc!GIS Explorer: It seems to me the primary purpose for ESRI putting this out is to keep their customers from migrating to Google Earth for certain tasks. I understand that they are licensing the engine from another company (Skyline Globe or GeoFusion?) and are therefore not in control of their own product. Apparently they allow some sort of scripting (a restricted ArcScript?) functionality. It has the potential to be something great, but ESRI doesn't really have the DNA to be a mass market product leader. Future enhancements are clouded by not owning the source code.

Microsoft's Virtual Earth: VE is a strange bird as it is an ActiveX browser plugin and not a full fledged application, but I see little difference in the installation procedure. Like ESRI, it seems they are trying to stave off Google Earth and keep the market open by providing an alternative. However they want to lock you in to using Windows and Internet Explorer at the same time. It has some really neat features, it is scriptable, and I'm certain it will be pushing Google to come up with bigger and better solutions. They, like Google, are going to have a hard time monetizing this product, the whole billboard thing is kind of stupid and belongs in something like Second Life. Perhaps that is what they are ultimately shooting for. The battle for the mass consumer geospatial market will be between Microsoft and Google.

Where is EarthBrowser in this list? Right now it is not in the same league as the other offerings. It does have some nice features that are just now being added to some of the other offerings like live cloud animations, earthquake data, sea surface temperature animations and weather forecasts. Version 3 is a complete re-write that is years in the making. It will have all of the features that are currently in version 2 integrated into a full hardware accelerated 3D environment. You will be able to smoothly fly through mountains, into and out of the atmosphere. It will support most of the popular raster and vector formats and enable you to link to sources of data on the internet. There are quite a few other neat little tricks in store that I'll reveal as it gets closer to the release date.