Thursday, September 20, 2007

Fast Steradian Intersection Correction

I just wanted to go back and mention that if the sum of the steradian angles is greater than 180 degrees, then due to the sine and cosine functions, the simple formula breaks down. However, we know that if the sum of the angles is greater than 180 degrees then by definition they intersect. Due to this, we should also store the steradian angle along with the angle sine and cosine.
So the revised intersection test algorithm becomes:
angle_a + angle_b > pi or dot(normal_a, normal_b) >= cos_a*cos_b - sin_a*sin_b

Not as pretty. A speedup for large coverages :-) but a slowdown for everything else... :-(

Simplified GeoJSON proposal

I've come up an extremely simplified GeoJSON example which can do everything that the current GeoJSON specification can do but with only 4 core elements. Every element is a feature and can contain a list of points, lines and polygons or other features. Everything else in the GeoJSON spec is pretty much taken from some OGC standard or another. It's important not to get caught up in the past we are at the beginning of a potential standard, as Sean said in his post: "there's only one first chance to get a standard right." The only thing that I think might be questionably useful about this would be the nesting of features.

[x0,y0], [x1,y1], ..., [xn,yn]
[x0,y0, x1,y1, ..., xn,yn],
[x0,y0, x1,y1, ..., xn,yn],
[x0,y0, x1,y1, ..., xn,yn],
[x0,y0], [x1,y1], ..., [xn,yn]

Also an optional "crs" coordinate reference system can contain an EPSG code, ESRI WKT (Well Known Text) or a PROJ.4 projection string or all 3. If none is specified, the default is decimal degrees in the WGS84 datum. The coordinate reference system of the parent cascades down the to all children until a child specifies one.

"wkt":"COMPD_CS["OSGB36 / British National Grid...",
"proj4":"+proj=utm +zone=15 +ellps=GRS80 +datum=NAD83 +units=m +no_defs"

Another useful item would be an optional "bounds" element that specifies the bounding envelope in the element's crs.
"bounds":[min_lon, min_lat, max_lon, max_lat]

Minimalistic and precise.

GeoJSON redux

Sean Gilles has a post responding to my call for a simplified coordinate representation in GeoJSON. His argument is that clarity of representation is more important that implementation overhead which I agree with to some extent.

However, it seems to me that the reason that JSON is so much better than XML for many purposes is exactly that it *does* take into account implementation overhead, thereby making it easier to exchange internal data structures without the overhead of XML.

For any sane implementation of JSON, the following must be extracted as a list of lists:

[ [x1, y1, z1], ..., [xn, yn, zn] ]

So a list of 10,000 coordinates will give you 10,000 lists of 3 floating points each. Or 10,000 list structures and 30,000 floats. Each list takes time and memory to create and address and extract the elements out of. Whereas:

[ x1, y1, z1, ..., xn, yn, zn ]

gives you the same 30,000 floats but only one list. A big win for any standardized JSON reading algorithm. Creating a custom, context sensitive JSON parser to ignore the structure is more than a little implementation detail.

Strangely, after Sean argues against removing context information to improve clarity, he then requests just that. He suggests that the "type" element that describes what kind of GeoJSON element you are looking at be removed since it should be obvious from the type of request that you made to receive the GeoJSON content. What if one were to receive a set of files with various different sets of data, some single features, some feature sets and perhaps some just geometry elements? If you remove the type field from the geometry elements, how would you know what kind of geometry you have? You couldn't tell a Point from a Polygon without the type field.

All that said, I do have another simplification for GeoJSON that is unrelated to the previous issues. How about we do away with the 'Point', 'LineString' and 'Polygon' geometry types. No really. They are just special cases of 'MultiPoint', 'MultiLineString' and 'MultiPolygon' with one element. I find myself writing code to take the special case single element entities and put them into the more general multi-element entities. That is code I would much rather not write since it introduces complexity and potential bugs. The only real difference in the two is the lack of a "Multi" and an extra set of an enclosing brackets.

Wednesday, September 19, 2007

Fast Steradian Intersection Testing

I've come up with a pretty neat inclusion testing algorithm and I wanted to share it. The need is to test for inclusion of datasets within a camera view. Datasets are normally constrained to a certain area on the globe, but can cover the entire globe in some cases.

The simple way to do this is to make a bounding box with longitude and latitude coordinates and then just do intersection testing from a view bounding box. However as you get farther from the equator, the bounding boxes get progressively more distorted and you can be either over or under inclusive in your datasets.

My solution is to use steradians, or solid angles. All you need is a normal vector and an angle, four floating point numbers which is the same as a bounding box, actually I add another to speed up calculation but I'll explain that later. It performs uniformly anywhere on the sphere, and to specify global coverage you just set the angle to 180 degrees.

Now all you need to do is to calculate the steradian coverage for a dataset, and one for the visible camera volume and you can do inclusion testing with just five multiplies, three adds and a compare. It may be a little more work than a bounding box intersection test, but not much more and I think that it is an acceptable tradeoff for the properties that it provides.
It's simple really, the dot product between steradian normals gives you cosine of the angle between them. Intersection is indicated by an angle less than the sum of each steradian's coverage angle, or:
dot(normal_a, normal_b) >= cos(angle_a + angle_b)

There is a nice formula for the cosine of a sum:
cos(a + b) = cos(a)cos(b) - sin(a)sin(b)

So if you store the sine and cosine of the angle rather than the angle you get:
dot(normal_a, normal_b) >= cos_a*cos_b - sin_a*sin_b

Simple, fast and effective. I haven't found a non O(N^2) algorithm to determine the minimum steradian on a dataset yet. It is fast and easy to use lat/lon bounding boxes, but you are going to have the same problems in higher and lower latitudes with over-coverage so it is best to calculate the minimum steradian for each dataset. At least you only have to do it once on static data.

libkml: wtf?

So I saw mention that Google is soon to be releasing an open source library for the kml format.
Google will be releasing an open-source KML library in C++ that implements and tracks the standard as it progresses.

I can see two intended audiences for this library; kml content creators and content consumers. I just don't think it makes sense for either of them. For kml creators, why would you need to interface with a C++ library in order to create kml files. The answer is, you shouldn't need to. It's kind of like my post on Dreamweaver, if you know what you are doing it just gets in the way and if you don't know what you are doing it is way too complex. A C++ library seems like overkill to write out some xml text. I guess it could keep track of external and document-wide styles? Big deal.

If you are a kml consumer then it makes a little more sense to use a library in C++, but not much more. Using external libraries requires you to build a bridge between your code and the library concepts. So the libkml will be dictating what types of entities you support and how they are interrelated within your code. This is restrictive on how you would develop your internal classes by forcing you to make a class structure identical to the libkml structure or you could try to build a conceptual bridge between your internal structure and the libkml structure in order to be compatible. Once you have either one, why would you need an external library just to parse the XML entities.
By providing a reference library it allows developers to more easily keep up to date with KML without having to maintain their own library and track standards changes.

So developers won't have to support any changes in the standard if they are using libkml? I guess it sounds more like it's for the kml content producers.
I guess an alternate explanation is that they are trying hard to make it seem like the standard will be truly open. Of course I'll take a look at it when it comes out to see what it's all about, but the whole concept seems like an exercise in futility.

Monday, September 17, 2007

JSON beats XML, some comments on GeoJSON

I've been setting up some new data services and after experimenting with formatting my data in JSON (Javascript Object Notation) and I'm hooked. Programming mostly in C++ has made XML the easiest data format to use but I'm very unhappy with the hurdles that one must go through to go from XML to the internal representation of the data. I've been doing a lot of Javascript and Python programming lately and am just blown away by how easy it is to create, maintain and share data through the JSON format.

For XML, on the server end one would take an internal data representation in Python and create custom classes to format each data item with the appropriate XML tag, convert it to a string and output the XML tree. Not too hard if it is a relatively shallow dataset without too much data nesting. On the client end, you would write classes that would decode the tree structure of the tags with full knowledge of each tag's data type (string, integer, floating point, etc...) and create a new data structure to mirror the server side data.

For JSON, you take your data structure, usually a dictionary structure with a name associated with a value, dump it directly to JSON with a single procedure call and save it to a file. On the client end, you load the data directly into a similar data structure with a single call. No special decoding classes needed, no XML tag data types (string, integer, float) to have foreknowledge of.

I've been looking at the GeoJSON specification and it looks pretty nice and but with a few caveats. Say you have a LineString feature type, the coordinates are a list of lists which contain 2 coordinate elements (longitude and latitude). This is extremely space and processing time inefficient, which is very important for me with large datasets. I suggest using the standard single list of coordinates like is done in all other formats like GML and KML. Also it would be nice if there were some sort of feature style information specified like color, line width, line style and placemark icons.

I'll definitely be using my modified GeoJSON format when I return to C++ programing for EarthBrowser v3 soon. Sorry XML, please don't feel bad. It's not you, it's me. I just feel like we need a little space.

Tuesday, September 11, 2007

The Economist snubs EarthBrowser

Amongst yet another "gee whiz" article about the "geoweb," it is claimed that: 'Keyhole, an American firm, released the first commercial “geobrowser” in 2001.'

I guess it's too much to ask for a magazine to research each claim made in each article, but EarthBrowser, then called Planet Earth, was the first commercial "geobrowser." It predated Keyhole's existence by several years. In fact someone from Keyhole contacted me back then and inquired about purchasing the domain name, the EarthBrowser trademark and my customer list. The amount they offered was laughably low considering EarthBrowser had over 2 million downloads and sold over 20,000 copies by then.

On a different note, several months ago I alluded to working on a new project that was a bit of a diversion from the next great OpenGL version of EarthBrowser. Project Kraken is nearing completion now and it has become better than I'd imagined it could be. It's been hard not blabbing all about it on the #worldwind or #planetgeospatial irc rooms to get useful feedback, but I've restrained myself.

Here is some newly declassified information about Project Kraken:
  • It will be released within the next 30 days
  • There will be a free version
  • It is easily customizable
  • It will compete for mindshare with Google Earth

Wednesday, July 18, 2007

Reviews: Django, CSS, jQuery and DreamWeaver oh joy

One saying from my youth that has stuck with me was from my drumming instructor: "It's good for your beat to be ballsy, but the balls have to have hair on them." I took the meaning to be that a great idea isn't enough, it must be executed with finesse.

I am near completion of the secret project that I undertook a few months ago. With luck it will be unveiled in the next few weeks. Right now I am rewriting, which hasn't had a major update since 2003. I've had to update my skill set in the web design arena, since I have to do everything in this little business and I've learned some little nuggets that may be of use to someone. On to the tool review...

Let me just say right now that I don't understand why Dreamweaver exists. It has a very complex interface and set of features, so it looks like it is for professional web designers right? After trying to use it and spending all my time in the incredibly weak text editor to edit the HTML code, I just started using TextMate. I conclude that any expert web designer will just be editing the raw html source. Beginners will be overwhelmed by the options and difficulty of changing settings through the dialog box interface to the raw code. Again, who is the target market for Dreamweaver?

On to the server logic. I've looked at several CMS (Content Management System) packages such as Drupal, Joomla, ExpressionEngine, TurboGears and Django. I've decided that I'm pretty much done with PHP because it is so lame. Does anyone use Perl anymore? That leaves me with my preferred scripting language of Python and Django won the initial test phase hands down.

Django is actually very simple, it basically consists of models, views, url parsing and a templating system which are all nicely intertwined. You server parses the incoming url request and sends all the request info to a view function that you define. From there you can choose what models (if any) are involved in the request along with any other data, perform operations on that data like sorting and filtering, then pass it to the templating system where it can fill in subsections of your pre-defined HTML template. It took about 5 minutes to make a webpage that listed all of the recent earthquakes over magnitude 4, sorted by most recent and provide a link with a relevant title for each. Of course that doesn't include getting that quake information updating into the database, that is a whole other story. There are also nice little modules to do RSS or Atom feeds, blog posts and many other neat features. I'm now rewriting my purchasing system in Python instead of PHP and I couldn't be happier about it. If you are in search of a CMS for your site and don't have your heart set on PHP, you must try Django.

I've decided that giving my site that Web 2.0 feel isn't enough, I'm going with web 2.1. Learning CSS is now mandatory, it isn't that complex, just the ability to set up some inherited styling options for HTML elements. You just have to know what tags you are using and what styling options look best for what you want your site to look like. jQuery is fast becoming my friend and I haven't even done much with it yet. It is a set of Javascript functions that lets you alter the structure of your site when certain events happen, like the page loads or a button or link is clicked. It abstracts away the XMLHttpRequest, handles JSON data, animates elements and a lot more. If you design websites, use it!

In conclusion, if someone were to ask me what they would use to create a fairly complex website I would say set up a dedicated server with Apache, MySQL running the Django framework for the server side logic, edit your HTML and CSS with a nice text editor like TextMate and use jQuery for the client side logic. Don't bother with Dreamweaver.

Wednesday, July 11, 2007

Chewing Tobacco with Bill Gates

I've seen some famous people in my time, but have talked to very few. Perhaps meeting Linus Pauling, who was a personal hero, when I made a complete fool of myself has made me a bit shy. Mr Pauling gave a talk to my first year chemistry class at Reed College in 1985. After the lecture, I approached him and he gave me his full attention with everyone watching the exchange. I could tell that everyone was expecting a thought provoking question with an insightful and wise response from him. I kind of froze and asked a confused, misinformed clarification about his position on vitamin C. He corrected me politely and moved on leaving me groaning inside at my lack of having a prepared question.

In 1989 during summer break, I worked for Microsoft and was quite the wiseass. The marketing folks asked for quotes from the CalTech summer hires to put in the school newspaper the next year and I thought up some ultra-geeky quote that had a hidden, and very rude, message in it and everyone had a chuckle over the fact that the marketing people thought it was real. Toward the end of the summer, all of the interns were invited to Bill Gates house for an afternoon barbeque. At that time in my life I thought I was being a rebel by smoking nasty convenience store cigars and chewing tobacco. So I'm there at Mr. Gates house and I had perhaps my second "wad of chew" ever in my mouth, it was my last. I went to the bathroom to spit it out as it was making me feel sick but there was a line about 6 people deep. I look behind me and Bill himself is behind me in line and he strikes up a conversation with me about what I'm doing this summer outside of work. I tell him that I'm taking windsurfing lessons and ask him about the garage for boats he has near the lake edge. I realize that I'd flubbed another chance to ask someone famous a probing and insightful question. Feeling sicker and sicker as the nicotine leeched into my bloodstream through my raw gums, it was finally my turn and I made sure there was no residue for Mr. Gates to find as I spit it in the sink. The fact that one of the richest men in the world was waiting in line to use his own bathroom so he could talk to his young employees is amazing. For that, I will always have a great deal of respect for him.

I not even going to go into what happened when I met Vatos, the drummer from Oingo Boingo.

Tuesday, June 05, 2007

Google getting into GPU/multicore programming

No one can say that they are dumb. I've written about GPU programming and it's enormous potential in previous posts. However, the news that Google is purchasing PeakStream is still a little surprising.

Personally I prefer the open source libsh library to any commercial offerings. However I don't know if it is being actively developed any longer since McCool started up RapidMind, the biggest PeakStream competitor. Both RapidMind and PeakStream are very lame company names by the way. Where does this leave NVIDIA'a CUDA?

PeakStream's and RapidMind's technologies are probably more geared toward multi-core processors since that is where the high end bucks are, which is usually where smaller companies aim. However it seems to me that the whole stream processing market will eventually be dominated by the graphics card makers since they comprise the low end and ubiquitous commodity market. Either that or multi-core and graphics chipsets will merge eventually, which is probably the most likely scenario.

Thursday, May 31, 2007

Blue Moon Tonight

See a rare blue moon tonight.

A blue moon is the second full moon within the same calendar month. The moon will not actually be blue...

New version of Blue Marble Next Generation

Just got word from Reto Stockli that there's a new version of BMNG out. It hasn't hit the mirrors yet, but it should be available shortly.

Changes include:

1) Replacement of the standard BMNG Antarctica by use of a colorized version of the high quality MODIS Mosaic of Antarctica by Terry Haran and Ted Scambos (NSIDC)
2) Fix of Southern Hemisphere sea floor topography errors (GEBCO vesion 1.02, and bug in my code)
3) Fix of Arctic 80N-90N land mask inconsistencies between the MOD12Q1 and GEBCO sea floor topography. Remaining 80N-90N inconsistencies are due to the fact that there are no MOD09A1 land surface reflectances in this area and it was painted with permanent snow in BMNG during all months.

I'm excited for the new dataset! It isn't mentioned, but I hope that they fix the hillshading in southern Crete.

Thursday, May 17, 2007

Dynamic Relief Mapping

EarthBrowser 3.0 is starting to shape up nicely. I've created a normal map generator using a Sobel filter and SRTM data. Normal maps are just a texture respresenting compressed normal directions and they enable one to create the illusion of very detailed geometry with the help of hardware accelerated shaders.

Africa Relief

Asia Relief

Using the mosaic and raster classes that I described in my previous post, I generate a normal map in the gnomonic projection and put that together with the texture map and a wgs84 ellipsoid and it is starting to look pretty nice. The great thing about it is the shadows move where they should when the light source moves. Someday I'll make a youtube video to demonstrate that effect.

Thursday, May 10, 2007

Hexagonal Dataset Rundown

The Icosahedral Hexagonal grid system for the next version of EarthBrowser is nearly complete. This was no trivial task and it required several helper technologies in order to be possible. One of the primary motivations of this technology was to be able to represent the northern and southern latitudes as accurately any other area on the globe, which it does seamlessly.

Unfortunately the size of the datasets are not significantly smaller as I had hoped. The size of the BMNG (Blue Marble Next Generation) dataset in the generic Plate Carée projection is just over 6 Gigabytes uncompressed. In the Hexagonal grid format it comes in roughly the same. The full 500 meter per pixel hex grid consists of 162 tiles roughly 4096x4096 in size. Each hex tile image is a square in the Gnomonic Projection centered on the hexagon tile center. A similar resolution tile set in Plate Carée would be 128 tiles of size 5700x5700. Theoretically the dataset size should be much smaller, but I added about 5% padding on each edge, also each corner of the square image has redundant data that is represented in adjacent hexagons.

Compressed into Jpeg2000 format, I can get the whole dataset down to around 250 Megabytes without noticeable compression artifacts. I think that will be better than you can get from Plate Carée since land is over-represented and less compressible than ocean areas.

In order to make all of this work, I needed a nice little tool I was using to build my Landsat 15 meter dataset. The class is called mosaic and it will take a set of geo-referenced datasets in any projection and build a new geo-referenced image in any other projection. This has made creating the hexagonal tile dataset very simple. The mosaic class can read Jpeg2000, MrSid, GeoTIFF, ECW, jpeg, png or even raw images. It also take advantage of the ability of Jpeg2000 and MrSid formats to supply reduced resolution subsets of images in order to speed up the processing. In future versions I'll add netCDF and some other neat formats out there. It's all very fast too in optimized C++ code, much faster than gdal_warp.

The mosaic class is possible due to another class that has become the very heart of the new EarthBrowser program. The raster class is useful not only in importing and exporting data from image files, but also in the graphics "game" engine for vertex buffers. Raster can be used represent any block of data based on it's height, width, depth and storage type (8-bit, 16-bit unsigned, float, double, unsigned 64-bit, etc.). It can also have an arbitrary interleaved ordering: interleaved by pixel, by line or by plane (bip, bil and bsq). It can be geo-referenced with a supplied origin, resolution, rotation, projection and datum. It can also be subsampled with various sampling kernels like bilinear, cubic convolution, cubic spline and even nearest neighbor! I tried Lanczos but failed and gave up.

Due to the flexibility of the mosaic and raster classes, I build the Hexagonal dataset from BMNG and SRTM Plus (fused elevation and bathymetry) with the very same set of calls, with just a different source dataset name and output raster format. I decided to use bilinear on SRTM and cubic convolution on BMNG too. Of course all of this functionality will be available in EarthBrowser 3.0 from the Python console or from your own imported Python scripts.

EarthBrowser 3 is going to be a quantum leap from version 2. Now if I can just nail down the rights to use the i-cubed 15 meter Landsat dataset I won't have to waste my time on that again!

Thanks for the (indirect) mention Mr. Hanke

The much awaited blog from the Google earth/maps team has now arrived, and it doesn't disappoint. I look forward to hearing more about many things Google is doing or planning, especially details of the collaboration with NASA. I sure hope that NASA puts any generated earth data in the public domain. John Hanke kicks it off with his perception of today's online geo-referenced world:
I don't think that there is agreement on what the geoweb is, but I think there is a lot of enthusiasm and energy across many fronts to make it happen. I expect the "it" will evolve substantially over the next few months and years as we (the geo ecosystem on the web) collectively figure out how "earth browsers," embedded maps, local search, geo-tagged photos, blogs, the traditional GIS world, wikis, and other user-generated geo content all interrelate.

EarthBrowser, the original "earth browser," on the web will continue to be a part of the evolving "geoweb", for lack of a better euphemism. I'll be posting an update later today on some of the cool new technologies coming out in EarthBrowser 3.0.

Tuesday, March 20, 2007

Project Kraken and EarthBrowser circa 1998

For the past two weeks I've been working on a secret project (codename Kraken); and it will be the perfect compliment to the next version of EarthBrowser. I'll give out some hints and details in the coming weeks about it's potential uses and capabilities. However before I reveal the future, I wanted to share the earliest snapshot of EarthBrowser caught in the wild...

I haven't kept many records from the time, but I looked up my old website from Dec 6, 1998 in the wayback machine. It featured the first version of EarthBrowser which was called Planet Earth at the time. I couldn't get the domain name for that so I changed the name to EarthBrowser at the end of 1999. Anyone who is interested can take a look in the wayback machine here. Unfortunately there are no images saved, they were pretty cheesy anyway. Once Planet Earth started selling, I quit my job and have been working on it full time ever since.

Here's the main text:

Planet Earth 1.0
Have you ever wanted to see the Earth from space? Now you can see what it looks like right on your desktop. Planet Earth displays a realtime 3-dimensional model of the Earth with the current cloud information downloaded directly over the internet. Night and day shadows are updated continuously and you can rotate it to view any spot on the Earth.

  • Cloud information updated every 6 hours.
  • Areas of night and day are updated continuously.
  • Real 3-dimensional model can be rotated to reveal any part of the earth.
  • Zoom in or out
  • Position viewing location over any spot on Earth
  • Rotates freely or stays in fixed position

  • Macintosh with PowerPC processor
  • Internet connection (for clouds)
  • System 7.5 or above
  • 1 MB Disk
  • 4 MB Ram

Mac System 7.5, that's old!! I remember trying to get it to work on the Motorolla 68000 processors but they were just too under powered in floating point...

Monday, March 19, 2007

15 meters of Landsat

I've been working on a new Landsat 15 meter per pixel base map for version 3.0 of EarthBrowser which is due out later this year. Tasks like this make it painfully obvious that I am crazy to be doing this all by myself. Here is where I'm at with that...

A couple of years ago I decided that I needed a royalty free base map for EarthBrowser that looks better than the one available through the OnEarth WMS service which EarthBrowser currently uses. My only option at the time was to download the raw data and process it myself. I wrote a script that downloaded the raw tiles from slowly so as not to overtax their servers. I was looking at a long, uphill battle to get what I wanted. Now I have all of the raw data, but it needed a lot of pre-processing to be a seamless mosaic that wouldn't have the abrupt changes in contrast and quality from tile to tile.

Then in November of last year I found just what I was looking for: a set of pre-mosaiced 15 meter Landsat data from NASA John C. Stennis Space Center. Two of the color channels were in the infrared spectrum, which is great because many of the visible color channels have haze in them which makes it difficult to get a seamless looking image, but the infrared cuts through the haze. However this means that it looks strange because it isn't a true color image. More image processing needed...

I keep going and began taking scenes in different regions of the world and re-coloring them by hand and using the differences in the two images to "train" an algorithm to color the entire dataset in a more realistic color scheme. First I tried using principal component analysis to reduce the dimensionality of the problem. This didn't work too well since the principal components changed from region to region. I settled finally on using a RGB to HSV conversion to separate the color components. I then use a fairly simple piecewise polynomial approximation using Jacobian matrix decomposition in three dimensions, after which I re-project it back into RGB space and it doesn't look too bad!

However I am now running up against a problem that I didn't think about until I got to this step (doesn't it always work that way). This dataset has been compressed using a "lossy" wavelet compression algorithm. That means that my polynomial approximations will work for one trained tile but won't transfer well to other areas of the mosaic due to the incosistancy of color values. I'm getting little yellow dots appearing randomly and big swaths of green where it should be dark blue due to compression artifacts. Arrrrgh!

As an example just recently I was contacted by an environmental organization from the Greek island of Lesbos asking me if I could donate a high resolution image of Lesbos for a mural at their center. Happy to help an environmental organization and also as a first real test of some of the new capabilities of EarthBrowser, I made a high res image to his specifications and unfortunately had to do a lot of hand editing in order to make it look good. Here is a before and after showing my color transform algorithm in operation.

Unprocessed NASA Landsat Lesbos

EarthBrowser Processed NASA Landsat Lesbos

Hand Edited EarthBrowser processed Lesbos

It is clear from this test that I need the data in an uncompressed format in order for my algorithms to work properly. After inquiring around, I have finally found someone within NASA that can sell me the uncompressed data on hard drives but it is $16,000 and they are not quite clear on whether it is the exact dataset that I am referencing. Needless to say, I can't afford that. So I'm kind of stuck.

There is some hope though, just recently I read on the excellent Bull's Rambles Blog that a fantastic new Landsat 15m dataset has been "donated... to the public domain." One would assume that the use of NASA resources to produce this dataset would have a prerequisite of the product being in the public domain. I've been in contact with the WorldWind project lead Patrick Hogan to inquire about it's availability, and they understandably want to have it "premiered" by WorldWind. This leads me to believe that it will be generally available at some point, but there has been no difinitive answer yet.

I'm keeping my fingers crossed because I'd really rather spend my time programming.

Thursday, March 08, 2007

Sweet Sweet Amazon S3

Another gigantic hurdle overcome! Last night I woke up around 4am and laid there thinking about all of the sweet simplifications possible. I've just signed up with Amazon S3 (Simple Storage Solution) and I think it just lowered my blood pressure a bit. The Coding Horror blog, which is fast becoming a favorite, details their efforts at reducing server load by putting static images on S3. I've heard about S3 since it has been out but thought it too complicated and didn't want to be a first adoptor, but now I'm sold.

I run two separate servers for EarthBrowser, one for the website and one as the dedicated data server for all of the EarthBrowser instances out there. Whenever I put out a release, my web server would get slammed and become unresponsive with thousands of people each downloading the 10 megabyte installer file. I have actually written and implemented, in PHP, a distributed, caching data server toolset that would allow me to rent several cheap $7 dreamhost setups and plop it in to increase my capacity. They would talk to each other and update their internal datasets for more frequently requested items. I was even considering finding some European or Australian hosting sites to get closer to the rest of the world in anticipation of EarthBrowser v3 coming out later this year. It was working pretty nicely but the administrative and logistical overhead was getting to be a bit overwhelming. Now I don't have to worry about storing and managing a group of servers to distribute EarthBrowser v3 data. With Amazon S3 I have virtually limitless capacity and limitless download bandwidth for about $0.20 a Gigabyte.

Now all that work can be thrown away, which makes me both cringe and sigh in relief at the same time. Often I work for a week or two on a technology and wind up throwing it away when something better comes along. Sometimes only through the act of creating and exploring a problem space can better alternatives be discovered. That is why product specs should never be static documents. This time the better solution came along and whacked me on the head.

It is a little tricky to get working well since it uses a SOAP or REST interface with some quirks. I wrote a python script that works with the excellent boto module to allow anyone with a S3 account to create/delete/update objects and buckets from the command line. I'm making it open source and downloadable for anyone who wants it. If you have any suggestions, let me know. There are some issues if you are using Python 2.3 or below, contact me and I'll tell you how to fix them.

Monday, March 05, 2007

The virtual globe as a software platform

Web browsers are a very interesting and instructive case of a software platform that is completely dependent upon the network but lives on the desktop, like virtual globes should. Browsers have a language that controls the content (HTML), a language that can provide functional logic (Javascript) and a plugin architecture. The magnitude of the world wide web is a testament to just how powerful a platform the web browser is. But what elements make it so successful? A display language is a given since the browser would not exist without it, but that is not sufficient. The network is where the full power becomes apparent. The ability to algorithmically define content for local display on remote computers is where the real power comes from. However server side control of content has many limitations. HTTP being a stateless connection has been the headache of many web developers. We are now starting to see some of the real potential of the web browser become unleashed with the new "Web 2.0" hype surrounding the Javascript language. What we are really seeing is the control of the display being performed by the client rather than a remote server. Imagine that, client side control of an application, we're back to the 1980s and it's about time!

I have no doubt that each of the development teams of the major virtual globes want their creation to become the default platform for geospatial content. Why shoot for anything less? Let's take a look at where things stand for the big guys and give a little update on where EarthBrowser is headed.

Google Earth: The KML format gives the ability for novice users to add their own simple content to Google Earth, much like a static web page. The network link provides the ability to get dynamic, server based content into GE, it is very "Web 1.0." with no client side processing. They are currently at the top of the game. They have a tremendous asset with their enormous and growing dataset. However as I mentioned in a previous post, they will be running up against some limitations of their design if they don't change course, but their biggest problem right now is how to make money from it.

WorldWind: This product has advanced the farthest in the key platform categories. It supports KML and has a plugin architecture as well. There have been some recent developments which might allow one to use Python for client side processing. Their early decision to use .NET has delayed it becoming cross platform, however a Java port is on the way. I am generally underwhelmed by Java applications but am hopeful they will pull it off. Being from a government agency, they don't have the same pressures as a commercial entity. This, and being open source, gives them an advantage in the platform building arena. WorldWind has the best shot at becoming the platform of choice for researchers, if they can just convince them to write in Java.

Arc!GIS Explorer: It seems to me the primary purpose for ESRI putting this out is to keep their customers from migrating to Google Earth for certain tasks. I understand that they are licensing the engine from another company (Skyline Globe or GeoFusion?) and are therefore not in control of their own product. Apparently they allow some sort of scripting (a restricted ArcScript?) functionality. It has the potential to be something great, but ESRI doesn't really have the DNA to be a mass market product leader. Future enhancements are clouded by not owning the source code.

Microsoft's Virtual Earth: VE is a strange bird as it is an ActiveX browser plugin and not a full fledged application, but I see little difference in the installation procedure. Like ESRI, it seems they are trying to stave off Google Earth and keep the market open by providing an alternative. However they want to lock you in to using Windows and Internet Explorer at the same time. It has some really neat features, it is scriptable, and I'm certain it will be pushing Google to come up with bigger and better solutions. They, like Google, are going to have a hard time monetizing this product, the whole billboard thing is kind of stupid and belongs in something like Second Life. Perhaps that is what they are ultimately shooting for. The battle for the mass consumer geospatial market will be between Microsoft and Google.

Where is EarthBrowser in this list? Right now it is not in the same league as the other offerings. It does have some nice features that are just now being added to some of the other offerings like live cloud animations, earthquake data, sea surface temperature animations and weather forecasts. Version 3 is a complete re-write that is years in the making. It will have all of the features that are currently in version 2 integrated into a full hardware accelerated 3D environment. You will be able to smoothly fly through mountains, into and out of the atmosphere. It will support most of the popular raster and vector formats and enable you to link to sources of data on the internet. There are quite a few other neat little tricks in store that I'll reveal as it gets closer to the release date.

Wednesday, February 21, 2007

Global Grid: The Icosahedral Hexagonal Grid

I've been working with high resolution earth images for almost a decade now in various forms. In the late '90s I flew down to visit Tom Van Sant who is creator of the GeoSphere Project and the first person, to my knowlege, to have created a high resolution satellite composite of the entire earth without cloud cover. Early versions of EarthBrowser used the GeoSphere base map and was used in several museum kiosks around the world as part of the GeoSphere project. Since then EarthBrowser has moved on to higher resolution datasets, but the basic limitations that were present then remain.

The need to represent a sphere with a series of rectangles has offended my aesthetic senses for a long time. It seems simple enough, images are two dimensional and the surface of an ellipsoid is also two dimensional. The basic problem is that a planar representation of an ellipsoid requires distortion, generally around the poles. This isn't a huge problem since you can re-project the image to eliminate this distortion. The problem for me is that there is an unacceptable amount of waste. More data bins (pixels) have to be created via a resampling stretch near the poles only to be resampled down again when used. Yuck!

I've been obsessing about this for the past year and came up with what I thought was an elegant solution, and later discovered that others had been there before (of course!). The basic idea is that the plane can be tiled by only three of the fundamental polygons: the equilateral triangle, the square and the hexagon.

The basis of the EarthBrowser version 3 dataset will be the dodecahedron. From the dodecahedron you can surround each of the 12 pentagons with hexagons to generate a soccer ball-like solid (sometimes refered to as a bucky-ball). Higher and higher numbers of hexagons can be filled in between the 12 pentagons to give you a set of seamless spherical hexagons, plus the original 12 pentagons. This allows you to cover the globe at varying resolutions.

This solution is a very elegant one and is already beginning to be used in several areas like climate and ocean modeling. Getting away from the rectangular grid creates some real complexity issues with indexing, dataset management and data projection. However the benefits of minimizing cell to cell distortion often outweigh the extra complexity for certain uses.

For global earth viewers like EarthBrowser, Google Earth and World Wind, I'm not sure the extra complexity would normally be worth the savings in data size. The data tiles have to be in special map projections and are hard to edit and manage. However in EarthBrowser v3 I have created some tools and techniques within EarthBrowser that compliment it's hexagonal grid in such a way as to eliminate many of the problems and turn several negatives into positives. I'll save discussion of those components for future posts. Having a this global grid technology makes EarthBrowser very flexible for future enhancements that I hope will include the ability to perform data modeling on global datasets.

But best of all it satisfies my aesthetic sensibilities.

Tuesday, February 20, 2007

More on GPU processing

Things are moving rapidly in the Graphics Processing Unit (GPU) programming arena. NVIDIA just released CUDA a GPU compiler that will allow applications to offload work to the GPU. Most GPUs run about 10 times faster than their host CPU and are ideal for running uniform mathematical operations on large sets of data.

NVIDIA is adding a nice compiler that does the hard work of translating your math into GPU assembly code. All you have to do is provide the source code of the program, compile it and upload the program and data to the graphics processor and get the results. ATI is apparently going with a much more barebones approach and allowing you to program the GPU in assembly code directly. That is more flexible, but requires someone else to develop and support the toolset. I like NVIDIA's solution better, but I don't think that they allow you direct access to assembly so it is a little less flexible.

GPU programming is going to become a big deal in scientific computing in the next few years.