Thursday, September 20, 2007

Fast Steradian Intersection Correction

I just wanted to go back and mention that if the sum of the steradian angles is greater than 180 degrees, then due to the sine and cosine functions, the simple formula breaks down. However, we know that if the sum of the angles is greater than 180 degrees then by definition they intersect. Due to this, we should also store the steradian angle along with the angle sine and cosine.
So the revised intersection test algorithm becomes:
angle_a + angle_b > pi or dot(normal_a, normal_b) >= cos_a*cos_b - sin_a*sin_b

Not as pretty. A speedup for large coverages :-) but a slowdown for everything else... :-(

Simplified GeoJSON proposal

I've come up an extremely simplified GeoJSON example which can do everything that the current GeoJSON specification can do but with only 4 core elements. Every element is a feature and can contain a list of points, lines and polygons or other features. Everything else in the GeoJSON spec is pretty much taken from some OGC standard or another. It's important not to get caught up in the past we are at the beginning of a potential standard, as Sean said in his post: "there's only one first chance to get a standard right." The only thing that I think might be questionably useful about this would be the nesting of features.

{
"points":[
[x0,y0], [x1,y1], ..., [xn,yn]
],
"lines":[
[x0,y0, x1,y1, ..., xn,yn],
...
],
"polygons":[
[
[x0,y0, x1,y1, ..., xn,yn],
...
],
[
[x0,y0, x1,y1, ..., xn,yn],
...
],
...
],
"features":[
{
"points":[
[x0,y0], [x1,y1], ..., [xn,yn]
],
"lines":[
...
]
},
...
]
}


Also an optional "crs" coordinate reference system can contain an EPSG code, ESRI WKT (Well Known Text) or a PROJ.4 projection string or all 3. If none is specified, the default is decimal degrees in the WGS84 datum. The coordinate reference system of the parent cascades down the to all children until a child specifies one.

"crs":{
"epsg":"4326",
"wkt":"COMPD_CS["OSGB36 / British National Grid...",
"proj4":"+proj=utm +zone=15 +ellps=GRS80 +datum=NAD83 +units=m +no_defs"
}



Another useful item would be an optional "bounds" element that specifies the bounding envelope in the element's crs.
"bounds":[min_lon, min_lat, max_lon, max_lat]


Minimalistic and precise.

GeoJSON redux

Sean Gilles has a post responding to my call for a simplified coordinate representation in GeoJSON. His argument is that clarity of representation is more important that implementation overhead which I agree with to some extent.

However, it seems to me that the reason that JSON is so much better than XML for many purposes is exactly that it *does* take into account implementation overhead, thereby making it easier to exchange internal data structures without the overhead of XML.

For any sane implementation of JSON, the following must be extracted as a list of lists:

[ [x1, y1, z1], ..., [xn, yn, zn] ]

So a list of 10,000 coordinates will give you 10,000 lists of 3 floating points each. Or 10,000 list structures and 30,000 floats. Each list takes time and memory to create and address and extract the elements out of. Whereas:

[ x1, y1, z1, ..., xn, yn, zn ]

gives you the same 30,000 floats but only one list. A big win for any standardized JSON reading algorithm. Creating a custom, context sensitive JSON parser to ignore the structure is more than a little implementation detail.

Strangely, after Sean argues against removing context information to improve clarity, he then requests just that. He suggests that the "type" element that describes what kind of GeoJSON element you are looking at be removed since it should be obvious from the type of request that you made to receive the GeoJSON content. What if one were to receive a set of files with various different sets of data, some single features, some feature sets and perhaps some just geometry elements? If you remove the type field from the geometry elements, how would you know what kind of geometry you have? You couldn't tell a Point from a Polygon without the type field.

All that said, I do have another simplification for GeoJSON that is unrelated to the previous issues. How about we do away with the 'Point', 'LineString' and 'Polygon' geometry types. No really. They are just special cases of 'MultiPoint', 'MultiLineString' and 'MultiPolygon' with one element. I find myself writing code to take the special case single element entities and put them into the more general multi-element entities. That is code I would much rather not write since it introduces complexity and potential bugs. The only real difference in the two is the lack of a "Multi" and an extra set of an enclosing brackets.

Wednesday, September 19, 2007

Fast Steradian Intersection Testing

I've come up with a pretty neat inclusion testing algorithm and I wanted to share it. The need is to test for inclusion of datasets within a camera view. Datasets are normally constrained to a certain area on the globe, but can cover the entire globe in some cases.

The simple way to do this is to make a bounding box with longitude and latitude coordinates and then just do intersection testing from a view bounding box. However as you get farther from the equator, the bounding boxes get progressively more distorted and you can be either over or under inclusive in your datasets.

My solution is to use steradians, or solid angles. All you need is a normal vector and an angle, four floating point numbers which is the same as a bounding box, actually I add another to speed up calculation but I'll explain that later. It performs uniformly anywhere on the sphere, and to specify global coverage you just set the angle to 180 degrees.

Now all you need to do is to calculate the steradian coverage for a dataset, and one for the visible camera volume and you can do inclusion testing with just five multiplies, three adds and a compare. It may be a little more work than a bounding box intersection test, but not much more and I think that it is an acceptable tradeoff for the properties that it provides.
It's simple really, the dot product between steradian normals gives you cosine of the angle between them. Intersection is indicated by an angle less than the sum of each steradian's coverage angle, or:
dot(normal_a, normal_b) >= cos(angle_a + angle_b)

There is a nice formula for the cosine of a sum:
cos(a + b) = cos(a)cos(b) - sin(a)sin(b)

So if you store the sine and cosine of the angle rather than the angle you get:
dot(normal_a, normal_b) >= cos_a*cos_b - sin_a*sin_b

Simple, fast and effective. I haven't found a non O(N^2) algorithm to determine the minimum steradian on a dataset yet. It is fast and easy to use lat/lon bounding boxes, but you are going to have the same problems in higher and lower latitudes with over-coverage so it is best to calculate the minimum steradian for each dataset. At least you only have to do it once on static data.

libkml: wtf?

So I saw mention that Google is soon to be releasing an open source library for the kml format.
Google will be releasing an open-source KML library in C++ that implements and tracks the standard as it progresses.

I can see two intended audiences for this library; kml content creators and content consumers. I just don't think it makes sense for either of them. For kml creators, why would you need to interface with a C++ library in order to create kml files. The answer is, you shouldn't need to. It's kind of like my post on Dreamweaver, if you know what you are doing it just gets in the way and if you don't know what you are doing it is way too complex. A C++ library seems like overkill to write out some xml text. I guess it could keep track of external and document-wide styles? Big deal.

If you are a kml consumer then it makes a little more sense to use a library in C++, but not much more. Using external libraries requires you to build a bridge between your code and the library concepts. So the libkml will be dictating what types of entities you support and how they are interrelated within your code. This is restrictive on how you would develop your internal classes by forcing you to make a class structure identical to the libkml structure or you could try to build a conceptual bridge between your internal structure and the libkml structure in order to be compatible. Once you have either one, why would you need an external library just to parse the XML entities.
By providing a reference library it allows developers to more easily keep up to date with KML without having to maintain their own library and track standards changes.

So developers won't have to support any changes in the standard if they are using libkml? I guess it sounds more like it's for the kml content producers.
I guess an alternate explanation is that they are trying hard to make it seem like the standard will be truly open. Of course I'll take a look at it when it comes out to see what it's all about, but the whole concept seems like an exercise in futility.

Monday, September 17, 2007

JSON beats XML, some comments on GeoJSON

I've been setting up some new data services and after experimenting with formatting my data in JSON (Javascript Object Notation) and I'm hooked. Programming mostly in C++ has made XML the easiest data format to use but I'm very unhappy with the hurdles that one must go through to go from XML to the internal representation of the data. I've been doing a lot of Javascript and Python programming lately and am just blown away by how easy it is to create, maintain and share data through the JSON format.

For XML, on the server end one would take an internal data representation in Python and create custom classes to format each data item with the appropriate XML tag, convert it to a string and output the XML tree. Not too hard if it is a relatively shallow dataset without too much data nesting. On the client end, you would write classes that would decode the tree structure of the tags with full knowledge of each tag's data type (string, integer, floating point, etc...) and create a new data structure to mirror the server side data.

For JSON, you take your data structure, usually a dictionary structure with a name associated with a value, dump it directly to JSON with a single procedure call and save it to a file. On the client end, you load the data directly into a similar data structure with a single call. No special decoding classes needed, no XML tag data types (string, integer, float) to have foreknowledge of.

I've been looking at the GeoJSON specification and it looks pretty nice and but with a few caveats. Say you have a LineString feature type, the coordinates are a list of lists which contain 2 coordinate elements (longitude and latitude). This is extremely space and processing time inefficient, which is very important for me with large datasets. I suggest using the standard single list of coordinates like is done in all other formats like GML and KML. Also it would be nice if there were some sort of feature style information specified like color, line width, line style and placemark icons.

I'll definitely be using my modified GeoJSON format when I return to C++ programing for EarthBrowser v3 soon. Sorry XML, please don't feel bad. It's not you, it's me. I just feel like we need a little space.

Tuesday, September 11, 2007

The Economist snubs EarthBrowser

Amongst yet another "gee whiz" article about the "geoweb," it is claimed that: 'Keyhole, an American firm, released the first commercial “geobrowser” in 2001.'

I guess it's too much to ask for a magazine to research each claim made in each article, but EarthBrowser, then called Planet Earth, was the first commercial "geobrowser." It predated Keyhole's existence by several years. In fact someone from Keyhole contacted me back then and inquired about purchasing the earthbrowser.com domain name, the EarthBrowser trademark and my customer list. The amount they offered was laughably low considering EarthBrowser had over 2 million downloads and sold over 20,000 copies by then.

On a different note, several months ago I alluded to working on a new project that was a bit of a diversion from the next great OpenGL version of EarthBrowser. Project Kraken is nearing completion now and it has become better than I'd imagined it could be. It's been hard not blabbing all about it on the #worldwind or #planetgeospatial irc rooms to get useful feedback, but I've restrained myself.

Here is some newly declassified information about Project Kraken:
  • It will be released within the next 30 days
  • There will be a free version
  • It is easily customizable
  • It will compete for mindshare with Google Earth