Setting up aerial photos (orthophotos) in Geoserver

We are currently working on a project where we are setting up aerial photos or orthophotos in Geoserver. We tried to do this, naively, by adding the original data that we got from our supplier. The original data is 1.4TB. This might not be big for Google, but it is quite big for us, and definitely for our server. The problem wasn’t just the total size of the data, it was also a matter of the composition of it.
The data was delivered as jpeg images accompanied by world files. The images were around 50-80mb each (we used them as a ImageMosaic). The dimensions of the images were 10 000×10 000 pixels. We use our own map client written in Javascript (not OpenLayers in this case) which requests tiles of 256×256 pixels. To cover a page in the client we do around 30 requests. In order for Geoserver to get us a tile it needs to load the full image of 10 000×10 000 px. This is obviously quite silly since we won’t use most of those pixels. Most of us don’t have a screen resolution higher than 1920×1200 (many don’t have that!) so why load all that data?
OK, so what to do? Well, our legacy in house map server stores its tiles in 1500×1500 by default. But it occurred to us that it seems that most (citation needed) of the world use tile sizes in the power of 2, so we went with 2048×2048.
Great, we have a plan, now, how do we transform all our tiles to smaller ones? Also, we might want to create a pyramid if we want to be able to match the scale layers of our map data. The solution is gdal_retile.py - GDAL is an amazing open source project that does a lot of useful stuff, like retiling mosaics and creating pyramids, among plenty of other things.

I did a test on 3.1 GB of data and got 998 MB, a size of thirty percent. So, the original files are kind of bloated.
The retile took 2 hours. A quick calculation tells us that to retile all of the data would take 1000h or approximately 41 days. However, it’s a python script which can’t utilize threading, so, how can we speed it up?
The data was split into 10 areas. If we run 10 instances, the script should be able to utilize different cpu cores. This would reduce the running time by 10, and finish in 4 days.

After starting this and going to FOSS4G, and coming back (it took longer than 4 days), you’re done :)

Now all I have to do is combine all the tile index files somehow. I fired up ogr2ogr to combine them and thought I was all set, but obviously I didn’t think about the paths. When creating the new tiles I was at a different level of the directory structure and now I have to prepend the superfolders to the location column in the dbf files. So far I haven’t figured out how to do this in a good way… I tried to do it in LibreOffice, but I became frustrated when because that just moved the data into dbt files and I don’t even know if Geoserver understands that…

More on the result in a later post.

 

FOSS4G 2011 – The Projects You Shouldn’t Miss

This is sort of a follow up on my post earlier post Traditional and modern approaches to GIS – short summary of FOSS4G 2011. I will try to sum up some of the projects I thought was most interesting from the sessions I went to.

On the conservative side, GeoTools is getting its own scripting API, GeoScript, which makes much of the GeoTools functionality available for Javascript, Python, Scala and Groovy. From the demo, it looks like a productive way to experiment with geometries (GeoScript ships with an integrated viewer), and maybe a way to generate SLDs programatically. Along the same lines, GeoTool’s Java API has been overhauled, so some of the powerful functionality can now be accessed in very few lines of code. Not everything must be related to OGC standards and XML, and that’s a great thing.

For what I call the modern approach, there’s just too much to talk about them all, but I’ll try to summarize some of the projects that I really want to look closer at.

Tilemill is a web application for designing web maps – basically, it lets you work out a design for your vector and raster data. This design is used to render the actual tiles using Mapnik. In contrast to all map design tools I’ve seen before, the focus in Tilemill is designing for the web - other tools I’ve seen have not been suitable for styling huge datasets and multiple zoom levels. Tilemill doesn’t use SLD, but uses Carto, a CSS like styling language. After seeing some of the designs AJ Ashton from MapBox has done in Tilemill, I’m convinced this is something we will have to try out.

For tiling, a lot of alternatives to GeoWebCache have been mentioned – I have no specifics on them, but we will check them out: TileStache (used by Tilemill, as I understand it), TileCacheMapProxyMapCache. In the same area there’s also TileStream, a service that hosts and serves your tiles.

Two projects from Vizzuality with the Carto prefix seems really interesting. They’re building a stack with PostGIS as spatial database, which they have packaged as a cloud service called CartoDB, that can be accessed through a HTTP API. For the server application, they have CartoSet, a Ruby on Rails application available on Github. See UNESCOplaces.org for a neat example of what that might look like.

That was a few of my favourites from what I’ve seen. I hope to dig deeper into them, and perhaps some of the other things we’ve seen in future posts.

Traditional and modern approaches to GIS – short summary of FOSS4G 2011

We’ve reached the end of FOSS4G 2011 in Denver, and I’m going to write down some thoughts after three days of sessions.

I’m going to try to do this without turning this post into a rant about some of the more traditional GIS software out there. It might be that I’m not originally coming from a GIS background, but digging through hundreds of lines of verbose XML config files is not really my idea of fun, and don’t get me started on SLDs. (<ogc:IsThisTagNameTooLongForYou>? Yes it is.). Don’t get me wrong, configuring and coding against GeoServer/GeoWebCache is way more productive than working with some of the legacy products we traditionally used, but it still feels too complicated.

The division between traditional and modern is hardly specific to GIS, but rather something we see all across the board: traditional, big, monolithic chunks of software, expecting your full attention and demanding complex setup and configuration to get started, against the new philosophy of sane defaults without configuration, the simplest thing that could possibly work, and customization through extensibility and integration. Compiled, strictly typed languages against the scripted, untyped ones. RDBMs against NoSql, and so on.

Since much of my prior contact with Open Source GIS has been with OGC standards, WMS, WFS, SLD and the software on the more traditional end of the scale – GeoServer, GeoWebCache, GeoTools – I had the picture of open source GIS as unnecessarily heavyweight and complex for many uses. (I don’t mean to be overly critic of the mentioned softwares – they are truely amazing tools, and great achievements in open source, my critique is more about how they and their APIs are packaged.)

From the projects being presented at FOSS4G 2011, there’s a huge push for the lightweight or modern approach. Every other session is talking about scripting, using Node.js, NoSql databases and it appears that even the core developers of GeoTools/Server are getting fed up with SLDs. That’s great news.

I’m going to follow up this with a post with specifics about some of the new projects I’ve run into during the conference.

Flying to FOSS4G

image

image

I am on a plane.

The coffee was terrible, Christopher insisted on telling us that it smelled like refuse and refused to drink it.

We all suffer from Wikipedia withdrawal. Examples: kosher food, Bermuda Triangle, stalling airplanes, what actually happens of you forget to turn off the radio on your phone on a plane. We all agreed it would be safe to turn my computer on and then turn off the wifi during flight. It was.

The Boeing inflight map was styled worse than our own maps and didn’t show any relevant information whatsoever. Also, the plane on the map covered whole countries so you could barely see where you were.

Blogging at Kartena

In our work to improve ourselves and the world, the developers of Kartena have decided to take up writing.
We will write about open source software that we like and use (and how), and in what ways we would like to improve those projects. We will write about our own ideas and projects, and maybe what we happened to learn on the upcoming FOSS4G conference.