Setting up aerial photos (orthophotos) in Geoserver

We are currently working on a project where we are setting up aerial photos or orthophotos in Geoserver. We tried to do this, naively, by adding the original data that we got from our supplier. The original data is 1.4TB. This might not be big for Google, but it is quite big for us, and definitely for our server. The problem wasn’t just the total size of the data, it was also a matter of the composition of it.
The data was delivered as jpeg images accompanied by world files. The images were around 50-80mb each (we used them as a ImageMosaic). The dimensions of the images were 10 000×10 000 pixels. We use our own map client written in Javascript (not OpenLayers in this case) which requests tiles of 256×256 pixels. To cover a page in the client we do around 30 requests. In order for Geoserver to get us a tile it needs to load the full image of 10 000×10 000 px. This is obviously quite silly since we won’t use most of those pixels. Most of us don’t have a screen resolution higher than 1920×1200 (many don’t have that!) so why load all that data?
OK, so what to do? Well, our legacy in house map server stores its tiles in 1500×1500 by default. But it occurred to us that it seems that most (citation needed) of the world use tile sizes in the power of 2, so we went with 2048×2048.
Great, we have a plan, now, how do we transform all our tiles to smaller ones? Also, we might want to create a pyramid if we want to be able to match the scale layers of our map data. The solution is gdal_retile.py - GDAL is an amazing open source project that does a lot of useful stuff, like retiling mosaics and creating pyramids, among plenty of other things.

I did a test on 3.1 GB of data and got 998 MB, a size of thirty percent. So, the original files are kind of bloated.
The retile took 2 hours. A quick calculation tells us that to retile all of the data would take 1000h or approximately 41 days. However, it’s a python script which can’t utilize threading, so, how can we speed it up?
The data was split into 10 areas. If we run 10 instances, the script should be able to utilize different cpu cores. This would reduce the running time by 10, and finish in 4 days.

After starting this and going to FOSS4G, and coming back (it took longer than 4 days), you’re done :)

Now all I have to do is combine all the tile index files somehow. I fired up ogr2ogr to combine them and thought I was all set, but obviously I didn’t think about the paths. When creating the new tiles I was at a different level of the directory structure and now I have to prepend the superfolders to the location column in the dbf files. So far I haven’t figured out how to do this in a good way… I tried to do it in LibreOffice, but I became frustrated when because that just moved the data into dbt files and I don’t even know if Geoserver understands that…

More on the result in a later post.

 

Flying to FOSS4G

image

image

I am on a plane.

The coffee was terrible, Christopher insisted on telling us that it smelled like refuse and refused to drink it.

We all suffer from Wikipedia withdrawal. Examples: kosher food, Bermuda Triangle, stalling airplanes, what actually happens of you forget to turn off the radio on your phone on a plane. We all agreed it would be safe to turn my computer on and then turn off the wifi during flight. It was.

The Boeing inflight map was styled worse than our own maps and didn’t show any relevant information whatsoever. Also, the plane on the map covered whole countries so you could barely see where you were.

Blogging at Kartena

In our work to improve ourselves and the world, the developers of Kartena have decided to take up writing.
We will write about open source software that we like and use (and how), and in what ways we would like to improve those projects. We will write about our own ideas and projects, and maybe what we happened to learn on the upcoming FOSS4G conference.