Do you get what you pay for?

Ed Parsons (CTO of Ordnance Survey in the UK, the premier provider of GIS data there) has an article in his blog commenting about a story in the Guardian about the price of data.

Ed argues that if you don't pay directly for your GIS data, you're going to get the out-of-date data that is our (the US) national GIS database, with roads and satellite imagery that is has led to a significant amount of amusement (such as the "where is Apple?" comments that he references).

Of course, comparing data (especially satellite imagery between a country that measures 241,590 vs a country that measures 9,161,923) is a bit unfair. The entire UK can be shot from a single satellite in geosynchronous orbit without too much difficulty, whereas the US requires 2 satellites just to get reasonable weather coverage. Granted, that's weather, but it's not easier for either of them to get detailed land data. Of course, the US has a population about 300M, about five times the UK's 60M, but the land area is 40 times the size, and there's not much in terms of economies of scale, especially when you factor in Alaska and Hawaii.

Unfortunately, I don't have a good set of statistics about either the age of our government-captured data or the accuracy of the vector data (such as the Tiger databases used as a basis for most programs).

However, I do know that the question of free vs. quality data doesn't have to be a step function. In the US, you can get pretty-good (if a bit out of date) data for the entire country. If you want up-to-date information, you contact one of the worldwide providers of geospatial data and they'll be happy to sell you a dataset that they've created and updated last week.

For those on a limited budget, the out-of-date information can provide a necessary starting point for base funding or proof of concept which is, in effect, not available in the UK.

All things considered, I can't argue with the better quality of the data coming from Ordnance Survey, but at the prices that they are charging, I would think that there are a variety of organizations and people who do not benefit from the use of data. The price for a boundary line set for the UK is £7140 for internal use at a corporation. If you have fewer than 100 terminals (anything with a screen that can display the data, by their definition), you can get discounted down to 15% of that (or £892.50—about US$1,549.29). Unfortunately, that doesn't include roads. Roads come from the OS MasterMap ITN layer (integrated Transportation Network) and the area surrounding london (bounded by the M25) will run you £2,548 per year for 2 terminals (and £33,236 for unlimited).. Again, the data is very accurate (updated at least twice a year and backed by a staff of over 1,400 people—according to the web site), but there's little that you can do if you just to do something simple.

Of particular concern is the availability of lower resolution data for purposes of research, illustration and more basic rendering. The data that you get for your US$4,423 of the area surrounding London is very accurate and contains every street. I'm curious as to whether most small companies would be willing to put down the cost of software plus the cost of the street data in order to see if they can gain benefit from using geographic data. Obviously, if you have a specific use (you're a company doing routing or road maintenance), it makes sense. However, for a company that would like to play around with the idea of locating their customers to see if there are locations that might serve them better, it raises serious questions about pricing viability. This is where the US system shines—providing data for proof-of-concept or for which high levels of accuracy may not be necessary. Education, research, and small businesses all benefit from this.

In the end, the creation of services such as Google Earth make it much easier for people to use the kinds of data that would be costing them thousands of dollars to license themselves, and may be the right way for people to gain access to data in areas of the world where such data isn't available for prices that fit within normal people's budgets.

Unfortunately, it takes a company like Google (with really deep pockets) to be able to shrug off the charges for data access in order to make this information available on their web site. Smaller organizations would be hard pressed to do so, even as a beta test.

Fortunately, for higher-education users, there is a subscription-based service run by the University of Edinburgh, in conjunction with Ordnance Survey that provides access to data on an annual subscription basis. Although I couldn't find a price anywhere officially, a little googling got me information indicating that it runs from about £600 per year to about £1,100 per year, which is not bad, if you get access to shape data.

And, for lower education, the crown copyright provides for free use for most schools (teaching students under 16) and low-cost (£300) for schools teaching students above that age.

In the end, the argument may come down to the one outlined in the 2002 report by Peter Weiss of the US National Weather Service: Borders in Cyberspace: Conflicting Public Sector Information Policies and their Economic Impacts (PDF). In this report (the one referenced by the Guardian article), he outlines the differences between the European and US models of data access. Obviously, he believes that the US model makes more sense, but it doesn't address questions such as the accuracy of such data for GIS applications in the US.

Although the GIS data is certainly not nearly as useful as the weather data that comes from the US government, it's very hard to say that it doesn't serve a useful purpose and that it impedes the creation of data from other sources. Even in the noted case of "where is Apple?" this faux pas was resolved by purchasing a small number of tiles from another provider.