Wanderings, musings and kinetic chatter

GITA 2009: Map data for free (or fee)?

Back in August of 2008, I thought I’d be a hotshot and volunteer an idea for a talk at the GITA Infrastructure 2009 conference. I didn’t know much about the provision of map data - from the government or elsewhere - but figured nine months would be more than enough time to educate myself and stumble upon the A-HA moment: the catharsis when (Eureka!) everything would become clear and the answer would be obvious.

Nonsense! The question of government-provided map data is more tangled up than that basket of yarn I bought at a yard sale seven years ago when I was considering taking up knitting. (I bought a cat instead.)

It’s a polarizing topic with global policy and socio-economic implications, and I would certainly do it a disservice trying to be remotely thorough within a 45 minute session.

Fast forward to April 22nd, 2009. Sweaty palms, fluttery stomach, and sixteen slides consisting merely of somewhat related, mildly amusing pictures (directive: minimum boredom, maximum retention).

Much to my initial chagrin, the talk was actually well-attended. Yikes. Suffice it to say, everyone in the room knew more about the actual circumstances in play than I did. Than I do. Than I ever will. I’m not a practitioner. At first I was merely suspicious that that was the case; but about 10 minutes prior to mic time, “Dave” came to the podium to ask me about some legal battle in Santa Barbara… Huh? As it turns out, a case had recently been decided that prohibited the county from withholding map data, whether for financial or security reasons. The case set far-reaching legal precedent against both arguments. I managed to work it into my talk at a surprisingly appropriate place [insert horn-tooting here]. [While I’m at it, here’s a toot for Dave too!!]

I think that not-being-a-practitioner thing worked to my advantage, because I was able to work with the pragmatic details and not get embroiled in legitimate daily work-related frustration about where to source data that I know exists but that is - for whatever reason - unavailable, or unusable. (‘I pay my taxes, so what gives?’)

That said, I couldn’t be less concerned at the end of the day with the pricing structure of government-provided map data - or the absence of one - or even if the data is made available at all. What I do find fascinating (and worth celebrating) is Capitalism at work, effectively removing the government from the equation. People are resolving map data sourcing obstacles, because their government has not (cannot, will not, should not… whatever the case may be).

When asked at the end of the talk, what do I think people should do when the government does not provide map data for free (implying astronomical pricing that, for all intents and purposes, shuts small business and independent consultants out of the commercial marketplace for adding value to such data), I simply said: wait and see.

The marketing professional in me realizes that there are business models for deriving revenue streams from the provision of map data that are entirely inappropriate for a government entity in the digital age - but perfectly suitable for a business entity. And, I may not be a geographer, but I did muddle through econ501 in grad school, so I’m pretty confident in saying that demand will ultimately be met by supply, and the price will be dictated by the market. (Not by a ‘cost-recovery’ pricing scheme that, insofar as one measures incremental costs, generates 99% margin, or - as is the case for the OS in the UK - a profitability mandate to justify partial taxpayer funding…). And all of that other chart-and-graph stuff.

It’s only a matter of time.

Posted by Nancy Carter Mon, 04 May 2009 14:28:00 GMT


Ruminations on the Geo-Semantic Web

Along with several other SNI-ites, I recently gave a presentation at the 2009 GITA Geospatial Infrastructure Solutions Conference . Mine was entitled “The Geo-Semantic Web, Looking Beyond the Buzzwords”. A topic that has been on my mind for some months. I was rather hoping that I would be able to post a link to the recorded session video, however, they are not yet available, and apparently GITA is charging for them. Not quite sure how I feel about that, but perhaps that’s a topic for another time.

In any case, I thought I would finally take the time to summarize some of my thoughts and opinions on the matter, and why I think the notion of the semantic web, in general, is important, as well as what we might stand to gain by adding a little dash of geo into it.

Read past the jump for all the fun.

I’m going to try and stray away from delving too deeply into the technical mumbo-jumbo here and instead try and summarize some of the main points from my GITA presentation.

Firstly, it seems like we’re all constantly inundated with some fancy-schmancy new set of buzzwords or catch phrases that we’re supposed to be on the lookout for, since they will all apparently be the “next big thing.”

Web 3.0 anybody? Heck, I’m still reeling over all the hoopla surrounding Web 2.0! But, I think it’s important to take a bit of a step back here and look at what this Web 2.0 stuff really is (or was… should I be using the past tense already? I’ll leave that as an exercise to the reader).

You can read through that linked wikipedia article for what amounts to the official definition, I suppose, but here’s my take on it:

Technologically, not much changed or happened in the underlying web infrastructure.

Now, before I start a flame war there, yes, I know there were some changes, some things evolved and were made at least different, if not better, but when you look at the actual nuts and bolts of “The Web” it’s pretty much the same as it was during the glory days of Web 1.0 (before we even knew it had a version number, ah… ignorance was bliss, was it not?).

However, most of what really occurred was a fundamental shift in thinking. People and organizations started seeing the web more as an application platform, and not just a platform, but oftentimes a preferred platform. What that led to was the creation of actual web applications as opposed to web sites. Instead of just reading articles, searching for information, or browsing picture galleries, we’re actually using web applications to do, well, you know, real stuff.

This is not unlike all of the brouhaha surrounding Web 3.0 and the Semantic Web (which are often lumped together). The technology to do most of this “semantic stuff” already exists, and has for quite some time. RDF, which serves as an underlying structural framework for most things semantic, for example, was a W3C recommendation back in 1999. And, in fact, even though the more recent definitions (specifically wikipedia, which is, after all, pretty much the font of all human knowledge :) ) quips that the acronym RSS is:

” most commonly translated as “Really Simple Syndication” “

The original version of the specification published by Netscape way back in 1999 notes that the acronym stands for “RDF Site Summary.” So, I suppose if you want to get all technical about it (as I am wont to do, being some sort of über geek and all), if you were subscribing and/or publishing RSS feeds back in 1999, you were pretty much using the Semantic Web version 1.0, so give yourself a big pat on the back for being such a forward-thinking early adopter!

True, we are starting to see some of the technologies mature (triple stores for example), and see wider adoption, but the underlying ideas and frameworks have been around for quite some time.

What the semantic web will really involve, if it is to take off in a big way and become ubiquitous (as I am fairly convinced it will), it will be the result of a shift in thinking and perception more so than a rapid and radical evolutionary leap in technology. Granted, that leap is bound to occur coincident with all that semantic ubiquity (oooh… Semantic Ubiquity, band name?), but that shift in thinking is the important bit.

And what would that shift involve (in my opinion, at least)? Well, it’s really about starting to blur the lines between “data” (things lying around in relational databases, spreadsheets, XML documents, etc.) and “content.” After all, we’ve got quite a bit of both lying around in one form or another, but in a rather substantial preponderance of cases, one would find it difficult to use that data without first combining it with some form of “content,” or converting that content into some sort of normalized “data” that can be manipulated, queried, sorted, reported on, yada3.

What the whole semantic web movement is attempting to get at is to blur those lines and build those bridges, such that you can break out of the typical mold of seeing “data” in typical tabular format, and “content” as a big blob of words or numbers without any structure, and, most importantly, without any meaning (meaning to a computer, that is), such that they can be transparently intermingled and used together, ultimately making everyone’s lives easier (knock on wood), and, most importantly, reducing the amount of time and effort it takes to combine all of these disparate bits of content and data together in order to form actionable intelligence.

That being said, this is not an easy problem to solve, especially when we’re talking about the “plain” web, as opposed to the GeoWeb (yeah, that’s right, I made you read allll the way down here before I even started getting to the “geo” part!).

As I mentioned, I’m not going to delve into the technical nitty gritty of some of the current and/or proposed work being done on the “plain” web side in that respect (with one exception, DBPedia which I happen to think is pretty spiffy, given my obvious affection for all things semantic and wikipedia). However, getting back to the “geo” part, I think we may be a bit ahead of the game here.

The non-geo web is going to continue to evolve more and more towards a semantically enabled and linked infrastructure, but as those efforts march on, I think the geospatial crowd would be doing both themselves and the rest of our web family a service by beginning to think about our data (and content!) in terms of semantics.

Sound hard? Well, to be honest, yeah, there are challenges. However, think about it. Your average bits and pieces of geodata are already structured by their very nature. Whether that’s a shapefile, a KML file (with SchemaData of course!), GeoRSS, or anything else along those lines, a good bit of the work required to “semantify” the data is already there! The hardest part is starting to think about things not in terms of structured tables and databases, but as semantic graphs. Or, put more succinctly, think about embedding more meaning into your data. Beginning to think about things in this way is by far the biggest challenge I have had as I have been digging into, researching, and attempting to use and build things around semantic technologies, as I find myself so used to thinking about things in terms of tables and joins and rows and columns and so on and so forth. It has proven incredibly difficult to re-train my brain to think about what things mean and how to describe those relationships to a computer, even though one would think it might be the more natural way of doing it.

If we can get there, though, it opens up all kinds of doors for future applications, data interoperability, and analysis. Think about the sorts of things you might be able to do if, for example (disclaimer: I am notoriously terrible with coming up with relevant, let alone good examples on the spur of the moment) instead of having a big ol’ geodataset full of various bits and pieces of information on fire hydrants and their locations, you were also embedding or linking meaning, or had some already there, automagically imported from another source. Meaning such as what fire hydrants did, that their flow rates relate to water, which comes from municipal sources, which are also used to provide H2O to nearby houses, which affects the pressure in those locations…

These are the sorts of things we might be doing and attempting to figure out and analyze today, but it is still us, the humans, who have to ultimately use our noggins or other pieces of unwieldy or complex task-specific tools and software to derive, describe, and/or use that meaning in order to make decisions. If we can cut down that lead time even by a factor of 5-10%, wouldn’t that give us a lot more time to get into doing some really crazy, new, difficult, and interesting stuff?

Those are my $0.02 worth anyway, and, granted, I glossed over a LOT of stuff here and it still ended up being a short novel, but I’ll do my best to start elaborating on some more specific examples and topics here in the near future :)

Posted by Chris Wed, 20 May 2009 12:41:00 GMT