Advanced Issue with Metadata in Spatial Information Systems

Acknowledgements

References

GIS at SFU

5.0 Future Engine

To create the semantic functionality, there are several tasks to perform. Firstly, metadata elements need to be identified that are both included in the search engine, and are subject to frequent contextual errors. The first choice from the metadata elements, being either FGDC or ISO, would be the keywords because the same keywords may be included for several different datasets. Some examples of keywords that would have serious contextual problems would be roads, transportation, range, and habitat: essentially any variety of keyword entry that can be used in many different contexts. For each keyword entry, a set of secondary “semantic” tags could be created, giving each word context identifiers. E.g. A GIS coverage concerning forest service roads would likely have the keyword “roads.” The associated secondary semantic tags would include words like logging, dirt, unpaved, rough, back roads, etc. These secondary tags would be cross-referenced with the appropriate input from the user’s search parameters. Of course, this leads to the trickiest part of developing a semantic search engine: How to make it such that the user interface accepts enough detail into the search parameters, but at the same time, not confuse the user (as per the paradox presented earlier).

There is nothing to the interface of basic web search engines: simply type a word in the textbox, click SEARCH, and suddenly there are about 100,000 items found, maybe 10 or 20 of them are actually useful. For a semantic search engine, the user interface cannot be this basic, in fact, the search needs multiple input values in order to word properly, otherwise, the system would not know how to utilize the secondary and tertiary tags on the metadata elements. To address this, the user interface could be designed to accept multiple keywords as separate entities in the search, and include contextual parameters, such as activity types. E.g. a user searching for a coverage depicting rivers may be interested in hydrological information, such as flow rate. The user could specify one or more of the following as keywords: water, rivers, streams, hydrology, flow, etc. Additionally, or alternately, the user could choose “scientific data” from a list of activities or study fields from a drop box. Whatever the user chooses would be searched for among the keywords, AND cross-referenced with the secondary tags.

While there are limitless advantages to using semantic search engines, especially with respect to spatial data dictionaries, there is one major constraint on their practicality: The amount of time and effort required by an administrator to add or modify entries. Because what it is proposing as an XML file structure with three classes of tags, the amount of time that would be required to populate all of those tags with values may discourage the implementation of this system. Even with a small number of primary metadata elements, the system could potentially need hundreds of secondary and tertiary tags to work successfully. One solution to this problem would be to pre-define contextual definitions to a library of common geographical terms. A comprehensive set of secondary and tertiary tags could be assigned to words like road, field, lake, etc. During metadata entry, the user would be able to type in a word, and the interface would inform the user whether or not a set of pre-defined semantic and ontological tags exist for that word. This would not address every possible word, but it would address the problem in sufficient detail.

<<- Previous l Top l Next ->>