4.2 Problems and Errors
Probably the most
significant problem that we encountered in this project was
referring to the ISO tags in the XML files with metadata
explorer. Metadata explorer draws upon the information stored
in the XML files using Java Server Page files (JSP).
It is easy, in theory,
to modify these files to produce a simplified metadata
scheme. Some of these simplified products could be one sheet
showing “basic” metadata, another showing “details” one
showing only technical metadata (geographic bounds,
projection, datum, etc), and so on. As the XML, files contain
both FGDC and ISO metadata tags; a separate metadata wizard
exists for both metadata schemes. Despite the fact that
Metadata Explorer appears to support ISO metadata, Metadata
Explorer is not easily configurable for use with ISO
metadata. In fact, when we attempted to modify the JSP file
“…/Include_details_output.JSP,” we encountered a severe
problem regarding the structure of the ISO tags in the XML
files. The problem lies in that there are many duplicate XML
tags, many with several children elements and tags, I.E.
several tags with the same name. ArcIMS version 4.0 has bugs
in it that do now allow it to recognize higher-level Xpath
statements, Xpath being the path language used to declaring
XML tag locations. This causes due confusion within the
metadata sheet generation process, for there is no way to tell
the program which tag to draw data from.
The ISO metadata
structure is not devoid of an intended methodology to sort the
different tags. In every case of duplicate XML tags, there is a
child tag with a “value” attribute. Each tag has a different
value assigned to “value.” This intended use of this attribute
is as a unique identifier for each tag such that tags can have
the same name, but referred to as a different class of the same
tag type. An example of this would be a set of tags named
keywords: there may be thematic keywords, place keywords,
context keywords, and others.” This naming convention causes
problems with Xpath statements. In order to refer to the
appropriate XML tags in the JSP file, Xpath statements need to
be written for each element. Xpath statements are very easy to
write: it is not difficult to write an expression that selects a
tag from a group of tags with the same name, provided that there
is a unique identifier embedded within the tag’s data. Figure
4-2-1 shows the
principle problem, and figure
4-2-2 shows a
simple selection based on the value of attribute “ID.”
<AAA>
<BBB>
<CCC ID = 1>
<DDD>####</DDD>
</CCC>
<CCC ID = 2>
<DDD>####</DDD>
</CCC>
</BBB>
</AAA>
Figure
4-2-1.
The selection upon the use of the string /AAA/BBB/CCC.
<AAA>
<BBB>
<CCC ID = 1>
<DDD>####</DDD>
</CCC>
<CCC ID = 2>
<DDD>####</DDD>
</CCC>
</BBB>
</AAA>
Figure 4-2-2.
The selection upon the use of the string /AAA/BBB/CCC[@ID = 1]/DDD.
Metadata
Explorer does not understand an Xpath statement like this one.
Xpath Visualizer can be found at:
http://www.vbxml.com/xpathvisualizer/default.asp
Regarding the
current structure of ISO metadata in XML is that the design of
the tag system should be re-designed. The presence of multiple
XML tags with the same name is not a good solution. The use of
ID variables within in the tags is a source of much error and
grief when making specific references in the file. X-path
statements are very easy to write; but unfortunately, not easy
for some software packages to understand. Software problems are
another issue, yes; however, we must state that such a system
should have a tag-hierarchy structure such that there are cases
where duplicate tags exist. The exception to this is that
duplicate tag names can be used if they are located within
parent tags that have different names. The net result should be
that every ISO metatdata element in the XML file has a unique
X-path statement referring to it. This will ensure that the
standard will be implemented without multiple tag reference
issues, such as those that we experienced.
Other Problems
and Issues
We encountered numerous problems
throughout our project. The first one is the quality of the
metadata: Many of the datasets from the SIS server and Research
Data Library do not have high quality metadata; therefore it
must be researched and created.
This affects the
importance of our datasets as well, for we are filling in the
blanks for the required elements with no certainty that what we
are writing is truly correct. The quality of the metadata plays
an important role since the user utilizes this information, and
if the metadata displays incorrect information about the
datasets that they are using, this affects the quality of their
research. Even though a large
number of spatial datasets are available to SFU students and
researchers, the information available on how to obtain or
access these datasets is not always readily available.
Especially, on the HTML interface, which is the most common, the
exchange of data among academic department information system is
problematic due to the insensitivity of the context and tags.
Our approaches to solve the problems
are various and one of our initial trials was to
establish a metadata
schema conforming to the FGDC metadata standard. FGDC standard
is chosen because it is a worldwide leading standard and it
strongly recommended for the future open GIS environment at SFU.
Standardization occurs because the people or organizations that
provide the data to users should have the same format of
metadata so that there is no confusion between different
datasets. If all providers comply with the same format of
metadata, it would be much easier for the users to understand
the data and put it to proper use. Moreover, it is a solution
to interoperability issues. One of the focuses of our project
is to attempt to minimize the gap of diverse datasets by setting
standards in hopes of achieving interoperability between
providers and user in the campus community. This will allow
users to use the diverse collection of data for research use
without having to question if the datasets are compatible. On
the other hand, we should be realistic and understand that this
problem may never be fully resolved because of semantic and
ontological problems relating to this topic. Since our group
consists of only four people, we cannot think exactly like the
other tens of thousands of student and faculties think; we can
only assume.
This leads to problems
implementing semantics and ontologies in a rigid programming
environment. For example, the keywords used in our data
dictionary may not comply with the other users because we may
chose to use one of many choices of words which has the same
meaning as a different word; i.e.: car and vehicle, or a single
word with many meanings, such as “range.” In addition,
semantics is interrelated to ontological problems. Because we
are working with a collection of diverse datasets, they have a
different ontological schema and we are incorporating their
schemas into our project. Therefore, we are assuming that the
users whom would be using this service share the same
ontological schema, which is usually not the case. This is more
evident in certain term used by different departments within SFU.
Ontologies are locally unique, and therefore may never be solved
but hopefully we can shrink the gap by standardizing the
metadata.
|