3.2 Philosophical Issues
In order to make good metadata to create the “SFU Online Data
Dictionary,” it is very important to understand semantics and
ontologies. Solid understanding and proper application of
these fields of thought make the composition of good metadata
possible. Also, understanding
Semantic Query Language and Wordnet is necessary, since these
search engines introduce reducing the semantic differences and
ontological problems.
Semantics
The term semantics can be defined as words which has “different
meaning for different purposes for different people.” In
creating definitions for elements in metadata, the semantic
difference can often create problems since the definitions are
often very short and simple words that could be interpreted
variably depending on prospective users’ own experience and
knowledge. This leads us to the
problem of query metadata with regards to the semantics of the
search parameter. In querying many data sources, how do you
know that your search parameter has been interpreted the same
way? Handled with the same meaning: same context? This
problem associated with semantic differences can be solved when
ontology is properly used. In our
project, we have addressed this issue using XML tags (refer to
technical issue). Semantic interoperability has been identified
as a key issue concerning geographic data sharing between
different geo-spatial information communities
Ontology
Ontology is a process of
“achieving a clear and concise description of terms and
concepts.” Also, ontologies can be
characterized as shared vocabularies or conceptualizations of a
specific subject matter. Therefore, production of
ontology for terms with semantic difference can help reduce and
reduce the problems with interpretation. By properly using
ontology, it will promote integration of data from different
sources into a single system and improvement of “access to and
sharing of existing geographical information resources.”
It provides a logical definition of
concepts and their properties and argues for the benefit of this
approach in typical application scenarios. A means to improve
access to and sharing of existing geographical information
resources is by standardizing a set of ontologies which we see
fit to describe the semantics of a dataset. In our project, we
need to address ontologies to improve the metadata search engine
by using XML tags. Because our group is responsible for
determining the search keywords, we must identify a set of
ontologies that best suits the dataset in order to achieve an
efficient metadata search engine.
Semantic Query Language and Wordnet
SemQL is a
semantic version of Standard Query Language (SQL). It is
similar to SQL, except that it has no FROM clause, and it is
being developed to search multi-database systems, such that it
considers the context of the query parameters. For such a query
to be successful, it must know two important things: where to
find relevant information on the component databases, and which
entities, elements and attributes within the component databases
meet the semantic requirements of the search. With the use of
Wordnet, semantic networks can be established for use in the
search, also known as semantic heterogeneity classification.
Wordnet and semantic heterogeneity are important factors to
examine when developing search engines of this nature. “Because
meaningful sentences are composed of meaningful words, any
system that hopes to process natural languages as people do must
have information about words and their meanings” Wordnet
searches for many semantic relationships in determining the
meaning of a group of words. Such factors include synonymy (comparision
for synonyms), antonymy (opposites), troponomy (manner), and
several others.
Wordnet is a
very powerful tool in finding semantic relationships within
groups of words, but the validity of the Wordnet results may be
of concern. Questions arise out of how Wordnet deals with
differences in cultural expression. Consider the language
ontologies that exist within the English language alone. For
example, in England, large trucks are not known as trucks;
rather, they are known as “lorries.” Many other
culture-language discrepancies exist that could result in
Wordnet producing output that may not be useful by the search
engine, thus lowering its reliability. More research needs to be
undertaken on Wordnet, especially on how the inner working of
its code actually performs what it does.
The process
resulting from the conflation of SemQL and semantic
heterogeneity is known as “semantic query processing,” and has
proven to be quite successful in finding specific entries in
groups of small and moderately sized databases. However, It is
unclear whether or not this search process would work in a very
large environment, especially where the key fallacy is the
construction of semantic heterogeneity. Semantic relationships
would be much harder to form, thus would probably not be formed
correctly at all in some cases. SemQL is a very intriguing
language, and it deserves more attention from the geo-spatial
community as a possible replacement for SQL as the query
language of choice for search engines dealing with geo-spatial
data.
|