go to part 5: Discovery, Transmission, and Storage of RDF data
Contents
A vocabulary is a list of terms and their definitions. (1) In this context, "vocabulary" refers specifically to a controlled vocabulary in the sense that terms are included in the list by the declaration of some sponsoring body. Within a vocabulary, the best practice is to have only one term for a single thing and to have only a single definition for each term. Terms are often identified by a URI created from a text string which is generally understandable by humans to be related to the concept that is embodied in the term.
Although different terms may be named by URIs that use the same string after a prefix, e.g. ac:derivedFrom
and dsw:derivedFrom
, they do not necessarily mean the same thing. The term text strings are qualified by prefixing them in such a way that they form a URI which unambiguously relates them to their defining vocabulary. What each term means is defined in the vocabulary from which they came, and the data in the RDF cannot be interpreted without reference to that definition.
For example, the string "derivedFrom" can represent "a reference to an original resource from which the current one is derived" (the sense in which it is used in the draft Audubon Core standard (2) ). However, "derivedFrom" can also be a term representing the organism or Token from which another Token is derived (the sense in which it is used in darwin-sw (3) ). To differentiate between different definitions of terms, a namespace abbreviation is placed in front of the term to make it unique. In this example, ac:
can abbreviate "http://rs.tdwg.org/ac/terms/
" and dsw:
can abbreviate "http://purl.org/dsw/
". So when one writes ac:derivedFrom
it is an abbreviation for the URI http://rs.tdwg.org/ac/terms/derivedFrom
while dsw:derivedFrom
is an abbreviation for the URI http://purl.org/dsw/derivedFrom
. In this convention, the namespace provides a way to identify the defining vocabulary while the entire URI provides a globally unique identifier for the term. Often a variety of the namespace itself serves as an identifier for the entire vocabulary, known as the vocabulary URI.
Commonly vocabulary namespaces are either a "hash namespace" or a "slash namespace". URIs for terms in vocabularies that use hash namespaces are constructed by appending a "#" character to the end of the vocabulary URI, then appending the term string. For example, the SKOS core vocabulary has the URI http://www.w3.org/2004/02/skos/core
. The URI for the term skos:closeMatch
is formed by concatenating the vocabulary URI, "#", and the term name "closeMatch": http://www.w3.org/2004/02/skos/core#closeMatch
. URIs for terms in vocabularies that use hash namespaces are constructed by appending the term name to the end of the vocabulary URI. For example, the Dublin Core "terms" vocabulary has the vocabulary URI http://purl.org/dc/terms/
. The URI for the term dcterms:subject
is formed by concatenating the vocabulary URI and the term name "subject": http://purl.org/dc/terms/subject
. The choice of whether to use a hash or slash namespace depends on several factors, including the size of the vocabulary and the method by which the RDF describing the vocabulary will be served. This is described in detail in Appendix B of Best Practice Recipes for Publishing RDF Vocabularies.
Even though a vocabulary may use a URI to identify its terms, it should not be presumed that attempting to dereference the URI using a web browser or semantic client will necessarily return any useful information. If a URI does dereference, the information returned can range from a minimal human-readable text description of the term to a complex RDF description of the properties of that term and how it is related to other terms.
Although there is no strict definition of an ontology, it is generally recognized that an ontology describes the formal constraints of the terms in a vocabulary and expresses relationships among terms in the vocabulary using some kind of ontology representation language. An ontology can describe hierarchical relationships among terms (as in a taxonomy), describe associative relationships among terms (as in a thesaurus), and relate terms within the vocabulary to other terms described outside of the vocabulary. Basic RDF makes few distinctions in the ways that URIs can be used to represent predicates and objects. But other systems have been developed to extend basic RDF and provide ways to indicate more clearly the nature of resources and the relationships between them. RDFS (section 4.4.) and OWL (section 7.4.) provide increasingly complex means of describing such relationships.
There are tradeoffs involved in choosing to describe terms using an ontology rather than a simpler controlled vocabulary. Terms that are described as part of an ontology carry much more meaning and can be used to do more sophisticated machine reasoning (i.e. to "generate" new triples from existing ones). On the other hand, terms that are described by an ontology are far more encumbered and may have a much more restricted utility than terms that are described in a simple controlled vocabulary.
For example, Audubon Core is very vague about the types of resources that can be the subjects and objects of ac:derivedFrom
. Because Audubon Core is concerned with media items, one might assume that it is intended to describe how one media item (e.g. an image) is related to another. For instance a particular StillImage might be ac:derivedFrom
some particular MovingImage . However, there is nothing in the formal description of ac:derivedFrom
that would prohibit someone from making the statement that an offspring is ac:derivedFrom
its mother. Thus a simple description of a term allows it to be adapted to uses that were not conceived by the original creators of the term.
In contrast, the darwin-sw ontology defines dsw:derivedFrom
to be a transitive property (i.e. if A dsw:derivedFrom B
and B dsw:derivedFrom C
, then A dsw:derivedFrom C
) and is also defined to have the inverse property dsw:hasDerivative
(i.e. if A dsw:derivedFrom B
, then B dsw:hasDerivative A
). So persons who are not careful in using dsw:derivedFrom
may end up saying things that they do not intend. The problem of making unintended statements is particularly acute when an ontology declares terms to have range and domain properties (see sections 4.3.2.5. and 4.3.2.6.).
Ontologies and a language to describe them (OWL) are described in more detail in section 7. of this guide.
In 2007, the Dublin Core Metadata Initiative (DCMI) (4) produced a Recommendation describing an abstract model for Dublin Core (DC) metadata. (5) Because DC is a widely recognized and used vocabulary, the DCMI Abstract Model (DCAM) has been used as a model for other vocabularies outside DCMI, with Darwin Core as a notable example within our community. The proposed Audubon Core TDWG standard is also modeled after the DCMI Abstract Model. As an abstract model, the DCMI Abstract Model is independent of any particular coding scheme. Thus it can be used as a model for vocabularies such as Audubon Core, whose normative definition is in human-readable terms and assumes no particular serialization. Nevertheless, the DCAM was built upon concepts defined in the World Wide Web Consortium's (W3C's) Resource Description Framework (RDF). So even vocabularies which are based on the DCAM but which are not expressed in RDF are structured in a way analogous to the structure of RDF.
Resources that are described using the DCAM are identified by URIs or are expressed as string literals. The nature of literals can be clarified by applying an ISO language tag or by indicating the structure of the string by associating it with a syntax encoding scheme (e.g. it is an integer or formatted date). See this for details of DCAM literal value surrogates.
A subject resource is described by a set of statements. Each statement is made up of a property-value pair.
In a statement about a subject resource which consists of a property-value pair, the property (also known as an element) is a term from the vocabulary (e.g. the DC vocabulary) which relates the subject resource to the value. The value of the statement can be represented by either a URI or a literal. For example a book (the subject) could be described by the property dcterms:title
and the literal value "A Tale of Two Cities
". Optionally, the literal could have the tag xml:lang="en"
or xml:lang="es"
for property-value pair dcterms:title
"Historia De Dos Ciudades
".
The value surrogate (which represents the value resource) could also be a URI as in the property-value pair:
dcterms:type http://purl.org/dc/dcmitype/Text
where http://purl.org/dc/dcmitype/Text
is a URI which identifies the type of things that consist primarily of words for reading.
The DCAM defines a special type of vocabulary term called a class. A class is a mechanism defining a group of resources of a particular type. A resource which is a member of a particular class of resources is called an instance of that class.
Note: a common convention is to capitalize the names of classes and to write the names of properties in camelCase
. However, this convention is one of convenience and it is not safe to make assumptions about the nature of a term based on its capitalization.
Terms in a vocabulary which are properties that can describe all instances of a particular class are said to have a domain relationship with that class. For example, the property term dcterms:medium
has a domain which is the class dcterms:PhysicalResource
. So the property dcterms:medium
will apply to all dcterms:PhysicalResource
instances.
Many DC terms which are properties do not have domain assignments. Thus, although a resource which is an instance of the class dcterms:BibliographicResource
may have the property dcterms:publisher
, there is nothing which says that a thing which has the property dcterms:publisher
must be an instance of the dcterms:BibliographicResource
class. However, any thing which has the property dcterms:medium
can be inferred to be an instance of the class dcterms:PhysicalResource
because of the domain assignment in the definition of dcterms:medium
.
In some cases, it may be assumed that all values associated with a particular property would be instances of a particular class. This is called a range relationship between the property and the class. For example, if a subject resource has a creator, it can be inferred that the creator is some kind of agent which is capable of creating. Therefore, the definition of the property dcterms:creator
states that it has the range dcterms:Agent
. So if one makes a statement about the painting Starry Night that
dcterms:creator people:Vincent_van_Gogh
one can infer that people:Vincent_van_Gogh
is a dcterms:Agent
As was the case with domain, there is no requirement that any property term have a range assignment.
The DCMI Abstract Model allows a class to be related to another class by a sub-class relationship. One class is a sub-class of another class if all instances of the first class are also instances of the second class. For example dcterms:FileFormat
is defined to be a sub-class of dcterms:MediaType
. Therefore, anything which is a dcterms:FileFormat
is also a dcterms:MediaType
. It is NOT true that anything which is a dcterms:MediaType
will also be a dcterms:FileFormat
.
The DCAM also allows a property to be related to another property by a sub-property relationship. A property is a sub-property of another property if all instances of the first property also have the same relationship as the second property. For example dcterms:creator
is a sub-property of dcterms:contributor
. So if the painting Starry Night is described by the property value pair:
dcterms:creator people:Vincent_van_Gogh
then it can also be known that the property value pair:
dcterms:contributor people:Vincent_van_Gogh
also applies to Starry Night even if it is not stated explicitly.
If the DCAM is used as the basis for a vocabulary, one can assume:
- that the purpose of the vocabulary is to describe resources using sets of property-value pairs.
- that a described (subject) resource may be identified by a URI, although this is not required.
- that resources which are values may be described using URI references or literals.
- that the properties will be identified by qualified URIs.
Although the DCAM assumes the existence of classes, it does NOT require that a vocabulary following its model define any classes. If the vocabulary defines classes the DCAM allows but does NOT require that those classes be associated with any property terms through domain or range assignments.
Although the DCAM was designed to be compatible with RDF, it is a generic information model and does not assume any particular encoding syntax.
RDF was developed as a general purpose model for representing information on the Web. After the introduction of RDF, the W3C published a Recommendation known as the RDF Schema (RDFS) for specifying how RDF could be used to describe vocabularies. (6) RDFS extends the basic RDF syntax by defining additional terms which facilitate the categorization of resources into classes, and which describe how properties are related to those classes. Because the DCMI Abstract Model was developed based on the concepts defined in RDF and RDFS, there is a direct relationship between the concepts discussed in section 4.3. and those which are described formally in RDFS. In fact, although the DCAM does not assume any particular encoding syntax for vocabularies which follow it, the model itself is formally defined based on RDF and RDFS.
A described resource of the DCAM corresponds to the subject of an RDF triple. The property-value pair of the DCAM corresponds to the predicate and object of an RDF triple. The DCAM states that the description of a resource is made up of one or more instances of property-value pairs which describe that single resource. RDF does not demand that resources be described in sets of triples related to a single resource. Rather, RDF triples stand as independent statements of fact which do not have to be grouped in any particular way. However, as a practical matter when RDF is serialized using XML or N3, the triples which describe a single resource are grouped together. For example the following DCAM description:
Described resource: http://bioimages.vanderbilt.edu/kirchoff/ac1490
Property-value pairs:
dcterms:title "Quercus michauxii (Fagaceae) - leaf"
dcterms:created "2010-09-01T03:41:31"
dcterms:dateCopyrighted "2011"
can be expressed in RDF/XML as:
<rdf:Description rdf:about="http://bioimages.vanderbilt.edu/kirchoff/ac1490">
<dcterms:title>Quercus michauxii (Fagaceae) - leaf</dcterms:title>
<dcterms:created>2010-09-01T03:41:31</dcterms:created>
<dcterms:dateCopyrighted>2011</dcterms:dateCopyrighted>
</rdf:Description>
or in N3 as
<http://bioimages.vanderbilt.edu/kirchoff/ac1490>
dcterms:title "Quercus michauxii (Fagaceae) - leaf"
dcterms:created "2010-09-01T03:41:31";
dcterms:dateCopyrighted "2011".
In both of these RDF serializations, the predicates and objects which describe the subject are presented in a list of pairs which are analogous to property-value pairs as described in the DCAM.
Because of the basis of the DCMI Abstract Model on the ideas embodied in RDF and RDFS, there are equivalencies between the abstract terminology used in the DCMI Abstract Model and the formally defined terms of RDF and RDFS. The following table summarizes these equivalencies (7):
DCAM | RDF/RDFS |
---|---|
resource | rdfs:Resource |
property or element | rdf:Property |
class | rdfs:Class |
syntax encoding scheme | rdfs:Datatype |
has domain | rdfs:domain |
has range | rdfs:range |
sub-property of | rdfs:subPropertyOf |
sub-class of | rdfs:subClassOf |
The DCAM model presents the concept that a vocabulary term can be a class. However, it does not provide a direct means of declaring that a resource is an instance of a class. This can be done explicitly in RDF using the predicate rdf:type
. If a triple states that the subject resource has rdf:type
C, then C is inferred to be an rdfs:Class
and the triple states that the subject resource is an instance of that class. (8)
4.4.4.2. Inferring an instance of a class based on rdfs:range
and rdfs:domain
properties of a RDF predicate
Because the RDFS terms rdfs:range
and rdfs:domain
correspond directly to the has range and has domain relationships described in the DCAM, the explanation provided in sections 4.3.2.5. and 4.3.2.6. apply to RDFS as well. (9) In the following example,
foaf: = "http://xmlns.com/foaf/0.1/" and
kimage: = "http://bioimages.vanderbilt.edu/kirchoff/
The definition of the term foaf:depicts
in RDF (10) contains the following triple:
foaf:depicts rdfs:domain foaf:Image
which says that a subject of the property foaf:depicts
is an foaf:Image
. If we make the statement:
kimage:ac1490 foaf:depicts http://bioimages.vanderbilt.edu/uncg/14
then the following triple can be inferred even if it is not explicitly stated in any RDF describing kimage:ac1490
since kimage:ac1490
is the subject of a triple having foaf:depicts
as its predicate.
kimage:ac1490 rdf:type foaf:Image
Thus applying the property foaf:depicts
to a resource implicitly asserts that the resource is an instance of the class foaf:Image
.
Similarly, if a property term T is defined to have range C, then if resource O is the object of a triple containing the T as the predicate, it is implied that O rdf:type C
and therefore O is an instance of class C.
The following example illustrates the danger of using a property term having an assigned domain as the predicate of a statement about a resource of a type not intended to be described by that term. If I would like to describe the nature of novels using terms I have created in a vocabulary defined under the namespace myNmspc:
I might make the statement
http://dbpedia.org/resource/The_Lord_of_the_Rings foaf:depicts myNmspc:goodVsEvil
However, making this statement asserts that my descriptive term for novels myNmspc:goodVsEvil
is an instance of the class foaf:Image
which doesn't make very much sense.
4.4.4.3. Implying an instance of a class using the class name as the name of a container element in XML serialization of RDF
When the name of a class is used as the name of the XML container element for the description of a resource (instead of the generic rdf:Description
element), a rdf:type
statement is implied which indicates that the resource is an instance of that class. See section 3.3.3. for an example.
The definition of the class foaf:Image
in RDF (10) makes the statement
foaf:Image rdfs:subClassOf foaf:Document
Thus if one declares that
kimage:ac1481 rdf:type foaf:Image
it can also be inferred that
kimage:ac1481 rdf:type foaf:Document
Note that nothing prohibits a particular resource from being an instance of several classes. Also note that in section 4.4.4.2. it was shown that the triple
kimage:ac1490 foaf:depicts http://bioimages.vanderbilt.edu/uncg/14
implies that kimage:ac1490
is an instance of the foaf:Image
class. Given the rdfs:subClassOf
statement in the definition of foaf:Image
, the triple also implies that kimage:ac1490
is also an instance of the foaf:Document
class.
The examples given above show that careless application of the rdfs:subClassOf
, rdfs:range
, and rdfs:domain
properties in the definitions of terms or the careless use of terms having such properties can result in unintended consequences if users are not aware of the implications of their actions. For this reason, the three properties listed above (as well as rdfs:subPropertyOf
) are generally used sparingly in term definitions of general purpose vocabularies.
The Darwin Core vocabulary (11) is a TDWG technical specification standard. (12) It follows the DCMI Abstract Model (13). In accordance with the DCAM, Darwin Core has terms that are classes or properties. Those properties could have domains which are Darwin Core class terms (although they don't have any that are explicitly declared in the term definitions). The properties are associated with values to form property-value pairs that describe the subject.
However, this particular version of the abstract model is described in the context of the guidelines for implementing application schemas as XML, so some of the statements made in those guidelines are more restrictive than the general DCAM. In particular, they restrict a property to describe at most one class. This is to promote consistency in implementation of the containment system that is described in the guide. The XML guide also restricts values to be literal strings contained within property elements as opposed to allowing values to be specified as XML attributes. So although the model described is considered an abstract model, it is focused on describing "flat" records ("Simple Darwin Core"). The guidelines specifically state that they apply in situations where RDF/XML are not appropriate and that they are NOT guidelines for the implementation of RDF/XML. (14) So the restrictive modifications of the DCAM described in the Darwin Core abstract model should not be considered to apply to any RDF implementation of Darwin Core.
Although there is no existing implementation guide or statement of best practices indicating how Darwin Core should be implemented as RDF, the standard itself is described using RDF in the normative document for Darwin Core (15). Much of the information about the Darwin Core terms which are defined in this document is in the form of literal strings intended for human consumption, namely rdfs:label
, rdfs:comment
, and dcterms:description
. These properties can be viewed at the human-readable Quick Reference Guide. (16) Each term has the property rdfs:isDefinedBy
which has the value of the URI of the vocabulary ( http://rs.tdwg.org/dwc/terms/
). Several additional well-known informational properties are given for each term: dcterms:issued
and dcterms:modified
which are untyped strings but are in ISO 8601 format, rdf:type
which is either rdfs:Property
or rdfs:Class
, dcterms:hasVersion
which references a permanent record in the term history (17), and dcterms:replaces
which references dated terms in the term namespace which are not explicitly described there as subject resources. The properties also include three terms that are described in the Darwin Core attributes vocabulary (18): dwcattributes:abcdEquivalence
, dwcattributes:status
(with values of recommended
or deprecated
), and dwcattributes:organizedInClass
. The first of these three terms maps Darwin Core terms to similar terms in the Access to Biological Collections Data (ABCD) schema.(19) The second suggests to humans whether the term is still recommended for use. The last term suggests to humans the class which a property term might describe; Darwin core terms have no formal rdfs:domain properties
(and no rdfs:range
properties as well).
Of the possible DCAM/RDFS relationships which allow a machine to infer additional facts from a property statement (i.e. range, domain, subclass, and subproperty), only rdfs:subProperty
is used in the Darwin Core term definitions. There are several types of rdfs:subProperty
declarations:
- ID properties which describe identifiers (e.g.
dwc:acceptedNameUsageID
anddwc:datasetID
) which are declared to berdfs:subProperty
ofdcterms:identifier
. - properties involving dates (e.g.
dwc:eventDate
anddwc:georeferencedDate
) which are declared to berdfs:subProperty
ofdcterms:date
. dwc:namePublishedIn
which is declared anrdfs:subProperty
ofdcterms:bibliographicCitation
- "By" terms (
dwc:georeferencedBy
,dwc:identifiedBy
,dwc:locationAccordingTo
,dwc:measurementDeterminedBy
,dwc:nameAccordingTo
,dwc:recordedBy
,dwc:relationshipAccordingTo
) which are declared to berdfs:subProperty
ofdwc:accordingTo
, a Darwin Core-defined property which is an "Abstract term to attribute information to a source." dwc:collectionCode
which is declared anrdfs:subProperty
ofhttp://rs.tdwg.org/ncd/terms/Collection#Code
dwc:collectionID
which is declared anrdfs:subProperty
ofhttp://rs.tdwg.org/ncd/terms/Collection#Id
dwc:institutionCode
which is declared anrdfs:subProperty
ofhttp://rs.tdwg.org/ncd/terms/Institution#Code
dwc:institutionID
which is declared anrdfs:subProperty
ofhttp://rs.tdwg.org/ncd/terms/Institution#Id
dwc:ownerInstitutionCode
which is declared anrdfs:subProperty
ofhttp://rs.tdwg.org/ncd/terms/Institution#Code
dwc:ResourceRelationship
(a Darwin Core class) which is declared anrdfs:subProperty
ofdcterms:relation
(20)
Note: the URIs having namespace http://rs.tdwg.org/ncd/terms/Institution#
seem to be incorrect as the namespace for the Natural Collection Description vocabulary is defined as http://rs.tdwg.org/ontology/voc/Collection#
in the draft standard. (21)
There are a number of terms included in the Darwin Core standard which are not defined by Darwin Core itself. Most of these terms are record-level property terms that are defined by Dublin Core (22), e.g. dcterms:type
, dcterms:modified
, etc. The exception is dcterms:Location
which serves as one of the classes under which Darwin Core terms are organized.
A significant aspect of these terms is that unlike the terms defined in the Darwin Core namespace, the Dublin core terms may have domains and ranges specified in their definitions. This is significant in the case of dcterms:type
which will be discussed in section 4.7.3.1. See also a discussion of problems involving Darwin Core guidelines for using Dublin Core terms.
There are two ways to assert that a resource is a class
As described in section 4.4.4.1., one can express that a term represents a class if it is declared explicitly to be a class in its description. For example, in the definition of dcterms:Agent
the following RDF triple is used to declare it to be a class:
http://purl.org/dc/terms/Agent rdf:type http://www.w3.org/2000/01/rdf-schema#Class
If a term is declared to be an owl:Class
, then it is also implicitly declared to be an rdfs:Class
because owl:Class rdfs:subClassOf rdfs:Class
. See section 7.6.2. of this guide for more on owl:Class
.
The RDFS schema defines the relationship between rdf:type
and rdfs:Class
.(23) The schema says that rdf:type
is used to state that a resource is an instance of an rdfs:Class
. Significantly, the range of rdf:type
is defined to be rdfs:Class
. Therefore, when one uses rdf:type
to declare that a resource is an instance of a particular class, one is also asserting that the object of the triple IS an rdfs:Class
even if the term which is the object isn't already explicitly defined to be one. (So the in the previous example, the triple not only asserts that dcterms:Agent
is an instance of rdfs:Class
, it is also asserting indirectly that rdfs:Class
is an instance of rdfs:Class
!)
For example, if one asserts that
http://bioimages.vanderbilt.edu/kirchoff/ac1490 rdf:type http://purl.org/dc/dcmitype/StillImage
or in abbreviated form
kimage:ac1490 rdf:type dctype:StillImage
not only does it assert that kimage:ac1490
is an instance of dctype:StillImage
, but it also declares dctype:StillImage
to be an rdfs:Class
. As it turns out, dctype:StillImage
is already declared to be an rdfs:Class
in its term definition (24), but if it hadn't already been a class, one would be asserting that it was one by using it as the object of the type declaration.
The RDFS schema is silent about what constitutes appropriate classes or types in a particular situation. Users are free to use any pre-existing types/classes or to define their own.
However, since the purpose of having classes seems to be to allow different users to know that they are both talking about the same kind of resource, it is a best practice to use well-known types rather than making up ad hoc types.(25) The wiki page ClassInventory lists classes defined in some well-known vocabularies and ontologies.
Both Dublin Core (26) and Darwin Core (27) define type vocabularies. In the RDF which defines both of these vocabularies, each of the listed types is declared to be an rdfs:Class
. So as a practical matter, from the standpoint of RDF these terms are no different than the classes defined within the regular Dublin Core and Darwin Core term vocabularies. The classes in the type vocabularies are also listed on the page ClassInventory.
These type vocabulary terms are defined separately from the classes defined in the general terms vocabularies because these terms are intended to be used as controlled values for other property terms in the respective vocabularies. The DCMI Type Vocabulary provides a list of approved terms that may be used as controlled values for the dcterms:type
property, while the Darwin Core Type Vocabulary "extends the DCMI Type Vocabulary and provides a list of approved values that may be used for the dwc:basisOfRecord
term to identify the specific type of a resource." (27) In the case of a Simple Darwin Core (28) serialization, values are expressed as text literals. dcterms:type
and dwc:basisOfRecord
can be used as field names in a text file with the literals from the type vocabularies given as values for that field in a record. In Darwin Core XML (29), the type values are expressed as string literal content of dcterms:type
and dwc:basisOfRecord
elements.
In RDF, a type vocabulary term could be used as an object of dcterms:type
and dwc:basisOfRecord
properties. For example:
http://bioimages.vanderbilt.edu/kirchoff/ac1490 dcterms:type dctype:StillImage
http://bioimages.vanderbilt.edu/specimen/ncu592805 dwc:basisOfRecord dwctype:LivingSpecimen
However, as an rdfs:Class
each of these type terms could also be used as the object of an rdf:type
property as well.
http://bioimages.vanderbilt.edu/kirchoff/ac1490 rdf:type dctype:StillImage
http://bioimages.vanderbilt.edu/specimen/ncu592805 rdf:type dwctype:LivingSpecimen
The guidelines for expressing Dublin Core metadata as RDF (30) state that "It is recommended that RDF applications implementing this specification primarily use and understand rdf:type
in place of dcterms:type
when expressing Dublin Core metadata in RDF, as most RDF processors come with built-in knowledge of rdf:type
." It also notes that dcterms:type
has semantics that are similar to rdf:type
, but that the precise relationship between those properties is undecided. There is no similar guideline for use of dwc:basisOfRecord
vs. rdf:type
, but using the same rationale, it would seem preferable to use rdf:type
when Darwin Core is expressed in RDF.
Interestingly, Darwin Core includes two classes borrowed from Dublin Core: dcterms:Location
as a class defined in the Darwin Core Terms vocabulary and dctype:Event
in the Darwin Core Type vocabulary. However, Darwin Core also defines classes dwctype:Location
and dwc:Event
. Nothing in the RDF ties these similarly-named classes together, so a machine would not consider a resource with rdf:type dctype:Event
to be an instance of the same class as another resource with rdf:type dwc:Event
; likewise resources typed as dcterms:Location
and dwctype:Location
would be considered instances of different classes. In a serialization where the types were expressed as text literals rather than URIs, they would simple be "Event
" and "Location
" and not differentiable.
The creators of RDFS and the DCMI Abstract Model considered identification of the class of which a resource was an instance to be such a significant aspect of describing a resource that special structures and vocabulary were developed to declare or infer the type of a resource. In order for effective querying of an RDF database, or to conduct reasoning based on RDF data, it is important to be able to know whether resources described by different people or organizations are actually the same kind of thing. Thus it should be an important task of the community to reach a consensus about the precise URIs which should be used for typing, or if several URIs are acceptable, to define equivalency among terms from different vocabularies.
For some descriptions of what a vocabulary is and how it is related to terms such as taxonomy, thesaurus, and ontology, see
http://infogrid.org/wiki/Reference/PidcockArticle
http://semwebtec.wordpress.com/2010/11/23/contolled-vocabulary-vs-ontology/
http://species-id.net/wiki/Audubon_Core_Term_List#derivedFrom
http://code.google.com/p/darwin-sw/wiki/ClassToken
There are important distinctions between the legacy terms in the dc: = http://purl.org/dc/elements/1.1/
namespace and the terms in the dcterms: = http://purl.org/dc/terms/
namespace. See the sections "Addition of domains and ranges for existing properties and creation of new properties" and "Subproperty relations between terms in 'the Dublin Core' " at http://dublincore.org/usage/decisions/2008/dcterms-changes/#sect-2 as well as http://dublincore.org/documents/2008/01/14/domain-range/ for details.
http://dublincore.org/documents/abstract-model/
http://www.w3.org/TR/rdf-schema/
The information in this table is taken from
http://dublincore.org/documents/abstract-model/#sect-5
using the following namespace abbreviations to simplify the table:
rdf: = http://www.w3.org/1999/02/22-rdf-syntax-ns#
rdfs: = http://www.w3.org/2000/01/rdf-schema#
http://www.w3.org/TR/rdf-schema/#ch_type
http://www.w3.org/TR/rdf-schema/#ch_range
http://xmlns.com/foaf/spec/index.rdf
http://www.tdwg.org/standards/status-and-categories/
http://rs.tdwg.org/dwc/terms/guides/xml/index.htm#abstractmodel
http://rs.tdwg.org/dwc/terms/guides/xml/index.htm#introduction
can be downloaded from
http://rs.tdwg.org/dwc/rdf/dwcterms.rdf
using most web browsers. However, it is not possible to view the file using most web browsers.
The file at http://code.google.com/p/darwincore/source/browse/trunk/rdf/dwcterms.rdf (viewable in a browser) is similar but may be an outdated version.
The official normative document (i.e. Type 1 document sensu the draft TDWG documentation standard) for Darwin Core is http://rs.tdwg.org/dwc/rdf/dwctermshistory.rdf viewable at http://code.google.com/p/darwincore/source/browse/trunk/rdf/dwctermshistory.rdf but as a practical matter, dereferencing term URIs leads to the dwcterms.rdf file since the terms in the dwctermshistory.rdf file have dates appended for versioning purposes.
http://rs.tdwg.org/dwc/terms/index.htm
can be downloaded from
http://rs.tdwg.org/dwc/rdf/dwctermshistory.rdf
using most web browsers and which can be viewed in human-readable form at
http://darwincore.googlecode.com/svn/trunk/terms/history/index.htm
but which is not necessarily up to date.
http://rs.tdwg.org/dwc/terms/attributes/
which is described by the document
http://rs.tdwg.org/dwc/terms/attributes/dwcattributes.rdf
The ABCD schema is a TDWG Standard (http://www.tdwg.org/standards/115/) which can be used to describe collections, specimens, and observations in a highly structured fashion. It is in the form of an XML schema and its terms are not identified by URIs nor defined in RDF, so they cannot be used as URI object references in RDF.
see http://code.google.com/p/darwincore/issues/detail?id=135
Human readable http://www.tdwg.org/standards/312/
RDF at
http://rs.tdwg.org/ontology/voc/Collection
http://rs.tdwg.org/ontology/voc/Institution
http://rs.tdwg.org/ontology/voc/InstitutionType
http://rs.tdwg.org/ontology/voc/ContactDetails
Use view page source to view RDF; however, a note at that page says management of the vocabulary is now at http://code.google.com/p/tdwg-ontology/source/browse/trunk/ontology/voc/
http://dublincore.org/documents/dcmi-terms/
http://www.w3.org/TR/rdf-schema/#ch_type
http://dublincore.org/2010/10/11/dctype.rdf in RDF
http://dublincore.org/documents/dcmi-type-vocabulary/ in human-readable form
The TDWG GUID Applicability Standard http://www.tdwg.org/standards/150/ states in Recommendation 11 that "Objects in the biodiversity domain that are identified by a GUID should be typed using the TDWG ontology or other well-known vocabularies in accordance with the TDWG common architecture." Thus in our community, it is considered to be a best practice to use a well-known vocabulary.
Human readable at http://dublincore.org/documents/dcmi-type-vocabulary/#H7
RDF at http://dublincore.org/2010/10/11/dctype.rdf
See the section "The Vocabulary Encoding Scheme 'DCMI Type Vocabulary' and its member terms" at http://dublincore.org/usage/decisions/2008/dcterms-changes/#sect-2 for a history of the evolution of the DCMI Type vocabulary
Human readable at http://rs.tdwg.org/dwc/terms/type-vocabulary/index.htm
RDF at http://code.google.com/p/darwincore/source/browse/trunk/rdf/dwctype.rdf
http://rs.tdwg.org/dwc/terms/simple/index.htm
http://rs.tdwg.org/dwc/terms/guides/xml/index.htm
http://dublincore.org/documents/dc-rdf/#sect-5
Thanks to Paul Murray for helpful comments and suggestions on this page, including part of the definition of term.
Questions? Comments? Contact Steve Baskauf .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.