How to link datasets to physical samples #16

ashepherd · 2018-03-19T17:26:05Z

{
  "@type": "Dataset",
  ...
  "hasPart": [
    {
      "@type": "CreativeWork",
      "additionalType": "http://schema.geolink.org/1.0/base/main#PhysicalSample",
      ... first sample...
    },
    {
      "@type": "CreativeWork",
      "additionalType": "http://schema.geolink.org/1.0/base/main#PhysicalSample",
      ... second sample ...
    },
  ]
}

smrgeoinfo · 2018-03-21T19:59:28Z

Given the limited scope of schema.org, using 'hasPart' seems like the least unlikely relation to use for now as a proof of concept.

In the long run, we really need to think about the purpose of this schema.org markup. There are already lots of better defined metadata vocabularies and standards out there that are in wide use for scientific data. What are we gaining by ad hoc use of relationships like this to get information we're interested in into the SDO markup, when there are already mechanisms to publish the metadata in xml or using other rdf vocabularies (Prov, geoDCAT-AP) that are designed for the information we're interested in. If the commercial search engines are interested in data, wouldn't in make more sense for them to figure out how to index existing metadata?

In the meantime, I'll go ahead and implement 'hasPart' for linking the Earthchem Library datasets to IGSNs where the information exists.

extended example with IGSN:

"hasPart": [
    {
      "@type": "CreativeWork",
      "additionalType": "http://schema.geolink.org/1.0/base/main#PhysicalSample",
      ... first sample...
      "identifier": {
        "@type": "PropertyValue",
        "additionalType": ["http://schema.geolink.org/1.0/base/main#Identifier", "http://purl.org/spar/datacite/Identifier"],
        "name": "IGSN goes here",
        "propertyID": "IGSN",
        "url": "https://app.geosamples.org/sample/igsn/WHO000A52",
        "value": "WHO000A52"
      },
      ...
    },
    {
      "@type": "CreativeWork",
      "additionalType": "http://schema.geolink.org/1.0/base/main#PhysicalSample",
      ... second sample ...
      "identifier": {
        "@type": "PropertyValue",
        "additionalType": ["http://schema.geolink.org/1.0/base/main#Identifier", "http://purl.org/spar/datacite/Identifier"],
        "name": "IGSN goes here",
        "propertyID": "IGSN",
        "url": "https://app.geosamples.org/sample/igsn/WHO000A53",
        "value": "WHO000A53"
      }
      ...
    }
  ]

mbjones · 2018-03-22T04:27:23Z

@smrgeoinfo totally agree on needing to decide what SDO is "for". I think domain metadata standards are too niche for the big search engines to grapple with. But their happy to deal with something the size of Wikipedia, and I think more happy if we spend the time mapping domain info onto their chosen model. That it isn't as precise as the domain metadata isn't probably their biggest concern. The one nice thing about everyone mapping to SDO is we seem to be achieving pretty broad consensus on vocabularies, just because everyone wants to be compatible with the search engines, although we are losing precision along the way.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to link datasets to physical samples #16

How to link datasets to physical samples #16

ashepherd commented Mar 19, 2018

smrgeoinfo commented Mar 21, 2018

mbjones commented Mar 22, 2018

How to link datasets to physical samples #16

How to link datasets to physical samples #16

Comments

ashepherd commented Mar 19, 2018

smrgeoinfo commented Mar 21, 2018

mbjones commented Mar 22, 2018