Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP headers vs dialects #449

Closed
JeniT opened this issue Apr 5, 2015 · 7 comments · Fixed by #465
Closed

HTTP headers vs dialects #449

JeniT opened this issue Apr 5, 2015 · 7 comments · Fixed by #465

Comments

@JeniT
Copy link

JeniT commented Apr 5, 2015

The algorithm for generating annotated tables in the model document currently includes:

If the tabular data file was retrieved with Content-Type including the header=absent parameter set header to false in DD.

There are a few issues with this:

  1. why not also use charset to set the encoding property?
  2. shouldn't it be possible for users to override the information from the HTTP headers, given that publishers often don't have much control over them and they may therefore be wrong on occasion
  3. are there other HTTP headers which could supply information and thus should be treated as part of the embedded metadata?

I think it would be better to only use the parameters (both header and encoding) in the Content-Type header if using the default dialect.

@gkellogg
Copy link
Member

gkellogg commented Apr 5, 2015

I could see using charset (from Content-Type). And, the language could be changed so that these fields are set in DD unless values are explicitly defined in the merged metadata.

Could also use Content-Language to set lang inherited property.

@iherman
Copy link
Member

iherman commented Apr 6, 2015

I think @gkellogg answered to all three points; +1 to his answers.

@JeniT
Copy link
Author

JeniT commented Apr 6, 2015

+1 on use of the Content-Type.

I suggest that we consider using the HTTP headers from retrieving a file as part of creating the embedded metadata for that file.

As well as the Content-Language for lang, we could use the Link headers (except for any used to discover metadata) to set common properties. This could head us back into the #297 issue about what URLs to use for link relations unless we restrict which link relations are recognised.

Similarly, we could use Last-Modified to set a common property indicating the last modified date (eg dc:modified) but this would mean we had to pick a metadata vocabulary which we avoided doing.

My proposal would be that we only use Content-Language as the others will fall into the "too hard" category, but I wanted to flag the options.

@iherman
Copy link
Member

iherman commented Apr 6, 2015

On 06 Apr 2015, at 11:37 , Jeni Tennison [email protected] wrote:

+1 on use of the Content-Type.

I suggest that we consider using the HTTP headers from retrieving a file as part of creating the embedded metadata for that file.

As well as the Content-Language for lang, we could use the Link headers (except for any used to discover metadata) to set common properties. This could head us back into the #297 issue about what URLs to use for link relations unless we restrict which link relations are recognised.

Similarly, we could use Last-Modified to set a common property indicating the last modified date (eg dc:modified) but this would mean we had to pick a metadata vocabulary which we avoided doing.

My proposal would be that we only use Content-Language as the others will fall into the "too hard" category, but I wanted to flag the options.

+1 to your last remark:-). Ie, we would re-use content-type and content-language.


Reply to this email directly or view it on GitHub.


Ivan Herman, W3C
Digital Publishing Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
ORCID ID: http://orcid.org/0000-0003-0782-2704

@gkellogg
Copy link
Member

gkellogg commented Apr 7, 2015

We should also look for ContentType: text/tsv and set the delimiter to <TAB> rather than , in the dialect.

gkellogg added a commit that referenced this issue Apr 7, 2015
…t in Content-Type.

Set `lang` inherited property in `EM` from Content-Language.
Fixes #449.
@gkellogg
Copy link
Member

gkellogg commented Apr 7, 2015

I updated as discussed, but did not see a value in adding DD to EM, as it's used for parsing the tabular data file in this algorithm.

I also did nothing with Last-Modified or Link headers at this time, but an issue for LCCR may be appropriate.

@iherman
Copy link
Member

iherman commented Apr 8, 2015

On 8 Apr 2015, at 01:08, Gregg Kellogg [email protected] wrote:

I updated as discussed, but did not see a value in adding DD to EM, as it's used for parsing the tabular data file in this algorithm.

The PR looks fine to me, but I let @JeniT decide on the merge

I also did nothing with Last-Modified or Link headers at this time, but an issue for LCCR may be appropriate.

Please open a separate issue...

Ivan


Reply to this email directly or view it on GitHub.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants