Apium is an API to access all public Center for Digital Research in the Humanities resources. It is also an invasive weed in Nebraska.
The CDRH has the metadata and text of several thousand documents such as letters, posters, novels, and images in an Elasticsearch index. This API is a wrapper around that index which provides convenient ways to search and filter those items.
Below, you can find instructions about basic functionality like sorting, pagination (start / rows), and selecting the fields you want to get back.
There are a couple features that may need a little bit of an introduction.
Facets provide you a way of combining and counting the values of a field related to a query. For example, if you search for "horse" and get 100 results, a facet on the author name field might tell you that 90 of those results were by Buffalo Bill, 9 from Meriwether Lewis, and 1 from Jane Austen. You can add facets on keyword fields and date fields, but not on text fields.
Highlights are a cool way to preview the results of a text query. For example, if you searched for "horse," highlights might look like "...Oglala Sioux Nation. American Horse was the son of Sitting Bear..." and "...Stout as a horse, affectionate, haughty, electrical..." This preview helps users decide which result is most relevant to them. You can add highlighting to any text field.
Lists number of documents matching keyword fields
Defaults:
- no defaults
Standard fields
facet[]=keyword_field
facet[]=category
facet[]=category&facet[]=title
Nested fields
facet[]=nested_field.keyword_field
facet[]=creator.name
facet[]=creator.name&facet[]=creator.role
Date ranges (currently supports days or years)
facet[]=date_field.range
facet[]=date.year
#=> { 1889 : 10, 1890 : 20 }
facet[]=date
#=> { 01-02-1889 : 2, 03-04-1889 : 8 }
Number of facets returned and sorting alphabetically (by default sorts by count)
facet_limit=number&facet_sort=term|direction
facet_limit=100
facet_sort=term|asc
facet_limit=30&facet_sort=term|desc
Sorting facets
Defaults:
- no selection: score|desc
- term selection, no order: term|desc
Always defaults to score descending. If you wish to sort alphabetically, add "term" and a direction. If you wish to sort score ascending, use "score" and a direction. Multiple sorts for single facets, and distinct sorts for separate facets are not supported at this time.
facet_sort=type|direction
facet_sort=term|desc
facet_sort=score|asc
The fields returned by a query
Defaults:
- returns all possible fields
Restrict the fields displayed per document in the response. Use !
to exclude a field. Wildcards in fieldnames supported.
fl=yes,!no
fl=title,!date*,date_written
Filters by keyword field across the possible documents
Defaults:
- no filters applied except
_type
for collection
Standard fields
f[]=field|type
f[]=category|Writings
f[]=category|Writings&f[]=format|manuscript
Nested fields
f[]=nested.keyword|type
f[]=creator.name|Cather, Willa
f[]=contributor.role|Editor
Date field
If given one date, will use it has both start and end.
Can give year range or specify date range
f[]=field|range_start|(range_end)
f[]=date|1884
#=> 01-01-1884 to 12-31-1884
f[]=date|1884|1887
#=> 01-01-1884 to 12-31-1887
f[]=date|1884-02-01|1887-03-01
#=> 02-01-1884 to 03-01-1887
Returns context of text match results
Defaults:
hl=true
hl_chars=100
hl_fl=text
hl_num=3
Disabling Highlighting
If you wish to turn highlighting off:
hl=false
Characters
This sets the number of characters that will be returned around a highlight match
hl_chars=number
hl_chars=100
Field List
Highlights will always be returned for the text
field, but if you are searching multiple fields, you may wish to see highlights on those fields, also. You do not need to send text
when specifying additional fields.
hl_fl=field1,field2,field3
hl_fl=annotations
hl_fl=annotations,catherwords
Number
The number of highlights returned per field. If you set hl_num=3
for text
and annotations
you could receive up to 6 highlights, 3 from each field.
hl_num=number
hl_num=1
hl_num=5
Specify the order of results
Defaults:
When no sort or partial sort is supplied
- query present: sort by "relevancy" descending
- given term is "relevancy", no order provided: sort descending
- given term is not "relevancy", no order provided: sort ascending
You may pass multiple fields to be sorted. The first one appearing in the URL parameters will take precedence over the other(s).
sort[]=field|direction
sort[]=date|desc&sort[]=title|asc
Sorting facets
Please refer to the section on facets for information about how to sort facets, specifically.
Manual pagination of results
Defaults:
- start=0
- num=50
Note: Zero indexed
start=number
num=number
start=0&num=50 # returns first 50 results
start=49&num=50 # returns second 50 results
start=9&num=10 # returns second 10 results
Please refer to the Elasticsearch query string syntax for a list of all possibilities for text searching.
Basic search
q=word
q=multiple words
q=word
Multiple fields
By default, this will search the "text" field, you can specify a different one to use or multiple fields. If adding fields, you will want to make sure that your highlights include fields beyond "text"
q=field:word
q=field:word AND otherfield:other
q=field:word OR otherfield:other
Advanced search
q="phrase of words"
q=wildcard*
q=word OR other
q=word AND other
q=(word OR other) OR -nothanks