diff --git a/contentreport-filteredcollections.md b/contentreport-filteredcollections.md new file mode 100644 index 00000000..0df282b3 --- /dev/null +++ b/contentreport-filteredcollections.md @@ -0,0 +1,149 @@ +# Filtered Collections report +[Back to the list of all defined endpoints](endpoints.md) + +This endpoint provides aggregated statistics about the number of items per collection according to selected filters. + +NOTE: This is currently a beta feature. + + +**GET /api/contentreport/filteredcollections** + +The endpoint takes a `filters` query parameter whose value is a comma-separated list of filters +like the following: +``` +?filters=is_discoverable,has_multiple_originals,has_pdf_original +``` + +Alternatively, the comma-separated list can be replaced by a repetition of the `filters` parameter +for each requested filter: +``` +?filters=is_discoverable&filter=has_multiple_originals&filter=has_pdf_original +``` + + +Please see [below](#available-filters) for the list of available filters. + +## Report contents + +For each collection, the basic report consists of: +* name (label) and handle of the collection +* name (label) and handle of the parent community +* total number of items +* number of items matching all selected filters + +In addition, a `summary` element provides the total number of items and the total number of items matching all filters +for the whole repository. + +An example JSON response document to `/api/contentreport/filteredcollections`: +```json +{ + "id": "filteredcollections", + "collections": [ + { + "label": "Collection 1", + "handle": "100/1", + "values": { + "is_discoverable": 23, + "has_multiple_originals": 3, + "has_pdf_original": 14 + }, + "community_label": "Community 1", + "community_handle": "20.500.11794/1", + "nb_total_items": 23, + "all_filters_value": 3 + }, + { + "label": "Collection 2", + "handle": "100/2", + "values": { + "is_discoverable": 1, + "has_multiple_originals": 0, + "has_pdf_original": 0 + }, + "community_label": "Community 1", + "community_handle": "20.500.11794/1", + "nb_total_items": 1, + "all_filters_value": 0 + }, + { + "label": "Collection 3", + "handle": "100/3", + "values": { + "is_discoverable": 1, + "has_multiple_originals": 0, + "has_pdf_original": 1 + }, + "community_label": "Community 1", + "community_handle": "20.500.11794/1", + "nb_total_items": 1, + "all_filters_value": 0 + } + ], + "summary": { + "label": null, + "handle": null, + "values": { + "is_discoverable": 25, + "has_multiple_originals": 3, + "has_pdf_original": 15 + }, + "community_label": null, + "community_handle": null, + "nb_total_items": 25, + "all_filters_value": 3 + }, + "type": "filtered-collections", + "_links": { + "self": { + "href": "http://localhost:8080/dspace-server/api/contentreport/filtered-collections" + } + } +} +``` + +## Available filters + +The available filters are as follows: + +* Item Property Filters + * `is_item`: Is Item - always true + * `is_withdrawn`: Withdrawn Items + * `is_not_withdrawn`: Available Items - Not Withdrawn + * `is_discoverable`: Discoverable Items - Not Private + * `is_not_discoverable`: Not Discoverable - Private Item +* Basic Bitstream Filters + * `has_multiple_originals`: Item has Multiple Original Bitstreams + * `has_no_originals`: Item has No Original Bitstreams + * `has_one_original`: Item has One Original Bitstream +* Bitstream Filters by MIME Type + * `has_doc_original`: Item has a Doc Original Bitstream (PDF, Office, Text, HTML, XML, etc) + * `has_image_original`: Item has an Image Original Bitstream + * `has_unsupp_type`: Has Other Bitstream Types (not Doc or Image) + * `has_mixed_original`: Item has multiple types of Original Bitstreams (Doc, Image, Other) + * `has_pdf_original`: Item has a PDF Original Bitstream + * `has_jpg_original`: Item has JPG Original Bitstream + * `has_small_pdf`: Has unusually small PDF + * `has_large_pdf`: Has unusually large PDF + * `has_doc_without_text`: Has document bitstream without TEXT item +* Supported MIME Type Filters + * `has_only_supp_image_type`: Item Image Bitstreams are Supported + * `has_unsupp_image_type`: Item has Image Bitstream that is Unsupported + * `has_only_supp_doc_type`: Item Document Bitstreams are Supported + * `has_unsupp_doc_type`: Item has Document Bitstream that is Unsupported +* Bitstream Bundle Filters + * `has_unsupported_bundle`: Has bitstream in an unsupported bundle + * `has_small_thumbnail`: Has unusually small thumbnail + * `has_original_without_thumbnail`: Has original bitstream without thumbnail + * `has_invalid_thumbnail_name`: Has invalid thumbnail name (assumes one thumbnail for each original) + * `has_non_generated_thumb`: Has non-generated thumbnail + * `no_license`: Doesn't have a license + * `has_license_documentation`: Has documentation in the license bundle +* Permission Filters + * `has_restricted_original`: Item has Restricted Original Bitstream + * `has_restricted_thumbnail`: Item has Restricted Thumbnail + * `has_restricted_metadata`: Item has Restricted Metadata + +Possible response status: + +* 200 OK - The specific report data was found, and the data has been properly returned. +* 403 Forbidden - In case of unauthorized user session. diff --git a/contentreport-filtereditems.md b/contentreport-filtereditems.md new file mode 100644 index 00000000..1c7f807f --- /dev/null +++ b/contentreport-filtereditems.md @@ -0,0 +1,140 @@ +# Metadata query (aka Filtered Items) report +[Back to the list of all defined endpoints](endpoints.md) + +This endpoint provides a custom query API to select items from existing collections, +according to given Boolean and metadata filters. + +NOTE: This is currently a beta feature. + + +**GET /api/contentreport/filtereditems** + +The report parameters are described [below](#report-parameterization). + +Additionally, a `pageNumber` parameter is available to retrieve results starting at a given page +(according to `pageLimit`, the maximum number of items per page). Page numbering starts at 0. + +All parameters except `pageNumber` and `pageLimit` are repeatable. Multiple values can be expressed either +by repeating the corresponding parameter, e.g.: +``` +?filters=is_discoverable&filters=has_multiple_originals&filters=has_pdf_original +``` + +of by using a comma-separated value, e.g.: + +``` +?filters=is_discoverable,has_multiple_originals,has_pdf_original +``` + +except the `queryPredicates` parameter, which supports only parameter repetition for multiple values +to avoid any ambiguities in case a predicate values contains commas. + +Please see [below](#report-parameterization) for parameterization details. + +## Report contents + +An example JSON response document to `/api/contentreport/filtereditems` (metadata removed for brevity): +```json +{ + "id": "filtereditems", + "items": [ + { + "id": "07e388ff-f22b-4d4f-8275-acab5c3edacc", + "uuid": "07e388ff-f22b-4d4f-8275-acab5c3edacc", + "name": "Enhancing the lubricity of an environmentally friendly Swedish diesel fuel MK1", + "handle": "20.500.11794/42", + "metadata": { + "dc.contributor.author": [ + { + "value": "Smith, John", + "language": null, + "authority": "6eee383a-f126-4705-9ffb-b4aa4832070e", + "confidence": 600, + "place": 0 + } + ], + "dc.publisher": [ + { + "value": "Elsevier", + "language": "fr_CA", + "authority": null, + "confidence": -1, + "place": 0 + } + ], + }, + "inArchive": true, + "discoverable": true, + "withdrawn": false, + "lastModified": "2015-11-23T17:30:21.463+00:00", + "entityType": "Publication", + "owningCollection": { + "id": "d98a828c-45c2-43d9-9861-6b9800bf14f5", + "uuid": "d98a828c-45c2-43d9-9861-6b9800bf14f5", + "name": "Articles publiés dans des revues avec comité de lecture", + "handle": "100/1", + "metadata": { + "dc.identifier.uri": [ + { + "value": "http://localhost:4000/handle/100/1", + "language": null, + "authority": null, + "confidence": -1, + "place": 0 + } + ], + "dspace.entity.type": [ + { + "value": "Publication", + "language": null, + "authority": null, + "confidence": -1, + "place": 0 + } + ] + }, + "type": "collection" + }, + "type": "item" + }, + { + ... + } + ], + "itemCount": 40, + "type": "filtereditemsreport", + "_links": { + "self": { + "href": "http://localhost:8080/dspace-server/api/contentreport/filtereditems" + } + } +} +``` + +## Report parameterization + +The parameters are specified as follows: + +* `collections`: The collection UUIDs where to search items. If none are provided, the whole repository is searched. +* `presetQuery`: This parameter is not used on the REST API side. It defines a predefined set of query predicates + defined in the Angular layer. +* `queryPredicates`: Predicates used to filter matching items. They can be predefined (see `presetQuery` above) + or defined specifically by the user. As mentioned above, they are the only parameter that cannot be repeated + using comma-separated values. +* `pageLimit`: Maximum number of items per page. +* `filters`: Supplementary filters, these are the same as those available in the Filtered Collections report. + Please see [/api/contentreport/filteredcollections](contentreport-filteredcollections.md#available-filters) for details. +* `additionalFields`: Fields to add to the basic report for each item included in the report. + +The _basic report_ mentioned above includes, for each item: + +* Sequential number (order of appearance in the report) +* UUID +* Parent collection +* Handle +* Title + +Possible response status: + +* 200 OK - The specific report data was found, and the data has been properly returned. +* 403 Forbidden - In case of unauthorized user session. diff --git a/endpoints.md b/endpoints.md index f67f45aa..fb26516e 100644 --- a/endpoints.md +++ b/endpoints.md @@ -56,6 +56,8 @@ * [/api/authz/features](features.md) * [/api/statistics](statistics.md) * [/api/tools/itemrequests](item-requests.md) +* [/api/contentreport/filteredcollections](contentreport-filteredcollections.md) +* [/api/contentreport/filtereditems](contentreport-filtereditems.md) ## Actuator endpoints The following endpoints are implemented using [Spring Boot Actuator](https://docs.spring.io/spring-boot/docs/current/reference/html/actuator.html#actuator.enabling) and are enabled by default: