FHIRPath is a path based extraction language, somewhat like XPath. It is optimised to work on FHIR resources. Operations are expressed in terms of the logical content of the resources, rather than their XML or JSON reprsentation. The expressions can (in theory) be converted to XPath, JSON or OCL equivalents
All FHIRPath operations result in a collection of Elements of various types. When the expression begins evaluating, there is a collection with one element in focus.
- Usage
FHIR Path is used in 5 places within the FHIR specifications
- search parameter paths - used to define what contents the parameter refers to
- slicing discriminator - used to indicate what element(s) define uniqueness
- invariants in ElementDefinition, used to apply co-occurance and other rules to the contents
- error message locations in OperationOutcome
- URL templates in Smart on FHIR's cds-hooks
Implementations may find other uses for this as well
Some features are only allowed in invariants - either because they are hard to implement, or because they make constraints on the contexts in which the path statements can be used.
- Path selection
The first fundamental operation is to select a set of elements by their path:
path.subPath.subPath - select all the elements on the path
e.g. To select all the phone numbers for a patient
telecom.value
when the focus is "Patient".
Paths and polymorphic items:
path.value[x].subPath - all kinds of value
path.valueQuantity.subPath - only values that are quantity
* - any child
** - any descendent
name* - recursive uses of the element
$context - the original context (see below for usage)
$resource - the original container resource (e.g. skip contained resources, but do not go past a root resource into a bundle, if it is contained in a bundle)
$parent - the element that contains $contex
Note: $resource and $parent are only allowed in invariants
There is a special case around the entry point, where the type of the entry point can be represented, but is optional. To illustrate this point, take the path
telecom.where(use = 'work').value
This can be evaluated as an expression on a Patient resource, or other kind of resources. However, for natural human use, expressions are often prefixed with the name of the context in which they are used:
Patient.telecom.where(use = 'work').value
These 2 expressions have the same outcome, but when evaluating the second, the evaluation will only produce results when used on a Patient resource.
- FHIR Types
FHIR Paths are always run against an Element or a Resource, and always produce collections of Elements or Resources. All elements in a collection will always have a FHIR type. The evaluation engine will automatically elevate constants to the right FHIR type.
To illustrate this point, take the following path expression:
Patient.name.text
This produces a collection of elements with the type "string", as defined by the FHIR data types. Literal values such as
"test string"
0
0.1
true
false
are automatically converted to the appropriate FHIR types (string, integer, decimal, and boolean respectively).
Note: string constants are surrounded by either " or ', and use json type escaping internally. String constants are the only place that non-ascii characters are allowed in the expression. Unicode characters may, but do not need to be, escaped using the \u syntax.
- Boolean evaluations
Collections can be evaluated as booleans in logical tests in criteria. When a collection is implicited converted to a boolean then:
- if it has a single item that is a boolean:
- it has the value of the boolean
- if it is empty
- it is false
- else
- it is true
Note that collections never contain null objects
This same principle applies when using the path statement in invariants.
- Functions
In addition to selecting subelements, functions can be performed on the list. Functions are names that are followed by a () with zero or more parameters, separated by ','
As an example:
telecom.where(use = 'home').value.empty()
This returns a collection of a single boolean element that contains true if there is no home telephone numbers.
The parameters are expressions that are evaluated with respect to the collection that the operation applies to, which can be modified by the special paths given in section 2.
The following operations are defined:
true if the collection is empty
Returns the opposite of the boolean evaluation of a collection
Filter the collection to only those elements that meet the stated criteria expression. Expressions are evaluated with respect to the elements in the context. If the criteria are true, they are included in the result collection.
true if all items in the collection meet the criteria (also true if the collection is empty). The criteria are evaluated for each item in the collection.
true if any items in the collection meet the criteria (and false if the collection is empty). The criteria is evaluated for each item in the collection.
Returns the indexth item in the collection (0 based index)
Returns a collection containing the first item in the list. Equivalent to .item(0)
Returns a collection containing the last item in the list
Returns a collection containing all but the first item in the list
Returns a collection containing all but the first num items in the list
Returns a collection containing the first num items in the list
Returns a part of the string.
Returns a collection with a single value which is the integer count of the collection
Converts a string to an integer (empty collection if it's not a proper integer)
Filters the list to only include elements with a string value that starts with the specified content. Note that only primitive elements have a string representation
Returns the length of characters used to represent the value (primitive types only) (does not include syntactical escapes). Returns the longest item in the collection Note that only primitive elements have a string representations
Returns a boolean for whether all the items in the collection match the given regex (which is a string - usually a constant)
Note that only primitive elements have a string representation
A simpler variation of matches that returns a boolean for a matching character sequence
Returns true if all the elements in the list are distinct when using the relative paths (simple paths only with no functions). If the elements in the list are primitives, this can be used with no paths (e.g. .distinct())
for each item in the collection, if it is a Reference, locate the target of the reference, and add it to the resulting collection. Else, ignore it
Note: distinct() and resolve() are only allowed in the context of invariants, and need only be implemented by path evaluators that are used for testing invariants
- Operations
The following operators are allowed to be used between path expressions (e.g. expr op expr):
True if the left collection is equal to the right collection
- equality is determined by comparing all the properties of the children
- todo: does order matter in sub-collections?
- todo: does precision matter?
- typically, this operator is used with a single fixed values. This means that Patient.telecom.system = 'phone' will return an empty collection if there is more than one telecom with a use typically, you'd want Patient.telecom.where(system = 'phone')
true if the collections are the same
- string evaluation is not case sensitive
- order doesn't matter
the inverse of the equals reaction
the inverse of the equivalent reaction
- this and the other 3 order related operations can only be used for strings, codes, integers, and decimals
- unless there is only one item in each collection (left and right) this returns an empty collection
- code evaluation is strictly lexical, not based on any defined meaning of order
- comparisons involving other types always return an empty collection
merge the two collections into a single list, eliminating any duplicate values (e.g. equal)
test whether all the itesm in the left collection are in the right collection.
- order does not matter
- if the right collection is a URI, and it refers to a value set, value set membership testing will be performed
left and right are converted to booleans (see #3 above) and a boolean true or false is the outcome following normal and rules
left and right are converted to booleans (see #3 above) and a boolean true or false is the outcome following normal or rules
left and right are converted to booleans (see #3 above) and a boolean true or false is the outcome following normal xor rules
if left and right are collections with a single element, and the element type is the same, add them together (integers, decimals, strings only)
if left and right are collections with a single element, and the element type is the same, subtract right from left (integers, decimals)
if left and right are collections with a single element, and the element type is the same, concatenate them (strings only)
Note that operations may be grouped in the classical sense by surrounding them with ()
Operator precedence:
#1 . (path/function invocation)
#2: *, /
#3: +, -, &, |
#4: =, , !=, !, >, <, >=, <=, in
#5: and, xor, or
- Fixed constants
A token introduced by a % defines a fixed constant that is automatically expanded into it's agreed value by the parser. Tokens consist of an introducing %, and then a sequence of characters that conforms to the id data type (1-64, a..z, A..Z, 0..9, -, .)
The following fixed values are set for all contexts:
%sct - url for snomed ct
%loinc - url for loinc
%ucum - url for ucum
%vs-[name] - full url for the provided HL7 value set with id [name]
%ext-[name] - full url for the provided HL7 extension with id [name]
Implementers should note that defining additional fixed constants is a formal extension point for the langauge. Implementation Guides are allowed to define their own fixed constants, and implementers should provide some appropriate configuration framework to allow these constants to be provided to the evaluation engine at run time.
e.g. %us-zip = "[0-9]{5}(-[0-9]{4}){0,1}"
- Formal grammar
expression = term (righthand)*
righthand = op term | "." function
term = "(" expression ")" | primitive | predicate
primitive = number | "true" | "false" | constant | string
predicate = item ('.' item)*
item = element "*"? | function | axis_spec
axis_spec = "*" | "**" | "$context" | "$parent" | "$resource"
element = ("a".."z" | "A".."Z") ("a".."z" | "A".."Z" | "0".."9")* "[x]"?
function = funcname "(" param_list? ")"
param_list = expression ("," expression)*
funcname = "empty" | "not" | "where" | "all" | "any" | "first" | "last" | "tail" | "count" | "asInteger" | "startsWith" | "length" | "matches" | "distinct" | "resolve" | "contains" | "substring" | "skip" | "take"
op = "*" | "/" | "+" | "-" | "|" | "&" | "=" | "~" | "!=" | "!~" | ">" | "<" | "<=" | ">=" | "in" | "and" | "or" | "xor"
number = number as used in json
string = sequence of unicode characters, surrounded by either ' or ", using json escaping.
constant = "%" ("a"-"z" | "A-Z" | "0-9") ("a-z" | "A-Z" | "0-9" | "-" | ".")*
Note:
- Unicode whitespace is not explicitly shown in the grammar, but may be present around all terms and tokens in the grammar.
- Non ascii (e.g. unicode) characters only allowed in constants (they do not need to be escaped in constants, but can be)
- Mapping Extensions
** This section is speculative **
There is an additional mode for this language, where it is used for mapping from one content model to another.
When performing mapping:
a. path definition
The primary statement is used to define a path that will be instantiated
when the mapping is performed. The elements referred to in the mspping
statement are created rather then selected.
example:
Composition.section.title
will cause a Composition to be created, then a section to be added to
it, and then a title will be created in that (though no value is yet
assigned to the title)
When creating items, if an item has cardinality ..1, and it already
exist, the existing one will be reused. For items that have
cardinality > 1, new items are created.
No functions or operators are allowed in the defining path.
The first name in a defining expression may be either a variable name
(see below) or a resource or datat type type name - in which case a
new resource or datatype is created
b. value being mapped: a new special value is defined that refers to the
value being mapped:
$value - this is an object that is an instance of a fhir data type.
The type of $value is only known at run-time. When mapping from
XML or json, only primitives types can be mapped. When mapping from
CDA or v2, see the mapping table below for data type mapping
c. conditional mappings
Conditional mappings are encountered faily commonly. There is a special
mapping criteria:
if (expression) defining-expression
The expression has no context. Typically, the expression references $value
example:
if ($value = "23446-4") Composition.section.title
A section will be added only if the value (a CDA code in this case)
has the given value
d. crossing resource boundaries
mapping statements are allowed to cross resource boundaries, and
create new resources, using ->
The first name after a -> is always a resource type name that
defines what type of resource is being referenced (this is always
true, even if only one type of resource is valid)
example:
Composition.section.entry->AllergyIntolerance
When the section is added, an entry will be created. Also, an
AllergyIntolerance resource will be created, and the entry will
be updated to point to the newly created resource
Mapping statements do not assign ids to the resource - this is
done by the run time infrastructure
e. variables
mapping paths can be assigned to variables for later reuse
by prefixing the defining path with
[var-name] :=
This retains the item for later re-use in other mappings.
A variable name can be any combination of letters and digits
starting with a lowercase letter
example:
cmp := Composition
create a Composition, and remember it as "cmp"
ai1 := cmp.section.entry->AllergyIntolerance
take the composition already created, add a section to it,
then create an AllergyIntolerance resource - and remember that
as ai1
It is an error to re-use variable names in a set of mappings
(is it? - be conservative for now, make it an error, and see)
f. properties
properties of items in a defined path can have additional
properties assigned to them as they are created. This is done by:
owner[name1 := value; name2 := value].child
rules:
names refer to elements defined on the owner
values must be constants, expressions that start with $value, or constants
when executing, the following type conversions are implicit:
string -> integer (if it is an integer)
example:
lc := Coding[system = %loinc; code = "12345-6"]
cmp := Composition[code := lc]
g, Formal Grammar:
grammar:
mapping = (condition) (assignment) map (";" mapping)
condition = if "(" (expression) ")"
assignment = varname ":="
varname = ("a".."z") ("a".."z" | "A".."Z" | "0".."9")*
map = (step | type) ( ("." step) | ( "->" type) )*
step = element ("[" properties "]")?
type = typename ("[" properties "]")?
properties = property (";" property)*
property = element ":=" term
typename = ("A".."Z") ("a".."z" | "A".."Z" | "0".."9")*
inside term, one change:
- add axis_spec : $value
h. data type mapping table
v2:
v3:
ANY -> e := Element
ANY.nullFlavor -> e.extension[url = http://hl7.org/fhir/StructureDefinition/iso21090-nullFlavor; code = $value]
BL -> boolean
BN -> boolean
ED -> a := Attachment
mediaType -> ct = a.contentType[$value]
charset -> ct[ct + $value]
language -> a[language := $value]
compression -> n/a // have to uncompress
reference -> a[url := $value]
integrityCheck -> a[hash := $value]
integrityCheckAlgorithm -> // have to check this and recalcuate as SHA1
thumbnail -> n/a
ST -> string
CD -> CodeableConcept
CS -> code
CO -> ??
CV -> Coding
CE -> CodeableConcept
SC -> CodeableConcept
II -> Identifier
TEL -> Telecom
AD -> Address
EN -> HumanName
TN -> HumanName
PN -> HumanName
ON -> HumanName
INT -> integer
REAL -> decimal
RTO -> Ratio
PQ -> PhysicalQuantity
MO -> Quantity
TS -> dateTime
SET -> n/a
LIST -> n/a
BAG -> n/a
IVL<TS> -> Period
IVL<PQ> -> Range
HIST -> n/a
UVP -> n/a
PIVL -> Timing
EIVL -> Timing
GTS -> Timing
PPD -> n/a
i. known problems with mappings:
* when 2 different elements contribute to the same primitive value
* when 2 different paths use the same variable, and the variable cross a resource boundary, and either is optional