Skip to content

Commit

Permalink
Merge pull request #23 from JuliaMolSim/less_abstract
Browse files Browse the repository at this point in the history
simplify types
  • Loading branch information
rkurchin authored Nov 17, 2021
2 parents 0007177 + f0896f8 commit b3b7a45
Show file tree
Hide file tree
Showing 9 changed files with 226 additions and 202 deletions.
61 changes: 36 additions & 25 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,46 +12,57 @@ AtomsBase is an abstract interface for representation of atomic geometries in Ju
* file I/O with standard formats (.cif, .xyz, ...)
* numerical tools: sampling, integration schemes, etc.
* automatic differentiation and machine learning systems
* visualization (e.g. plot recipes)

Currently, the design philosophy is to be as lightweight as possible, with only a small set of required function dispatches. We will also provide a couple of standard concrete implementations of the interface that we envision could be broadly applicable, but encourage developers to provide their own implementations as needed in new or existing packages.
Currently, the design philosophy is to be as lightweight as possible, with only a small set of required function dispatches to make adopting the interface into existing packages easy. We also provide a couple of standard concrete implementations of the interface that we envision could be broadly applicable, but encourage developers to provide their own implementations as needed in new or existing packages.

## Overview
AtomsBase defines a few abstract types used for specifying an atomic system. We will describe them briefly here, from the "top down." Users and/or prospective developers may also find `implementation_simple.jl` a useful reference for a simple concrete implementation of the interface.
The main abstract type introduced in AtomsBase `AbstractSystem{D,S}`. The `D` parameter indicates the number of spatial dimensions in the system, and `S` indicates the type identifying an individual species in this system.

**A remark on SoA vs. AoS:** The "struct-of-arrays" (SoA) vs. "array-of-structs" (AoS) is a common design dilemma in representations of systems such as these. We have deliberately designed this interface to be _agnostic_ to how a concrete implementation chooses to structure its data. Some specific notes regarding how implementations might differ for these two paradigms are included below.
The main power of the interface comes from predictable behavior of several core functions to access information about a system. Various categories of such functions are described below.

A way to think about this broadly is that the difference amounts to the ordering of function calls. For example, to get the position of a single particle in an AoS implementation, the explicit funciton chaining would be `position(getindex(sys))` (i.e. extract the single struct representing the particle of interest and query its position), while for SoA, one should prefer `getindex(position(sys))` (extract the array of positions, then index into it for a single particle). The beauty of an abstract interface in Julia is that these details can be, in large part, abstracted away through method dispatch such that the end user sees the same expected behavior irrespective of how things are implemented "under the hood."
### System geometry
Functions that need to be dispatched:
* `bounding_box(::AbstractSystem{D})::SVector{D,SVector{D,<:Unitful.Length}}`: returns `D` vectors of length `D` that describe the "box" in which the system lives
* `boundary_conditions(::AbstractSystem{D})::SVector{D,BoundaryCondition})`: returns a vector of length `D` of `BoundaryCondition` objects to describe what happens at the edges of the box

### System
An object describing a system should be a subtype of `AbstractSystem` and will in general store identifiers, positions, and (if relevant) velocities of the particles that make it up, as well as a set of coordinate bounds.
Functions that will work automatically:
* `get_periodic`: returns a vector of length `D` of `Bool`s for whether each dimension of the system has periodic boundary conditions
* `n_dimensions`: returns `D`, the number of spatial dimensions of the system

An `AbstractSystem` takes several type parameters: `D` (an integer representing the dimensionality of the system), `ET<:AbstractElement`, and `AT<:AbstractParticle` (see below) to indicate what types of particles these are, and requires dispatch of the following functions:
* `bounding_box(::AbstractSystem{D})::SVector{D, SVector{D, <:Unitful.Length}}`: should return a set of basis vectors describing the boundaries of the coordinates in which the system resides
* `boundary_conditions(::AbstractSystem{D})::SVector{D, BoundaryCondition}`: returns the boundary conditions corresponding to each spatial dimension of the system (see below for more on the `BoundaryCondition` type)
### Iteration and Indexing over systems
There is a presumption of the ability to somehow extract an individual component (e.g. a single atom or molecule) of this system, though there are no constraints on the type of this component. To achieve this, an `AbstractSystem` object is expected to implement the Julia interfaces for [iteration](https://docs.julialang.org/en/v1/manual/interfaces/#man-interface-iteration) and [indexing](https://docs.julialang.org/en/v1/manual/interfaces/#Indexing) in order to access representations of individual components of the system. Some default dispatches of parts of these interfaces are already included, so the minimal set of functions to dispatch in a concrete implementation is `Base.getindex` and `Base.length`, though it may be desirable to customize additional behavior depending on context.

`AbstractSystem` implements [Julia's indexing interface](https://docs.julialang.org/en/v1/manual/interfaces/#Indexing), thus the following functions must also be dispatched:
* `Base.getindex(::AbstractSystem, ::Int)`
* `Base.size(::AbstractSystem)`
### System state and properties
The only required properties to be specified of the system is the species of each component of the system and the positions and velocities associated with each component. These are accessed through the functions `species`, `position`, and `velocity`, respectively. The default dispatch of these functions onto an `AbstractSystem` object is as a broadcast over it, which will "just work" provided the indexing/iteration interfaces have been implemented (see above) and the functions are defined on individual system components.

By default, the `position` and `velocity` functions dispatch onto `AbstractSystem` objects as a broadcast over the system (e.g. `position(sys::AbstractSystem) = position.(sys)`, invoking the dispatch of `position` onto the `AbstractParticle` type parameter of the system). In an SoA implementation, custom dispatches should probably be included to avoid the construction of the particle objects.
As a concrete example, AtomsBase provides the `StaticAtom` type as this is anticipated to be a commonly needed representation. Its implementation looks as follows:
```julia
struct StaticAtom{D,L<:Unitful.Length}
position::SVector{D,L}
element::Element
end
StaticAtom(position, element) = StaticAtom{length(position)}(position, element)
position(atom::StaticAtom) = atom.position
species(atom::StaticAtom) = atom.element
```
Note that the default behavior of `velocity` is to return `missing`, so it doesn't need to be explicitly dispatched here.

**A remark on fractional coordinates:** In many contexts, it is desirable to work with fractional coordinates, i.e. unitless multiples of some reference coordinate system (typically this reference is what would be returned by the `box` function above). Because a focus of this interface is interoperability and unitless, non-Cartesian coordinates introduce ambiguity, we've chosen to impose that coordinates returned by functions in the interface be unitful quantities in a Cartesian frame. Of course, a concrete implementation of the interface could (and likely would need to) also dispatch additional functions for working with fractional coordinates.
The two sample implementations provided in this repo are both "composed" of `StaticAtom` objects; refer to them as well as `sandbox/aos_vs_soa.jl` to see how this can work in practice.

### Particles
Particle objects are subtypes of `AbstractParticle`, and also take a type parameter that is `<:AbstractElement` (see below) to indicate how particles are identified.
The `position`, `velocity`, and `species` functions also have indexed signatures to extract a given element directly, as in, for example:

In the SoA case, particle objects would only ever be explicitly constructed when `getindex` is invoked on the system, as a "view" into the system.
```julia
position(sys, i) # position of `i`th particle in `sys`
```
Currently, this syntax only supports linear indexing.

Particle objects should dispatch methods of the following functions:
* `position(::AbstractParticle)::AbstractVector{<: Unitful.Length}`
* `element(::AbstractParticle)::AbstractElement`
### Struct-of-Arrays vs. Array-of-Structs
The "struct-of-arrays" (SoA) vs. "array-of-structs" (AoS) is a common design dilemma in representations of systems such as these. We have deliberately designed this interface to be _agnostic_ to how a concrete implementation chooses to structure its data. Some specific notes regarding how implementations might differ for these two paradigms are included below.

And, optionally,
* `velocity(::AbstractParticle)::AbstractVector{<: Unitful.Velocity}`, which defaults to returning `nothing` if not dispatched.
### Elements
Subtypes of `AbstractElement` serve as identifiers of particles. As the name suggests, a common case would be a chemical element (e.g. for a DFT simulation). However, it could also contain more detailed isotopic/spin information if that is necessary, or be a molecule (e.g. in MD), or even a much larger-scale object!
A way to think about this broadly is that the difference amounts to the ordering of function calls. For example, to get the position of a single particle in an AoS implementation, the explicit function chaining would be `position(getindex(sys))` (i.e. extract the single struct representing the particle of interest and query its position), while for SoA, one should prefer an implementation like `getindex(position(sys))` (extract the array of positions, then index into it for a single particle). The beauty of an abstract interface in Julia is that these details can be, in large part, abstracted away through method dispatch such that the end user sees the same expected behavior irrespective of how things are implemented "under the hood."

For simulation purposes, the utility of this object would be to serve as a sufficiently specific "index" into a database of simulation parameters (e.g. a pseudopotential library for DFT, or interparticle potential parameters for MD). Because we envision a chemical element being an extremely common case, we provide an explicit subtype in the form of `Element`, which makes use of [PeriodicTable.jl](https://github.com/JuliaPhysics/PeriodicTable.jl) to access information such as atomic numbers and masses.
To demonstrate this, we provide two simple concrete implementations of the interface in `implementation_soa.jl` and `implementation_aos.jl` to show how analogous systems could be constructed within these paradigms (including the `getindex` implementations mentioned above). See also `sandbox/aos_vs_soa.jl` for how they can actually be constructed and queried.

### Boundary Conditions
Finally, we include support for defining boundary conditions. Currently included are `Periodic` and `DirichletZero`. There should be one boundary condition specified for each spatial dimension represented.
10 changes: 7 additions & 3 deletions sandbox/aos_vs_soa.jl
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,17 @@ using AtomsBase
using StaticArrays
using Unitful

import PeriodicTable

periodic_table = PeriodicTable.elements

box = [[1, 0, 0], [0, 1, 0], [0, 0, 1]]u"m"
bcs = [Periodic(), Periodic(), Periodic()]
positions = [0.25 0.25 0.25; 0.75 0.75 0.75]u"m"
elements = [ChemicalElement(:C), ChemicalElement(:C)]
elements = [periodic_table[:C], periodic_table[:C]]

atom1 = SimpleAtom(SVector{3}(positions[1,:]),elements[1])
atom2 = SimpleAtom(SVector{3}(positions[2,:]),elements[2])
atom1 = StaticAtom(SVector{3}(positions[1,:]),elements[1])
atom2 = StaticAtom(SVector{3}(positions[2,:]),elements[2])

aos = FlexibleSystem(box, bcs, [atom1, atom2])
soa = FastSystem(box, bcs, positions, elements)
Expand Down
3 changes: 1 addition & 2 deletions src/AtomsBase.jl
Original file line number Diff line number Diff line change
@@ -1,9 +1,8 @@
module AtomsBase

include("interface.jl")
include("atoms.jl")
include("implementation_soa.jl")
include("implementation_aos.jl")

# Write your package code here.

end
35 changes: 35 additions & 0 deletions src/atoms.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
#
# Extra stuff only for Systems composed of atoms
#

export StaticAtom, AbstractAtomicSystem
export atomic_mass, atomic_number, atomic_symbol, atomic_property

struct StaticAtom{D,L<:Unitful.Length}
position::SVector{D,L}
element::Element
end
StaticAtom(position, element) = StaticAtom{length(position)}(position, element)
position(atom::StaticAtom) = atom.position
species(atom::StaticAtom) = atom.element

function StaticAtom(position, symbol::Union{Integer,AbstractString,Symbol,AbstractVector})
StaticAtom(position, elements[symbol])
end

function Base.show(io::IO, a::StaticAtom)
print(io, "StaticAtom: $(a.element.symbol)")
end

const AbstractAtomicSystem{D} = AbstractSystem{D,Element}

atomic_symbol(a::StaticAtom) = a.element.symbol
atomic_mass(a::StaticAtom) = a.element.atomic_mass
atomic_number(a::StaticAtom) = a.element.number
atomic_property(a::StaticAtom, property::Symbol) = getproperty(a.element, property)

atomic_symbol(sys::AbstractAtomicSystem) = atomic_symbol.(sys)
atomic_number(sys::AbstractAtomicSystem) = atomic_number.(sys)
atomic_mass(sys::AbstractAtomicSystem) = atomic_mass.(sys)
atomic_property(sys::AbstractAtomicSystem, property::Symbol)::Vector{Any} =
atomic_property.(sys, property)
30 changes: 23 additions & 7 deletions src/implementation_aos.jl
Original file line number Diff line number Diff line change
@@ -1,32 +1,48 @@
# Example implementation using as few function definitions as possible
#
# Implementation of AtomsBase interface in an array-of-structs style
#

using StaticArrays

export FlexibleSystem

# TODO Switch order of type arguments?
struct FlexibleSystem{D,ET<:AbstractElement,AT<:AbstractParticle{ET}} <: AbstractSystem{D,ET,AT}
box::SVector{D,<:SVector{D,<:Unitful.Length}}
struct FlexibleSystem{D,S,L<:Unitful.Length,AT} <: AbstractSystem{D,S}
box::SVector{D,<:SVector{D,L}}
boundary_conditions::SVector{D,<:BoundaryCondition}
particles::Vector{AT}
FlexibleSystem(box, boundary_conditions, particles) = new{
length(boundary_conditions),
eltype(elements),
eltype(eltype(box)),
eltype(particles),
}(
box,
boundary_conditions,
particles,
)
end
# convenience constructor where we don't have to preconstruct all the static stuff...
function FlexibleSystem(
box::Vector{Vector{L}},
box::Vector{<:AbstractVector{L}},
boundary_conditions::Vector{BC},
particles::Vector{AT},
) where {BC<:BoundaryCondition,L<:Unitful.Length,AT<:AbstractParticle}
) where {BC<:BoundaryCondition,L<:Unitful.Length,AT}
D = length(box)
if !all(length.(box) .== D)
throw(ArgumentError("box must have D vectors of length D"))
end
FlexibleSystem(SVector{D,SVector{D,L}}(box), SVector{D,BoundaryCondition}(boundary_conditions), particles)
FlexibleSystem(
SVector{D,SVector{D,L}}(box),
SVector{D,BoundaryCondition}(boundary_conditions),
particles,
)
end

bounding_box(sys::FlexibleSystem) = sys.box
boundary_conditions(sys::FlexibleSystem) = sys.boundary_conditions

function Base.show(io::IO, ::MIME"text/plain", sys::FlexibleSystem)
function Base.show(io::IO, sys::FlexibleSystem)
print(io, "FlexibleSystem with ", length(sys), " particles")
end

Expand Down
40 changes: 0 additions & 40 deletions src/implementation_simple.jl

This file was deleted.

33 changes: 20 additions & 13 deletions src/implementation_soa.jl
Original file line number Diff line number Diff line change
@@ -1,18 +1,19 @@
# Example implementation using as few function definitions as possible
#
# Implementation of AtomsBase interface in a struct-of-arrays style.
#

using StaticArrays

export FastSystem

struct FastSystem{D,ET<:AbstractElement,AT<:AbstractParticle{ET},L<:Unitful.Length} <:
AbstractSystem{D,ET,AT}
struct FastSystem{D,S,L<:Unitful.Length} <: AbstractSystem{D,S}
box::SVector{D,SVector{D,L}}
boundary_conditions::SVector{D,BoundaryCondition}
positions::Vector{SVector{D,L}}
elements::Vector{ET}
elements::Vector{S}
# janky inner constructor that we need for some reason
FastSystem(box, boundary_conditions, positions, elements) =
new{length(boundary_conditions),eltype(elements),SimpleAtom,eltype(eltype(positions))}(
new{length(boundary_conditions),eltype(elements),eltype(eltype(positions))}(
box,
boundary_conditions,
positions,
Expand All @@ -22,11 +23,11 @@ end

# convenience constructor where we don't have to preconstruct all the static stuff...
function FastSystem(
box::AbstractVector{Vector{L}},
box::Vector{<:AbstractVector{L}},
boundary_conditions::AbstractVector{BC},
positions::AbstractMatrix{M},
elements::AbstractVector{ET},
) where {L<:Unitful.Length,BC<:BoundaryCondition,M<:Unitful.Length,ET<:AbstractElement}
elements::AbstractVector{S},
) where {L<:Unitful.Length,BC<:BoundaryCondition,M<:Unitful.Length,S}
N = length(elements)
D = length(box)
if !all(length.(box) .== D)
Expand All @@ -46,16 +47,22 @@ function FastSystem(
)
end

function Base.show(io::IO, ::MIME"text/plain", sys::FastSystem)
function Base.show(io::IO, sys::FastSystem)
print(io, "FastSystem with ", length(sys), " particles")
end

bounding_box(sys::FastSystem) = sys.box
boundary_conditions(sys::FastSystem) = sys.boundary_conditions

# Base.size(sys::FastSystem) = size(sys.particles)
Base.length(sys::FastSystem{D,ET,AT}) where {D,ET,AT} = length(sys.elements)
Base.length(sys::FastSystem{D,S}) where {D,S} = length(sys.elements)

Base.getindex(sys::FastSystem{D,S,L}, i::Int) where {D,S,L} =
StaticAtom{D,L}(sys.positions[i], sys.elements[i])

# these dispatches aren't strictly necessary, but they make these functions ~2x faster
position(s::FastSystem) = s.positions
species(s::FastSystem) = s.elements

# first piece of trickiness: can't do a totally abstract dispatch here because we need to know the signature of the constructor for AT
Base.getindex(sys::FastSystem{D,ET,SimpleAtom}, i::Int) where {D,ET} =
SimpleAtom{D}(sys.positions[i], sys.elements[i])
position(s::FastSystem, i) = s.positions[i]
species(s::FastSystem, i) = s.elements[i]
Loading

0 comments on commit b3b7a45

Please sign in to comment.