RFC: Relax FieldHandler #625

termi-official · 2023-03-21T23:14:49Z

This change should eliminate the internal collect calls. If a user strongly depends on the ordering while only having a cellset::Set{Int}, then the collect should be done manually during construction once (instead of several times internally).

I also noticed that the FieldHandler was marked mutable. Is this legacy or is there some deeper reason downstream in some framework utilizing Ferrite as backend?

TODOs

OrderedSet for grid cellset
Test coverage
Docs
Devdocs

codecov-commenter · 2023-03-21T23:24:20Z

Codecov Report

Patch coverage: 57.14% and project coverage change: +0.08 🎉

Comparison is base (cc81e6c) 92.39% compared to head (309d385) 92.48%.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #625      +/-   ##
==========================================
+ Coverage   92.39%   92.48%   +0.08%     
==========================================
  Files          29       29              
  Lines        4446     4444       -2     
==========================================
+ Hits         4108     4110       +2     
+ Misses        338      334       -4

Impacted Files	Coverage Δ
src/PointEval/PointEvalHandler.jl	`92.69% <ø> (ø)`
src/Dofs/MixedDofHandler.jl	`82.22% <36.36%> (+2.30%)`	⬆️
src/Dofs/ConstraintHandler.jl	`95.85% <75.00%> (-0.20%)`	⬇️
src/L2_projection.jl	`100.00% <100.00%> (ø)`

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

fredrikekre · 2023-03-22T00:33:10Z

src/Dofs/ConstraintHandler.jl

@@ -306,7 +306,7 @@ function add_prescribed_dof!(ch::ConstraintHandler, constrained_dof::Int, inhomo
    return ch
 end

-function _add!(ch::ConstraintHandler, dbc::Dirichlet, bcfaces::Set{Index}, interpolation::Interpolation, field_dim::Int, offset::Int, bcvalue::BCValues, cellset::Set{Int}=Set{Int}(1:getncells(ch.dh.grid))) where {Index<:BoundaryIndex}
+function _add!(ch::ConstraintHandler, dbc::Dirichlet, bcfaces::Set{Index}, interpolation::Interpolation, field_dim::Int, offset::Int, bcvalue::BCValues, cellset=1:getncells(ch.dh.grid)) where {Index<:BoundaryIndex}


Here it might be nice to keep since it is used on L319 to check cellidx ∉ cellset. Still fast with a UnitRange, but if you pass a vector it will be slow(er).

Some rough benchmarks seem to suggest that it is benefitial to create the set if length(needles) > length(haystack), i.e. if we the number of lookups we will do is larger than the size of the collection. Still pretty fast though so perhaps just leave it as a vector if one passes that. In most cases it will already be a set, since it will come from a grid-cellset, which are sets.

I also thought about this, but decided to go for creating the Set always. We gain significant performance benefits if the Vector is large, while we definitely get some overhead due to the Set construction. However, for small vectors (where it may be enough to not convert it to a Set upfront), the overhead is rather small, so searching for a good heuristic might not be really worth here. I would still go for the conversion to Set, as the speedup can be really significant for large problems (O(N) vs O(log(N)).

…ieldHandlers.

termi-official · 2023-03-22T11:31:18Z

src/Dofs/MixedDofHandler.jl

+        # cellnumbers = issorted(fh.cellset) ? sort(collect(fh.cellset)) : fh.cellset
        nextdof = _close!(
            dh,
-            cellnumbers,
+            fh.cellset,


@fredrikekre I am not sure what we should do here. I can add another dispatch to _close! which just creates a sorted vector if the input cellset is of type Set{Int} to preserve the exact same dof assignment across all cases. However, the closing algorithm does not require a vector to work, which is why I decided to allow different dof assignments, depending on the specific dof handler at hand.

If you think that the sorted order is really that essential we can also recommending sorted set or ordered set as data structures.

It would be good to have it be more deterministic perhaps. Maybe we should make cellsets in the grid vectors, or keep both a vector version and a set version of the data.

Mhh. Maybe we can add the set type as a type parameter to the grid.

But then all sets need to have the same type. I am thinking we could either maintain both datastructures, but lazily initialize them as needed. Alternatively, use sorted vectors, and make sets when needed. Probably we can get away with doing it only in a few places such as in close! etc.

I think it makes sense to have the slow path for Set and the optimized paths for UnitRanges, which I will document as the recommended way to setup the subdomains, if we agree on this. UnitRanges keep the caches hot, are small and all operations are fast. What do you think?

We can use OrderedSet (after JuliaCollections/OrderedCollections.jl#100) and make sure to sort! in addcellset! for example.

…ribution across different data types for the cell numbering.

termi-official · 2023-03-22T19:36:24Z

Following the guideline in #629 here the first set of benchmarks.

	Master	PR

Construction
`MixedDofHandler(grid)`	1.671 ms (8 allocations: 15.26 MiB)	1.544 ms (8 allocations: 15.26 MiB)
`add!(dh, name, dim[, ip])`	34.745 ms (11 allocations: 18.00 MiB)	1.610 μs (3 allocations: 192 bytes)
`close!(dh)`	2.318 s (297 allocations: 791.71 MiB)	2.200 s (294 allocations: 776.46 MiB)
Constraints
`ConstraintHandler(dh)`	9.930 μs (12 allocations: 992 bytes)	9.500 μs (12 allocations: 992 bytes)
`add!(ch, dbc::Dirichlet)`	3.885 ms (8121 allocations: 4.04 MiB)	~~40.786 ms (8132 allocations: 22.05 MiB)~~ 3.437 ms (8127 allocations: 4.04 MiB)
`close!(ch)`	2.156 s (12071 allocations: 288.21 MiB)	1.894 s (12071 allocations: 288.21 MiB)

Reproducer

using Ferrite
using BenchmarkTools

grid = generate_grid(Quadrilateral, (1000, 1000));
∂Ω = union(
    getfaceset(grid, "left"),
    getfaceset(grid, "right"),
    getfaceset(grid, "top"),
    getfaceset(grid, "bottom"),
);
dbc = Dirichlet(:v, ∂Ω, (x, t) -> [0, 0]);
ip_v = Lagrange{2,RefCube,2}();
ip_s = Lagrange{2,RefCube,1}();

@btime MixedDofHandler($grid);
@btime add!(dh, $(:v), $2, $ip_v) setup=(dh=MixedDofHandler($grid)) evals=1;

function setup_dh(grid, T)
    dh = T(grid)
    add!(dh, :v, 2, Lagrange{2,RefCube,2}()) 
    add!(dh, :s, 1, Lagrange{2,RefCube,1}()) 
    return dh
end

@btime close!(dh) setup=(dh=setup_dh($grid, $MixedDofHandler)) evals=1;

function setup_dhclosed(grid, T)
    dh = T(grid)
    add!(dh, :v, 2, Lagrange{2,RefCube,2}()) 
    add!(dh, :s, 1, Lagrange{2,RefCube,1}())
    close!(dh)
    return dh
end
@btime ConstraintHandler(dh)  setup=(dh=setup_dhclosed($grid, $MixedDofHandler)) evals=1;

function setup_ch(grid, T)
    dh = setup_dhclosed(grid, T)
    return ConstraintHandler(dh)
end
@btime add!(ch, $dbc) setup=(ch = setup_ch($grid, $MixedDofHandler)) evals=1;

function setup_ch2(grid, T, dbc)
    dh = setup_dhclosed(grid, T)
    ch = ConstraintHandler(dh)
    add!(ch, dbc)
    return ch
end
@btime close!(ch) setup=(ch = setup_ch2($grid, $MixedDofHandler, $dbc)) evals=1;

This changes `FieldHandler.cellset` to be a sorted `OrderedSet` instead of a `Set`. This ensures that loops over sub-domains are done in ascending cell order. Since e.g. cells, node coordinates, and dofs are stored in ascending cell order this gives a significant performance boost to loops over sub-domains, i.e. assembly-style loops. In particular, this removes the performance gap between `MixedDofHandler` and `DofHandler` in the `create_sparsity_pattern` benchmark in #629. This is a minimal/initial step towards #625 that can be done before the DofHandler merge and rework of FieldHandler/SubDofHandler.

This changes `FieldHandler.cellset` to be a sorted `OrderedSet` instead of a `Set`. This ensures that loops over sub-domains are done in ascending cell order. Since e.g. cells, node coordinates, and dofs are stored in ascending cell order this gives a significant performance boost to loops over sub-domains, i.e. assembly-style loops. In particular, this removes the performance gap between `MixedDofHandler` and `DofHandler` in the `create_sparsity_pattern` benchmark in #629. This is a minimal/initial step towards #625 that can be done before the `DofHandler` merge and rework of `FieldHandler`/`SubDofHandler`.

This changes `FieldHandler.cellset` to be a `BitSet` (which is sorted) instead of a `Set`. This ensures that loops over sub-domains are done in ascending cell order. Since e.g. cells, node coordinates and dofs are stored in ascending cell order this gives a significant performance boost to loops over sub-domains, i.e. assembly-style loops. In particular, this removes the performance gap between `MixedDofHandler` and `DofHandler` in the `create_sparsity_pattern` benchmark in #629. This is a minimal/initial step towards #625 that can be done before the `DofHandler` merge and rework of `FieldHandler`/`SubDofHandler`.

This patch creates a `BitSet` of `FieldHandler.cellset` in loops, in particular in `close!(::DofHandler)` and `create_sparsity_pattern(::DofHandler)`. Since `BitSet`s are sorted this ensures that these loops are done in ascending cell order, which gives a performance boost due to better memory locality. This is an even smaller change than #654 (and #625) which should be completely non-breaking since the type of `FieldHandler.cellset` is not changed. Larger refactoring, such as using `BitSet` or `OrderedSet` will be done in the `FieldHandler/`SubDofHandler` rework.

termi-official · 2023-10-23T14:27:59Z

I will redo this on master if there is still interest.

termi-official · 2023-10-23T14:43:01Z

Okay we should really address this

using Ferrite, SparseArrays, BenchmarkTools
grid = generate_grid(Quadrilateral, (1000, 1000));

ip = Lagrange{RefQuadrilateral, 1}()
qr = QuadratureRule{RefQuadrilateral}(2)
cellvalues = CellValues(qr, ip);

dh = DofHandler(grid)
add!(dh, :u, ip)
close!(dh);

K = create_sparsity_pattern(dh)

ch = ConstraintHandler(dh);

∂Ω = union(
    getfaceset(grid, "left"),
    getfaceset(grid, "right"),
    getfaceset(grid, "top"),
    getfaceset(grid, "bottom"),
);

dbc = Dirichlet(:u, ∂Ω, (x, t) -> 0)
add!(ch, dbc);

close!(ch)

function assemble_element!(Ke::Matrix, fe::Vector, cellvalues::CellValues)
    n_basefuncs = getnbasefunctions(cellvalues)
    ## Reset to 0
    fill!(Ke, 0)
    fill!(fe, 0)
    ## Loop over quadrature points
    for q_point in 1:getnquadpoints(cellvalues)
        ## Get the quadrature weight
        dΩ = getdetJdV(cellvalues, q_point)
        ## Loop over test shape functions
        for i in 1:n_basefuncs
            δu  = shape_value(cellvalues, q_point, i)
            ∇δu = shape_gradient(cellvalues, q_point, i)
            ## Add contribution to fe
            fe[i] += δu * dΩ
            ## Loop over trial shape functions
            for j in 1:n_basefuncs
                ∇u = shape_gradient(cellvalues, q_point, j)
                ## Add contribution to Ke
                Ke[i, j] += (∇δu ⋅ ∇u) * dΩ
            end
        end
    end
    return Ke, fe
end

function assemble_global(cellvalues::CellValues, K::SparseMatrixCSC, dh::DofHandler)
    ## Allocate the element stiffness matrix and element force vector
    n_basefuncs = getnbasefunctions(cellvalues)
    Ke = zeros(n_basefuncs, n_basefuncs)
    fe = zeros(n_basefuncs)
    ## Allocate global force vector f
    f = zeros(ndofs(dh))
    ## Create an assembler
    assembler = start_assemble(K, f)
    ## Loop over all cels
    for cell in CellIterator(dh)
        ## Reinitialize cellvalues for this cell
        reinit!(cellvalues, cell)
        ## Compute element contribution
        assemble_element!(Ke, fe, cellvalues)
        ## Assemble Ke and fe into K and f
        assemble!(assembler, celldofs(cell), Ke, fe)
    end
    return nothing
end


function assemble_global2(cellvalues::CellValues, K::SparseMatrixCSC, dh::DofHandler)
    ## Allocate the element stiffness matrix and element force vector
    n_basefuncs = getnbasefunctions(cellvalues)
    Ke = zeros(n_basefuncs, n_basefuncs)
    fe = zeros(n_basefuncs)
    ## Allocate global force vector f
    f = zeros(ndofs(dh))
    ## Create an assembler
    assembler = start_assemble(K, f)
    ## Loop over all cels
    for cell in CellIterator(dh.subdofhandlers[1])
        ## Reinitialize cellvalues for this cell
        reinit!(cellvalues, cell)
        ## Compute element contribution
        assemble_element!(Ke, fe, cellvalues)
        ## Assemble Ke and fe into K and f
        assemble!(assembler, celldofs(cell), Ke, fe)
    end
    return nothing
end

@btime assemble_global(cellvalues, K, dh);
@btime assemble_global2(cellvalues, K, dh);

528.541 ms (12 allocations: 7.65 MiB)
1.499 s (15 allocations: 7.65 MiB)

Widen FieldHandler to allow more cellset types.

f6671cc

fredrikekre reviewed Mar 22, 2023

View reviewed changes

Speed up adding Dirichlet conditions on subdomains for Vector-typed F…

2470980

…ieldHandlers.

termi-official commented Mar 22, 2023

View reviewed changes

termi-official mentioned this pull request Mar 22, 2023

Merging DofHandler and MixedDofHandler #624

Closed

4 tasks

Dennis Ogiermann added 2 commits March 22, 2023 13:16

Add slow path for the cellnumbers as Set to enforce the same dof dist…

abd8e19

…ribution across different data types for the cell numbering.

Make add! faster.

9b66d92

Fix performance regression.

309d385

fredrikekre mentioned this pull request Mar 22, 2023

Use OrderedSet for cellsets #631

Closed

fredrikekre added this to the 0.4.0 milestone Mar 23, 2023

fredrikekre mentioned this pull request Mar 29, 2023

Use sorted OrderedSet in FieldHandler #654

Closed

fredrikekre mentioned this pull request Mar 30, 2023

Create a BitSet of FieldHandler.cellset in loops #656

Merged

termi-official mentioned this pull request Jun 5, 2023

New DofHandler syntax for mixed grids / subdomains #735

Merged

termi-official closed this Oct 23, 2023

termi-official deleted the do/relax-fieldhandler branch October 23, 2023 14:28

termi-official mentioned this pull request Nov 2, 2023

Introduce OrderedSets #834

Merged

termi-official mentioned this pull request Nov 23, 2023

getcellset(::SubDofHandler) #809

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Relax FieldHandler #625

RFC: Relax FieldHandler #625

termi-official commented Mar 21, 2023 •

edited

Loading

codecov-commenter commented Mar 21, 2023 •

edited

Loading

fredrikekre Mar 22, 2023

fredrikekre Mar 22, 2023

termi-official Mar 22, 2023

termi-official Mar 22, 2023

termi-official Mar 22, 2023

fredrikekre Mar 22, 2023

termi-official Mar 22, 2023

fredrikekre Mar 22, 2023

termi-official Mar 22, 2023

fredrikekre Mar 22, 2023

termi-official commented Mar 22, 2023 •

edited

Loading

termi-official commented Oct 23, 2023

termi-official commented Oct 23, 2023

RFC: Relax FieldHandler #625

RFC: Relax FieldHandler #625

Conversation

termi-official commented Mar 21, 2023 • edited Loading

TODOs

codecov-commenter commented Mar 21, 2023 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

termi-official commented Mar 22, 2023 • edited Loading

Reproducer

termi-official commented Oct 23, 2023

termi-official commented Oct 23, 2023

termi-official commented Mar 21, 2023 •

edited

Loading

codecov-commenter commented Mar 21, 2023 •

edited

Loading

termi-official commented Mar 22, 2023 •

edited

Loading