-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VarInfo refactor #5
Comments
So the There is room for having alternative implementations of |
This is also in line with TuringLang/Turing.jl#888. |
I think we should prioritize this issue, see also my comment in TuringLang/Turing.jl#1199 (comment):
Regarding @mohamed82008's comment above,
I don't think it's necessary to know anything about the order or shape or number of random variables ahead of time. We just shouldn't try to create a container ahead of time but add samples whenever they appear. We just run the model function, and whenever a new sample is needed, we add it to the "NamedTuple on steroids". At this time point we know the type and the value of the sample. |
"NamedTuple on steroids": so, persistent, strongly-typed extensible records? With lenses? |
Closing in favour of AbstractPPL discussion. |
Directly copied from discussion in TuringLang/Turing.jl#965 to avoid side-tracking the PR:
I'm trying to work through this compiler change and have some thoughts on VarInfo that I would appreciate some feedback on. I've also left some naming-related comments that I believe would be helpful to address for readability's sake.
Regarding the internals of VarInfo / MetaData, there are a lot of things where you need to pull elements out of a field of metadata based on metadata.idcs[vn] e.g.
md.ranges[md.idcs[vn]]]
md.orders[md.idcs[vn]]
md.gids[md.idcs[vn]]
md.dists[md.idcs[vn]]
It's also the case that everything is linearised when running the model. While it's clearly necessary to be able to linearise the state of the model at some point for the current implementation of HMC, it's not clear to me that it need to be the default mode of operation, and appears to add considerable complexity to the internals.
As such I would like to propose the following design change for discussion (perhaps a separate issue is required - again, advice appreciated):
introduce a new type Variable with fields val, gid (possibly not this? Presumably it really belongs in the gibbs sampler object), name, dist, and any other information that needs to be associated with a particular random variable on a particular run. The key point here is that information specific to a particular RV is stored with a particular RV, so there would be no need to linearise the data into a vector or construct an index map from location(s) in vector to random variable.
refactor VarInfo to contain a Vector (or other container, obviously one that is amenable to Zygote-based AD and type-stability) of Variables, in addition to logp and num_produce.
This would:
enhance readability and reduce mental overhead (imho)
enable better mixing of types. Currently if the values of every variable have the same type (usually Float64) the vals vector is very simple. The proposed design avoids the need to store all of the things in one monolithic vector, thus avoiding the need for a widely typed Vector of values.
separate out linearisation of model state from running of the model. We would implement a separate function, call it to_vec, which would efficiently convert a VarInfo instance into a vector of numbers (potentially of a number of different types, or perhaps to a single mutually acceptable type), in addition to a function that converts a vector of the same length as the one produced into a VarInfo object that is identical to the current VarInfo object in every way other than the vals of the Variables. Separating things out like this has the distinct advantage that if linearisation is not required for a particular algorithm (e.g. MH, SMC, Gibbs sampling, particle MCMC etc) then linearisation never needs to happen. Conversely, if linearisation is required then it can still be achieved efficiently. Moreover, I anticipate linearisation becoming a less important feature over time as the AD systems develop and begin to rely on ChainRules / ChainRulesCore, which play nicely with interestingly-structured data types. It's also worth noting that indexing, especially scalar, is a particularly inefficient operation for most reverse-mode AD systems (inc. Zygote, Tracker) and this is unlikely to fundamentally change any time soon. As such, avoiding the conversion between linear Vector and structured data may well be desirable from a performance perspective as it is indexing-heavy.
create opportunities for mixing static and dynamic structure. It seems like this aught to make mixing of static and dynamic model components more straightforward, but this is more of a hunch than anything else as I'm not sure what that mixing really looks like at the minute.
Choice over the type of the Variable "vector". It could actually be a Vector, but this doesn't seem likely to lead to type stable code. Alternatively the variables could be stored in a NamedTuple if we're able to enumerate all possible variable names that could crop up in a programme from the programme code itself (this seems feasible to me, but I might well be missing something). If there exists an example where this doesn't hold (i.e. the number of possible variable names is dynamic -- it's fine for some of them not to exist dynamically) that would be very much appreciated.
Would appreciate any feedback. If I'm missing something glaringly obvious here, then possibly I'm missing some general piece of understanding regarding the internals.
The text was updated successfully, but these errors were encountered: