Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for multiple views? #196

Closed
vigji opened this issue May 28, 2024 · 11 comments
Closed

Support for multiple views? #196

vigji opened this issue May 28, 2024 · 11 comments
Labels
enhancement New optional feature

Comments

@vigji
Copy link
Collaborator

vigji commented May 28, 2024

Hi people! Thanks for getting this going, a tool like this is truly needed!

Is your feature request related to a problem? Please describe.
I have a dataset with single animals detected from multiple cameras. As far as I can see there is no option for a "views" dimension in the default data format. (I tried to look for discussions of this issue, sorry if I missed something)

Describe the solution you'd like
A syntax within the movement framework to load together multiple files referring to different cameras in the same experiment (I don't know what are the long term plans in terms of file structures to load - so I am not necessarily committed to my current organisation)

Describe alternatives you've considered
I am currently loading them and concatenating the xarrays after loading (which is totally reasonable, so no rush for this!)

Additional context
This kind of data structure would be needed by anyone in the process of extracting 3D poses. This is something needed only as long as you do not have yet 3D coordinates, in my case (but I assume in anyone's) the end goal of having multiple views. So, if you want to support only the final, "clean" data where some triangulation of sort already happened, I would understand!

@vigji vigji added the enhancement New optional feature label May 28, 2024
@niksirbi
Copy link
Member

Hi @vigji, thanks for bringing this up!
We should have this discussion. It was always on our roadmap to also support 3D poses (see #124), but so far I've always conceptualised it as importing the already triangulated data in 3D (x,y,z) space.

What you are asking for is different, you want to be able to load multiple 2D camera views of the same animal and keypoints, right? I see why this is helpful. For example, that would enable us (or the users) to apply custom triangulation functionalities, or for example, employ clever outlier detection tools that rely on multi-view consistency (see #145). It would be useful to hear what sort of functionalities you have in mind for such data in movement.
We've already had some related discussions with @neuroinformatics-unit/behaviour, so would be keen to hear others' opinions on this.

On the technical level, it would be feasible. Probably the "cleanest" would be to add an optional views dimension to the existing xarray objects, as you suggested. This would be the most logically consistent with out data model I think.

@niksirbi
Copy link
Member

A question regarding the practicalities:

Are such multi-view predictions typically stored in a single file or as separate files? In the 2nd case the loader would have to accept a list of files as inputs.

An alternative would be to load one view the "usual" way, and then have an add_view() method that would load one additional file/view at a time. This method could also take care of creating the views dimension the first time it was called.

@vigji
Copy link
Collaborator Author

vigji commented May 29, 2024

It would be useful to hear what sort of functionalities you have in mind for such data in movement.

My idea would be to stick to movement data representation for a pipeline for coordinate triangulation and (later) filtering. The pipeline would roughly look like:
DLC labelling (from multiple camera views, to multiple camera views) -> triangulation (from multiple camera views, to 3D coords).

Probably the "cleanest" would be to add an optional views dimension to the existing xarray objects, as you suggested. This would be the most logically consistent with out data model I think.
💯

Are such multi-view predictions typically stored in a single file or as separate files? In the 2nd case the loader would have to accept a list of files as inputs.

Yes they are! A loader accepting an iterable would be my favourite option, although a dict would allow me to include in the same variable the views name as well. But again, do not want to set my own idiosyncratic constraints here & I'd be happy to comply to anuy solution (also an .add_view() method; that would have the probably annoying step of specifying the view name for the first file created, if by default it is created without view dimensions)

@niksirbi
Copy link
Member

Hmm, I like the dict option for passing view_name: filepath pairs. It could look something like this:

def from_multi_view(files_dict: dict[str, Path], source_software: str, fps: float):

Under the hood, this would call load_poses.from_file() for each dict entry, concatenate the loaded datasets along a new views axis, and return the concatenated dataset.

I like this option, because it means that using from_file() will keep working the same way, returning a dataset without a views dim. You only get the extra dim when you explicitly call from_multi_view(), so it won't be a breaking change.

Do you want to take a shot at implementing it @vigji ? You already seem to have relevant test data and some concatenation code that works. I'm also fine with you opening a draft PR with a quick-and-dirty implementation, and we can help you refine it (or we can do the refinement for you if you wish).

With regards to the triangulation, I guess you'd also need to load camera calibration parameters for that to work. If you end up doing something like that with movement, and you think your solution can be sufficiently generalised, we may also be interested in incorporating that as a feature. I know that anipose and sleap-anipose already do this, so we may be duplicating effort, but it could be convenient for users to do everything within the movement data framework (as you seem to be willing).

@niksirbi
Copy link
Member

I know that anipose and sleap-anipose already do this, so we may be duplicating effort, but it could be convenient for users to do everything within the movement data framework (as you seem to be willing).

To clarify, if this can be achieved by movement depending os say anipose, that would be the preferred option.

@vigji
Copy link
Collaborator Author

vigji commented May 29, 2024

Do you want to take a shot at implementing it @vigji ? You already seem to have relevant test data and some concatenation code that works. I'm also fine with you opening a draft PR with a quick-and-dirty implementation, and we can help you refine it (or we can do the refinement for you if you wish).

Yes, happy to try that! :)

With regards to the triangulation, I guess you'd also need to load camera calibration parameters for that to work. If you end up doing something like that with movement, and you think your solution can be sufficiently generalised, we may also be interested in incorporating that as a feature. I know that anipose and sleap-anipose already do this, so we may be duplicating effort, but it could be convenient for users to do everything within the movement data framework (as you seem to be willing).

I am currently using some triangulation code that people have kindly shared with me, I'll bring the topic up with them and get back on this point as I do not want to appropriate other people's pipeline! Will let you know

@vigji vigji closed this as completed May 29, 2024
@github-project-automation github-project-automation bot moved this from 🤔 Triage to ✅ Done in movement progress tracker May 29, 2024
@niksirbi
Copy link
Member

I am currently using some triangulation code that people have kindly shared with me, I'll bring the topic up with them and get back on this point as I do not want to appropriate other people's pipeline! Will let you know

Of course! There's also no absolute need or rush on this.

@adamltyson
Copy link
Member

Is it worth reopening this? Others may want to chime in about this general concept.

@vigji vigji reopened this May 30, 2024
@vigji
Copy link
Collaborator Author

vigji commented May 30, 2024

Also, I know that this is not the right place for the question, feel free to redirect me: would you consider a native way of dropping the individuals axis for people loading single animal data? (sorry if it exists and I've missed it)

@niksirbi
Copy link
Member

Also, I know that this is not the right place for the question, feel free to redirect me: would you consider a native way of dropping the individuals axis mfor people loading single animal data? (sorry if it exists and I've missed it)

There are some xarray-native methods to do that. xarray.Dataset.squeeze() will drop all dimensions with length=1, or if you want to be more explicit, do xarray.Dataset.drop_dims(). Maybe we should make that more prominent in the docs.

@niksirbi niksirbi moved this from ✅ Done to 📝 Todo in movement progress tracker May 30, 2024
@niksirbi
Copy link
Member

Also, FYI we have a zulip channel where you are welcome to ask all such questions, or to discuss anything movement-related.

@sfmig sfmig moved this from 📝 Todo to 🚧 In Progress in movement progress tracker Aug 27, 2024
@sfmig sfmig linked a pull request Aug 27, 2024 that will close this issue
5 tasks
@niksirbi niksirbi moved this from 🚧 In Progress to ✅ Done in movement progress tracker Nov 27, 2024
@niksirbi niksirbi closed this as completed by moving to ✅ Done in movement progress tracker Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New optional feature
Projects
Development

Successfully merging a pull request may close this issue.

3 participants