Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read video from memory newapi #6771

Merged
merged 20 commits into from
Oct 21, 2022

Conversation

jdsgomes
Copy link
Contributor

adds read from memory functionality in the new video API.

Addresses #6603

@jdsgomes jdsgomes changed the title Read video from mew newapi Read video from memory newapi Oct 15, 2022
@vedantroy
Copy link

vedantroy commented Oct 19, 2022

This segfaults on certain videos. Here's a partial-reproduction script (the videos are stored in a pandas dataframe):

import pandas as pd

# df2 = pd.read_pickle("df2.pkl")
df = pd.read_pickle("df.pkl")
# print the # of rows in the df
print(len(df))
# print the keys in the df
print(df.keys())
# print the first row in the df
print(df.iloc[0])


# print the type of the 1st value in the 1st row
first_vid = df.iloc[0][0]
print(f"Length: {len(first_vid)}")
print(f"Type: {type(first_vid)}")

# write first_vid to a file
with open("test.mp4", "wb") as f:
    f.write(first_vid)

import itertools
import copy

import torch
from torchvision.io import VideoReader
import torchvision

def clip_from_start(buf: bytes, expected_frames: int):
    # import av
    # import io
    # buffer = io.BytesIO(buf)
    # container = av.open(buffer)
    # i = 0 
    # for frame in container.decode(video=0):
    #     print(type(frame))
    #     i += 1
    #     print(i)
    #     pass

    tensor = torch.frombuffer(buf, dtype=torch.uint8)
    tensor = copy.deepcopy(tensor)
    # torchvision.io.read_video()
    rdr = VideoReader(tensor)
    sampled_frames = list(itertools.islice(iter(rdr), expected_frames))
    if len(sampled_frames) != expected_frames:
        return None
    data = []
    for frame in sampled_frames:
        data.append(frame["data"])
    return torch.stack(data, dim=0)

clip = clip_from_start(first_vid, 2)
print(clip.shape)

I'm working to get approval of the public copy of the data.

In the meantime, the error is:

test.py:41: UserWarning: The given buffer is not writable, and PyTorch does not support non-writable tensors. This means you can write to the underlying (supposedly non-writable) buffer using the tensor. You may want to copy the buffer to protect its data or make it writable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at ../torch/csrc/utils/tensor_new.cpp:1563.)
  tensor = torch.frombuffer(buf, dtype=torch.uint8)
malloc(): corrupted top size
Aborted (core dumped)

@jdsgomes
Copy link
Contributor Author

jdsgomes commented Oct 19, 2022

This segfaults on certain videos. Here's a partial-reproduction script (the videos are stored in a pandas dataframe):

@vedantroy Thanks for reporting this.
I cannot reproduce the same error with some test videos. Is there any possibility that the data is corrupt?
Otherwise please let me know when you would be able to share the sample video for us to verify and debug.
Other option is to save the video as a file and try to read from path to see if you see the same behaviour

Thanks

Copy link
Contributor

@bjuncek bjuncek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I'd love to make the tests stricter but otherwise good to go :)

assert len(vr_pts) == len(vr_pts_mem)

# compare the frames and ptss
for i in range(len(vr_frames)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this is like-for-like decoding (using verifiably the same backend), we should probably insist on either a max delta or completely equal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually there are two videos for which this is not true: SOX5yA1l24A.mp4
I verified with this code by reading twice from file and also the result is not the same:

In [3]: reader1 = VideoReader('test/assets/videos/SOX5yA1l24A.mp4')
In [4]: reader2 = VideoReader('test/assets/videos/SOX5yA1l24A.mp4')
In [5]: reader1_frames = []
In [6]: reader2_frames = []

In [8]: for r1frame in reader1:
   ...:     reader1_frames.append(r1frame["data"])

In [9]: for r2frame in reader2:
   ...:     reader2_frames.append(r2frame["data"])

In [12]: for i in range(len(reader1_frames)):
    ...:     if torch.all(reader1_frames[i].eq(reader2_frames[i])):
    ...:         print(f"Frame {i} is not equal")

Frame 0 is not equal
Frame 2 is not equal
Frame 3 is not equal
Frame 4 is not equal
Frame 5 is not equal
Frame 6 is not equal
Frame 7 is not equal
Frame 8 is not equal
Frame 9 is not equal
Frame 10 is not equal
Frame 11 is not equal
Frame 12 is not equal
....

Do you know if for this particular format the decoding could be non deterministic?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually surprising to me, is this non deterministic behaviour also occur on the stable API (if we specify same pts)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup; we use the same backend.
@jdsgomes can you check if it happens on the pyav?

Maybe this is a cue that we should reconsider our selection of testing videos.
Specifically, I think we should re-sample them again from whatever are most used datasets these days, plus some FFMPEG fate situations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confirmed that that is the case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please note that there is a bug in the test code (checking for equal instead of not equal) but still there are some frames that are different. This happens both in pyav and video_reader

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be related: https://video.stackexchange.com/questions/18902/why-ffmpeg-works-non-deterministically

In this case, I think this test is fine since it can handle the non-deterministic behaviour.

@@ -165,7 +165,7 @@ struct MediaFormat {
struct DecoderParameters {
// local file, remote file, http url, rtmp stream uri, etc. anything that
// ffmpeg can recognize
std::string uri;
std::string uri{std::string()};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this?
(don't know C++ well, so I'm just curious)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just want to initialise the uri value as empty string

@@ -76,8 +76,14 @@ def test_frame_reading(self, test_video):

# compare the frames and ptss
for i in range(len(vr_frames)):
assert av_pts[i] == vr_pts[i]
torch.test.assert_equal(av_frames[i], vr_frames[i])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was added in the wrong section so reverted the commit. My comment regarding why we cannot make the tests strict from read_from_memory still holds

Copy link
Contributor

@YosuaMichael YosuaMichael left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jdsgomes , I think overall the PR looks good. I will approve first to unblock, but lets wait for approval from @bjuncek before merging.

Also for vedantroy comment, I can't really reproduce it on my side. In my opinion, we can merge first but open an issue for the possible problem on segfault so we can come back to it later.

Copy link
Contributor

@bjuncek bjuncek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving to unblock, and starting a new issue to track video assets.

@jdsgomes jdsgomes merged commit 06ad05f into pytorch:main Oct 21, 2022
@github-actions
Copy link

Hey @jdsgomes!

You merged this PR, but no labels were added. The list of valid labels is available at https://github.com/pytorch/vision/blob/main/.github/process_commit.py

facebook-github-bot pushed a commit that referenced this pull request Oct 21, 2022
Summary:
* add tensor as optional param

* add init from memory

* fix bug

* fix bug

* first working version

* apply formatting and add tests

* simplify tests

* fix tests

* fix wrong variable name

* add path as optional parameter

* add src as optional

* address pr comments

* Fix warning messages

* address pr comments

* make tests stricter

* Revert "make tests stricter"

This reverts commit 6c92e94.

Reviewed By: YosuaMichael

Differential Revision: D40588171

fbshipit-source-id: 1c55b5ebd0bfdd3308931d218269a19481eb3ca9
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants