-
-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Request] Add ability to save generated maps to a folder and reuse for a second run #97
Comments
I have received several requests for this. I am planning to support it. However, the processing time for depth estimation is a small part of the overall processing time. |
The main slow down is the generation of the depth map and if we can cache/save/store that it can save HOURS when you taking the source image and loading the depth map to create the SBS final image. I do believe this will be a HUGE bonus to this awesome software. Thank you for considering it |
My recommended workflow for testing or previewing is Max FPS=0.25. |
Would you consider then for the CLI tool? I had a look at the code and adding this support to the feature where it reads the images from a folder could be the starting point maybe? --input Then when looping through the source images, it could load from MAPS folder if it exists and then do the SBS image Just thinking of ways to make it quicker for users. If this works, could then investigate approaches for video. I would be happy to test |
For now, I am planning to develop two CLI. export depth and generate video from depth,frame,audio. Also here is a benchmark. I think depth estimation is much faster than you think. It is faster than saving and loading an image. import os
import torch
import PIL
import torchvision.transforms.functional as TF
import time
ef bench_any():
model = torch.hub.load("nagadomi/Depth-Anything_iw3", "DepthAnything",
encoder="vitb", # Any_B
trust_repo=True).cuda()
RES = 392 # default resolution in iw3, 518 is also supported
BATCH = 4
N = 25
x = torch.rand((BATCH, 3, RES, RES)).cuda() - 0.5
with torch.inference_mode():
start = time.perf_counter()
for i in range(N):
with torch.autocast(device_type="cuda"):
depth = model(x)
torch.cuda.synchronize()
print(f"depth inference time per image: {round((time.perf_counter() - start) / (N * BATCH), 4)}s")
def bench_png():
OUTPUT_DIR = "png_bench"
FRAME_RES = (1920, 1080) # consider frame images
img = torch.rand((3, *FRAME_RES))
img = TF.to_pil_image(img)
os.makedirs(OUTPUT_DIR, exist_ok=True)
start = time.perf_counter()
N = 100
for i in range(N):
file_path = os.path.join(OUTPUT_DIR, f"{i}.png")
img.save(file_path)
PIL.Image.open(file_path).load()
print(f"PNG encode/save/load/decode time per image: {round((time.perf_counter() - start) / N, 4)}s")
if __name__ == '__main__':
bench_any()
bench_png() result (Linux, RTX 3070ti)
result with FRAME_RES=(392,392)
|
Will check this out tomorrow (night here) thanks for the share :) |
I chose to simply add --export option to the existing iw3.cli. It can be used in GUI. |
Sounds like a plan, can't wait to check it out :) ps: Wouldn't nag for features if we didn't think this is GREAT piece of software :) |
I added
CLI example for exporting,
In GUI, select
CLI example for generating video/image from exported data,
Simply specify the yml file as input. When exporting with Formatoutput directory structure
type: video
basename: filename
fps: 30.0
rgb_dir: rgb
depth_dir: depth
audio_file: audio.m4a
mapper: none
skip_mapper: false
skip_edge_dilation: false
updated_at: '2024-03-29T21:02:32.996779'
user_data:
export_options:
depth_model: Any_B
export_disparity: false
mapper: none
edge_dilation: 2
max_fps: 30
ema_normalize: false
rgb and depth files will be used as sequential frames, in ascending order of file name, when type is "video". depth, disparity and mapper functionThis is maybe a confusing point. The output depthmap is in different formats/scales depending on the depth estimation model. --mapper options used for depth model/--foreground-scale.
If you use a program other than iw3 to output depthmaps, there may not be a proper conversion function. EDIT: See https://github.com/nagadomi/nunif/blob/dev/iw3/docs/colorspace.md for Colorspace |
I haven't tested this much, so if something strange happens, it's most likely a bug. |
link to #87
The number in the filename is frame PTS. No problem if filenames are in frame order. Other problems do not happen in my env. |
Running part 2 after the first part If I use the GUI app it seems to work it's fine, I'm going to test your changes using the GUI and get back to you Also I used "Installer for Windows.bat" and "Open Prompt.bat" to update and get command line access |
Firstly I must say thank you for the continued assistance here :) Using GUI on a different video file, a youtube video 1080p MP4, getting the same with the DEPTH folder where the images are legit PNG but empty and the RGB folder is spot on. Can you think of anything I might have broken in my environment as I used the MASTER.ZIP download from here on a new setup? Not sure what else I can do to help you |
On windows, I have only checked nunif-windows-package. if you don't want to use it, let me know your Python environment.
This is a ffmpeg problem that occurs with floating point FPS. |
Seem to be good now :) |
|
If you find a problem, post a new issue. |
I'd like to save the DepthMap images used to another folder on the first run, so if I do another run I can use the pre-generated maps instead of re-generating for a 2nd/3rd run.
This way makes it easier and faster to test different options available
The text was updated successfully, but these errors were encountered: