Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GSOC] Add Image-to-Image and Image-to-Text generation with GUI support for OpenVINO GenAI #740

Closed
wants to merge 50 commits into from

Conversation

chux0519
Copy link

@chux0519 chux0519 commented Aug 5, 2024

Hi all,

Here is the draft PR for the first part of the GSOC project. This PR includes:

The OpenPose detector implemented in pure C++
A new image-to-image pipeline using ControlNet-OpenPose
I will create new PRs for the remaining tasks, which include the GUI client and the CLIP model in OpenVINO.

chux0519 added 30 commits May 21, 2024 11:01
- add convert script for model conversion
- part of post process logic
- python implmentation of visualization
using the same latent from numpy, and the results are just the same
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add the guide for OpenCV installation?
I downloaded opencv 4.9.0 and got cmake error.
log_cmake_error_opencv.txt

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, will add guide for opencv.
and I just checked opencv 4.9.0, it works fine.
I'm using "x64 Native Tools Command Prompt for VS 2022" instead of the default cmd.exe to do the build.
and opencv related outputs should like

-- Found Threads: TRUE
-- OpenCV ARCH: x64
-- OpenCV RUNTIME: vc16
-- OpenCV STATIC: OFF
-- Found OpenCV: C:/Users/chuxd/Downloads/opencv/build (found suitable version "4.9.0", minimum required is "4.9.0")
-- Found OpenCV 4.9.0 in C:/Users/chuxd/Downloads/opencv/build/x64/vc16/lib
-- You might need to add C:\Users\chuxd\Downloads\opencv\build\x64\vc16\bin to your PATH to be able to run your applications.

seems OpenCV RUNTIME not found correctlly in your setup.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add x64\vc16\lib for -DOpenCV_DIR:
cmake -S . -B build -DOpenCV_DIR="C:\Users\S590\Downloads\opencv\build\x64\vc16\lib" -DOpenVINO_DIR="C:\llm\LG\w_openvino_toolkit_windows_2024.3.0.16041.1e3b88e4e3f_x86_64\w_openvino_toolkit_windows_2024.3.0.16041.1e3b88e4e3f_x86_64\runtime\cmake"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently this verification within scripts works on Linux.

Please also provide windows verification.
For windows verification, some modifications are required:

  • add requirements.txt for python script and notebook.
  • Copy from “stable_diffusion_1_5_controlnet\cpp\scripts\pose.png” to detectors\scripts
  • add opencv path for cmake: cmake -S . -B build -DOpenCV_DIR="C:\Users\S590\Downloads\opencv\build\x64\vc16\lib"
  • add "set(CMAKE_CXX_STANDARD 17)" into cmakelist
  • debug the "GoogleTestAddTests" on Windows
  • upload images into ./media
  • modify the windows path of exe in verify.ipynb: ../build/Release/detectors_bridge

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To more precisely validate your detector implementation, add more quantitative comparisons of images (pixel level).

Copy link
Contributor

@yangsu2022 yangsu2022 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR currently has no GUI.
If you are more concerned about Windows GUI, I would recommend first checking and merging the Windows ControlNet pipeline without GUI.

@yangsu2022
Copy link
Contributor

This blog Stable Diffusion ControlNet Pipeline with OpenVINO™ In C++ has analyzed the differences between full pipelines in C++ and Python with pixel-level comparison and visualization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants