Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Remove Skia ❌🎨 #1740

Merged
merged 10 commits into from
Sep 1, 2023
Merged

chore: Remove Skia ❌🎨 #1740

merged 10 commits into from
Sep 1, 2023

Conversation

mrousavy
Copy link
Owner

@mrousavy mrousavy commented Sep 1, 2023

What

This has been a hard decision to make, but I decided that the Skia integration has to be removed from VisionCamera for various reasons.

I worked long nights, 10h-14h days, weekends and in total about 300 hours on the V3 Skia integration, but I decided that this will not land in production because it significantly increases the complexity of the VisionCamera codebase (see Why down below).
VisionCamera will stay as lean as possible, and my focus is to make it as stable and powerful as possible.

My work here has proven that it is possible to draw onto a Camera Frame in realtime using VisionCamera/React Native/JS/Skia (see for example my blog post on this here https://mrousavy.com/blog/VisionCamera-Pose-Detection-TFLite), and it is also possible for the drawn Frames to end up in the resulting photo or video recording. (See the Stori face filters example on our website, that's 60+FPS in React Native)

I can build this customized solution for you or your business for $$$ (contact us -> https://margelo.io), but I decided to not add this to the public VisionCamera repo because of the significantly increased complexity of also supporting the non-skia pipeline.

Right now, the semi-working version is on npm under [email protected]. Maybe I will create a separate repo/fork of that to provide a version of VisionCamera with uses only the Skia pipeline, this would simplify stuff a lot.

Note: V3 is still a huge upgrade, there's a ton of changes on the iOS codebase, Frame Processors have been rewritten from scratch, and the entire Android codebase has been rewritten and now uses a GPU accelerated OpenGL pipeline. Stay tuned for the V3 release (next week?)


Journey

I already felt like I was a pretty good developer, but getting into realtime graphics, OpenGL, shaders, surfaces, etc was tough. There's almost no documentation on what I was trying to do in the internet, and even ChatGPT just hallucinated most of the time. OpenGL is tough to set up, passing through Camera frames to surfaces is not easy, and drawing onto them in-between while always staying fully on the GPU is even more tricky.

I learned so many incredible things about Graphics, Android Surfaces, OpenGL, C++, memory management, Cameras (Camera2), Skia etc, and I feel like I now finally have a full understanding of how this all works together.

I wrote my own OpenGL context from scratch that could render an EXTERNAL texture to an offscreen 2D texture, then pass that through using a pass-through shader to three output surfaces (preview, video recorder, frame processor) - that sh*t wasn't easy! But it was pretty damn cool once I got it running. Really fast as well because it's fully built on the GPU.

Why

The normal, non-skia Camera session is already pretty complex. There's multiple outputs that the Camera streams to (PREVIEW, PHOTO, VIDEO), and everything is happening in parallel.

To now draw onto the Camera Frame (and have the drawing show in realtime on the PREVIEW, as well as in the resulting VIDEO and PHOTO), we need to introduce a step in-between here. This step in-between is the Frame Processor, that now returns a texture which everything was rendered into, so we need a separate render pass.

On iOS this was somewhat complex but thanks to CMSampleBuffers being GPU accessible, I got it working in just 2 weeks. This does not yet draw to the resulting video recording though, but that isn't too difficult to add, just some extra flows.

On Android, I had to scrap everything on the video pipeline and rewrite it from scratch using OpenGL. I then had two flows, one for skia and one for the default native camera pipeline. The skia one had the extra step in there where the frame processor ran first and rendered everything into a texture, and then I rendered that texture into the outputs.
I finally got this working after 4 weeks, but now the orientation is messed up since Skia doesn't use the same OpenGL transform matrix (float[16]) as the native OpenGL pipeline. Also there's some issues where this goes out of sync and crashes if you flip the Camera, or fast-refresh in react native.

There's just so many added extra ifs/branches that this significantly increases the complexity of the VisionCamera codebase, making it much harder to maintain for me.

Also, the buildscript got much more complex. I now need to check if Skia is installed, and if it is I need to include all skia headers, link against Skia and it's dependencies, and then link against RN Skia. If anything changes on their end (eg a new header in a new folder that I don't know yet), VisionCamera's build will break (unless you disable Skia or stay on an older version that worked).
This is just significantly increasing my maintenance cost, all for something that is not directly Camera related but instead Skia related.

Honestly this is fun and cool and all but only a very small fraction of actual VisionCamera users are going to use this.

Some examples of where the code got really complex:
  1. There's a huge added codebranch for Skia rendering here since it no longer renders to all outputs in parallel, but instead first to the FP offscreen 2D texture, then the result of that to all outputs in parallel - but not with my default OpenGL rendering pipeline, now it has to use Skia (since otherwise the OpenGL context is messed up):
    auto isSkiaFrameProcessor = _frameProcessor != nullptr && _frameProcessor->isInstanceOf(JSkiaFrameProcessor::javaClassStatic());
    if (isSkiaFrameProcessor) {
    // 4.1. If we have a Skia Frame Processor, prepare to render to an offscreen surface using Skia
    jni::global_ref<JSkiaFrameProcessor::javaobject> skiaFrameProcessor = jni::static_ref_cast<JSkiaFrameProcessor::javaobject>(_frameProcessor);
    SkiaRenderer& skiaRenderer = skiaFrameProcessor->cthis()->getSkiaRenderer();
    auto drawCallback = [=](SkCanvas* canvas) {
    // Create a JFrame instance (this uses queues/recycling)
    auto frame = JFrame::create(texture.width,
    texture.height,
    texture.width * 4,
    _context->getCurrentPresentationTime(),
    "portrait",
    false);
    // Fill the Frame with the contents of the GL surface
    _context->getPixelsOfTexture(texture,
    &frame->cthis()->pixelsSize,
    &frame->cthis()->pixels);
    // Call the Frame processor with the Frame
    frame->cthis()->incrementRefCount();
    skiaFrameProcessor->cthis()->call(frame, canvas);
    frame->cthis()->decrementRefCount();
    };
    // 4.2. Render to the offscreen surface using Skia
    __android_log_print(ANDROID_LOG_INFO, TAG, "Rendering using Skia..");
    OpenGLTexture offscreenTexture = skiaRenderer.renderTextureToOffscreenSurface(*_context,
    texture,
    transformMatrix,
    drawCallback);
    // 4.3. Now render the result of the offscreen surface to all output surfaces!
    if (_previewOutput) {
    __android_log_print(ANDROID_LOG_INFO, TAG, "Rendering to Preview..");
    skiaRenderer.renderTextureToSurface(*_context, offscreenTexture, _previewOutput->getEGLSurface());
    }
    if (_recordingSessionOutput) {
    __android_log_print(ANDROID_LOG_INFO, TAG, "Rendering to RecordingSession..");
    skiaRenderer.renderTextureToSurface(*_context, offscreenTexture, _recordingSessionOutput->getEGLSurface());
    }
    } else {
    // 4.1. If we have a Frame Processor, call it
    if (_frameProcessor != nullptr) {
    // Create a JFrame instance (this uses queues/recycling)
    auto frame = JFrame::create(texture.width,
    texture.height,
    texture.width * 4,
    _context->getCurrentPresentationTime(),
    "portrait",
    false);
    // Fill the Frame with the contents of the GL surface
    _context->getPixelsOfTexture(texture,
    &frame->cthis()->pixelsSize,
    &frame->cthis()->pixels);
    // Call the Frame processor with the Frame
    frame->cthis()->incrementRefCount();
    _frameProcessor->cthis()->call(frame);
    frame->cthis()->decrementRefCount();
    }
  2. The 4x4 float[16] OpenGL matrix works in OpenGL, but Skia uses a different coordinate system. I spent 10h trying to get the orientation/rotation to work without success
    // TODO: Apply Matrix. No idea how though.
    SkM44 matrix = SkM44::ColMajor(transformMatrix);
  3. I no longer can use android.media.Image. This is a really big change since most frameworks like MLKit or barcode detectors just take an android.media.Image as an input. Now, I had to create a custom Frame object, which had a fully native implementation and used a ByteBuffer of the pixels as it's backing array. This is a GPU -> CPU copy of the Frame.
    public class Frame {
    private final HybridData mHybridData;
    private Frame(HybridData hybridData) {
    mHybridData = hybridData;
    }
    @Override
    protected void finalize() throws Throwable {
    super.finalize();
    mHybridData.resetNative();
    }
    /**
    * Get the width of the Frame, in it's sensor orientation. (in pixels)
    */
    public native int getWidth();
    /**
    * Get the height of the Frame, in it's sensor orientation. (in pixels)
    */
    public native int getHeight();
    /**
    * Get the number of bytes per row.
    * * To get the number of components per pixel you can divide this with the Frame's width.
    * * To get the total size of the byte buffer you can multiply this with the Frame's height.
    */
    public native int getBytesPerRow();
    /**
    * Get the local timestamp of this Frame. This is always monotonically increasing for each Frame.
    */
    public native long getTimestamp();
    /**
    * Get the Orientation of this Frame. The return value is the result of `Orientation.toUnionValue()`.
    */
    public native String getOrientation();
    /**
    * Return whether this Frame is mirrored or not. Frames from the front-facing Camera are often mirrored.
    */
    public native boolean getIsMirrored();
    /**
    * Get the pixel-format of this Frame. The return value is the result of `PixelFormat.toUnionValue()`.
    */
    public native String getPixelFormat();
    /**
    * Get the actual backing pixel data of this Frame using a zero-copy C++ ByteBuffer.
    */
    public native ByteBuffer getByteBuffer();
    /**
    * Get whether this Frame is still valid.
    * A Frame is valid as long as it hasn't been closed by the Frame Processor Runtime Manager
    * (either because it ran out of Frames in it's queue and needs to close old ones, or because
    * a Frame Processor finished executing and you're still trying to hold onto this Frame in native)
    */
    public native boolean getIsValid();
    1. This is how I then manually get the pixels
      // Fill the Frame with the contents of the GL surface
      _context->getPixelsOfTexture(texture,
      &frame->cthis()->pixelsSize,
      &frame->cthis()->pixels);
    2. And I also needed to implement a custom queue/pool of pixel buffers (this is what ImageReader does) to avoid the malloc/free of the large pixel buffer on every frame. Right now, it just malloc's a new pixel buffer on every single Camera frame, aka 30 to 60 times a second a 3-10MB CPU buffer gets allocated and freed again. Even if you don't need it.
    3. I tried to make use of the OpenGL texture as a backing pixel store for the Image, but this wouldn't work since the OpenGL texture is already invalid once the Frame Processor is done executing, so I had to make a copy. I thought about AHardwareBuffer*, but this is again an additional renderpass - you have to set up an OpenGL framebuffer that renders to an EGLImageKHR, which is just wrapping the HardwareBuffer. Then, after I finally rendered into a GPU HardwareBuffer, what is the user going to do with it? Most people that want to run processing (eg barcode scanning) want to read pixels on the CPU. Only a small fraction of users are actually writing GPU processing code, so all of this additional overhead is unnecessary. (Also, creating an android.media.Image is really weird, I create an ImageWriter that streams into an ImageReader, then get the images there, fill them, and close them again)
  4. I somehow had to get the texture context in here so that you can render it multiple times
    if (name == "render") {
    auto render = JSI_HOST_FUNCTION_LAMBDA {
    if (_canvas == nullptr) {
    throw jsi::JSError(runtime, "Trying to render a Frame without a Skia Canvas! Did you install Skia?");
    }
    throw std::runtime_error("render() is not yet implemented!");
    return jsi::Value::undefined();
    };
    return jsi::Function::createFromHostFunction(runtime, jsi::PropNameID::forUtf8(runtime, "render"), 1, render);
    }
  5. I had to add ifs and checks everywhere regarding the PreviewView - if there is a Skia Frame Processor I want to add the PreviewView to the pipeline so it streams the frames with the drawings into the Preview View. If not, I want to add the PreviewView to the Camera so it can efficiently stream lower-res Frames into the Preview in parallel
    videoPipeline.setPreviewOutput(previewOutput?.surface)
  6. Finally, even if I would come up with a really cool abstract interface that allows me to quickly swap those pipelines and target surfaces out without anything breaking or too much nested ifs, the buildscript is still getting really complex. Just look at this lol
    # Optionally also add Skia Integration here
    if(ENABLE_SKIA)
    set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -DSK_GL -DSK_GANESH -DSK_BUILD_FOR_ANDROID")
    find_package(shopify_react-native-skia REQUIRED CONFIG)
    set(SKIA_PACKAGE shopify_react-native-skia::rnskia)
    set(RNSKIA_PATH ${NODE_MODULES_DIR}/@shopify/react-native-skia)
    set (SKIA_LIBS_PATH "${RNSKIA_PATH}/libs/android/${ANDROID_ABI}")
    add_library(skia STATIC IMPORTED)
    set_property(TARGET skia PROPERTY IMPORTED_LOCATION "${SKIA_LIBS_PATH}/libskia.a")
    add_library(svg STATIC IMPORTED)
    set_property(TARGET svg PROPERTY IMPORTED_LOCATION "${SKIA_LIBS_PATH}/libsvg.a")
    add_library(skshaper STATIC IMPORTED)
    set_property(TARGET skshaper PROPERTY IMPORTED_LOCATION "${SKIA_LIBS_PATH}/libskshaper.a")
    # We need to include the headers from skia
    # (Note: rnskia includes all their files without any relative path
    # so for example "include/core/SkImage.h" becomes #include "SkImage.h".
    # That's why for the prefab of rnskia, we flatten all cpp files into
    # just one directory. HOWEVER, skia itself uses relative paths in
    # their include statements, and so we have to include the path to skia)
    target_include_directories(
    ${PACKAGE_NAME}
    PRIVATE
    "${RNSKIA_PATH}/cpp/api/"
    "${RNSKIA_PATH}/cpp/jsi/"
    "${RNSKIA_PATH}/cpp/rnskia/"
    "${RNSKIA_PATH}/cpp/skia"
    "${RNSKIA_PATH}/cpp/skia/include/"
    "${RNSKIA_PATH}/cpp/skia/include/config/"
    "${RNSKIA_PATH}/cpp/skia/include/core/"
    "${RNSKIA_PATH}/cpp/skia/include/effects/"
    "${RNSKIA_PATH}/cpp/skia/include/utils/"
    "${RNSKIA_PATH}/cpp/skia/include/pathops/"
    "${RNSKIA_PATH}/cpp/skia/modules/"
    "${RNSKIA_PATH}/cpp/utils/"
    )
    target_link_libraries(
    ${PACKAGE_NAME}
    GLESv2 # <-- Optional: OpenGL (for Skia)
    EGL # <-- Optional: OpenGL (EGL) (for Skia)
    ${SKIA_PACKAGE} # <-- Optional: RN Skia
    jnigraphics
    skia
    svg
    skshaper
    )
    message("VisionCamera: Skia enabled!")
    endif()

@github-actions

This comment was marked as spam.

@agrittiwari
Copy link

Adding Skia and making it work with advance Camera is itself an OG accomplishment.
Decision is thought through and is right. Your maintenance cost would go in making sure you handle ever-growing edge cases.
Great and inspiring work Marc!

@thorbenprimke
Copy link

Such a thoughtful decision. Totally agree with the direction to remove the overhead for maintenance and reducing complexity.

A fork or for consulting seems the right direction.

What are the use cases for this? I can primarily think of livestream / video calls where it is used to enhance / augment the stream.

@Alexispap
Copy link

I could help with the matrix problem, i should figure that out since i am an applied mathematician. Could you provide me with the references that you used or details about what you are trying to do there?

@mrousavy
Copy link
Owner Author

mrousavy commented Sep 4, 2023

Thanks @Alexispap, I'll create a fork/new repo with that code where we can experiment then :) Should be possible to just play around with it to figure it out

@mrousavy
Copy link
Owner Author

mrousavy commented Sep 4, 2023

@thorbenprimke some use-cases that are all possible with VisionCamera + Skia (with this code that is on that branch actually!):

  • Snapchat/Instagram-style dog masks or beauty filters (see our website margelo.io for an example ("Stori"), that is also using a modified version of VisionCamera)
  • Realtime blurring faces or license plates image
  • Implementing filters like enhancing colors, inverting colors, making stars appear on a dark night sky, flame filters (again, Instagram/Snapchat has that)
  • Implementing VHS filters that actually render text as well image
  • Implementing overlays like "Free version" or your company name - or dashcam stuff like current speed etc
  • Applying color correction eg for software based night/HDR modes
  • Fun πŸ˜„

@tomerh2001
Copy link

I mainly wanted to create overlays on my camera, i.e. bounding box around an object I detected. Is it not possible now that Skia is not in react-native-vision-camera? How can I do it then?

@medet-mattr
Copy link

medet-mattr commented Mar 29, 2024

@mrousavy So there is no way to draw text or SVG on a frame ?

@tomerh2001
Copy link

@mrousavy So there is no way to draw text or SVG on a frame ?

Nope. Yet it's in the docs, go figure. Figured it out along with similar issues too late and had major issues due to that.

@mrousavy
Copy link
Owner Author

Yet it's in the docs, go figure

The docs say "proof of concept":

Screenshot 2024-03-29 at 12 25 59

..and I even explained why I "removed it again", and how everybody who absolutely needs this feature can reach out to my agency so I can develop this feature for your company.

Screenshot 2024-03-29 at 12 26 18

Figured it out along with similar issues too late and had major issues due to that.

As I said, if you had major issues and blockers but absolutely need this feature, you could reach out to my agency to get a custom solution developed for your app. Or sponsor the project to collectively fund the development of the Skia integration.

@islemsyw
Copy link

@tomerh2001 I have been trying to do the same task as you, did you figure out a way to draw the bounding boxes please?

@mrousavy
Copy link
Owner Author

VisionCamera V4 now has Skia Frame Processors, so this now finally works πŸŽ‰

isaaccolson pushed a commit to isaaccolson/deliveries-mobile that referenced this pull request Oct 30, 2024
* Revert "feat: Skia for Android (mrousavy#1731)"

This reverts commit a7c137d.

* Remove some skia

* Remove all the Skia stuff.

* Update useFrameProcessor.ts

* Update lockfiles

* fix: Use native Preview again

* Use `OpenGLTexture&` again

* Remove `PreviewOutput` (we use `SurfaceView` in parallel)

* fix: Log photo widths

* fix: Fix cpplint
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants