-
Notifications
You must be signed in to change notification settings - Fork 404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DataLoader
s 5: add support for external binary DataLoader
s (PATH)
#4521
Merged
Merged
Changes from all commits
Commits
Show all changes
14 commits
Select commit
Hold shift + click to select a range
4b5f528
implement External DataLoader
teh-cmc 0ce4381
add rust example
teh-cmc bc07ea2
public iter_external_loaders
teh-cmc c75c936
update file dialog
teh-cmc 357e68a
add cpp example (oh my god the pain)
teh-cmc 185292f
add python example
teh-cmc 3ef06e6
add example assets
teh-cmc a0283be
lints
teh-cmc f218d5f
all the nasty screenshot business
teh-cmc b5de0fe
typo
teh-cmc 9d836e3
try to somehow please clang
teh-cmc fea2a31
review and stuff
teh-cmc db96b99
lints
teh-cmc f83cf44
more lints
teh-cmc File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
190 changes: 190 additions & 0 deletions
190
crates/re_data_source/src/data_loader/loader_external.rs
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,190 @@ | ||
use std::io::Read; | ||
|
||
use once_cell::sync::Lazy; | ||
|
||
/// To register a new external data loader, simply add an executable in your $PATH whose name | ||
/// starts with this prefix. | ||
pub const EXTERNAL_DATA_LOADER_PREFIX: &str = "rerun-loader-"; | ||
|
||
/// Keeps track of the paths all external executable [`crate::DataLoader`]s. | ||
/// | ||
/// Lazy initialized the first time a file is opened by running a full scan of the `$PATH`. | ||
/// | ||
/// External loaders are _not_ registered on a per-extension basis: we want users to be able to | ||
/// filter data on a much more fine-grained basis that just file extensions (e.g. checking the file | ||
/// itself for magic bytes). | ||
pub static EXTERNAL_LOADER_PATHS: Lazy<Vec<std::path::PathBuf>> = Lazy::new(|| { | ||
re_tracing::profile_function!(); | ||
|
||
use walkdir::WalkDir; | ||
|
||
let dirpaths = std::env::var("PATH") | ||
.ok() | ||
.into_iter() | ||
.flat_map(|paths| paths.split(':').map(ToOwned::to_owned).collect::<Vec<_>>()) | ||
.map(std::path::PathBuf::from); | ||
|
||
let executables: ahash::HashSet<_> = dirpaths | ||
.into_iter() | ||
.flat_map(|dirpath| { | ||
WalkDir::new(dirpath).into_iter().filter_map(|entry| { | ||
let Ok(entry) = entry else { | ||
return None; | ||
}; | ||
let filepath = entry.path(); | ||
let is_rerun_loader = filepath.file_name().map_or(false, |filename| { | ||
filename | ||
.to_string_lossy() | ||
.starts_with(EXTERNAL_DATA_LOADER_PREFIX) | ||
}); | ||
(filepath.is_file() && is_rerun_loader).then(|| filepath.to_owned()) | ||
}) | ||
}) | ||
.collect(); | ||
|
||
// NOTE: We call all available loaders and do so in parallel: order is irrelevant here. | ||
executables.into_iter().collect() | ||
}); | ||
|
||
/// Iterator over all registered external [`crate::DataLoader`]s. | ||
#[inline] | ||
pub fn iter_external_loaders() -> impl ExactSizeIterator<Item = std::path::PathBuf> { | ||
EXTERNAL_LOADER_PATHS.iter().cloned() | ||
} | ||
|
||
// --- | ||
|
||
/// A [`crate::DataLoader`] that forwards the path to load to all executables present in | ||
/// the user's `PATH` with a name that starts with `EXTERNAL_DATA_LOADER_PREFIX`. | ||
/// | ||
/// The external loaders are expected to log rrd data to their standard output. | ||
/// | ||
/// Refer to our `external_data_loader` example for more information. | ||
pub struct ExternalLoader; | ||
|
||
impl crate::DataLoader for ExternalLoader { | ||
#[inline] | ||
fn name(&self) -> String { | ||
"rerun.data_loaders.External".into() | ||
} | ||
|
||
fn load_from_path( | ||
&self, | ||
store_id: re_log_types::StoreId, | ||
filepath: std::path::PathBuf, | ||
tx: std::sync::mpsc::Sender<crate::LoadedData>, | ||
) -> Result<(), crate::DataLoaderError> { | ||
use std::process::{Command, Stdio}; | ||
|
||
re_tracing::profile_function!(filepath.display().to_string()); | ||
|
||
for exe in EXTERNAL_LOADER_PATHS.iter() { | ||
let store_id = store_id.clone(); | ||
let filepath = filepath.clone(); | ||
let tx = tx.clone(); | ||
|
||
// NOTE: spawn is fine, the entire loader is native-only. | ||
rayon::spawn(move || { | ||
re_tracing::profile_function!(); | ||
|
||
let child = Command::new(exe) | ||
teh-cmc marked this conversation as resolved.
Show resolved
Hide resolved
|
||
.arg(filepath.clone()) | ||
.args(["--recording-id".to_owned(), store_id.to_string()]) | ||
.stdout(Stdio::piped()) | ||
.stderr(Stdio::piped()) | ||
.spawn(); | ||
|
||
let mut child = match child { | ||
Ok(child) => child, | ||
Err(err) => { | ||
re_log::error!(?filepath, loader = ?exe, %err, "Failed to execute external loader"); | ||
return; | ||
} | ||
}; | ||
|
||
let Some(stdout) = child.stdout.take() else { | ||
let reason = "stdout unreachable"; | ||
re_log::error!(?filepath, loader = ?exe, %reason, "Failed to execute external loader"); | ||
return; | ||
}; | ||
let Some(stderr) = child.stderr.take() else { | ||
let reason = "stderr unreachable"; | ||
re_log::error!(?filepath, loader = ?exe, %reason, "Failed to execute external loader"); | ||
return; | ||
}; | ||
|
||
re_log::debug!(?filepath, loader = ?exe, "Loading data from filesystem using external loader…",); | ||
|
||
let version_policy = re_log_encoding::decoder::VersionPolicy::Warn; | ||
let stdout = std::io::BufReader::new(stdout); | ||
match re_log_encoding::decoder::Decoder::new(version_policy, stdout) { | ||
Ok(decoder) => { | ||
decode_and_stream(&filepath, &tx, decoder); | ||
} | ||
Err(re_log_encoding::decoder::DecodeError::Read(_)) => { | ||
// The child was not interested in that file and left without logging | ||
// anything. | ||
// That's fine, we just need to make sure to check its exit status further | ||
// down, still. | ||
return; | ||
} | ||
Err(err) => { | ||
re_log::error!(?filepath, loader = ?exe, %err, "Failed to decode external loader's output"); | ||
return; | ||
} | ||
}; | ||
|
||
let status = match child.wait() { | ||
Ok(output) => output, | ||
Err(err) => { | ||
re_log::error!(?filepath, loader = ?exe, %err, "Failed to execute external loader"); | ||
return; | ||
} | ||
}; | ||
|
||
if !status.success() { | ||
let mut stderr = std::io::BufReader::new(stderr); | ||
let mut reason = String::new(); | ||
stderr.read_to_string(&mut reason).ok(); | ||
re_log::error!(?filepath, loader = ?exe, %reason, "Failed to execute external loader"); | ||
} | ||
}); | ||
} | ||
|
||
Ok(()) | ||
} | ||
|
||
#[inline] | ||
fn load_from_file_contents( | ||
&self, | ||
_store_id: re_log_types::StoreId, | ||
_path: std::path::PathBuf, | ||
_contents: std::borrow::Cow<'_, [u8]>, | ||
_tx: std::sync::mpsc::Sender<crate::LoadedData>, | ||
) -> Result<(), crate::DataLoaderError> { | ||
// TODO(cmc): You could imagine a world where plugins can be streamed rrd data via their | ||
// standard input… but today is not world. | ||
Ok(()) // simply not interested | ||
} | ||
} | ||
|
||
fn decode_and_stream<R: std::io::Read>( | ||
filepath: &std::path::Path, | ||
tx: &std::sync::mpsc::Sender<crate::LoadedData>, | ||
decoder: re_log_encoding::decoder::Decoder<R>, | ||
) { | ||
re_tracing::profile_function!(filepath.display().to_string()); | ||
|
||
for msg in decoder { | ||
let msg = match msg { | ||
Ok(msg) => msg, | ||
Err(err) => { | ||
re_log::warn_once!("Failed to decode message in {filepath:?}: {err}"); | ||
continue; | ||
} | ||
}; | ||
if tx.send(msg.into()).is_err() { | ||
break; // The other end has decided to hang up, not our problem. | ||
} | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
int main() { | ||
std::cout << "That will only work with the right plugin in your $PATH!" << std::endl; | ||
std::cout << "Checkout the `external_data_loader` C++ example." << std::endl; | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
from __future__ import annotations | ||
|
||
print("That will only work with the right plugin in your $PATH!") | ||
print("Checkout the `external_data_loader` Python example.") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
fn main() { | ||
println!("That will only work with the right plugin in your $PATH!"); | ||
println!("Checkout the `external_data_loader` Rust example."); | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
cmake_minimum_required(VERSION 3.16...3.27) | ||
|
||
# If you use the example outside of the Rerun SDK you need to specify | ||
# where the rerun_c build is to be found by setting the `RERUN_CPP_URL` variable. | ||
# This can be done by passing `-DRERUN_CPP_URL=<path to rerun_sdk_cpp zip>` to cmake. | ||
if(DEFINED RERUN_REPOSITORY) | ||
add_executable(rerun-loader-cpp-file main.cpp) | ||
rerun_strict_warning_settings(rerun-loader-cpp-file) | ||
else() | ||
project(rerun-loader-cpp-file LANGUAGES CXX) | ||
|
||
add_executable(rerun-loader-cpp-file main.cpp) | ||
|
||
# Set the path to the rerun_c build. | ||
set(RERUN_CPP_URL "https://github.com/rerun-io/rerun/releases/latest/download/rerun_cpp_sdk.zip" CACHE STRING "URL to the rerun_cpp zip.") | ||
option(RERUN_FIND_PACKAGE "Whether to use find_package to find a preinstalled rerun package (instead of using FetchContent)." OFF) | ||
|
||
if(RERUN_FIND_PACKAGE) | ||
find_package(rerun_sdk REQUIRED) | ||
else() | ||
# Download the rerun_sdk | ||
include(FetchContent) | ||
FetchContent_Declare(rerun_sdk URL ${RERUN_CPP_URL}) | ||
FetchContent_MakeAvailable(rerun_sdk) | ||
endif() | ||
|
||
# Rerun requires at least C++17, but it should be compatible with newer versions. | ||
set_property(TARGET rerun-loader-cpp-file PROPERTY CXX_STANDARD 17) | ||
endif() | ||
|
||
# Link against rerun_sdk. | ||
target_link_libraries(rerun-loader-cpp-file PRIVATE rerun_sdk) |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should document somewhere what we expect from a loader, eg in a
//!
-level docstringIn particular:
--recording-id
)stdout
)This should also be added as a how-to guide on rerun.io/docs!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have an upcoming PR that adds a web guide, I'll add a link to that guide in the doc string then