v2.0.0-rc.1
Value specialization
The Value
struct has been refactored into multiple strongly-typed structs: Tensor<T>
, Map<K, V>
, and Sequence<T>
, and their type-erased variants: DynTensor
, DynMap
, and DynSequence
.
Values returned by session inference are now DynValue
s, which behave exactly the same as Value
in previous versions.
Tensors created from Rust, like via the new Tensor::new
function, can be directly and infallibly extracted into its underlying data via extract_tensor
(no try_
):
let allocator = Allocator::new(&session, MemoryInfo::new(AllocationDevice::CUDAPinned, 0, AllocatorType::Device, MemoryType::CPUInput)?)?;
let tensor = Tensor::<f32>::new(&allocator, [1, 128, 128, 3])?;
let array = tensor.extract_array();
// no need to specify type or handle errors - Tensor<f32> can only extract into an f32 ArrayView
You can still extract tensors, maps, or sequence values normally from a DynValue
using try_extract_*
:
let generated_tokens: ArrayViewD<f32> = outputs["output1"].try_extract_tensor()?;
DynValue
can be upcast()
ed to the more specialized types, like DynMap
or Tensor<T>
:
let tensor: Tensor<f32> = value.upcast()?;
let map: DynMap = value.upcast()?;
Similarly, a strongly-typed value like Tensor<T>
can be downcast back into a DynValue
or DynTensor
.
let dyn_tensor: DynTensor = tensor.downcast();
let dyn_value: DynValue = tensor.into_dyn();
Tensor extraction directly returns an ArrayView
extract_tensor
(and now try_extract_tensor
) now return an ndarray::ArrayView
directly, instead of putting it behind the old ort::Tensor<T>
type (not to be confused with the new specialized value type). This means you don't have to .view()
on the result:
-let generated_tokens: Tensor<f32> = outputs["output1"].extract_tensor()?;
-let generated_tokens = generated_tokens.view();
+let generated_tokens: ArrayViewD<f32> = outputs["output1"].try_extract_tensor()?;
Full support for sequence & map values
You can now construct and extract Sequence
/Map
values.
Value views
You can now obtain a view of any Value
via the new view()
and view_mut()
functions, which operate similar to ndarray
's own view system. These views can also now be passed into session inputs.
Mutable tensor extraction
You can extract a mutable ArrayViewMut
or &mut [T]
from a mutable reference to a tensor.
let (raw_shape, raw_data) = tensor.extract_raw_tensor_mut();
Device-allocated tensors
You can now create a tensor on device memory with Tensor::new
& an allocator:
let allocator = Allocator::new(&session, MemoryInfo::new(AllocationDevice::CUDAPinned, 0, AllocatorType::Device, MemoryType::CPUInput)?)?;
let tensor = Tensor::<f32>::new(&allocator, [1, 128, 128, 3])?;
The data will be allocated by the device specified by the allocator. You can then use the new mutable tensor extraction to modify the tensor's data.
What if custom operators were 🚀 blazingly 🔥 fast 🦀?
You can now write custom operator kernels in Rust. Check out the custom-ops
example.
Custom operator library feature change
Since custom operators can now be written completely in Rust, the old custom-ops
feature, which enabled loading custom operators from an external dynamic library, has been renamed to operator-libraries
.
Additionally, Session::with_custom_ops_lib
has been renamed to Session::with_operator_library
, and the confusingly named Session::with_enable_custom_ops
(which does not enable custom operators in general, but rather attempts to load onnxruntime-extensions
) has been updated to Session::with_extensions
to reflect its actual behavior.
Asynchronous inference
Session
introduces a new run_async
method which returns inference results via a future. It's also cancel-safe, so you can simply cancel inference with something like tokio::select!
or tokio::time::timeout
.
If you have any questions about this release, we're here to help:
Love ort
? Consider supporting us on Open Collective 💖
❤️💚💙💛