Edit

Share via


Run ONNX models using the ONNX Runtime included in Windows ML

The ONNX Runtime shipped with Windows ML allows apps to run inference on ONNX models locally.

Creating an inference session

The APIs are the same as when using ONNX Runtime directly. For example, to create an inference session:

// Create inference session using compiled model
using InferenceSession session = new(compiledModelPath, sessionOptions);

We suggest reading the ONNX Runtime docs for more info about how to use the ONNX Runtime APIs within Windows ML. Model inference code will be different for every model.

Compile models

Before using an ONNX model in an inference session, it often must be compiled into an optimized representation that can be executed efficiently on the device's underlying hardware.

As of ONNX Runtime 1.22, there are new APIs that better encapsulate the compilation steps. More details are available in the ONNX Runtime compile documentation (see OrtCompileApi struct).

// Prepare compilation options
OrtModelCompilationOptions compileOptions = new(sessionOptions);
compileOptions.SetInputModelPath(modelPath);
compileOptions.SetOutputModelPath(compiledModelPath);

// Compile the model
compileOptions.CompileModel();

Note

Compilation can take several minutes to complete. So that any UI remains responsive, consider doing this as a background operation in your application.

Tip

For optimal performance, compile your models once and reuse the compiled version. Store compiled models in your app's local data folder for subsequent runs. Note that updates to the EPs or runtime might require recompiling.

See also