Skip to content

Commit

Permalink
Change cmakelists and documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
Romanov committed Jun 1, 2021
1 parent 6e9bbd7 commit ec95383
Show file tree
Hide file tree
Showing 5 changed files with 26 additions and 46 deletions.
25 changes: 12 additions & 13 deletions inference-engine/samples/speech_sample/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

This sample demonstrates how to execute an Asynchronous Inference of acoustic model based on Kaldi\* neural networks and speech feature vectors.

The sample works with Kaldi ARK files only, so it does not cover an end-to-end speech recognition scenario (speech to text), requiring additional preprocessing (feature extraction) to get a feature vector from a speech signal, as well as postprocessing (decoding) to produce text from scores.
The sample works with Kaldi ARK or Numpy* uncompressed NPZ files, so it does not cover an end-to-end speech recognition scenario (speech to text), requiring additional preprocessing (feature extraction) to get a feature vector from a speech signal, as well as postprocessing (decoding) to produce text from scores.

Automatic Speech Recognition C++ sample application demonstrates how to use the following Inference Engine C++ API in applications:

Expand All @@ -27,8 +27,8 @@ Basic Inference Engine API is covered by [Hello Classification C++ sample](../he

## How It Works

Upon the start-up, the application reads command line parameters and loads a Kaldi-trained neural network along with Kaldi ARK speech feature vector file to the Inference Engine plugin. Then it performs inference on all speech utterances stored in the input ARK file. Context-windowed speech frames are processed in batches of 1-8
frames according to the `-bs` parameter. Batching across utterances is not supported by this sample. When inference is done, the application creates an output ARK file. If the `-r` option is given, error
Upon the start-up, the application reads command line parameters, loads a specified model and input data to the Inference Engine plugin, performs synchronous inference on all speech utterances stored in the input file. Context-windowed speech frames are processed in batches of 1-8
frames according to the `-bs` parameter. Batching across utterances is not supported by this sample. When inference is done, the application creates an output file. If the `-r` option is given, error
statistics are provided for each speech utterance as shown above.

You can see the explicit description of
Expand All @@ -43,7 +43,7 @@ Several parameters control neural network quantization. The `-q` flag determines
Three modes are supported:

- *static* - The first
utterance in the input ARK file is scanned for dynamic range. The scale factor (floating point scalar multiplier) required to scale the maximum input value of the first utterance to 16384 (15 bits) is used
utterance in the input file is scanned for dynamic range. The scale factor (floating point scalar multiplier) required to scale the maximum input value of the first utterance to 16384 (15 bits) is used
for all subsequent inputs. The neural network is quantized to accommodate the scaled input dynamic range.
- *dynamic* - The user may specify a scale factor via the `-sf` flag that will be used for static quantization.
- *user-defined* - The scale factor for each input batch is computed
Expand Down Expand Up @@ -99,17 +99,17 @@ speech_sample [OPTION]
Options:

-h Print a usage message.
-i "<path>" Required. Paths to .ark files. Example of usage: <file1.ark,file2.ark> or <file.ark>.
-i "<path>" Required. Paths to input files. Example of usage: <file1.ark,file2.ark> or <file.ark> or <file.npz>.
-m "<path>" Required. Path to an .xml file with a trained model (required if -rg is missing).
-o "<path>" Optional. Output file name to save ark scores.
-o "<path>" Optional. Output file name to save scores. Example of usage: <output.ark> or <output.npz>
-d "<device>" Optional. Specify a target device to infer on. CPU, GPU, MYRIAD, GNA_AUTO, GNA_HW, GNA_SW_FP32, GNA_SW_EXACT and HETERO with combination of GNA
as the primary device and CPU as a secondary (e.g. HETERO:GNA,CPU) are supported. The list of available devices is shown below. The sample will look for a suitable plugin for device specified.
-pc Optional. Enables per-layer performance report.
-q "<mode>" Optional. Input quantization mode: "static" (default), "dynamic", or "user" (use with -sf).
-q "<mode>" Optional. Input quantization mode: static (default), dynamic, or user (use with -sf).
-qb "<integer>" Optional. Weight bits for quantization: 8 or 16 (default)
-sf "<double>" Optional. User-specified input scale factor for quantization (use with -q user). If the network contains multiple inputs, provide scale factors by separating them with commas.
-bs "<integer>" Optional. Batch size 1-8 (default 1)
-r "<path>" Optional. Read reference score .ark file and compare scores.
-r "<path>" Optional. Read referefile and compare scores. Example of usage: <reference.ark> or <reference.npz>
-rg "<path>" Read GNA model from file using path/filename provided (required if -m is missing).
-wg "<path>" Optional. Write GNA model to file using path/filename provided.
-we "<path>" Optional. Write GNA embedded model to file using path/filename provided.
Expand All @@ -118,10 +118,9 @@ Options:
If you use the cw_l or cw_r flag, then batch size and nthreads arguments are ignored.
-cw_r "<integer>" Optional. Number of frames for right context windows (default is 0). Works only with context window networks.
If you use the cw_r or cw_l flag, then batch size and nthreads arguments are ignored.
-oname "<outputs>" Optional. Layer names for output blobs. The names are separated with ",". Allows to change the order of output layers for -o flag.
Example: Output1:port,Output2:port.
-iname "<inputs>" Optional. Layer names for input blobs. The names are separated with ",". Allows to change the order of input layers for -i flag.
Example: Input1,Input2
-oname "<string>" Optional. Layer names for output blobs. The names are separated with "," Example: Output1:port,Output2:port
-iname "<string>" Optional. Layer names for input blobs. The names are separated with "," Example: Input1,Input2
-pwl_me "<double>" Optional. The maximum percent of error for PWL function.The value must be in <0, 100> range. The default value is 1.0.

Available target devices: <devices>

Expand Down Expand Up @@ -169,7 +168,7 @@ All of them can be downloaded from [https://storage.openvinotoolkit.org/models_c
## Sample Output
The acoustic log likelihood sequences for all utterances are stored in the Kaldi ARK file, `scores.ark`. If the `-r` option is used, a report on the statistical score error is generated for each utterance such as
The acoustic log likelihood sequences for all utterances are stored in the file. Example `scores.ark` or `scores.npz`. If the `-r` option is used, a report on the statistical score error is generated for each utterance such as
the following:
```sh
Expand Down
8 changes: 3 additions & 5 deletions inference-engine/samples/speech_sample/fileutils.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,7 @@ void ArkFile::GetFileInfo(const char* fileName, uint32_t numArrayToFindSize, uin
}
in_file.close();
} else {
fprintf(stderr, "Failed to open %s for reading in GetKaldiArkInfo()!\n", fileName);
exit(-1);
throw std::runtime_error(std::string("Failed to open %s for reading in GetFileInfo()!\n") + fileName);
}

if (ptrNumArrays != NULL)
Expand Down Expand Up @@ -76,8 +75,7 @@ void ArkFile::LoadFile(const char* fileName, uint32_t arrayIndex, std::string& p
}
in_file.close();
} else {
fprintf(stderr, "Failed to open %s for reading in GetKaldiArkInfo()!\n", fileName);
exit(-1);
throw std::runtime_error(std::string("Failed to open %s for reading in LoadFile()!\n") + fileName);
}

*ptrNumBytesPerElement = sizeof(float);
Expand All @@ -100,7 +98,7 @@ void ArkFile::SaveFile(const char* fileName, bool shouldAppend, std::string name
out_file.write(reinterpret_cast<char*>(ptrMemory), numRows * numColumns * sizeof(float));
out_file.close();
} else {
throw std::runtime_error(std::string("Failed to open %s for writing in SaveKaldiArkArray()!\n") + fileName);
throw std::runtime_error(std::string("Failed to open %s for writing in SaveFile()!\n") + fileName);
}
}

Expand Down
6 changes: 3 additions & 3 deletions inference-engine/samples/speech_sample/speech_sample.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
static const char help_message[] = "Print a usage message.";

/// @brief message for images argument
static const char input_message[] = "Required. Paths to .ark files. Example of usage: <file1.ark,file2.ark> or <file.ark>.";
static const char input_message[] = "Required. Paths to input files. Example of usage: <file1.ark,file2.ark> or <file.ark> or <file.npz>.";

/// @brief message for model argument
static const char model_message[] = "Required. Path to an .xml file with a trained model (required if -rg is missing).";
Expand Down Expand Up @@ -49,10 +49,10 @@ static const char custom_cpu_library_message[] = "Required for CPU plugin custom
"Absolute path to a shared library with the kernels implementations.";

/// @brief message for score output argument
static const char output_message[] = "Optional. Output file name to save ark scores.";
static const char output_message[] = "Optional. Output file name to save scores. Example of usage: <output.ark> or <output.npz>";

/// @brief message for reference score file argument
static const char reference_score_message[] = "Optional. Read reference score .ark file and compare scores.";
static const char reference_score_message[] = "Optional. Read reference score file and compare scores. Example of usage: <reference.ark> or <reference.npz>";

/// @brief message for read GNA model argument
static const char read_gna_model_message[] = "Read GNA model from file using path/filename provided (required if -m is missing).";
Expand Down
4 changes: 1 addition & 3 deletions thirdparty/cnpy/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,10 @@ endif(COMMAND cmake_policy)

project(CNPY)

set(BUILD_SHARED_LIBS OFF)
set(TARGET_NAME "cnpy")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++11")
add_library(cnpy STATIC "cnpy.cpp")

if(NOT WIN32)
if(NOT ${CMAKE_CXX_COMPILER_ID} STREQUAL "MSVC")
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-all")
set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -Wno-all")
target_compile_options(${TARGET_NAME} PUBLIC -Wno-unused-variable)
Expand Down
29 changes: 7 additions & 22 deletions thirdparty/zlib/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,23 +1,16 @@
PROJECT(zlib)

if(NOT WIN32)
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-all")
set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -Wno-all")
set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-all")
set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -Wno-all")
endif()

if (MSVC)
# Build with multiple processes
add_definitions(/MP)
# MSVC warning suppressions
add_definitions(
/wd4996 # The compiler encountered a deprecated declaration.
)

endif (MSVC)
if(CMAKE_C_COMPILER_ID STREQUAL "MSVC")
set (CMAKE_C_FLAGS "${CMAKE_C_FLAGS} /MP /wd4996 /W3")
endif()

set(BUILD_SHARED_LIBS OFF)
set(TARGET_NAME "zlib")
include_directories("${CMAKE_CURRENT_SOURCE_DIR}/zlib")

set(lib_srcs
zlib/adler32.c
Expand Down Expand Up @@ -51,17 +44,9 @@ set(lib_hdrs

set(lib_ext_hdrs "zlib/zlib.h" "zlib/zconf.h")
add_library(${TARGET_NAME} STATIC ${lib_srcs} ${lib_hdrs} ${lib_ext_hdrs})
target_include_directories(${TARGET_NAME} PUBLIC "${CMAKE_CURRENT_SOURCE_DIR}/zlib"
"${CMAKE_CURRENT_SOURCE_DIR}/zlib/..")

if(MSVC)
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} /W3")
endif()

if(UNIX)
if(CMAKE_COMPILER_IS_GNUCXX OR CV_ICC)
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -fPIC")
endif()
endif()
target_include_directories(${TARGET_NAME} PUBLIC "${CMAKE_CURRENT_SOURCE_DIR}/zlib"
"${CMAKE_CURRENT_SOURCE_DIR}/zlib/..")

set_target_properties(zlib PROPERTIES FOLDER thirdparty)

0 comments on commit ec95383

Please sign in to comment.