diff --git a/README.md b/README.md
index 5854535..88b55c0 100644
--- a/README.md
+++ b/README.md
@@ -1,4 +1,4 @@
-# AI Verification: Constrained Deep Learning [![Open in MATLAB Online](https://www.mathworks.com/images/responsive/global/open-in-matlab-online.svg)](https://matlab.mathworks.com/open/github/v1?repo=matlab-deep-learning/constrained-deep-learning)
+# AI Verification: Constrained Deep Learning
 
 Constrained deep learning is an advanced approach to training deep neural networks by incorporating domain-specific constraints into the learning process. By integrating these constraints into the construction and training of neural networks, you can guarantee desirable behaviour in safety-critical scenarios where such guarantees are paramount. 
 
@@ -6,7 +6,8 @@ This project aims to develop and evaluate deep learning models that adhere to pr
 
 <figure>
 <p align="center">
-    <img src="./documentation/figures/constrained_learning.svg">
+    <img src="./documentation/figures/constrained_learning.svg"
+         style="width:4in;height:1.1in">
 </p>
 </figure>
 
@@ -32,12 +33,12 @@ The repository contains several introductory, interactive examples as well as lo
 
 ### Introductory Examples (Short)
 Below are links for markdown versions of MATLAB Live Scripts that you can view in GitHub&reg;. 
-- [Fully Input Convex Neural Networks in 1-Dimension](examples/convex/introductory/PoC_Ex1_1DFICNN.md)
-- [Fully Input Convex Neural Networks in n-Dimensions](examples/convex/introductory/PoC_Ex2_nDFICNN.md)
-- [Partially Input Convex Neural Networks in n-Dimensions](examples/convex/introductory/PoC_Ex3_nDPICNN.md)
-- [Fully Input Monotonic Neural Networks in 1-Dimension](examples/monotonic/introductory/PoC_Ex1_1DFMNN.md)
-- [Fully Input Monotonic Neural Networks in n-Dimensions](examples/monotonic/introductory/PoC_Ex2_nDFMNN.md)
-- [Lipschitz Continuous Neural Networks in 1-Dimension](examples/lipschitz/introductory/PoC_Ex1_1DLNN.md)
+- [Fully input convex neural networks in 1-dimension](examples/convex/introductory/PoC_Ex1_1DFICNN.md)
+- [Fully input convex neural networks in n-dimensions](examples/convex/introductory/PoC_Ex2_nDFICNN.md)
+- [Partially input convex neural networks in n-dimensions](examples/convex/introductory/PoC_Ex3_nDPICNN.md)
+- [Fully input monotonic neural networks in 1-dimension](examples/monotonic/introductory/PoC_Ex1_1DFMNN.md)
+- [Fully input monotonic neural networks in n-dimensions](examples/monotonic/introductory/PoC_Ex2_nDFMNN.md)
+- [Lipschitz continuous neural networks in 1-dimensions](examples/lipschitz/introductory/PoC_Ex1_1DLNN.md)
 
 These examples make use of [custom training loops](https://uk.mathworks.com/help/deeplearning/deep-learning-custom-training-loops.html) and the [`arrayDatastore`](https://uk.mathworks.com/help/matlab/ref/matlab.io.datastore.arraydatastore.html) object. To learn more, click the links. 
 
@@ -70,13 +71,7 @@ As discussed in [1] (see 3.4.1.5), in certain situations, small violations in th
 
 ## Technical Articles
 
-This repository focuses on the development and evaluation of deep learning models that adhere to constraints crucial for safety-critical applications, such as predictive maintenance for industrial machinery and equipment. Specifically, it focuses on enforcing monotonicity, convexity, and Lipschitz continuity within neural networks to ensure predictable and controlled behavior. 
-
-By emphasizing constraints like monotonicity, constrained neural networks ensure that predictions of the Remaining Useful Life (RUL) of components behave intuitively: as a machine's condition deteriorates, the estimated RUL should monotonically decrease. This is crucial in applications like aerospace or manufacturing, where an accurate and reliable estimation of RUL can prevent failures and save costs. 
-
-Alongside monotonicity, Lipschitz continuity is also enforced to guarantee model robustness and controlled behavior. This is essential in environments where safety and precision are paramount such as control systems in autonomous vehicles or precision equipment in healthcare. 
-
-Convexity is especially beneficial for control systems as it inherently provides boundedness properties. For instance, by ensuring that the output of a neural network lies within a convex hull, it is possible to guarantee that the control commands remain within a safe and predefined operational space, preventing erratic or unsafe system behaviors. This boundedness property, derived from the convex nature of the model's output space, is critical for maintaining the integrity and safety of control systems under various conditions.
+This repository focuses on the development and evaluation of deep learning models that adhere to constraints crucial for safety-critical applications, such as predictive maintenance for industrial machinery and equipment. Specifically, it focuses on enforcing monotonicity, convexity, and Lipschitz continuity within neural networks to ensure predictable and controlled behavior. By emphasizing constraints like monotonicity, constrained neural networks ensure that predictions of the Remaining Useful Life (RUL) of components behave intuitively: as a machine's condition deteriorates, the estimated RUL should monotonically decrease. This is crucial in applications like aerospace or manufacturing, where an accurate and reliable estimation of RUL can prevent failures and save costs. Alongside monotonicity, Lipschitz continuity is also enforced to guarantee model robustness and controlled behavior. This is essential in environments where safety and precision are paramount such as control systems in autonomous vehicles or precision equipment in healthcare. Convexity is especially beneficial for control systems as it inherently provides boundedness properties. For instance, by ensuring that the output of a neural network lies within a convex hull, it is possible to guarantee that the control commands remain within a safe and predefined operational space, preventing erratic or unsafe system behaviors. This boundedness property, derived from the convex nature of the model's output space, is critical for maintaining the integrity and safety of control systems under various conditions.
 
 These technical articles explain key concepts of AI verification in the context of constrained deep learning. They include discussions on how to achieve the specified constraints in neural networks at construction and training time, as well as deriving and proving useful properties of constrained networks in AI verification applications. It is not necessary to go through these articles in order to explore this repository, however, you can find references and more in depth discussion here.
 
@@ -90,4 +85,4 @@ These technical articles explain key concepts of AI verification in the context
 - [3] Gouk, Henry, et al. “Regularisation of Neural Networks by Enforcing Lipschitz Continuity.” Machine Learning, vol. 110, no. 2, Feb. 2021, pp. 393–416. DOI.org (Crossref), https://doi.org/10.1007/s10994-020-05929-w
 - [4] Kitouni, Ouail, et al. Expressive Monotonic Neural Networks. arXiv:2307.07512, arXiv, 14 July 2023. arXiv.org, http://arxiv.org/abs/2307.07512.
 
-Copyright 2024, The MathWorks, Inc.
+Copyright (c) 2024, The MathWorks, Inc.
diff --git a/conslearn/+conslearn/+convex/buildFICNN.m b/conslearn/+conslearn/+convex/buildFICNN.m
index b141ce4..3a20e70 100644
--- a/conslearn/+conslearn/+convex/buildFICNN.m
+++ b/conslearn/+conslearn/+convex/buildFICNN.m
@@ -13,13 +13,13 @@
 %
 %   BUILDFICNN name-value arguments:
 %
-%   'PositiveNonDecreasingActivation' - Specify the positive, convex,
+%   'ConvexNonDecreasingActivation'   - Specify the convex,
 %                                       non-decreasing activation functions. 
 %                                       The options are 'softplus' or 'relu'. 
 %                                       The default is 'softplus'.
 %
 % The construction of this network corresponds to Eq 2 in [1] with the
-% exception that the application of the positive, non-decreasing activation
+% exception that the application of the convex, non-decreasing activation
 % function on the network output is not applied. This maintains convexity
 % but permits positive and negative network outputs.
 % 
@@ -31,7 +31,7 @@
 arguments
     inputSize (1,:)
     numHiddenUnits (1,:)
-    options.PositiveNonDecreasingActivation = 'softplus'
+    options.ConvexNonDecreasingActivation = 'softplus'
 end
 
 % Construct the correct input layer
@@ -43,7 +43,7 @@
 end
 
 % Loop over construction of hidden units
-switch options.PositiveNonDecreasingActivation
+switch options.ConvexNonDecreasingActivation
     case 'relu'
         pndFcn = @(k)reluLayer(Name="pnd_" + k);
     case 'softplus'
@@ -68,10 +68,10 @@
 
 % Add a cascading residual connection
 for ii = 2:depth
-    tempLayers = fullyConnectedLayer(numHiddenUnits(ii),Name="fc_y_+_" + ii);
+    tempLayers = fullyConnectedLayer(numHiddenUnits(ii),Name="fc_y_" + ii);
     lgraph = addLayers(lgraph,tempLayers);
-    lgraph = connectLayers(lgraph,"input","fc_y_+_" + ii);
-    lgraph = connectLayers(lgraph,"fc_y_+_" + ii,"add_" + ii + "/in2");
+    lgraph = connectLayers(lgraph,"input","fc_y_" + ii);
+    lgraph = connectLayers(lgraph,"fc_y_" + ii,"add_" + ii + "/in2");
 end
 
 % Initialize dlnetwork
diff --git a/conslearn/+conslearn/+convex/buildPICNN.m b/conslearn/+conslearn/+convex/buildPICNN.m
index 37384d5..ed9032a 100644
--- a/conslearn/+conslearn/+convex/buildPICNN.m
+++ b/conslearn/+conslearn/+convex/buildPICNN.m
@@ -13,7 +13,7 @@
 %
 %   BUILDPICNN name-value arguments:
 %
-%   'PositiveNonDecreasingActivation' - Specify the positive, convex,
+%   'ConvexNonDecreasingActivation'   - Specify the convex,
 %                                       non-decreasing activation functions. 
 %                                       The options are 'softplus' or 'relu'. 
 %                                       The default is 'softplus'.
@@ -32,7 +32,7 @@
 %                                       default value is 1.
 % 
 % The construction of this network corresponds to Eq 3 in [1] with the
-% exception that the application of the positive, non-decreasing activation
+% exception that the application of the convex, non-decreasing activation
 % function on the network output is not applied. This maintains convexity
 % but permits positive and negative network outputs. Additionally, and in
 % keeping with the notation used in the reference, in this implementation
@@ -50,7 +50,7 @@
 arguments
     inputSize (1,:) {iValidateInputSize(inputSize)}
     numHiddenUnits (1,:)
-    options.PositiveNonDecreasingActivation = 'softplus'
+    options.ConvexNonDecreasingActivation = 'softplus'
     options.Activation = 'tanh'
     options.ConvexChannelIdx = 1
 end
@@ -63,7 +63,7 @@
 convexInputSize = numel(convexChannels);
 
 % Prepare the two types of valid activation functions
-switch options.PositiveNonDecreasingActivation
+switch options.ConvexNonDecreasingActivation
     case 'relu'
         pndFcn = @(k)reluLayer(Name="pnd_" + k);
     case 'softplus'
diff --git a/conslearn/buildConstrainedNetwork.m b/conslearn/buildConstrainedNetwork.m
index a85e264..1a3d852 100644
--- a/conslearn/buildConstrainedNetwork.m
+++ b/conslearn/buildConstrainedNetwork.m
@@ -19,7 +19,7 @@
 %
 %   These options and default values apply to convex constrained networks:
 %
-%   PositiveNonDecreasingActivation   - Positive, convex, non-decreasing 
+%   ConvexNonDecreasingActivation     - Convex, non-decreasing 
 %   ("fully-convex")                    activation functions. 
 %   ("partially-convex")                The options are "softplus" or "relu". 
 %                                       The default is "softplus".
@@ -96,10 +96,10 @@
         iValidateInputSize(inputSize)}
     numHiddenUnits (1,:) {mustBeInteger,mustBeReal,mustBePositive}
     % Convex
-    options.PositiveNonDecreasingActivation {...
+    options.ConvexNonDecreasingActivation {...
         mustBeTextScalar, ...
-        mustBeMember(options.PositiveNonDecreasingActivation,["relu","softplus"]),...
-        iValidateConstraintWithPositiveNonDecreasingActivation(options.PositiveNonDecreasingActivation, constraint)}
+        mustBeMember(options.ConvexNonDecreasingActivation,["relu","softplus"]),...
+        iValidateConstraintWithConvexNonDecreasingActivation(options.ConvexNonDecreasingActivation, constraint)}
     options.ConvexChannelIdx (1,:) {...
         iValidateConstraintWithConvexChannelIdx(options.ConvexChannelIdx, inputSize, constraint), ...
         mustBeNumeric,mustBePositive,mustBeInteger}
@@ -131,15 +131,15 @@
 switch constraint
     case "fully-convex"
         % Set defaults
-        if ~any(fields(options) == "PositiveNonDecreasingActivation")
-            options.PositiveNonDecreasingActivation = "softplus";
+        if ~any(fields(options) == "ConvexNonDecreasingActivation")
+            options.ConvexNonDecreasingActivation = "softplus";
         end
         net = conslearn.convex.buildFICNN(inputSize, numHiddenUnits, ...
-            PositiveNonDecreasingActivation=options.PositiveNonDecreasingActivation);
+            ConvexNonDecreasingActivation=options.ConvexNonDecreasingActivation);
     case "partially-convex"
         % Set defaults
-        if ~any(fields(options) == "PositiveNonDecreasingActivation")
-            options.PositiveNonDecreasingActivation = "softplus";
+        if ~any(fields(options) == "ConvexNonDecreasingActivation")
+            options.ConvexNonDecreasingActivation = "softplus";
         end
         if ~any(fields(options) == "Activation")
             options.Activation = "tanh";
@@ -148,7 +148,7 @@
             options.ConvexChannelIdx = 1;
         end
         net = conslearn.convex.buildPICNN(inputSize, numHiddenUnits,...
-            PositiveNonDecreasingActivation=options.PositiveNonDecreasingActivation,...
+            ConvexNonDecreasingActivation=options.ConvexNonDecreasingActivation,...
             Activation=options.Activation,...
             ConvexChannelIdx=options.ConvexChannelIdx);
     case "fully-monotonic"
@@ -259,9 +259,9 @@ function iValidateConstraintWithMonotonicTrend(param, constraint)
 end
 end
 
-function iValidateConstraintWithPositiveNonDecreasingActivation(param, constraint)
+function iValidateConstraintWithConvexNonDecreasingActivation(param, constraint)
 if ( ~isequal(constraint, "fully-convex") && ~isequal(constraint,"partially-convex") ) && ~isempty(param)
-    error("'PositiveNonDecreasingActivation' is not an option for constraint " + constraint);
+    error("'ConvexNonDecreasingActivation' is not an option for constraint " + constraint);
 end
 end
 
diff --git a/conslearn/trainConstrainedNetwork.m b/conslearn/trainConstrainedNetwork.m
index e627007..15c7e92 100644
--- a/conslearn/trainConstrainedNetwork.m
+++ b/conslearn/trainConstrainedNetwork.m
@@ -167,6 +167,16 @@
         end
     end
 end
+
+% Update the training monitor status
+if trainingOptions.TrainingMonitor
+    if monitor.Stop == 1
+        monitor.Status = "Training stopped";
+    else
+        monitor.Status = "Training complete";
+    end
+end
+
 end
 
 %% Helpers
diff --git a/documentation/AI-Verification-Convexity.md b/documentation/AI-Verification-Convexity.md
index bd58099..db8bf4f 100644
--- a/documentation/AI-Verification-Convexity.md
+++ b/documentation/AI-Verification-Convexity.md
@@ -36,7 +36,7 @@ and remain within the set.
 A function $f:\mathbb{R}^n\rightarrow\mathbb{R}$ is convex on $S\subset \mathbb{R}^n$ provided $S$ is a convex set, and for
 any $\lambda\in[0, 1]$, the following holds:
 
-$$ f((1−\lambda)x+\lambda y) \leq (1−\lambda)f(x)+ \lambda f(y) $$
+$f((1−\lambda)x+\lambda y) \leq (1−\lambda)f(x)+ \lambda f(y)$
 
 This means that the line segment connecting any two points on the graph
 of the function lies above or on the graph.
@@ -90,9 +90,9 @@ This means that if you take any two inputs to the network and any convex
 combination of them, then the resulting outputs will respect the convexity
 inequality.
 
-The recurrence equation defined in Eq. 2 in [1] gives a fully input convex neural network 'k-layer' architecture and is transcribed here for brevity:
+The recurrence equation defined in Eq. 2 in [1] gives a fully input convex neural network '$k$-layer' architecture and is transcribed here for brevity:
 
-$$ z_{i+1} = g_i (W_i^{(z)}z_i + W_i^{(y)} + b_i) $$
+$ z_{i+1} = g_i (W_i^{(z)}z_i + W_i^{(y)} + b_i) $
 
 Here, the network input is denoted $y$, $z_0,W_0^{(z)}=0$, and $g_i$ is an activation function. You can view a ‘2-layer’ FICNN
 architecture in Figure 3.
@@ -105,14 +105,14 @@ architecture in Figure 3.
 </figure>
 
 To guarantee convexity of the network, FICNNs require activation
-functions $g_i$ that are positive and non-decreasing. For example, see the
-positive, non-decreasing relu layer “pnd\_1” in Fig 3. Another common
+functions $g_i$ that are convex and non-decreasing. For example, see the
+convex, non-decreasing relu layer “pnd\_1” in Fig 3. Another common
 choice of activation function is the softplus function. Additionally,
-the weights in certain parts of the network, particularly those
-associated with the input or the input's interaction with latent layers,
-are constrained to be non-negative to maintain the convexity property.
-In the figure above, the weight matrices for the fully connected layers
-“fc\_z\_+\_2” and “fc\_y\_+\_2” are constrained to be positive (as
+the weights of all fully-connected layers, except those directly connected 
+to the input, must be constrained to be non-negative to preserve the 
+convexity property.
+In the figure above, the weight matrix for the fully connected layer
+“fc\_z\_+\_2” is constrained to be positive (as
 indicated by the “\_+\_” in the layer name). Note that in this implementation, the final activation function, $g_k$, is not applied. This still guarantees convexity but removes the restriction that outputs of the network must be non-negative.
 
 **Partially Input Convex Neural Network (PICNN)**
@@ -141,13 +141,13 @@ Here, $\tilde{g}_i$ is any activation function, $u_0=x$ where $x$ are the set of
 </figure>
 
 To guarantee convexity of the network, PICNNs require activation
-functions in the $z$ ‘output’ evolution to be positive and
+functions in the $z$ ‘output’ evolution to be convex and
 non-decreasing (see layer “pnd\_0” in Fig 4), but allows freedom for
 activation functions evolving the state, such as $tanh$  activation layers
 (see layer “nca\_0” in Fig 4). As with FICNNs, the weights in certain
 parts of the network are constrained to be non-negative to maintain the
 partial convexity property. In the figure above, the weight matrices for
-the fully connected layer “fc\_z\_+\_1” are constrained to be positive
+the fully connected layer “fc\_z\_+\_1” is constrained to be positive
 (as indicated by the “\_+\_” in the layer name). All other fully
 connected weight matrices in Fig 4 are unconstrained, giving freedom to
 fit any purely feedforward network – see proposition 2 [1]. Note again that in our implementation, the final activation function, $g_k$, is not applied. This still guarantees partial convexity but removes the restriction that outputs of the network must be non-negative.
@@ -176,11 +176,11 @@ discussed above.
 **One-Dimensional ICNN**
 
 Recall that a function $f:\mathbb{R}\rightarrow\mathbb{R}$ is convex on $S\subset \mathbb{R}$ provided $S$ is a convex set and if for all $x,y\in S$ and for any $\lambda\in[0, 1]$, the following inequality holds,
-$f((1−\lambda)x+\lambda y) \leq (1−\lambda)f(x)+ \lambda f(y)$. Intervals are convex sets in $\mathbb{R}$ and it immediately follows from the definition of convexity that for $S = [a,b]$, the upper bound on the interval is, 
+$f((1−\lambda)x+\lambda y) \leq (1−\lambda)f(x)+ \lambda f(y)$. Intervals are convex sets in $\mathbb{R}$ and it immediately follows from the definition of convexity that for $S = [a,b]$, the upper bound on the interval is,
 
-$$ f(x) \leq max(f(a),f(b)) $$
+$ f(x) \leq max(f(a),f(b))$
 
-To find the minimum of $f$ on the interval, you could use an optimization routine, such as projected gradient descent, interior-point
+To find the minimum of $f$ on the interval, you could use a optimization routine, such as projected gradient descent, interior-point
 methods, or barrier methods. However, you can use the properties of
 convex functions to accelerate the search in certain scenarios.
 
@@ -190,15 +190,15 @@ If $f(a) \gt f(b)$, then either the minimum is at $x=b$ or
 the minimum lies strictly in the interior of the interval,
 $x \in (a,b)$. To assess whether the minimum is at $x=b$, look at the derivative, $\nabla f(x)$, at the interval bounds. If $f$ is not differentiable
 at the interval bounds, for example the network has relu activation
-functions that define a set of non-differentiable points in $\mathbb{R}$, evaluate
+functions that defines a set of non-differentiable points in $\mathbb{R}$, evaluate
 both the left and right derivate of $f$ at the interval bounds instead.
 Then examine the sign of the directional derivatives at the interval bounds,
-directed to the interior of the interval: $sgn( \nabla f(a), -\nabla f(b) ) = (\pm , \pm)$. Note that the sign of 0 is taken as positive in this discussion.
+directed to the interior of the interval: $sgn(\nabla f(a), -\nabla f(b)) = (\pm,\pm)$. Note that the sign of 0 is taken as positive in this discussion.
 
 If $f$ is differentiable at the interval bounds, then there are two possible sign
-combinations since $\nabla f(a) \leq m \lt 0$ where $m$ is the gradient of the chord.
+combinations since $ \nabla f(a) \leq m \lt 0 $ where $m$ is the gradient of the chord.
 
--   $sgn(\nabla f(a), -\nabla f(b)) = (−,+)$, then the minimum must lie at $x = b$, i.e., $f(x) \geq f(b)$.
+-   $sgn(\nabla f(a), -\nabla f(b)) = (−,+)$, then the minimum must lie at $x = b$, i.e., $f(x) \geq = f(b)$.
 -   $sgn(\nabla f(a), -\nabla f(b)) = (-,−)$, then the minimum must lie in the interior of the interval, $x \in (a,b)$.
 
 If $f$ is not differentiable at the interval bounds, then there are still two
@@ -240,7 +240,7 @@ possible sign combinations since, at $x=b$, convexity means that $-\nabla f(b+\e
 
 In the case that $f(a) = f(b)$, the function must either be
 constant and the minimum is $f(a) = f(b)$. Or the minimum again
-lies in the interior. If $sgn(\nabla f(a)) = +$, then $\nabla f(a) = 0$ else this violates convexity since $f(a) = f(b)$. Similar is true for
+lies at the interior. If $sgn(\nabla f(a)) = +$, then $\nabla f(a) = 0$ else this violates convexity since $f(a) = f(b)$. Similar is true for
 $-sgn(\nabla f(b)) = +$. In this case, all sign combinations are possible
 owing to possible non-differentiability of $f$ at the interval bounds:
 
@@ -262,8 +262,8 @@ convex functions.
 
 This idea can be extended to many intervals. Take a 1-dimensional ICNN. Consider subdividing the
 operational design domain into a union of intervals $I_i$, where $I_i = [a_i,a_{i+1}]$ and $a_i \lt a_{i+1}$. A tight lower and upper bound on each interval can be computed with a
-single forward pass through the network of all interval boundary values in the union of intervals, a
-single backward pass through the network to compute derivatives at the interval boundary values, and
+single forward pass through the network of all interval bounds values in the union of intervals, a
+single backward pass through the network to compute derivatives at the interval bounds values, and
 one final convex optimization on the interval containing the global
 minimum. Furthermore, since bounds are computed at forward and
 backward passes through the network, you can compute a  'boundedness metric' during
@@ -279,35 +279,36 @@ and $sgn(0) = +$.
 The previous discussion focused on 1-dimensional convex functions, however, this idea extends to n-dimensional convex functions, $f:\mathbb{R}^n \rightarrow \mathbb{R}$. Note that a vector valued convex function is
 convex in each output, so it is sufficient to keep the target as $\mathbb{R}$. In the discussion in this section, take the convex set to be the n-dimensinal hypercube, $H_n$, with vertices, $V_n = {(\pm 1,\pm 1, \dots,\pm 1)}$. General convex hulls will be discussed later.
 
-An important property of convex functions in n-dimensions is that every 1-dimensional restriction also defines a convex function. This is easily seen from the
-definition. Define $g:\mathbb{R} \rightarrow \mathbb{R}$ as $g(t) = f(t\hat{n}) \text{ where } \hat{n}$ is
+An important property of convex functions in n-dimensions is that every 1-dimension restriction also defines a convex function. This is easily seen from the
+definition. Define $g:\mathbb{R} \rightarrow \mathbb{R}$ as $g(t) = f(t\hat{n})$ where $\hat{n}$ is
 some unit vector in $\mathbb{R}^n$. Then, by definition of convexity of $f$, letting $x = t\hat{n}$ and $y = t'\hat{n}$, it follows that,
 
-$$ g((1−\lambda)t+\lambda t') \leq (1−\lambda)g(t)+ \lambda g(t') $$
+$g((1−\lambda)t+\lambda t') \leq (1−\lambda)g(t)+ \lambda g(t')$
 
-Note that the restriction to 1-dimensional convex functions will be used several times in the following discussion.
+Note that the restriction to 1-dimensional convex function will be used several times in the following discussion.
 
 To determine an upper bound of $f$ on the hypercube, note that any point in $H_n$ can be expressed as a convex combination of its vertices, i.e., for $z \in H_n$, it follows that $z = \sum_i \lambda_i v_i$ where $\sum_i \lambda_i = 1$ and $v_i \in V_n$. Therefore, using the definition of convexity in the first inequality and that $\lambda_i \leq 1$ in the second equality,
 
-$$ f(z) = f(\sum_i \lambda_i v_i) \leq \sum \lambda_i f(v_i) \leq \underset{v \in V_n}{\text{max }}  f(v) $$
+$ f(z) = f(\sum_i \lambda_i v_i) \leq \sum \lambda_i f(v_i) \leq \underset{v \in V_n}{\text{max }}  f(v) $.
 
-Consider now the lower bound of $f$ over a hypercubic grid. Here we take the
-approach of looking for hypercubes where there is a guarantee that the
-minimum lies at a vertex of the hypercube and when this guarantee is not met, fall back to solving the convex optimization over that particular
-hypercubic. For the n-dimensional approach, we will split the
+Consider now the lower bound of $f$ over the hypercube. Here we take the
+approach of looking for cases where there is a guarantee that the
+minimum lies at a vertex of the hypercube and when this guarantee cannot
+be met, falling back to solving the convex optimization over this
+hypercubic domain. For the n-dimensional approach, we will split the
 discussion into differentiable and non-differentiable $f$, and consider
 these separately.
 
 **Multi-Dimensional Differentiable Convex Functions**
 
-Consider the derivatives evaluated at each vertex of a hypercube. For each $\nabla f(v)$, $v \in V_n$, take the directional derivatives,
+Consider the derivatives evaluated at each vertex of the hypercube. For each $\nabla f(v)$, $v \in V_n$, take the directional derivatives,
 pointing inward along a hypercubic edge. Without loss of generality,
 recall $V_n = \{(±1,±1,…,±1) \in \mathbb{R}^n\}$ and therefore
 the hypercube is aligned along the standard basis vectors
 $e_i$. The $\text{i}^{\text{th}}$-directional derivative,
 pointing inward, is defined as,
 
-$$ −sgn(v_i)e_i\cdot \nabla f(v) e_i = −sgn(v_i) \nabla_i f(v) $$
+$ −sgn(v_i)e_i\cdot \nabla f(v) e_i = −sgn(v_i) \nabla_i f(v)$
 
 where $sgn(v_i)$ denotes the sign of $\text{i}^{\text{th}}$
 component of the vertex $v$, and the minus ensures the directional
@@ -321,7 +322,8 @@ construction on a cube.
 </p>
 </figure>
 
-Analogous to the 1-dimensional case, analyze the signatures of the derivatives at the vertices. The notation $(\pm,\pm,…,\pm)_v$ denotes the overall sign of $−sgn(v_i)\nabla_i f(v)$ at $v$ for each $i$, and is used in the rest of this article.
+Analogous to the 1-dimensional case, analyze the
+signatures of the derivatives at the vertices. The notation $(\pm,...,\pm)_v $ denotes the overall sign of $−sgn(v_i)\nabla_i f(v)$ at $v$ for each $i$, and is used in the rest of this article.
 
 **Lemma**:
 
@@ -337,10 +339,12 @@ vector in direction $z-w$. Since the directional derivatives at $w$
 pointing inwards are all positive, and $f$ is differentiable, the
 derivative along the line at $w$, pointing inwards, is given by,
 
-$$ \hat{n} \cdot \nabla f(w) = \sum_i -|n_i|\cdot sgn(w_i) \cdot \nabla_i f(w) = \sum_i |n_i| \cdot (-sgn(w_i) \cdot \nabla_i f(w)) \geq 0 $$
+$ \hat{n} \cdot \nabla f(w) = \sum_i -|n_i|\cdot sgn(w_i) \cdot \nabla_i f(w) = \sum_i |n_i| \cdot (-sgn(w_i) \cdot \nabla_i f(w)) \geq 0 $
 
-and is positive, as $\hat{n} = - |n_i| \cdot sgn(w_i) \cdot e_i $.
-The properties proved previously can then be applied to this 1-dimensional restriction. Hence, a vertex with inward
+is positive, as $\hat{n} = - |n_i| \cdot sgn(w_i) \cdot e_i $.
+The properties proved previously can then by applied to this 1-dimensional restriction, i.e., if the
+gradient of $f$ as the interval bounds of an interval is positive, then $f$ has
+a minimum value at this interval bounds. Hence, a vertex with inward
 directional derivative signature $(+,+,…,+)$ is a lower bound for $f$ over the hypercube. ◼
 
 If there are multiple vertices sharing this signature, then since every
@@ -351,9 +355,9 @@ at vertices sharing these signatures so it is sufficient to select any.
 
 If no vertex has signature $(+,+,…,+)$, solve for the minimum using
 a convex optimization routine over this hypercube. Since all local minima are
-global minima, there is at least one hypercube requiring this approach.
+global minima, there is at least one hypercube requiring this solution.
 If the function has a flat section at its minima, there may be other
-hypercubes, also without a vertex with all positive signature. Note that empirically,
+hypercubes in the operational design domain, also without a vertex with all positive signature. Note that empirically,
 this seldom happens for convex neural networks as it requires fine
 tuning of the parameters to create such a landscape.
 
@@ -377,7 +381,7 @@ As depicted in figure 7, the vertices $w$ of the square (hypercube of dimension
 bisecting these directional derivatives, into the interior of the square, has a negative gradient. This is
 because the vertex is at the intersection of two planes and is a
 non-differentiable point, so the derivative through this point is path
-dependent. This is a well-known property of non-differentiable functions and breaks the assertion that this vertex is the minimum of $f$ over this
+dependent. This is a well-known observation but this breaks the assertion that this vertex if the minimum of $f$ over this
 square region. From this example, it is clear the minimum lies at the apex at $(0,0)$.
 
 To ameliorate this issue, in the case that the convex function is
@@ -388,13 +392,17 @@ $relu$ operations. In practice, this means that a vertex may be a
 non-differentiable point if the network has pre-activations to $relu$
 layers that have exact zeros. In practice, this is seldom the case. The
 probability of this occurring can be further reduced by offsetting any
-hypercube or hypercubic grid origin by a small random perturbation. If there are
-any zeros in these pre-activations, lower bounds for hypercubes that contain that vertex can be recomputed using
-a convex optimization routine instead.
+hypercube or hypercubic grid origin by a small random perturbation. It
+is assumed during training, for efficiency of computing bounds during training, that the convex neural network is differentiable everywhere. For final post-training analysis, this implementation checks the $relu$
+pre-activations for any exact zeros for all vertices. If there are
+any zeros in these pre-activations, lower bounds for hypercubes that contain that vertex are recomputed using
+an minimization routine. As a demonstration that these bounds are
+correct, in the examples, we also run the minimization optimization routine on every
+hypercube to show that bounds agree.
 
 As a final comment, for general convex hulls, the argument for the upper bound value of the function over the convex hull trivially extends, defined as the largest function value over the set of points defining the hull. The lower bound should be determined using an optimization routine, constrained to the set of point in the convex hull.
 
 **References**
 
 - [1] Amos, Brandon, et al. Input Convex Neural Networks. arXiv:1609.07152, arXiv, 14 June 2017. arXiv.org, https://doi.org/10.48550/arXiv.1609.07152.
-- [2] Ławryńczuk, Maciej. “Input Convex Neural Networks in Nonlinear Predictive Control: A Multi-Model Approach.” Neurocomputing, vol. 513, Nov. 2022, pp. 273–93. ScienceDirect, https://doi.org/10.1016/j.neucom.2022.09.108.
+- [2] Ławryńczuk, Maciej. “Input Convex Neural Networks in Nonlinear Predictive Control: A Multi-Model Approach.” Neurocomputing, vol. 513, Nov. 2022, pp. 273–93. ScienceDirect, https://doi.org/10.1016/j.neucom.2022.09.108.
\ No newline at end of file
diff --git a/documentation/AI-Verification-Lipschitz.md b/documentation/AI-Verification-Lipschitz.md
index 24864bd..ff9b3be 100644
--- a/documentation/AI-Verification-Lipschitz.md
+++ b/documentation/AI-Verification-Lipschitz.md
@@ -4,7 +4,7 @@ In the field of deep learning, neural networks have demonstrated remarkable succ
 
 Lipschitz continuity is a mathematical concept that describes the rate at which a function's output can change with respect to changes in its input. Formally, a function $f:X \rightarrow Y$ is said to be Lipschitz continuous if there exists a constant $\lambda \geq 0$ such that for all $x_1$ and $x_2$ in the domain *X*, the following inequality holds:
 
-$$|f(x_1) - f(x_2)| \leq \lambda |x_1 - x_2|$$
+$|f(x_1) - f(x_2)| \leq \lambda |x_1 - x_2|$
 
 Here, $\lambda$ is referred to as the Lipschitz constant. It essentially bounds the gradient (or the steepness) of the function, ensuring that the output does not change too dramatically for small changes in the input.
 
@@ -18,19 +18,19 @@ Enforcing Lipschitz continuity in neural networks is not straightforward. Calcul
 
 The choice of the p-norm in the context of Lipschitz constraints has a significant impact on the way distances are measured between points and consequently how to define and enforce Lipschitz continuity in neural networks. The p-norm (or Lp norm) is a generalization of the Euclidean distance and is defined for a vector *x* in a real or complex space as:
 
-$$||x||_p = (|x_1|^p + |x_2|^p + ... + |x_n|^p)^{(1/p)}$$
+$||x||_p = (|x_1|^p + |x_2|^p + ... + |x_n|^p)^{(1/p)}$
 
 where $|x_i|$ denotes the absolute value of the i-th component of the vector *x*, and $p \geq 1$. When talking about Lipschitz continuity using a p-norm, it corresponds to the inequality:
 
-$$||f(x_1) - f(x_2)||_p \leq \lambda_p ||x_1 - x_2||_p$$
+$||f(x_1) - f(x_2)||_p \leq \lambda_p ||x_1 - x_2||_p$
 
 where *f* is the function representing the neural network, and $\lambda_p$ is the Lipschitz constant for choice of norm *p*. This choice of *p* determines the geometry of the space in which to measure the distances and can have several implications:
 
 - $\ell_1$-Norm (Manhattan Distance)
-When $p = 1$, the $\ell_1$-norm sums the absolute values of the components of the vector. This norm is less sensitive to outliers than the $\ell_2$-norm and can lead to sparser solutions in optimization problems. In the context of Lipschitz continuity, using the 1-norm can result in a model that is robust to small changes in many input dimensions simultaneously.
+When `p = 1`, the $\ell_1$-norm sums the absolute values of the components of the vector. This norm is less sensitive to outliers than the $\ell_2$-norm and can lead to sparser solutions in optimization problems. In the context of Lipschitz continuity, using the 1-norm can result in a model that is robust to small changes in many input dimensions simultaneously.
 
 - $\ell_2$-Norm (Euclidean Distance)
-The $\ell_2$-norm $p = 2$ is the most commonly used norm, representing the straight-line distance between two points. It is rotationally invariant and often leads to smoother and more isotropic gradients. When enforcing Lipschitz continuity with the $\ell_2$-norm, the model is encouraged to be robust to perturbations in any direction in the input space.
+The $\ell_2$-norm (`p = 2`) is the most commonly used norm, representing the straight-line distance between two points. It is rotationally invariant and often leads to smoother and more isotropic gradients. When enforcing Lipschitz continuity with the $\ell_2$-norm, the model is encouraged to be robust to perturbations in any direction in the input space.
 
 - $\ell_\infty$-Norm (Maximum Norm)
 The $\infty$-norm takes the maximum absolute value among the components of the vector. It measures the largest change in any single dimension. In the context of Lipschitz continuity, this norm is concerned with the worst-case scenario, where the model is robust to the largest change in any single input dimension.
@@ -60,7 +60,7 @@ As an explicit example, consider the $\ell_p$-Lipschitz constrained network with
 
 You can compute an upper bound Lipschitz constant for this network by taking the product of Lipschitz constant for each layer. For the relu activation, $\lambda_p = 1$. For the fully connected layers, the Lipschitz constant is given by $||W||_p$, and a suitable proximal operator that ensures the network has upper bound Lipschitz constant, $\lambda_p = 2$, is
 
-$$W \rightarrow \frac{1}{max(1,||W||_p/\sqrt{\lambda_p})}W$$
+$W \rightarrow \frac{1}{max(1,||W||_p/\sqrt{\lambda_p})}W$.
 
 This ensures that the product of Lipschitz constants is at most $\lambda_p$. There are alternative proximal operators, some of which depends on the p-norm, for example using the $\ell_1$-norm as discussed in [2].
 
@@ -73,4 +73,4 @@ Lipschitz continuity offers a mathematical framework to understand and potential
 **References**
 
 - [1] Gouk, Henry, et al. “Regularisation of Neural Networks by Enforcing Lipschitz Continuity.” Machine Learning, vol. 110, no. 2, Feb. 2021, pp. 393–416. DOI.org (Crossref), https://doi.org/10.1007/s10994-020-05929-w
-- [2] Kitouni, Ouail, et al. Expressive Monotonic Neural Networks. arXiv:2307.07512, arXiv, 14 July 2023. arXiv.org, http://arxiv.org/abs/2307.07512.
+- [2] Kitouni, Ouail, et al. Expressive Monotonic Neural Networks. arXiv:2307.07512, arXiv, 14 July 2023. arXiv.org, http://arxiv.org/abs/2307.07512.
\ No newline at end of file
diff --git a/documentation/AI-Verification-Monotonicity.md b/documentation/AI-Verification-Monotonicity.md
index 747e5b0..2d67f04 100644
--- a/documentation/AI-Verification-Monotonicity.md
+++ b/documentation/AI-Verification-Monotonicity.md
@@ -27,9 +27,15 @@ To circumvent these challenges, an alternative approach is to construct neural n
 - **Constrained Weights**: Ensuring that all weights in the network are non-negative can guarantee monotonicity. You can achieve this by using techniques like weight clipping or transforming weights during training.
 - **Architectural Considerations**: Designing network architectures that facilitate monotonic behavior. For example, architectures that avoid certain types of skip connections or layer types that could introduce non-monotonic behavior.
 
-The approach taken in this repository is to utilize a combination of activation function, weight and architectural restrictions and is based on the construction outlined in [1]. Ref [1] discusses the derivation in the context of row vector representations of network inputs however MATLAB utilizes a column vector representation of network inputs. This means that the 1-norm discussed in [1] is replaced by the $\infty$-norm for implementations in MATLAB.
+The approach taken in this repository is to utilize a combination of these three aspects and is based on the construction outlined in [1]. As [1] discusses the derivation in the context of row vector representations of network inputs, we derive the result for column vector inputs here, as MATLAB utilizes a column vector representation of network inputs.
 
-Note that for different choices of p-norm, the derivation in [1] still yields a monotonic function $f$, however there may be couplings between the magnitudes of the partial derivatives (shown for p=2 in [1]). By default, the implementation in this repository sets $p=\infty$ for monotonic networks but other values are explored as these may yield better fits.
+Consider a scalar network $f:\mathbb{R}^n \rightarrow \mathbb{R}$ where $f(x) = g(x) + \lambda \sum_{k \in S} x_k$, $S$ denotes the set of monotonically dependent input indices and $g:\mathbb{R}^n \rightarrow \mathbb{R}$ is a Lipschitz continuous network, i.e., $\forall x,y \in \mathbb{R}^n$, $|g(x)-g(y)| \leq \lambda ||x-y||_p$. For monotonic decreasing, $f(x) = g(x) - \lambda \sum_{k \in S} x_k$.
+
+Take $p=\infty$. The matrix $\infty$-norm is the maximum absolute sum of each row, i.e., $||A||_\infty = max_i \sum_j |a_{ij}| $. Therefore, for multi-layer perceptron networks $g$ (as discussed in [1]), an upper bound on $\lambda = \prod_i ||W^{(i)}||_\infty$ where $W^{(i)}$ is the weight matrix of the $i$-th fully connected layer. It follows from Lipschitz continuity that $|| \nabla g ||_\infty \leq \lambda$ where, since $\nabla g$ is also taken column vector, $|| \nabla g ||_\infty = max_k |\partial g/\partial x_k| \leq \lambda$. Hence the choice of $\infty$-norm decouples the magnitudes of the directional derivatives in the monotonic features for column vector inputs and column vector gradients, or in other words, each partial derivative is free to take any value in the interval $[-\lambda, \lambda]$.
+
+From the definition of $f$, $\partial f/\partial x_k = \partial g/\partial x_k + \lambda \geq 0$ for $k \in S$ and hence the network $f$ is monotonic in $x_k$ for $k \in S$ by construction. The decoupling of the magnitudes of the directional derivatives means that the partial derivatives of $f$ can be as large as $2\lambda$ in each monotonic direction.
+
+Note that for different choices of p-norm, the derivation above still yields a monotonic function $f$, however there may be couplings between the magnitudes of the partial derivatives (shown for p=2 in [1]). By default, the implementation in this repository sets $p=\infty$ for monotonic networks but other values are explored as these may yield better fits.
 
 A simple monotonic architecture is shown in Figure 1. 
 
@@ -50,7 +56,7 @@ The main challenge with expressive monotonic networks is to balance the inherent
 
 For networks constructed to be monotonic, verification becomes more straightforward and comes down to architectural and weight inspection, i.e., provided the network architecture is of a specified monotonic topology, and that the weights in the network are appropriately related - see [1] - then the network is monotonic.
 
-In summary, while verifying monotonicity in general neural networks is complex due to non-linearities and high dimensionality, constructing networks with inherent monotonic properties simplifies verification. By using constrained architectures and weights, you can design networks that are guaranteed to be monotonic, thus facilitating the verification process and making the network more suitable for applications where monotonic behavior is essential.
+In summary, while verifying monotonicity in general neural networks is complex due to non-linearities and high dimensionality, constructing networks with inherent monotonic properties simplifies verification. By using monotonic activation functions and ensuring non-negative weights, you can design networks that are guaranteed to be monotonic, thus facilitating the verification process and making the network more suitable for applications where monotonic behavior is essential.
 
 **References**
 
diff --git a/documentation/figures/ficnn_network.jpg b/documentation/figures/ficnn_network.jpg
index 384d5b2..5625902 100644
Binary files a/documentation/figures/ficnn_network.jpg and b/documentation/figures/ficnn_network.jpg differ
diff --git a/examples/convex/classificationCIFAR10/TrainICNNOnCIFAR10Example.md b/examples/convex/classificationCIFAR10/TrainICNNOnCIFAR10Example.md
index bef92d1..fad1e0b 100644
--- a/examples/convex/classificationCIFAR10/TrainICNNOnCIFAR10Example.md
+++ b/examples/convex/classificationCIFAR10/TrainICNNOnCIFAR10Example.md
@@ -76,9 +76,9 @@ plot(ficnnet)
 ```
 
 <figure>
-<p align="center">
-    <img src="./figures/TrainICNN_Fig1.jpg">
-</p>
+  <p align="center">
+    <img src="./figures/TrainICNN_Fig1.png" width="1028" alt="">
+  </p>
 </figure>
 
 # Specify Training Options
@@ -127,9 +127,9 @@ trained_ficnnet = trainConstrainedNetwork("fully-convex",ficnnet,mbqTrain,...
 ```
 
 <figure>
-<p align="center">
-    <img src="./figures/TrainICNN_Fig2.jpg">
-</p>
+  <p align="center">
+    <img src="./figures/TrainICNN_Fig2.png" width="1927" alt="">
+  </p>
 </figure>
 
 # Evaluate Trained Network
@@ -160,7 +160,7 @@ disp("Training accuracy: " + (1-trainError)*100 + "%")
 ```
 
 ```matlabTextOutput
-Training accuracy: 97.7364%
+Training accuracy: 90.4848%
 ```
 
 Compute the accuracy on the test set.
@@ -173,7 +173,7 @@ disp("Test accuracy: " + (1-testError)*100 + "%")
 ```
 
 ```matlabTextOutput
-Test accuracy: 31.5848%
+Test accuracy: 27.4554%
 ```
 
 The networks output has been constrained to be convex in every pixel in every colour. Even with this level of restriction, the network is able to fit reasonably well to the training data. You can see poor accuracy on the test data set but, as discussed at the start of the example, it is not anticipated that such a fully input convex network comprising of fully connected operations should generalize well to natural image classification.
@@ -190,9 +190,9 @@ cm.RowSummary = "row-normalized";
 ```
 
 <figure>
-<p align="center">
-    <img src="./figures/TrainICNN_Fig3.jpg">
-</p>
+  <p align="center">
+    <img src="./figures/TrainICNN_Fig3.png" width="1028" alt="">
+  </p>
 </figure>
 
 To summarise, the fully input convex network is able to fit to the training data set, which is labelled natural images. The training can take a considerable amount of time owing to the weight projection to the constrained set after each gradient update, which slows down training convergence. Nevertheless, this example illustrates the flexibility and expressivity convex neural networks have to correctly classifying natural images.
diff --git a/examples/convex/classificationCIFAR10/TrainICNNOnCIFAR10Example.mlx b/examples/convex/classificationCIFAR10/TrainICNNOnCIFAR10Example.mlx
index b09e1c8..d495b3e 100644
Binary files a/examples/convex/classificationCIFAR10/TrainICNNOnCIFAR10Example.mlx and b/examples/convex/classificationCIFAR10/TrainICNNOnCIFAR10Example.mlx differ
diff --git a/examples/convex/classificationCIFAR10/downloadCIFARData.m b/examples/convex/classificationCIFAR10/downloadCIFARData.m
index 5370538..095c383 100644
--- a/examples/convex/classificationCIFAR10/downloadCIFARData.m
+++ b/examples/convex/classificationCIFAR10/downloadCIFARData.m
@@ -1,7 +1,5 @@
 function downloadCIFARData(destination)
 
-%   Copyright 2024 The MathWorks, Inc.
-
 url = 'https://www.cs.toronto.edu/~kriz/cifar-10-matlab.tar.gz';
 
 unpackedData = fullfile(destination,'cifar-10-batches-mat');
diff --git a/examples/convex/classificationCIFAR10/figures/TrainICNN_Fig1.jpg b/examples/convex/classificationCIFAR10/figures/TrainICNN_Fig1.jpg
deleted file mode 100644
index 4ad884c..0000000
Binary files a/examples/convex/classificationCIFAR10/figures/TrainICNN_Fig1.jpg and /dev/null differ
diff --git a/examples/convex/classificationCIFAR10/figures/TrainICNN_Fig1.png b/examples/convex/classificationCIFAR10/figures/TrainICNN_Fig1.png
new file mode 100644
index 0000000..6f49779
Binary files /dev/null and b/examples/convex/classificationCIFAR10/figures/TrainICNN_Fig1.png differ
diff --git a/examples/convex/classificationCIFAR10/figures/TrainICNN_Fig2.jpg b/examples/convex/classificationCIFAR10/figures/TrainICNN_Fig2.jpg
deleted file mode 100644
index 4d819c8..0000000
Binary files a/examples/convex/classificationCIFAR10/figures/TrainICNN_Fig2.jpg and /dev/null differ
diff --git a/examples/convex/classificationCIFAR10/figures/TrainICNN_Fig2.png b/examples/convex/classificationCIFAR10/figures/TrainICNN_Fig2.png
new file mode 100644
index 0000000..b11b61f
Binary files /dev/null and b/examples/convex/classificationCIFAR10/figures/TrainICNN_Fig2.png differ
diff --git a/examples/convex/classificationCIFAR10/figures/TrainICNN_Fig3.jpg b/examples/convex/classificationCIFAR10/figures/TrainICNN_Fig3.jpg
deleted file mode 100644
index 4530e3d..0000000
Binary files a/examples/convex/classificationCIFAR10/figures/TrainICNN_Fig3.jpg and /dev/null differ
diff --git a/examples/convex/classificationCIFAR10/figures/TrainICNN_Fig3.png b/examples/convex/classificationCIFAR10/figures/TrainICNN_Fig3.png
new file mode 100644
index 0000000..6f1bea2
Binary files /dev/null and b/examples/convex/classificationCIFAR10/figures/TrainICNN_Fig3.png differ
diff --git a/examples/convex/classificationCIFAR10/loadCIFARData.m b/examples/convex/classificationCIFAR10/loadCIFARData.m
index 7a23aa7..a896d98 100644
--- a/examples/convex/classificationCIFAR10/loadCIFARData.m
+++ b/examples/convex/classificationCIFAR10/loadCIFARData.m
@@ -1,7 +1,5 @@
 function [XTrain,YTrain,XTest,YTest] = loadCIFARData(location)
 
-%   Copyright 2024 The MathWorks, Inc.
-
 location = fullfile(location,'cifar-10-batches-mat');
 
 [XTrain1,YTrain1] = loadBatchAsFourDimensionalArray(location,'data_batch_1.mat');
diff --git a/examples/convex/introductory/PoC_Ex1_1DFICNN.md b/examples/convex/introductory/PoC_Ex1_1DFICNN.md
index f9daa44..4a41724 100644
--- a/examples/convex/introductory/PoC_Ex1_1DFICNN.md
+++ b/examples/convex/introductory/PoC_Ex1_1DFICNN.md
@@ -33,9 +33,9 @@ xlabel("x")
 ```
 
 <figure>
-<p align="center">
-    <img src="./figures/PoC_Ex1_1DFICNN_Fig1.jpg">
-</p>
+    <p align="center">
+        <img src="figures/PoC_Ex1_1DFICNN_Fig1.png" width="562" alt="">
+    </p>
 </figure>
 
 # Prepare Data
@@ -60,7 +60,7 @@ As discussed in [AI Verification: Convex](../../../documentation/AI-Verification
 inputSize = 1;
 numHiddenUnits = [16 8 4 1];
 ficnnet = buildConstrainedNetwork("fully-convex",inputSize,numHiddenUnits, ...
-    PositiveNonDecreasingActivation="relu")
+    ConvexNonDecreasingActivation="relu")
 ```
 
 ```matlabTextOutput
@@ -92,9 +92,9 @@ end
 ```
 
 <figure>
-<p align="center">
-    <img src="./figures/PoC_Ex1_1DFICNN_Fig2.jpg">
-</p>
+    <p align="center">
+        <img src="figures/PoC_Ex1_1DFICNN_Fig2.png" width="562" alt="">
+    </p>
 </figure>
 
 # Train FICNN
@@ -119,9 +119,9 @@ trained_ficnnet = trainConstrainedNetwork("fully-convex",ficnnet,mbqTrain, ...
 ```
 
 <figure>
-<p align="center">
-    <img src="./figures/PoC_Ex1_1DFICNN_Fig3.jpg">
-</p>
+    <p align="center">
+        <img src="figures/PoC_Ex1_1DFICNN_Fig3.png" width="2184" alt="">
+    </p>
 </figure>
 
 Evaluate the accuracy on the true underlying convex function from an independent random sampling from the interval <samp>[-1,1]</samp>.
@@ -138,7 +138,7 @@ lossAgainstUnderlyingSignal =
 
   gpuArray single
 
-    0.0338
+    0.0362
 ```
 # Train Unconstrained MLP
 
@@ -180,9 +180,9 @@ end
 ```
 
 <figure>
-<p align="center">
-    <img src="./figures/PoC_Ex1_1DFICNN_Fig4.jpg">
-</p>
+    <p align="center">
+        <img src="figures/PoC_Ex1_1DFICNN_Fig4.png" width="562" alt="">
+    </p>
 </figure>
 
 Specify the training options and then train the network using the <samp>trainnet</samp> function.
@@ -204,39 +204,40 @@ trained_mlpnet = trainnet(mbqTrain,mlpnet,lossMetric,options);
     Iteration    Epoch    TimeElapsed    LearnRate    TrainingLoss
     _________    _____    ___________    _________    ____________
             1        1       00:00:00         0.05         0.26302
-           50       50       00:00:07        0.045         0.12781
-          100      100       00:00:11      0.03645         0.12262
-          150      150       00:00:14     0.032805         0.10849
-          200      200       00:00:18     0.026572          0.1102
-          250      250       00:00:22     0.021523         0.11806
-          300      300       00:00:25     0.019371         0.10301
-          350      350       00:00:29     0.015691        0.096023
-          400      400       00:00:32     0.012709         0.10675
-          450      450       00:00:36     0.011438        0.097555
-          500      500       00:00:39    0.0092651        0.094147
-          550      550       00:00:43    0.0075047        0.090284
-          600      600       00:00:46    0.0067543        0.088997
-          650      650       00:00:50    0.0054709        0.086944
-          700      700       00:00:53    0.0044315        0.085979
-          750      750       00:00:57    0.0039883        0.085362
-          800      800       00:01:00    0.0032305         0.08497
-          850      850       00:01:04    0.0026167         0.08464
-          900      900       00:01:07    0.0023551        0.084311
-          950      950       00:01:11    0.0019076        0.084135
-         1000     1000       00:01:15    0.0015452        0.083962
-         1050     1050       00:01:19    0.0013906        0.083793
-         1100     1100       00:01:22    0.0011264         0.08367
-         1150     1150       00:01:26    0.0009124         0.08356
-         1200     1200       00:01:29   0.00082116        0.083461
+           50       50       00:00:05        0.045         0.12781
+          100      100       00:00:08      0.03645         0.12262
+          150      150       00:00:11     0.032805         0.10938
+          200      200       00:00:14     0.026572         0.10655
+          250      250       00:00:17     0.021523         0.11237
+          300      300       00:00:20     0.019371           0.104
+          350      350       00:00:23     0.015691         0.10177
+          400      400       00:00:26     0.012709        0.097083
+          450      450       00:00:28     0.011438        0.094851
+          500      500       00:00:31    0.0092651        0.092311
+          550      550       00:00:34    0.0075047        0.093058
+          600      600       00:00:36    0.0067543        0.089904
+          650      650       00:00:39    0.0054709        0.088938
+          700      700       00:00:42    0.0044315        0.087454
+          750      750       00:00:45    0.0039883        0.086143
+          800      800       00:00:48    0.0032305        0.085586
+          850      850       00:00:51    0.0026167        0.085192
+          900      900       00:00:54    0.0023551         0.08487
+          950      950       00:00:57    0.0019076        0.084659
+         1000     1000       00:00:59    0.0015452        0.084424
+         1050     1050       00:01:02    0.0013906        0.084303
+         1100     1100       00:01:05    0.0011264        0.084138
+         1150     1150       00:01:08    0.0009124        0.084049
+         1200     1200       00:01:11   0.00082116        0.083947
 Training stopped: Max epochs completed
 ```
 
 <figure>
-<p align="center">
-    <img src="./figures/PoC_Ex1_1DFICNN_Fig5.jpg">
-</p>
+    <p align="center">
+        <img src="figures/PoC_Ex1_1DFICNN_Fig5.png" width="2184" alt="">
+    </p>
 </figure>
 
+
 Evaluate the accuracy on an independent random sampling from the interval <samp>[-1,1]</samp>. Observe that the loss against the underlying monotonic signal here is higher as the network has fitted to the sinusoidal contamination.
 
 ```matlab
@@ -244,7 +245,7 @@ lossAgainstUnderlyingSignal = computeLoss(trained_mlpnet,xTest,tTest,lossMetric)
 ```
 
 ```matlabTextOutput
-lossAgainstUnderlyingSignal = 0.0699
+lossAgainstUnderlyingSignal = 0.0696
 ```
 # Network Comparison
 
@@ -264,11 +265,12 @@ legend("FICNN","MLP","Training Data")
 ```
 
 <figure>
-<p align="center">
-    <img src="./figures/PoC_Ex1_1DFICNN_Fig6.jpg">
-</p>
+    <p align="center">
+        <img src="figures/PoC_Ex1_1DFICNN_Fig6.png" width="562" alt="">
+    </p>
 </figure>
 
+
 It is visually evident that the MLP solution is not convex over the interval but the FICNN is convex, owing to its convex construction and constrained learning.
 
 # Guaranteed Bounds for FICNN
@@ -318,9 +320,9 @@ title("Guarantees of upper and lower bounds for FICNN network");
 ```
 
 <figure>
-<p align="center">
-    <img src="./figures/PoC_Ex1_1DFICNN_Fig7.jpg">
-</p>
+    <p align="center">
+        <img src="figures/PoC_Ex1_1DFICNN_Fig7.png" width="562" alt="">
+    </p>
 </figure>
 
 # Violated Bounds for MLP
@@ -351,9 +353,9 @@ grid on;
 ```
 
 <figure>
-<p align="center">
-    <img src="./figures/PoC_Ex1_1DFICNN_Fig8.jpg">
-</p>
+    <p align="center">
+        <img src="figures/PoC_Ex1_1DFICNN_Fig8.png" width="562" alt="">
+    </p>
 </figure>
 
 # Helper Functions
diff --git a/examples/convex/introductory/PoC_Ex1_1DFICNN.mlx b/examples/convex/introductory/PoC_Ex1_1DFICNN.mlx
index 7b1e4be..87ac560 100644
Binary files a/examples/convex/introductory/PoC_Ex1_1DFICNN.mlx and b/examples/convex/introductory/PoC_Ex1_1DFICNN.mlx differ
diff --git a/examples/convex/introductory/PoC_Ex2_nDFICNN.md b/examples/convex/introductory/PoC_Ex2_nDFICNN.md
index ef691e0..e9a16e4 100644
--- a/examples/convex/introductory/PoC_Ex2_nDFICNN.md
+++ b/examples/convex/introductory/PoC_Ex2_nDFICNN.md
@@ -31,9 +31,9 @@ ylabel("x2")
 ```
 
 <figure>
-<p align="center">
-    <img src="./figures/PoC_Ex2_nDFICNN_Fig1.jpg">
-</p>
+    <p align="center">
+        <img src="figures/PoC_Ex2_nDFICNN_Fig1.png" width="562" alt="">
+    </p>
 </figure>
 
 # Prepare Data
@@ -58,7 +58,7 @@ In this proof of concept example, build a 2\-dimensional FICNN using fully conne
 inputSize = 2;
 numHiddenUnits = [16 8 4 1];
 ficnnet = buildConstrainedNetwork("fully-convex",inputSize,numHiddenUnits,...
-    PositiveNonDecreasingActivation="softplus")
+    ConvexNonDecreasingActivation="softplus")
 ```
 
 ```matlabTextOutput
@@ -90,10 +90,8 @@ end
 ```
 
 <figure>
-<p align="center">
-    <img src="./figures/PoC_Ex2_nDFICNN_Fig2.jpg">
-</p>
-</figure>
+    <p align="center">
+        <img src="figures/PoC_Ex2_nDFICNN_Fig2.png" width="562" alt="">
 
 # Train FICNN
 
@@ -117,9 +115,9 @@ trained_ficnnet = trainConstrainedNetwork("fully-convex",ficnnet,mbqTrain,...
 ```
 
 <figure>
-<p align="center">
-    <img src="./figures/PoC_Ex2_nDFICNN_Fig3.jpg">
-</p>
+    <p align="center">
+        <img src="figures/PoC_Ex2_nDFICNN_Fig3.png" width="2368" alt="">
+    </p>
 </figure>
 
 Evaluate the accuracy on the training set.
@@ -133,7 +131,7 @@ loss =
 
   gpuArray single
 
-    0.0134
+    0.0156
 ```
 
 Plot the network predictions with the training data.
@@ -151,9 +149,9 @@ legend("Training Data","Network Prediction",Location="northwest")
 ```
 
 <figure>
-<p align="center">
-    <img src="./figures/PoC_Ex2_nDFICNN_Fig4.jpg">
-</p>
+    <p align="center">
+        <img src="figures/PoC_Ex2_nDFICNN_Fig4.png" width="562" alt="">
+    </p>
 </figure>
 
 # Guaranteed Bounds for 2\-D FICNN
@@ -191,9 +189,9 @@ hold off
 ```
 
 <figure>
-<p align="center">
-    <img src="./figures/PoC_Ex2_nDFICNN_Fig5.jpg">
-</p>
+    <p align="center">
+        <img src="figures/PoC_Ex2_nDFICNN_Fig5.png" width="562" alt="">
+    </p>
 </figure>
 
 # Helper Functions
diff --git a/examples/convex/introductory/PoC_Ex2_nDFICNN.mlx b/examples/convex/introductory/PoC_Ex2_nDFICNN.mlx
index db1e5be..dfdc467 100644
Binary files a/examples/convex/introductory/PoC_Ex2_nDFICNN.mlx and b/examples/convex/introductory/PoC_Ex2_nDFICNN.mlx differ
diff --git a/examples/convex/introductory/PoC_Ex3_nDPICNN.md b/examples/convex/introductory/PoC_Ex3_nDPICNN.md
index 78b74b1..5e705cc 100644
--- a/examples/convex/introductory/PoC_Ex3_nDPICNN.md
+++ b/examples/convex/introductory/PoC_Ex3_nDPICNN.md
@@ -31,11 +31,12 @@ ylabel("x2")
 ```
 
 <figure>
-<p align="center">
-    <img src="./figures/PoC_Ex3_nDPICNN_Fig1.jpg">
-</p>
+    <p align="center">
+        <img src="figures/PoC_Ex3_nDPICNN_Fig1.png" width="562" alt="">
+    </p>
 </figure>
 
+
 Observe the overall underlying convex behavior in <samp>x2</samp> given <samp>x1</samp>, and non\-convex behavior in <samp>x1</samp> given <samp>x2</samp>. 
 
 # Prepare Data
@@ -61,7 +62,7 @@ In this proof of concept example, build a 2\-dimensional PICNN using fully conne
 inputSize = 2;
 numHiddenUnits = [32 8 1];
 picnnet = buildConstrainedNetwork("partially-convex",inputSize,numHiddenUnits,...
-    PositiveNonDecreasingActivation="softplus",...
+    ConvexNonDecreasingActivation="softplus",...
     Activation="tanh",...
     ConvexChannelIdx=2)
 ```
@@ -95,9 +96,9 @@ end
 ```
 
 <figure>
-<p align="center">
-    <img src="./figures/PoC_Ex3_nDPICNN_Fig2.jpg">
-</p>
+    <p align="center">
+        <img src="figures/PoC_Ex3_nDPICNN_Fig2.png" width="562" alt="">
+    </p>
 </figure>
 
 # Train PICNN
@@ -122,9 +123,9 @@ trained_picnnet = trainConstrainedNetwork("partially-convex",picnnet,mbqTrain,..
 ```
 
 <figure>
-<p align="center">
-    <img src="./figures/PoC_Ex3_nDPICNN_Fig3.jpg">
-</p>
+    <p align="center">
+        <img src="figures/PoC_Ex3_nDPICNN_Fig3.png" width="2368" alt="">
+    </p>
 </figure>
 
 Evaluate the accuracy on the training set.
@@ -138,7 +139,7 @@ loss =
 
   gpuArray single
 
-    0.0265
+    0.0275
 ```
 
 Plot the network predictions with the training data.
@@ -156,9 +157,9 @@ legend("Training Data","Network Prediction",Location="northwest")
 ```
 
 <figure>
-<p align="center">
-    <img src="./figures/PoC_Ex3_nDPICNN_Fig4.jpg">
-</p>
+    <p align="center">
+        <img src="figures/PoC_Ex3_nDPICNN_Fig4.png" width="562" alt="">
+    </p>
 </figure>
 
 # Guaranteed Bounds for 2\-D PICNN in 1\-D Restrictions
@@ -187,9 +188,9 @@ xlabel("x2")
 ```
 
 <figure>
-<p align="center">
-    <img src="./figures/PoC_Ex3_nDPICNN_Fig5.jpg">
-</p>
+    <p align="center">
+        <img src="figures/PoC_Ex3_nDPICNN_Fig5.png" width="562" alt="">
+    </p>
 </figure>
 
 As in the 1\-dimensional convex case, compute bounds for 1\-dimensional restrictions for fixed <samp>x1</samp>.
@@ -235,9 +236,9 @@ title("Guarantees of upper and lower bounds for PICNN network for fixed x1=" + x
 ```
 
 <figure>
-<p align="center">
-    <img src="./figures/PoC_Ex3_nDPICNN_Fig6.jpg">
-</p>
+    <p align="center">
+        <img src="figures/PoC_Ex3_nDPICNN_Fig6.png" width="562" alt="">
+    </p>
 </figure>
 
 # Helper Functions
diff --git a/examples/convex/introductory/PoC_Ex3_nDPICNN.mlx b/examples/convex/introductory/PoC_Ex3_nDPICNN.mlx
index 962295f..6ffd164 100644
Binary files a/examples/convex/introductory/PoC_Ex3_nDPICNN.mlx and b/examples/convex/introductory/PoC_Ex3_nDPICNN.mlx differ
diff --git a/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig1.jpg b/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig1.jpg
deleted file mode 100644
index ee27a8c..0000000
Binary files a/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig1.jpg and /dev/null differ
diff --git a/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig1.png b/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig1.png
new file mode 100644
index 0000000..fe7189b
Binary files /dev/null and b/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig1.png differ
diff --git a/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig2.jpg b/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig2.jpg
deleted file mode 100644
index 010d41f..0000000
Binary files a/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig2.jpg and /dev/null differ
diff --git a/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig2.png b/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig2.png
new file mode 100644
index 0000000..bc3612c
Binary files /dev/null and b/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig2.png differ
diff --git a/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig3.jpg b/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig3.jpg
deleted file mode 100644
index 85ae223..0000000
Binary files a/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig3.jpg and /dev/null differ
diff --git a/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig3.png b/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig3.png
new file mode 100644
index 0000000..92c3b24
Binary files /dev/null and b/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig3.png differ
diff --git a/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig4.jpg b/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig4.jpg
deleted file mode 100644
index ccb3b25..0000000
Binary files a/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig4.jpg and /dev/null differ
diff --git a/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig4.png b/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig4.png
new file mode 100644
index 0000000..f48a056
Binary files /dev/null and b/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig4.png differ
diff --git a/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig5.jpg b/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig5.jpg
deleted file mode 100644
index a56182c..0000000
Binary files a/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig5.jpg and /dev/null differ
diff --git a/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig5.png b/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig5.png
new file mode 100644
index 0000000..6ee4cbe
Binary files /dev/null and b/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig5.png differ
diff --git a/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig6.jpg b/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig6.jpg
deleted file mode 100644
index bf57a1b..0000000
Binary files a/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig6.jpg and /dev/null differ
diff --git a/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig6.png b/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig6.png
new file mode 100644
index 0000000..18de7e5
Binary files /dev/null and b/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig6.png differ
diff --git a/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig7.jpg b/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig7.jpg
deleted file mode 100644
index 52248a6..0000000
Binary files a/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig7.jpg and /dev/null differ
diff --git a/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig7.png b/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig7.png
new file mode 100644
index 0000000..b6ea101
Binary files /dev/null and b/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig7.png differ
diff --git a/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig8.jpg b/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig8.jpg
deleted file mode 100644
index 8b9595e..0000000
Binary files a/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig8.jpg and /dev/null differ
diff --git a/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig8.png b/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig8.png
new file mode 100644
index 0000000..43ccf46
Binary files /dev/null and b/examples/convex/introductory/figures/PoC_Ex1_1DFICNN_Fig8.png differ
diff --git a/examples/convex/introductory/figures/PoC_Ex2_nDFICNN_Fig1.jpg b/examples/convex/introductory/figures/PoC_Ex2_nDFICNN_Fig1.jpg
deleted file mode 100644
index 3f4a295..0000000
Binary files a/examples/convex/introductory/figures/PoC_Ex2_nDFICNN_Fig1.jpg and /dev/null differ
diff --git a/examples/convex/introductory/figures/PoC_Ex2_nDFICNN_Fig1.png b/examples/convex/introductory/figures/PoC_Ex2_nDFICNN_Fig1.png
new file mode 100644
index 0000000..18f4234
Binary files /dev/null and b/examples/convex/introductory/figures/PoC_Ex2_nDFICNN_Fig1.png differ
diff --git a/examples/convex/introductory/figures/PoC_Ex2_nDFICNN_Fig2.jpg b/examples/convex/introductory/figures/PoC_Ex2_nDFICNN_Fig2.jpg
deleted file mode 100644
index 010d41f..0000000
Binary files a/examples/convex/introductory/figures/PoC_Ex2_nDFICNN_Fig2.jpg and /dev/null differ
diff --git a/examples/convex/introductory/figures/PoC_Ex2_nDFICNN_Fig2.png b/examples/convex/introductory/figures/PoC_Ex2_nDFICNN_Fig2.png
new file mode 100644
index 0000000..49731f3
Binary files /dev/null and b/examples/convex/introductory/figures/PoC_Ex2_nDFICNN_Fig2.png differ
diff --git a/examples/convex/introductory/figures/PoC_Ex2_nDFICNN_Fig3.jpg b/examples/convex/introductory/figures/PoC_Ex2_nDFICNN_Fig3.jpg
deleted file mode 100644
index 4bb16a4..0000000
Binary files a/examples/convex/introductory/figures/PoC_Ex2_nDFICNN_Fig3.jpg and /dev/null differ
diff --git a/examples/convex/introductory/figures/PoC_Ex2_nDFICNN_Fig3.png b/examples/convex/introductory/figures/PoC_Ex2_nDFICNN_Fig3.png
new file mode 100644
index 0000000..3759686
Binary files /dev/null and b/examples/convex/introductory/figures/PoC_Ex2_nDFICNN_Fig3.png differ
diff --git a/examples/convex/introductory/figures/PoC_Ex2_nDFICNN_Fig4.jpg b/examples/convex/introductory/figures/PoC_Ex2_nDFICNN_Fig4.jpg
deleted file mode 100644
index 3a5c37b..0000000
Binary files a/examples/convex/introductory/figures/PoC_Ex2_nDFICNN_Fig4.jpg and /dev/null differ
diff --git a/examples/convex/introductory/figures/PoC_Ex2_nDFICNN_Fig4.png b/examples/convex/introductory/figures/PoC_Ex2_nDFICNN_Fig4.png
new file mode 100644
index 0000000..8c0aeb3
Binary files /dev/null and b/examples/convex/introductory/figures/PoC_Ex2_nDFICNN_Fig4.png differ
diff --git a/examples/convex/introductory/figures/PoC_Ex2_nDFICNN_Fig5.jpg b/examples/convex/introductory/figures/PoC_Ex2_nDFICNN_Fig5.jpg
deleted file mode 100644
index 4f9fc15..0000000
Binary files a/examples/convex/introductory/figures/PoC_Ex2_nDFICNN_Fig5.jpg and /dev/null differ
diff --git a/examples/convex/introductory/figures/PoC_Ex2_nDFICNN_Fig5.png b/examples/convex/introductory/figures/PoC_Ex2_nDFICNN_Fig5.png
new file mode 100644
index 0000000..5c0712d
Binary files /dev/null and b/examples/convex/introductory/figures/PoC_Ex2_nDFICNN_Fig5.png differ
diff --git a/examples/convex/introductory/figures/PoC_Ex3_nDPICNN_Fig1.jpg b/examples/convex/introductory/figures/PoC_Ex3_nDPICNN_Fig1.jpg
deleted file mode 100644
index 4a0dbc9..0000000
Binary files a/examples/convex/introductory/figures/PoC_Ex3_nDPICNN_Fig1.jpg and /dev/null differ
diff --git a/examples/convex/introductory/figures/PoC_Ex3_nDPICNN_Fig1.png b/examples/convex/introductory/figures/PoC_Ex3_nDPICNN_Fig1.png
new file mode 100644
index 0000000..bc28a12
Binary files /dev/null and b/examples/convex/introductory/figures/PoC_Ex3_nDPICNN_Fig1.png differ
diff --git a/examples/convex/introductory/figures/PoC_Ex3_nDPICNN_Fig2.jpg b/examples/convex/introductory/figures/PoC_Ex3_nDPICNN_Fig2.jpg
deleted file mode 100644
index 188fd11..0000000
Binary files a/examples/convex/introductory/figures/PoC_Ex3_nDPICNN_Fig2.jpg and /dev/null differ
diff --git a/examples/convex/introductory/figures/PoC_Ex3_nDPICNN_Fig2.png b/examples/convex/introductory/figures/PoC_Ex3_nDPICNN_Fig2.png
new file mode 100644
index 0000000..890a058
Binary files /dev/null and b/examples/convex/introductory/figures/PoC_Ex3_nDPICNN_Fig2.png differ
diff --git a/examples/convex/introductory/figures/PoC_Ex3_nDPICNN_Fig3.jpg b/examples/convex/introductory/figures/PoC_Ex3_nDPICNN_Fig3.jpg
deleted file mode 100644
index eb97474..0000000
Binary files a/examples/convex/introductory/figures/PoC_Ex3_nDPICNN_Fig3.jpg and /dev/null differ
diff --git a/examples/convex/introductory/figures/PoC_Ex3_nDPICNN_Fig3.png b/examples/convex/introductory/figures/PoC_Ex3_nDPICNN_Fig3.png
new file mode 100644
index 0000000..54d4879
Binary files /dev/null and b/examples/convex/introductory/figures/PoC_Ex3_nDPICNN_Fig3.png differ
diff --git a/examples/convex/introductory/figures/PoC_Ex3_nDPICNN_Fig4.jpg b/examples/convex/introductory/figures/PoC_Ex3_nDPICNN_Fig4.jpg
deleted file mode 100644
index a01dfa3..0000000
Binary files a/examples/convex/introductory/figures/PoC_Ex3_nDPICNN_Fig4.jpg and /dev/null differ
diff --git a/examples/convex/introductory/figures/PoC_Ex3_nDPICNN_Fig4.png b/examples/convex/introductory/figures/PoC_Ex3_nDPICNN_Fig4.png
new file mode 100644
index 0000000..be4e800
Binary files /dev/null and b/examples/convex/introductory/figures/PoC_Ex3_nDPICNN_Fig4.png differ
diff --git a/examples/convex/introductory/figures/PoC_Ex3_nDPICNN_Fig5.jpg b/examples/convex/introductory/figures/PoC_Ex3_nDPICNN_Fig5.jpg
deleted file mode 100644
index 77a2db2..0000000
Binary files a/examples/convex/introductory/figures/PoC_Ex3_nDPICNN_Fig5.jpg and /dev/null differ
diff --git a/examples/convex/introductory/figures/PoC_Ex3_nDPICNN_Fig5.png b/examples/convex/introductory/figures/PoC_Ex3_nDPICNN_Fig5.png
new file mode 100644
index 0000000..0e2bd78
Binary files /dev/null and b/examples/convex/introductory/figures/PoC_Ex3_nDPICNN_Fig5.png differ
diff --git a/examples/convex/introductory/figures/PoC_Ex3_nDPICNN_Fig6.jpg b/examples/convex/introductory/figures/PoC_Ex3_nDPICNN_Fig6.jpg
deleted file mode 100644
index 4f6ccff..0000000
Binary files a/examples/convex/introductory/figures/PoC_Ex3_nDPICNN_Fig6.jpg and /dev/null differ
diff --git a/examples/convex/introductory/figures/PoC_Ex3_nDPICNN_Fig6.png b/examples/convex/introductory/figures/PoC_Ex3_nDPICNN_Fig6.png
new file mode 100644
index 0000000..aed903e
Binary files /dev/null and b/examples/convex/introductory/figures/PoC_Ex3_nDPICNN_Fig6.png differ
diff --git a/examples/convex/neuralODE/TrainConvexNeuralODENetworkWithEulerODESolverExample.md b/examples/convex/neuralODE/TrainConvexNeuralODENetworkWithEulerODESolverExample.md
index dc3bde7..1106c69 100644
--- a/examples/convex/neuralODE/TrainConvexNeuralODENetworkWithEulerODESolverExample.md
+++ b/examples/convex/neuralODE/TrainConvexNeuralODENetworkWithEulerODESolverExample.md
@@ -23,22 +23,23 @@ where $A$ is a 2\-by\-2 matrix.
 The neural network of this example takes as input an initial condition and computes the ODE solution through the learned neural ODE model.
 
 <figure>
-<p align="center">
-    <img src="./figures/neuralODE_Fig1.jpg">
-</p>
+    <p align="center">
+        <img src="./figures/neuralODE_Fig1.jpg" width="521" alt="">
+    </p>
 </figure>
 
 The neural ODE operation, given an initial condition, outputs the solution of an ODE model. In this example, specify a fully input convex neural network (FICNN) block of '2\-layer' depth, i.e., with a fully connected layer, a softplus layer, a second fully connected layer that is then combined with the input via a residual fully connected operation, as the ODE model.
 
 <figure>
-<p align="center">
-    <img src="./figures/neuralODE_Fig2.jpg">
-</p>
+    <p align="center">
+        <img src="./figures/neuralODE_Fig2.png" width="355" alt="">
+    </p>
 </figure>
 
 In this example, the ODE that defines the model is solved numerically using the Euler method. Unlike the higher order Runge\-Kutta (4,5) pair of Dormand and Prince \[2\], Euler method for solving ODEs is a first order, linear procedure and so preserves convexity. That is, an Euler update procedure with a convex network governing the dynamics of the physical system preserves overall convexity of the input, $y(t)$ , with respect to the output, $y(t+1)$ , i.e., $y(t+1)=g(y(t))$ where $g:{\mathbb{R}}^2 \to {\mathbb{R}}^2$ is fully input convex in each output.
 
-In this example, you use <samp>forwardEuler</samp>, an implementation of a forward Euler method for <samp>dlarray</samp> that behaves similar to <samp>dlode45</samp>. For more information, see [<samp>forwardEuler</samp>](./forwardEuler.md).
+
+In this example, you use <samp>forwardEuler</samp>, an implementation of a forward Euler method for <samp>dlarray</samp> that behaves similar to <samp>dlode45</samp>. For more information, see [<samp>forwardEuler</samp>](./forwardEuler.m).
 
 # Synthesize Data of Target Dynamics
 
@@ -69,9 +70,9 @@ grid on
 ```
 
 <figure>
-<p align="center">
-    <img src="./figures/neuralODE_Fig3.jpg">
-</p>
+    <p align="center">
+        <img src="./figures/neuralODE_Fig3.png" width="562" alt="">
+    </p>
 </figure>
 
 # Define and Initialize Model Parameters
@@ -84,7 +85,7 @@ dt = t(2);
 timesteps = (0:neuralOdeTimesteps)*dt;
 ```
 
-Construct a 2\-dimensional FICNN using fully connected layers and <samp>softplus</samp> activation functions. For more information on the architectural construction, see [AI Verification: Convex](../../../documentation/AI-Verification-Convexity.md), or, for a proof\-of\-concept example, see [<samp>PoC_Ex2_nDFICNN</samp>](../introductory/PoC_Ex2_nDFICNN.md). The first fully connected operation takes as input a vector of size <samp>stateSize</samp> and increases its length to <samp>hiddenSize</samp>. Conversely, the subsequent fully connected operation takes as input a vector of length <samp>hiddenSize</samp> and decreases its length to <samp>stateSize</samp>. The residual connection applies a fully connected operation that takes <samp>stateSize</samp> to <samp>stateSize.</samp>
+Construct a 2\-dimensional FICNN using fully connected layers and <samp>softplus</samp> activation functions. For more information on the architectural construction, see [AI Verification: Convex](../../../documentation/AI-Verification-Convexity.md), or, for a proof\-of\-concept example, see [<samp>PoC_Ex2_nDFICNN</samp>](../ProofOfConcept/PoC_Ex2_nDFICNN.md). The first fully connected operation takes as input a vector of size <samp>stateSize</samp> and increases its length to <samp>hiddenSize</samp>. Conversely, the subsequent fully connected operation takes as input a vector of length <samp>hiddenSize</samp> and decreases its length to <samp>stateSize</samp>. The residual connection applies a fully connected operation that takes <samp>stateSize</samp> to <samp>stateSize.</samp>
 
 ```matlab
 stateSize = size(xTrain,1);
@@ -92,7 +93,7 @@ hiddenSize = 20;
 numHiddenUnits = [hiddenSize stateSize];
 
 neuralOdeFICNN = buildConstrainedNetwork("fully-convex",stateSize,numHiddenUnits,...
-    PositiveNonDecreasingActivation="softplus")
+    ConvexNonDecreasingActivation="softplus")
 ```
 
 ```matlabTextOutput
@@ -116,9 +117,9 @@ plot(neuralOdeFICNN)
 ```
 
 <figure>
-<p align="center">
-    <img src="./figures/neuralODE_Fig4.jpg">
-</p>
+    <p align="center">
+        <img src="./figures/neuralODE_Fig4.png" width="562" alt="">
+    </p>
 </figure>
 
 ## Define Model Function
@@ -226,15 +227,15 @@ end
 ```
 
 <figure>
-<p align="center">
-    <img src="./figures/neuralODE_Fig5.jpg">
-</p>
+    <p align="center">
+        <img src="./figures/neuralODE_Fig5.png" width="2638" alt="">
+    </p>
 </figure>
 
 <figure>
-<p align="center">
-    <img src="./figures/neuralODE_Fig6.jpg">
-</p>
+    <p align="center">
+        <img src="./figures/neuralODE_Fig6.png" width="562" alt="">
+    </p>
 </figure>
 
 # Evaluate Model
@@ -293,9 +294,9 @@ plotTrueAndPredictedSolutions(xTrue4, xPred4);
 ```
 
 <figure>
-<p align="center">
-    <img src="./figures/neuralODE_Fig7.jpg">
-</p>
+    <p align="center">
+        <img src="./figures/neuralODE_Fig7.png" width="562" alt="">
+    </p>
 </figure>
 
 # Formal Boundedness Guarantees
@@ -392,9 +393,9 @@ end
 ```
 
 <figure>
-<p align="center">
-    <img src="./figures/neuralODE_Fig8.jpg">
-</p>
+    <p align="center">
+        <img src="./figures/neuralODE_Fig8.png" width="562" alt="">
+    </p>
 </figure>
 
 From the figure, you observe that the true trajectory (black line) is always within the red bounding box at any point in time. This guarantees the bounded behaviour of all trajectories for a given region of initial conditions, time step sizes and total time evolution.
diff --git a/examples/convex/neuralODE/TrainConvexNeuralODENetworkWithEulerODESolverExample.mlx b/examples/convex/neuralODE/TrainConvexNeuralODENetworkWithEulerODESolverExample.mlx
index b3f7f1b..45881ab 100644
Binary files a/examples/convex/neuralODE/TrainConvexNeuralODENetworkWithEulerODESolverExample.mlx and b/examples/convex/neuralODE/TrainConvexNeuralODENetworkWithEulerODESolverExample.mlx differ
diff --git a/examples/convex/neuralODE/figures/neuralODE_Fig2.jpg b/examples/convex/neuralODE/figures/neuralODE_Fig2.jpg
deleted file mode 100644
index b7e1cef..0000000
Binary files a/examples/convex/neuralODE/figures/neuralODE_Fig2.jpg and /dev/null differ
diff --git a/examples/convex/neuralODE/figures/neuralODE_Fig2.png b/examples/convex/neuralODE/figures/neuralODE_Fig2.png
new file mode 100644
index 0000000..a17d8e3
Binary files /dev/null and b/examples/convex/neuralODE/figures/neuralODE_Fig2.png differ
diff --git a/examples/convex/neuralODE/figures/neuralODE_Fig3.jpg b/examples/convex/neuralODE/figures/neuralODE_Fig3.jpg
deleted file mode 100644
index 3cb89e6..0000000
Binary files a/examples/convex/neuralODE/figures/neuralODE_Fig3.jpg and /dev/null differ
diff --git a/examples/convex/neuralODE/figures/neuralODE_Fig3.png b/examples/convex/neuralODE/figures/neuralODE_Fig3.png
new file mode 100644
index 0000000..54b231a
Binary files /dev/null and b/examples/convex/neuralODE/figures/neuralODE_Fig3.png differ
diff --git a/examples/convex/neuralODE/figures/neuralODE_Fig4.jpg b/examples/convex/neuralODE/figures/neuralODE_Fig4.jpg
deleted file mode 100644
index 4b81673..0000000
Binary files a/examples/convex/neuralODE/figures/neuralODE_Fig4.jpg and /dev/null differ
diff --git a/examples/convex/neuralODE/figures/neuralODE_Fig4.png b/examples/convex/neuralODE/figures/neuralODE_Fig4.png
new file mode 100644
index 0000000..0b31e3c
Binary files /dev/null and b/examples/convex/neuralODE/figures/neuralODE_Fig4.png differ
diff --git a/examples/convex/neuralODE/figures/neuralODE_Fig5.jpg b/examples/convex/neuralODE/figures/neuralODE_Fig5.jpg
deleted file mode 100644
index cc11a8c..0000000
Binary files a/examples/convex/neuralODE/figures/neuralODE_Fig5.jpg and /dev/null differ
diff --git a/examples/convex/neuralODE/figures/neuralODE_Fig5.png b/examples/convex/neuralODE/figures/neuralODE_Fig5.png
new file mode 100644
index 0000000..a78e561
Binary files /dev/null and b/examples/convex/neuralODE/figures/neuralODE_Fig5.png differ
diff --git a/examples/convex/neuralODE/figures/neuralODE_Fig6.jpg b/examples/convex/neuralODE/figures/neuralODE_Fig6.jpg
deleted file mode 100644
index d7d6962..0000000
Binary files a/examples/convex/neuralODE/figures/neuralODE_Fig6.jpg and /dev/null differ
diff --git a/examples/convex/neuralODE/figures/neuralODE_Fig6.png b/examples/convex/neuralODE/figures/neuralODE_Fig6.png
new file mode 100644
index 0000000..9a606b0
Binary files /dev/null and b/examples/convex/neuralODE/figures/neuralODE_Fig6.png differ
diff --git a/examples/convex/neuralODE/figures/neuralODE_Fig7.jpg b/examples/convex/neuralODE/figures/neuralODE_Fig7.jpg
deleted file mode 100644
index e05cf40..0000000
Binary files a/examples/convex/neuralODE/figures/neuralODE_Fig7.jpg and /dev/null differ
diff --git a/examples/convex/neuralODE/figures/neuralODE_Fig7.png b/examples/convex/neuralODE/figures/neuralODE_Fig7.png
new file mode 100644
index 0000000..e4b95c0
Binary files /dev/null and b/examples/convex/neuralODE/figures/neuralODE_Fig7.png differ
diff --git a/examples/convex/neuralODE/figures/neuralODE_Fig8.jpg b/examples/convex/neuralODE/figures/neuralODE_Fig8.jpg
deleted file mode 100644
index 750b263..0000000
Binary files a/examples/convex/neuralODE/figures/neuralODE_Fig8.jpg and /dev/null differ
diff --git a/examples/convex/neuralODE/figures/neuralODE_Fig8.png b/examples/convex/neuralODE/figures/neuralODE_Fig8.png
new file mode 100644
index 0000000..43a1f45
Binary files /dev/null and b/examples/convex/neuralODE/figures/neuralODE_Fig8.png differ
diff --git a/tests/system/tFullyInputConvexNetwork.m b/tests/system/tFullyInputConvexNetwork.m
index e74d543..953c096 100644
--- a/tests/system/tFullyInputConvexNetwork.m
+++ b/tests/system/tFullyInputConvexNetwork.m
@@ -28,7 +28,7 @@ function verifyNetworkOutputIsFullyConvex(testCase, PndActivationFunctionSet, Ta
             inputSize = 1;
             numHiddenUnits = [16 8 4 1];
             ficnn = buildConstrainedNetwork("fully-convex",inputSize,numHiddenUnits, ...
-                PositiveNonDecreasingActivation=PndActivationFunctionSet);
+                ConvexNonDecreasingActivation=PndActivationFunctionSet);
 
             % Train fully convex network. Use just 1 epoch.
             maxEpochs = 1;
diff --git a/tests/system/tPartiallyInputConvexNetwork.m b/tests/system/tPartiallyInputConvexNetwork.m
index a10615a..47251ba 100644
--- a/tests/system/tPartiallyInputConvexNetwork.m
+++ b/tests/system/tPartiallyInputConvexNetwork.m
@@ -44,7 +44,7 @@ function verifyNetworkIsPartiallyConvex(testCase, PndActivationFunctionSet, Acti
             inputSize = 2;
             numHiddenUnits = [32 8 1];
             picnn = buildConstrainedNetwork("partially-convex",inputSize,numHiddenUnits,...
-                PositiveNonDecreasingActivation=PndActivationFunctionSet,...
+                ConvexNonDecreasingActivation=PndActivationFunctionSet,...
                 Activation=ActivationFunctionSet,...
                 ConvexChannelIdx=2);
 
diff --git a/tests/unit/conslearn/convex/tbuildFICNN.m b/tests/unit/conslearn/convex/tbuildFICNN.m
index 14dcd8b..ac96bf9 100644
--- a/tests/unit/conslearn/convex/tbuildFICNN.m
+++ b/tests/unit/conslearn/convex/tbuildFICNN.m
@@ -103,7 +103,7 @@ function verifyActivationLayersAreCorrect(testCase, FullyConnectedLayerSizesSet,
 
             % Build convex neural network
             net = conslearn.convex.buildFICNN([28, 28, 1], FullyConnectedLayerSizesSet, ...
-                PositiveNonDecreasingActivation = ActivationFunctionSet.Input);
+                ConvexNonDecreasingActivation = ActivationFunctionSet.Input);
 
             % Get indices for activation layers
             pndLayerIdx = find(contains({net.Layers.Name}, "pnd"));
diff --git a/tests/unit/conslearn/convex/tbuildPICNN.m b/tests/unit/conslearn/convex/tbuildPICNN.m
index b8e399c..f8ed11f 100644
--- a/tests/unit/conslearn/convex/tbuildPICNN.m
+++ b/tests/unit/conslearn/convex/tbuildPICNN.m
@@ -71,7 +71,7 @@ function verifyPndActivationLayersAreCorrect(testCase, FullyConnectedLayerSizesS
 
             % Build network
             net = conslearn.convex.buildPICNN(inputSize, numHiddenUnits, ...
-                PositiveNonDecreasingActivation=PndActivationFunctionSet.Input);
+                ConvexNonDecreasingActivation=PndActivationFunctionSet.Input);
 
             % Get indices for activation layers
             pndLayerIdx = iFindLayerIdxWithName(net, "pnd");
diff --git a/tests/unit/conslearn/monotonic/tmakeParametersMonotonic.m b/tests/unit/conslearn/monotonic/tmakeParametersMonotonic.m
index ee975ec..a3a6b87 100644
--- a/tests/unit/conslearn/monotonic/tmakeParametersMonotonic.m
+++ b/tests/unit/conslearn/monotonic/tmakeParametersMonotonic.m
@@ -10,7 +10,7 @@ function verifyCorrectResultForSimpleCase(testCase, ValidInputs)
             out = conslearn.monotonic.makeParametersMonotonic( ...
                 ValidInputs.W, ValidInputs.lambda, ValidInputs.pNorm);
 
-            testCase.verifyEqual(extractdata(out), extractdata(ValidInputs.ExpectedOutput), AbsTol=1e-12);
+            testCase.verifyEqual(out, ValidInputs.ExpectedOutput, AbsTol=1e-12);
         end
     end
 
diff --git a/tests/unit/tbuildConstrainedNetwork.m b/tests/unit/tbuildConstrainedNetwork.m
index d5e6c00..f5cea01 100644
--- a/tests/unit/tbuildConstrainedNetwork.m
+++ b/tests/unit/tbuildConstrainedNetwork.m
@@ -47,7 +47,7 @@ function canBuildFullyConvexNetworkWithOptionalInputs(testCase, ValidInputSizeSe
             pndActivation = ValidPndActivationSet;
 
             fcn = @() buildConstrainedNetwork(constraint, inputSize, numHiddenUnits, ...
-                "PositiveNonDecreasingActivation", pndActivation);
+                "ConvexNonDecreasingActivation", pndActivation);
 
             net = testCase.verifyWarningFree(fcn);
 
@@ -63,7 +63,7 @@ function canBuildPartiallyConvexNetworkWithOptionalInputs(testCase, ValidInputSi
             convexChannelIdx = ValidConvexChannelIdxSet;
 
             fcn = @() buildConstrainedNetwork(constraint, inputSize, numHiddenUnits, ...
-                "PositiveNonDecreasingActivation", pndActivation, ...
+                "ConvexNonDecreasingActivation", pndActivation, ...
                 "Activation", activation, ...
                 "ConvexChannelIdx", convexChannelIdx);
 
@@ -355,8 +355,8 @@ function errorsForLipschitzConstrainedAndInvalidNameValuePairs(testCase, Invalid
     Name = "UpperBoundLipschitzConstant", ...
     Value = 1);
 
-param.PositiveNonDecreasingActivation = struct( ...
-    "Name", "PositiveNonDecreasingActivation", ...
+param.ConvexNonDecreasingActivation = struct( ...
+    "Name", "ConvexNonDecreasingActivation", ...
     "Value", "relu");
 
 param.ConvexChannelIdx = struct( ...
@@ -373,8 +373,8 @@ function errorsForLipschitzConstrainedAndInvalidNameValuePairs(testCase, Invalid
     Name = "UpperBoundLipschitzConstant", ...
     Value = 1);
 
-param.PositiveNonDecreasingActivation = struct( ...
-    "Name", "PositiveNonDecreasingActivation", ...
+param.ConvexNonDecreasingActivation = struct( ...
+    "Name", "ConvexNonDecreasingActivation", ...
     "Value", "relu");
 
 param.ConvexChannelIdx = struct( ...
@@ -387,8 +387,8 @@ function errorsForLipschitzConstrainedAndInvalidNameValuePairs(testCase, Invalid
     Name = "ResidualScaling", ...
     Value = 1);
 
-param.PositiveNonDecreasingActivation = struct( ...
-    "Name", "PositiveNonDecreasingActivation", ...
+param.ConvexNonDecreasingActivation = struct( ...
+    "Name", "ConvexNonDecreasingActivation", ...
     "Value", "relu");
 
 param.ConvexChannelIdx = struct( ...