Since nutrient uptake and

distribution are affected by interactions within the plant, multivariate

approaches have been derived in a bid to

overcome such difficulties of the univariate and bivariate approaches. Several

pieces of literature have shown that nutrient diagnosis using univariate and

bivariate approaches presented conflicting diagnosis (Huang et al., 2012; Wairegi and van Asten,

2012; Blanco-Macias et al., 2009; Silva

et al., 2004). As such, nutrient

norms from these approaches are numerically biased (Parent et al. (2012a). In 1992, Parent and Dafir proposed a modified DRIS

using the centered log-ratio technique – proposed by Aitchison (1986) for compositional

data to conduct Compositional Nutrient Diagnosis (CND). CND and the univariate

and bivariate indices have shown to be moderately to closely related to each

other (Parent et al., 1994a; Parent et al., 1994b; Wairegi and van Asten,

2011; Parent, 2011; Wairegi and van Asten, 2012). With the multivariate

approach, other inferential techniques such as principal component analysis

(PCA), canonical correlation, Chi-square test and others are often used to

improve the efficiency and provide an accurate diagnosis by recognizing the

dynamics of nutrient interaction (Serra et

al., 2016).

CND is a multivariate approach that was developed to

improve nutrient diagnosis via univariate or bivariate approach. It was proposed and developed by Parent and

Dafir (1992) and is based on the principles of Compositional Data Analysis

(CDA) 166. The CND takes into consideration the interdependence of nutrient

concentrations in plants and that the sum of all dry matter concentrations

always totals up to 100 %, or sums up to 1.

CND method has an

accurately stated covariance matrix, allowing for the computation of ratios

originating from nutrient concentration that are mutually exclusive (Parent,

2011) as opposed to DRIS approach which is empirical without clear and distinct

outline of the covariance matrix for conducting multivariate analysis (Barlog,

2016).

Nutrient indices are used

for the interpretation of compositional nutrient data. It represents the

difference between a particular nutrient and its geometric mean relative to the

difference of the same nutrient to the geometric mean of the high-yielding

subpopulation 29

2.5.3.1 Establishing CND Norms

CND norms are established

first by forming a database of nutrient concentration and yield of the crop in

question. According Serra et al.

(2016), the nutrient concentration database must show normal distribution,

thus, making it necessary to transform the nutrient concentration to correct

non-normal distribution. Several methods of nutrient concentration

transformation have been proposed. These include row centered log ratio,

isometric log ratio e.t.c. Row centered log ratios (clr) have been the most widely used method of transformation.

Recently, Parent et

al (2013) demonstrated certain difficulties in CND-clr computation

such as; the occurrence of a singular matrix in multivariate analyses

computations (due to closure of indices to a zero-sum) made clr an

inappropriate transformation as the geometric mean of the whole unstructured

vector was affected by large variations in micronutrient concentrations due to

fungicide applications.

Because of these limitations, modified CND-clr was proposed by Parent et al. (2013) and this approach uses the

isometric log-ratio (ilr) transformation instead of row-centered log

ratios. This approach generates linearly independent variables computed as

structured balances of components or groups of components (Egozcue et al., 2003). To date, CND-ilr has

been used to classify the nutrient composition of several crops (Parent, 2011;

Parent et al., 2012b; Hernandes et al., 2012; Marchand et al., 2013; Parent et al., 2013).

After transforming the nutrient concentration, the

high yielding population of healthy leaves with no damage is selected. The

database might be divided into two subpopulations using the mean+0.5 standard

deviation as a criterion to separate the populations into a high yielding group

and low yielding group (Serra et al.,

2010). Cumulative variance function fit to cubic (Khiari et al., 2001) and Boltzmann equation (Hernandez et al., 2008). Parent et al. (1994) proposed Chi-square

distribution function to define a CND threshold value for nutrient

imbalance.

2.5.3.2 Mathematical approach for establishing the CND norms

Parent and Dafir (1992) indicated that plant tissue

composition forms a d dimensional

nutrient arrangement i.e, simplex (Sd)

made of d+1 nutrient proportions

including a d nutrient and filling

value defined as follows:

10

Where 100 is the dry matter concentration

(%)

N, P, K… = Nutrient proportion (%)

= The filling value computed as

11

The nutrient proportions

become scale invariant after they have been divided by geometric mean (G) of the d+1 components including

(Aitchinson, 1986) as follows;

12

After calculation of the

geometric mean (G), the new

expression for the multi-nutrient is log-transformed to generate the

row-centered log ratios as follows:

13

The sum of the

row-centered log ratios must be equal to zero i.e;

14

From this, CND norms are

the mean and the standard deviations (SD) of row-centered log ratios of the

high yielding subpopulation from the yield and nutrient concentration database.

The additivity or independence among the compositional data is ascertained

using the clr transformation

(Aitchison, 1986).

After obtaining the clr, there is need to iterate a

partition of the database into two subpopulations using the Cate-Nelson

procedure after the arranging the yield data in decreasing order as described

by Khiari et al. (2001). After this

stage, it is necessary to iterate a partition of the database between two

subpopulations using the Cate-Nelson procedure once the observations have been

ranked in a decreasing yield order (Khiari et

al, 2001).

In the first partition,

the two highest yield values form one group, and the remainder of yield values

forms another group; thereafter, the three highest yield values form the other.

This process is repeated until the two lowest yield values form one group, and

the remainder of yield values forms the other. At each iteration, the first

subpopulation comprises n1 observations, and the second

comprises n2 observations for a total of n observations

(n = n1 + n1) in the whole database.

For the two subpopulations obtained at each iteration, one must compute the

variance of CND VX values.

The variance ratio for

component X can be estimated as follows:

15

The cumulative variance

ratio function (

is

then computed as the sum of variance ratios at the ith iteration from the top. The cumulated variance

ratios for a given iteration is computed as a proportion of the total sum of

variance ratios across all iterations to compare the discrimination power of

the VX between low-yield and high-yield subpopulations on a common

scale. It is computed as:

16

Where n1-1 is

partition number and n is the total number of observations (n1

+ n2). The denominator is the sum of variance ratios across

all iterations, and thus, is a constant for nutrient X. The cumulative

function

related

to yield (Y) shows a cubic pattern:

17

Where

h = intercept

a,

b and c= parameter coefficients.

The optimum partition

between the two subpopulations is defined as the inflection portion (IP) and is computed by as the point

where the model shows changes in concavity and is obtained by equating the

derivative of the equation above to zero as:

18

And then the second

derivative as:

19

The yield cutoff value is

obtained as

and the highest yield cutoff

value across nutrient expressions can be selected to ascertain that minimum

yield target for a high-yield subpopulation will be classified as high yield

whatever the nutrition expression.

2.5.3.3 Establishing the CND Index

After establishing the

CND norms as means and standard deviation of the clr of the nutrient concentration in the maize ear leaf tissue

denoted as

+

and

+

.

The CND index (I) denoted

as

…

, were calculated from the clr as follows:

,

,

,

,…

20

the index is defined as the distance of a given

nutrient Xi from its geometric mean

12, which is relative to the distance of the same nutrient from the geometric

mean of the target population (reference population with high yield).

From this point of view above, it is expected that when

CND index is closer to zero, Xi

nutrient is less imbalanced than others in the analysis. Serra et al. (2010a, b) observed CND index

close to zero showed higher nutritional balance.

The CND indices are

standardized and linearized variables as dimensions of a circle (d+1=2), a sphere (d+1=3), or hypersphere (d+1>3)

in a d-dimensional space.

2.5.3.4 CND nutrient imbalance index (NII)

The NII is the CND r2

as recommended by Parent and Dafir (1992) and is given as

21

Its radius, r,

computed from the CND nutrient indices, thus characterizes each specimen. The

sum of d + 1 squared independent, unit-normal variables produces a new

variable having a ?2 distribution with d + 1 degrees of freedom

(Ross, 1987). Because CND indices are independent, unit-normal variables, the

CND r2 values must have a ?2 distribution

function. This is why it is recommended that the highest yield cutoff value

(highest discrimination power) among d + 1 nutrient computations be

retained to calculate the proportion of the low-yield subpopulation below yield

cutoff used as the critical value for the cumulative distribution function. As

defined by equation 19 and 20, the closer to zero that CND indices are, and

thus the CND r2 or ?2 values are, the higher the

probability to obtain a high yield.