Computersthem. Engng Vol. IY, No. 12, pp. t287-1300, 1995 Copyright @ lW5 Elscvier Science Ltd Printed in Great Britain. All rights reserved
Peqgamon 0098-1354@4)00119-7
FACTOR
ANALYTICAL
0098-1354/95 $Y.SO+O.~
MODELING DATA
J. L. HARMON,’
OF BIOCHEMICAL
PH. DLJBOC,~. 7 and D. BONVIN’.
*
’ Institut d’Automatique.
Ecole Polytechnique Fedcrate de Lausanne, CH-1015 Lausanne, Switzerland de Genie Biologique. Ecole Polytechnque Fcderale de Lausanne, CH-1015 Lausanne, Switzerland
2Laboratoire
(Received
8 October
1994; final revision received November
26 October 1994)
1994; received for publication
21
Abstract-Factor
Analysis, a multivariate technique for determining the major trends or factors in a data matrix, is shown in this paper to be appropriate for resolving biochemical reaction networks. As opposed to an algorithmic approach, the methods presented in this article are intended to be a highly interactive set of tools. The researcher can use these tools to investigate a data matrix of concentrationchange measurements by proposing different reaction networks. Several tools are adapted from other fields. and a few new techniques are proposed. The new techniques involve the estimation (or exfraction) of reaction stoichiometries and reaction extents when all the reactions are not present at all times. This article presents theoretical elements, simulation results as well as an application of the method to experimental data from the fed-batch production of Baker’s yeast grown on glucose. Reaction stoichiometries and reaction extents are estimated for the reactions of glucose fermentation, glucose oxidation and ethanol oxidation
1. INTRODUCTION In general,
data analysis
is characterized redundant researcher order
in biochemical
by a large
number
measurements.
ing all measurements would
pretation ration
of
of unspecific
is generally
simply
model.
in biochemical these
Thus,
research
measurements
into existing
modeling,
and
control
tion
and the
one of the
The problem and spawned man, FA
of redundant
sciences With spread
the advent
to the field of chemometrics,
for the elucidation good
overview,
method been
adapted
Target
Bonvin,
(Bonvin
1992).
The
these
several
accept
reactions.
researchers
For
a
of chemical
reac-
not only
or reject the
have proposed
aim at
reactions
stoichiometries
biochemical schemes
field, to adapt
iCurrently at: lnstitut de Genie Chimique. Ecole Polytechnique Fed&ale de Lausanne. Switzerland. $ To whom all correspondence should be sent.
than
a
(1991)
method two
(1989) reac-
assume
may
coefficients
set
is difficult be
to
and
the
unclear.
a method
of
drawback
the value
reactions.
reactions,
proposed
minimum
The major
In that
stoichiometric
of this approach of several
that are rarely
is
stoichio-
accurately
avail-
able a priori. As opposed
to a specific
in this paper
a priori
algorithm,
are meant allows
scenarios
Once
the stoichiometry
the reseacher a reaction
of any newly
unobserved
states and reaction
While niques
are
involve
the extraction
chiometries reaction Section 1287
its
is
data,
to estimate
reactor
operation
1992).
from other fields,
also
network
rates (Stephanopou-
many of the techniques
are adaptations
and
retrieved
19841, or to monitor
and Bonvin,
to pro-
certain types of
can be used to test the
consistency
(Prinz
the methods
to be an interac-
by including
information.
resolved,
10s and San,
1990; Prinz and
does
Saner
define
Hamer
irreversible
stoichiometries
pose different
e.g.
has
of independent
to
more
tive set of tools which
age, (for
work,
presented
(Har-
(TFA),
Analysis
and Rippin, method
in
1991). A particular
Factor
the number
but also at helping for
measurements
to the investigation
tion networks determining
of spectral
see Malinowski,
of FA,
(FA)
of the computer
of
another
metric
part of this century
the field of Factor Analysis
1970).
quickly
data was confronted
in the early
for
that it must
and optimiza-
of
this geometrical
coefficients.
integ-
to bioreactors.
that tries to find a sparse
consisting
selection tries
is the inter-
tion methods. the social
network
implement
the data in
their
a method
However,
incorporat-
not available,
like to analyze
FA techniques
presented
experiments
Since a model
to form a meaningful
new challenges
diverse
proposed.
without
shown
These
of reaction having
to
in this work
several
new tech-
new
extents resolve
methods and stoithe
entire
network. 2 describes application
some to
basic properties
biochemical
reactors.
of FA The
J. L. HARMON et al.
1288 description
of
approximate of resolving sections:
the individual
a network.
problem
determination mation
tools
is given
order of their application
of
This is broken
formulation,
information.
and
tion extents from
reaction
using
production
4. Conclusions
and
Problem
the methods
Basically,
FA
experimental of Baker’s
further
yeast
research
in
ideas
=
common
R
D contains
to observations
extents
the matrix
The
different
reactions.
FA
be
can
common
on
applied
the
the matrix
with the rows represent
information)
the and
(species-dependent factors
As
the complete
in
many
found
1970; Malinowski, only
by
in time and the col-
C the stoichiometries
information).
of
and interrelated
(time-dependent
are
then
the
description
references
of
(Harman,
1991), this paper will concentrate
problem
formulation
to bioprocesses,
and
and on several
solution
techniques
of interest. Basic where
In
model.
a set of R reactions
of reaction
of a particular
can be expressed
2
a general
are taking
experiment
place,
the rate
species at a particular
rrrr is the stoichiometric in the rth reaction,
+ Fo,,(t)cs (f). (2)
the rth reaction volume
expressed
of the system,
concentration
the feed
the
and
input
three
terms
represent
in mollh species
form equation
X is the unknown (with the
r= I
volumetric
on the right
hand
rate of
rth
reaction),
stoichiometric
F,,,
rates.
side of equation
the rate of accumulation,
is produced
species
is con-
need to have
biomass,
protein
the input
are The (2) rate
,
(4)
B x S data matrix, of reaction
x,, representing
and
N
is the
(with
senting
the stoichiometry
number
of reactions
the problem
Yr,nT=XN
(measured)
matrix
matrix
the extent of
unknown
R Xs
n:,
repre-
the rth row,
of the rth reaction).
R may also be unknown.
consists of estimating
The Thus,
R, X and N from
D. Data pre-treatment Atomic
At this point it may be
and other balances.
desirable
to inspect
checking
different
For
balance
each
the consistency balances
type,
of the data
(e.g.
a vector
which satisfies the following
carbon,
to be
species.
For
balanced
m can be formed
relationship:
found
the number
of
in the corresponding
balance,
for
the number
found in each of the species. (5),
(5)
of m contains
the carbon
vector contains
by
redox).
Nm=O.
instance,
of carbon
From equations
the
atoms (4) and
one obtains: Dm=XNm=O.
or mass
the overbar
F,,, and flow
or
(3) reads:
B x R extents
the rth column,
or g/h. V is the
with
concentration.
output
of the sth
ir is the reaction
and c, is the molar
of the sth
indicating
coefficient
(e.g.
moles
An amount
species if the
structure
D is the known
where
balance
Here,
and species.
can be used).
D-2
units
r=I
species
defined
Each of the S elements
- Fi,(t)c
(in
do not necessarily
time
as:
=d(vyt))
i,(t)n,
reactor
The species
D contains
matrix
consumed
decreases
character-
where
The
or
(1)
and the columns
R could
the matrix
respectively. produced
if the corresponding
a chemically
(3)
dt,
t,, the. times of the first and the bth
conversely,
In matrix
.
measurements
F,,(t)E, dt + ” F,,&)c&) I 1x1 trrr
or pseudo-species
CDl”rn”S
separate
In an application
to species,
reaction
fzxbxsr
the rows
composition
corresponding umns
c
rowsxfactors
with
an observed
- v(GefM&e‘)
for each observation
increases
to express
D, respectively,
factors.
amount
grams)
of two matrices:
R and C represent
the data matrix
rrcr and
observations,
sumed.
D
istics associated
with
and,
rows x columns
The matrices
-I
data
BACKGROUND
attempts
of
lh
the
as the product
(2) can
the bs-elements
f,, > trer):
V(t*)cs(tb)
=
in
and reac-
formulation
data matrix
D (for
Equation
to give
of
are also presented. 2. THEORETICAL
with time
transfor-
extraction
stoichiometries
are estimated
the fed-batch
Section
the data matrix
into five
results using both noise-free
and noisy data are used to illustrate Section 3. Finally,
rate, respectively.
down
of reactions,
quantities,
Simulated
and the output be integrated
data pre-treatment,
of the number abstract
in the
in the process
Note
also that equation
(6) implies
matrix satisfies the balance. metries derived
that if the data
then any set of stoichio-
from D will also satisfy that balance.
For the case of C constaints reads:
(6)
(balances),
equation
(6)
Factor analytical modeling of biochemical data (7)
where
Equation
(7)
singular
values,
lie in the
null
values),
U is the B x S orthonormal
DM=O, with M a SX C matrix indicates space
that each
of
errors, The
owing
not necessarily
can
constraints
of D must
However,
MT.
D does
data
of constraints.
row
then
be
(Bonvin
to
measurement
satisfy
reconciliated
by projecting
as follows
equation
(7).
to verify
the
D on the null space
and Rippin,
of MT
1990):
D,=D(I-MM+), where
M+
represents Factor
Scaling. magnitude
are many The
ways
the standard
deviation
species
column
weight more
accurately
analysis
measurement
measured
making
There
elements.
species
of
error for each giving
errors.
equal
This has of the
(9).)
their interpretation
with
more diffi-
space
An
alternative
ties (i.e.
way
consists
to the range
the variability
Once
in scaling
so that rank(D)
values
N must be post multiplied physically
meaningful,
unscaled
(in
cause
these singular
rank
of
D
to
determine
After
the
according
of the number data
have
S.
to equation
independent
properly
(3) and scaled,
This can be done
1).
i=
largest initial
The
reactions
occurring
of
simultaneously
must be determined. In the case of no measurement gives
the
number
Since noise cannot
be avoided
tial
noisy
rank
of
determined. portion
only errors.
no
Singular Value
loss
Decomposition
of generality
than the number
B>
then
(Horn
hopefully,
to
be of a
contains (SVD)
is
the
SVD
and Johnson, D=
(SVD) . Assuming
Decomposition
greater S),
the essen-
needs
entails the removal
which,
Value
matrix
R.
for that purpose.
Singular with
this matrix
the rank of reactions
in practice,
data
This necessarily
of
a method
the
noise,
of independent
that the sample of measured
of
D can
be
1985):
LEV==
size is
species expressed
(i.e. as
aiuiv:,
are zero,
of noise will
subsequent
step
values
can
is to
be
neg-
data.
Singular
are
reconstructed
and
are
values
variance
Malinowski
value
with
compared
are added
the
to the
in one
at a
until the data is adequately ratio
(1987)
whether
test using
developed
expressed the singular
an F function
a particular
singular
value,
similar to the S - n smaller
singular
(see Appendix). qualitative
method
representing
true signals
bit rapid
fluctuations.
the autocorrelation
tive measure
Shrager
correlation
between
to different
singular
vector random)
overview number
of many
other
of factors
the authors’ no
small (e.g.
Malinowski
opinion,
one amount
of
(1991) methods
present
these
will
of trial and error
if a
measure-
between
smaller
its
than 0.5).
gives an excellent to determine
the
Overall,
generally
It is usually
methods
corres-
its autocorre-
in noisy data.
+- 1 factor.
these
with
is expected and hence,
its
as there will
In contrast,
associated
elements,
lation will be relatively In addition,
large
If a
factor,
the elements
ment noise, little correlation (mostly
true
observations.
is mostly
(1982)
as a quantita-
(see Appendix).
will be relatively
ponding
be smooth
and Hendler
to a
of
of this
noise will exhi-
function
corresponds
autocorrelation be some
should
of this smoothness
vector
vectors
with random
proposed singular
is the inspection
of the matrix U. Column
while others associated
certain
(9) using only singular
This idea can be quantitatively
a Fisher
a,, , is statistically
that
by reconstructing
to equation
with the largest
values
same result within 2
values
qualitatively
data
singular
matrix
arranged
of
to be non zero and the
singular
the data matrix according
the columns
the number
the number
singular
The
many
cc&
the rank
lected and set to zero.
values
of reactions
been
as they
In the case of no measure-
values
be
how
Another
Determination
S in
to as-
important
this application,
the last S-R
to investigate
stoichiome-
it is necessary
of the corresponding
reactions).
ment noise,
explained.
tries.
space
replace
and the rank of D gives R. The presence
with
of W to
the row
B, L? would
urns of U and V on D and, thus, can reveal
values.
with the inverse
set of vectors
spanning
are notably
the significance
spc-
the computed
the
decreasing
2 R.
The singular
data
in each column).
X and N have been estimated,
obtain
the
of the corresponding
by
of D, and V is the S x S
S>
time in the summation
with respect
containing
sume that both B and S are larger than or equal to R
(i.e.
the species
matrix
(ordered
Furthermore,
of decreasing
cult.
D
equation
the term associated
the
D
set of vectors
in the numerical
of the results regarding
a low weight,
W.
the importance
of D, but the disadvantage
significance
the data
matrix,
of
of D. (In the case where
independent
essentiaHy
of increasing
orthonormal
of the system
of the expected
to the different
the advantage
of
a,,
the column
indicate
is to use the inverse
of D),
spanning
be normalized.
the diagonal
method
Z is the S X S diagonal
it is imperative
multiplying
weighting
of choosing
most common (i.e.
Hence,
by post
with a diagonal
of M.
by the order
of the data matrix
is accomplished
matrix
is affected
of the variables.
that the columns This
(8)
the pseudo-inverse
analysis
12439
give
in the
accepted
is perfect,
and
must be used.
a
J. L. HARMON et al.
1290 Once been
the number
determined,
of independent
reactions
the last S - R singular
set to zero, and D is approximated
R has
values
are
using R factors
as
last rows sponds ward
of D. Note
now
matrices
observation D&+r=g
o,u,vT,
(IO)
,=I
The
where
the superscript
^ denotes
singular
value
noise-induced
may
be considered not imply
only error.
vant
the
Some
reaction since
Conversely, S-dimensional
the
(10)
The
insignificant
and
that its removal
will
loss of information error
is
rele-
unavoidable.
spans
the
in the reduced-rank
strategy,
however,
than valuable
relationship
(4)
is that more
from
(II)
N, = irT,
(12)
and N. a RX
S abstract
“a”
is used
abstract and represent true matrices In
the
matrix
factors.
this (10)
Hence, time
tions.
It is obvious
some
means
pearance
of
paper, by
profiles
reactions.
for
Factor
evaluated
vation.
For each such matrix,
for
independent
reaction
of a singular
vector
correlations
vs time
of independent
This method direction dent formed
be plotted
reactions.
the disappearance
The
by successively
measure-
backward adding
data rows
spanned
a
Bonvin
physically
and Rippin
meaningful
(1990)
proposed
ntarr i.e. a known vector, solution
stoichi-
t, can be calcuto
(15)
of the target vector
the +
each
the
superscript
indicates
the Moore-Penrose 1985)
and
(16)
the pseudo-inverse
conditions
P is the
stoichiometric of
advantage when
the
coefficients reactions.
R
of TFA
particular
missing
is the ability elements
(Bonvin
and
be
great
to test targets
even
of the target
Rippin,
sensitivity of indepenare
of the
analysis
In general, to biochemical target
termed
free-floating,
(1992)
have
elements.
In
developed
a
of TFA.
target testing is more than chemical
stoichiometries
sely. However,
are
Malinowski,
for the unknown
et al.
vector
1990;
to TFA,
Harmon
for
a
1991). This extension addition,
known
Consequently,
also yields predictions
in the reverse
and
projection
matrix T, at least
must
assothe
(Horn
resulting
To solve for the transformation
obserof
onto the space
by the rows of N,:
matrix.
the R
matrices
N..
(14)
The auto-
in front
and
for a parti-
nT=nT tarN+N. il * =n? fdlP .
obser-
to resolve
a match
n:, = nT + &T= tTN, + eT,
Johnson,
reactions.
can also be employed
to resolve
to find
preTarget
i.e. a row of N is
the transformation
Here
formerly
significantly.
been
method,
stoichiometries
that given a target vector,
satisfying
The appear-
ciated with noise to increase appearance
stoichiometry.
with this
will cause
have
attempts
n is the transformed,
where
Analysis
the autocorrelation
in U are computed.
RxR
nT = tTNar
to have
each
the
the transformation
cular stoichiometry,
as the projection
In this method,
(13)
finding
where E is the model error. Thus, n can be estimated
reac-
the data between
to
popular
known
(13),
and disap-
and the present
is successively
can
equation
= XN.
methods
One
(TFA),
a priori
the D.
detecting
of spectral
experiments.
vation
autocorrelation
only
of notation,
et al., 1987) was developed
vectors
Analysis
T.
lated as the least-squares
full-rank
be helpful
Evolving
of the experiment
of a new
the
the appearance
data matrix containing
the singular
Factor
approximated
retaining
that it would
in titration
beginning
are
for simplicity
idea in mind for the elucidation forward
The
D will also be labeled
of detecting
(Gampp
ments
matrix. matrices
bases for the
systematically
matrix
Autocorrelation
(EFA)
of
be
the reduced-rank
to identify
and are there-
T.
Many
sented
ometry,
to equation
dominant
these
in (4). will
according
stoichiometric because
matrix
testing.
of N,
of those matrices:
reduces
extents matrix
only orthogonal
remainder
D
reaction
now
transformation
From
to
and
by the rows
of X,, respectively,
combinations
problem
between
x,=&z
will
vector
stoichiometries
spanned
X,N, = X,T-‘TN,
Target
be obtained
reaction
a singular
meaningful
lie in the spaces
and the columns
The
noise
of
of abstract quantities
physically
The extents
information.
can now
X, is a B x R abstract
subscript
Transformation
whole approx-
an
significantly.
fore linear
as follows:
where
ance
fact that a
back-
data between
of an independent
autocorrelation
row space of D, some error will natur-
will be removed The
associated
The
network
ally have to be accepted imation.
a matrix
matrix.
does
eradicate to
data
the original
time correHence,
instant and the end of the experiment.
the
decrease with the reduced-rank
contain
disappearance
cause
that the reference
to that of the first row.
target
are
difficult to apply
data since macroscopic often
testing
not
known
can be used
preci-
to check
1291
Factor analytical modeling of biochemical data available the
reaction
literature
stoichiometries
or
compatibility
computed
(available
by
with measured
other
simultaneous
reaction,
substrates
cose and oxygen) by-product)
for
of two
and ethanol,
or of two products
glu-
(biomass
to the value
main substrate.
one
constraints
e.g. the exclusion
(glucose
in addition
reaction’s
processes,
two stoichiometric
each macroscopic
for
data.
Reduced target. In fermentation can often propose
from
means)
Therefore,
subset
with
reactions
an alternative
common
the
natural
progression
involve
all
possible
to divide
the
of reactions.
to apply
of an
reactions
experiment times.
the data in order
As shown
below,
It
not
method
known.
More
a researcher
may
be
able
projected
onto
(e.g.
oxygen
From
the extracted of reaction
reaction
along
with
extents leads to the extraction
equation
(4) and (13)
give:
computed,
(17)
quantities
X, and N, can be
but not the physically
meaningful
(or
one obtains:
X’N’P ~‘N’P+X;;_,N,_,P
I’ 1
=[-&-P=[$-]N, where part
the tilde indicates of the matrix
(=)
a projected
which
matrix,
is orthogonal
have used the fact that N’P = 0. Note only
information
associated
X and
The
matrices
respectively.
i.e. that
to N’.
with the remaining
autocorrelation
appearance
or disappearance
D’,
data
matrix,
only
that section
reactions ometry,
can
can
N:,
be
of rows
is present.
Using
reveal
constructed
by
of D where
a subset
SVD,
including
an abstract
quantities
according
A comparison
to equations
of equations
abstract
(lo)-(12):
I-l 0
of r will
set of r
(r < R):
and
XL
both describe the same space, of
b.
If
only
one
reaction
= X’N’.
(18)
XL and N: can be computed, matrix
D can be written
the single
partitioned
i.e. the column is present
space
in D
(i.e.
L-1 x”
as:
is simply
[;I
= [&J
(19)
[C]
proportional
measurement ment
where
2,
0
but not X’ and
The above D=XN=
that
R - r = 1). its extent
N’. The
(24)
(23) and (24) shows
the matrices
stoichi-
D’ which
from
the
another
space of this reduced
D’ = X:N: Again,
profiles
of reaction(s),
be calculated
span the stoicbiometric reactions
time
set
R and R-r,
of D gives the following
fi = f,A,. the
We
that D contains
D and D are of rank
SVD
N. If
N:)
of R - r reactions.
D = X,N, = XN. that the abstract
to N’
(22)
D=DP= ~!N'+X;;_,NR_,
in fermen-
is
D can be
matrix:
X’N’
from
Extraction of reaction extents. For a set of data D
Note
(19)-(22),
to
once
r reactions
matrix
orthogonal
projection
may
stoichiometries.
with R reactions,
(21) it is possible
the data
the space
equations
in a certain
does not participate
This type of a priori information
tation).
(20)
P=I-(N’)+N’=I-(N:)+N:.
to indicate
specific species which did not participate
set-up,
we can write:
extraction,
precisely,
=
reaction
of
using the following
the data. Similarly.
the available
the extent space for the R - r reactions
to isolate subsets extents
the r
N,=N’. the
is then
this information
or extract reaction
be used to identify
will
the
that the
with
x:=x’
information
For instance,
at all
of
itself and to
priori
a
systems.
of
to bioche-
is to take advantage
in the data matrix
types
in biochemcial
From
the stoichiometric space for the other
testing is difficult
information
incorporate
associated
definitions,
the identifica-
Extraction of information
inherent
respectively.
and with the above
compute
Since target
are
in D’ and with the remaining R - r
present
reactions,
where
mical systems,
quantities
that
simultaneously.
the The
r and R - r are used to indicate
corresponding
With
occur
D”
subscripts
and a
- 1 for
represents
up to R reactions.
data with possibly
tion of T using target testing is often limited to cases only ?WO reactions
r reactions;
only
remaining
and
double
primes
set of data: D’ is associated
represent
noise
to t..
derivation noise. would
the
(computed
from
with the data
chiometric
space.
assumed
the presence
In experimental cause
D’)
the space
to deviate
Subsequently.
of no
data, measurespanned
from
by N:
the true stoi-
this would
cause
J. L. HARMON et al.
1292
errors in the projection given by equation (23). Nevertheless, the hope is that the rank reduction of D’ from S to r according to equation (10) removes much of this noise. Extracting reaction stoichiometries. The extraction of reaction stoichiometries can be handled in the same fashion as extent extraction. The method can be used to extract the stoichiometric space for R - r reactions once the extent space for the other r reactions is known. Basically, if a set of r reaction extents X, can be formed, for example using the extraction technique described above, the data matrix D can be projected on the space orthogonal to X, (or X8,,) using the following projection matrix: P=I-xx,x,+=I-xx,..x,:..
(25)
1 _N,
1
LnR-rj
= [PX,N,+PX,_,N,_,]
The extents of four independent reactions are generated to form the matrix X = [xlxzx*]. By construction, only three reactions occur simultaneously, giving three subsets of data according to Table 1. Each subset of data contains 100 points. Table
1. Reactions
D,. I)2andD3
Data subset Dl
D2
4
RI R2 R3
+ + -
+ +
+ + +
3
3
+
rt
+
3
of reactions
-1
The matrix fi is of rank R - T. SVD of fi gives the following abstract quantities:
N=
[I 0 7 d
and 1Tr, and between n and ii,). However, only the stoichiometries or the extents, but not both, can be scaled arbitrarily. If both are, their product would not equal D. Usually, the stoichiometries are scaled with respect to a key species (e.g. the value - 1 for one of the substrates). Furthermore, since the stoichiometries are extracted using subsets of the data, it is useful to check their compatibility with the complete data set, for example by using the extracted stoichiometries as targets in a TFA scheme. In the case of a discrepancy, the projected target can be used to form the matrix N. The corresponding reaction extents can then easily be computed as follows: (28)
0
-1
-1
-3 0
[ -2
(27)
A comparison of equations (26) and (27) shows that the matrices NR_, and R, both describe the same space, i.e. the row space of I). If only one reaction is present in fi (i.e. R-r= l), its stoichiometry n is simply proportional to 6.. Reconciliation of extracted quantities. As part of the extraction procedures described above, the reaction extents and the reaction stoichiometries are both scaled arbitrarily (proportionality factors between
X=DN+.
in the data subsets
Reaction
(26)
fi = S.R..
twesent
The four independent stoichiometries involve five species:
=PXR_rN~--r=%~_rN~--r.
since PX, = 0
SEMIJLATFiD RESULTS
Data generation
Number
The projected matrix fi becomes: b=PD=P[X,IXR-,]
3.
1
3
1
2
2
-3
-4
1
1.
-1
-1
0
3I
data matrix is simply generated as the product of X and N according to equation (4). We assume that the first two stoichiometric coefficients for each reaction are known a priori. As three reactions occur at all times, we need to propose three stoichiometric coefficients for each reaction in order to solve directly for the transformation matrix T according to equation (13). Since only two coefficients are known for each reaction, the extraction procedure described above is necessary to identify the stoichiometries from D. First, we will illustrate the techniques of factor analysis and extraction on noise-free data. Then, a simulation with 2% noise on each variable will be analyzed. The only available information is the data matrix D and the a priori knowledge of the first two columns of N.
The
Extraction
procedure
in the case of no&e-free
data
The forward autocorrelation profiles (Fig. 1) clearly indicate the presence of three reactions initially. The sharp increase in the autocorrelation of the fourth singular vector after about 100 observations is an indication of the appearance of a fourth reaction. Similarly, the backward autocorrelation profiles (Fig. 2) show that three reactions are present towards the end of the experiment. The decrease
Factor analytical modeling of biochemical data of the autocorrelation after about
of the fourth
200 observations
disappearance
singular
which
vector
by NI.~.
of the
and equation
is an indication
is easily (12).
ing to equations fi has only
dominant
loo
300
200
the above
Now n, by
projecting
By
by
the
The autocorrelation profiles of the singular vectors (noise-free data).
D
two
stoichiometric
1.0
determined
from
1
times
-0.5
300
200
103
The
300
obsavalion
autocorrelation profiles of the singular vectors (noise-free data). and
backward
tions. Three
data subsets
of
the
profiles:
D, (observations
lOl-200), can
be
shown
to contain
m,
x1 can
(22)
four
obtained.
is based
some
of
one reaction.
Figure
The extraction
we propose
individually
the
can be
has
tion (28).
been
and three (x2, xX). A for n,
obtained
of the data and procedure
with
et al., 1992). x,. From
the
spanned
by
according
to
space
This
and extents may
have
not
be
been scaled
to first scale the stoichio-
with
respect
to a key
the extents according
procedure
original
noise-free
allowed and
species, to equa-
us to reproduce
the extents
of the
data.
D2 (observations
201-300).
Each
three
data set
Extraction
of extents
and
and compar-
by the occurrence
of
the procedure.
of x, is first realized
the data sets DI and D which were
by comparing
shown
respectively.
metric space for the three reactions
in the case of noisy data
simultaneous
on the analysis
3 illustrates
three and four reactions,
values
them
these
that differ
stoichio-
n,
computed
stoichiometries
consistently,
noisy
The
stoichiometries
two
and (23).
As
metries
be
the stoichiometries
strategy. The extraction
ison of data subsets
to
can be
This way,
at extracting
exactly
reactions. Extraction
the
reactions.
and (nl,n4)
of the stoichiometric
pro-
only
of
extraction
reac-
l-100),
D3 (observations
space
necessary
data (Harmon
autocorrelation from
T
of the consistency
of four independent can be defined
other
(x2, x_,), (x3, ~4) and
and then to compute the presence
the
determine
stoichiometry
final step is aimed
equations
*
to that
stoichiometries
(nl,n2)
to the available
n2, q 3 and
200
two
knowledge
of the different
sensitivity
The
to
can
matrix
the
in turn
knowledge
Ica
the
(n,,n3),
to R2, R,
and (26).
for each reaction.
that
using
respect
-1.0
one
for
the
can be indicative the
forward
x2
to extract
four times: once using (x2, xj, G)
comparison
I
!
The
obtains
from (x2, x4), (x3, x4) and (x2, x3), respecti-
Note
extracted
0.0
(25)
perpendicularly
meaningful
coefficients
vely.
files indicate
one
perpendicularly
extents,
physically
computed
3 0.5
data
space
stoichiometries
Fig. 2. Backward
1). SVD to xg.
it is possible
2X2-transformation
obtain metric
0
procedure,
is
(i.e.
the data sets Dz and D and 4
to equations
projecting
spanned
0
reaction
is proportional
determined,
space according
-1.0
fourth
matrix which
that the extent space corresponding
and R4 has been
4
value
and D, respectively.
Otservatiw
Fig. 1. Forward
accord-
in D,, see Table
%. which
and ~4 by comparing
to N,,.
singular
not present
of b gives directly Repeating
of D1
The projected
with the additional
R3, the reaction
via SVD
D perpendicular
(22) and (23).
one
associated
determined
The next step consists in project-
ing the data matrix
of a reaction.
0
1293
The
to contain stoichio-
in D1 is spanned
data
matrix
noise-free
measurements
corrupted
with
The
number
determined lation
gaussian
noise.
of time
by the forward
and backward
autocorre-
of
of noise,
The
multiplicative
from changes
as a function
ing the appearance tion.
is constructed
reactions
profiles
Because
2%
of
D
of concentration
data
the
left
singular
vectors
there is some uncertainty or the disappearance
where
the
number
of
of
is D.
in locatof a reac-
reactions
is
J. L.
1294
et al.
HARMON
D
f
Fig. 3. Extraction
procedure for the reaction system given N ext = stoichiometry
extraction;
in Table 1 (Xext= T = transformation).
extent
The last step in the extraction procedure deals with the reconciliation of stoichiometries and extents. TFA is applied to the four extracted stoichiometries. II, is chosen as the average of the three extractions with (Y+, x.,), (x3, x4) and (xl, x3). There is no noticeable discrepancy between the spaces spanned by the extracted stoichiometries and by the first four left singular vectors of D. The extents of the corresponding reactions are computed using
ambiguous are simply discarded. Consequently, the data subsets D,, D2 and D3 are of slightly smaller size than in the noise-free case. The procedure is then the same as that described above for the noise-free case and given in Fig. 3. The extracted and original stoichiometries are listed in Table 2. Because of the addition of noise, the computed stoichiometries differ slightly from those used in the simulation.
Table 2. Original and extrafted stoichiometrics in the case of noisy simulated data RCZPXkMl
RI
Extraction using
h
x4)
Stoichiometry
n,
0 0 0 0 0 0
-1
RI RI RI RI RI
(x2. x3) (x3. -4) average above (x21x3.Qrq) original
a1 ml nl n1 01
-1 -1 -1 -1 -1
Rz R2
(x3. xq) original
% a2
--t -1
R3 R.,
(x2. xq) original
03 03
Rd R,
(X2.%I original
14 nr
0 0 -2 -2
extraction;
1;
-1.00 -1.00 - 1.00 - 1.00 -1.00 -1.00
2.98 2.98 2.98 2.98 2.98 3.00
1.a0 1.00 1.oll 1.00 1.00 1.00
1.02 1.00
2.00 2.00
1.99 2.00
-3 -3
-4.01 -4.00
1.13 1.00
1.20 1.00
-1 -1
-1.27 -1.00
0.14 0.00
2.85 3.00
Factor analytical modeling of biochemical data equation
(28).
In order
fit of the various cent between
to evaluate
extents,
an extracted
in the original
the goodness
the relative
error
of
in per-
data is computed
Factor analysis of data The raw data are first converted
extent and its counterpart
noise-free
1295
equation
as:
(3).
numerically
The
gas
integrated
flow
(29) The relative
errors
are 6.4, 5.3, 6.7 and 3.7%
for x,,
x2. xj and x4, respectively.
ethanol
ments.
to molar
convert
26.5 g/C-m01 (1986),
proposed
species. The
Material and methods The
factor
methods
were
also
data collected
from
a fed-batch
using
experiment
Baker’s
yeast.
Saccharomyces cerevisiae ATCC on a semi-complex with
glucose
to avoid
The KLF
extract
added
each
log
Lab
10-15
min to determine
Mannheim
enzyme
filtration
branes
of
aeration
(1 l/min).
sitions mined were
was used
the glucose
flow
and
of
was grown
that
were
time,
entirely.
reased
and reached
sharply
after
production
the hypothesis F-test
zero.
immediately
the
concentration The
feeding of
set in.
dec-
307.1 for two reactions
correlation
was
0.315 I/h.
of 3.51 is
5.99
is normally
and
significance
(for
2
level),
distributed
compared
2.40 for
in the factor
matrix is projected containing to equation
three
value
is
reactions
com-
of 18.15. The
auto-
of the U matrix are
Consequently,
the influence analytical
R = 3.
of measurement
procedure,
and redox balances
This way,
fied so as to verify
value
with the threshold
the data
onto the null space of the matrix
the carbon (8).
indicates
The first three columns
significant.
to reduce
level
function
values of the columns
of U are found errors
of
test error
measure-
value
significance
computed
[0.92,0.76,0.67,0.46,0.36].
after 6.7 h and the glu-
7.5 h at a rate
at the 5%
reactions:
with the threshold
a
species
value
that the error
of 10.13,
had reached
each function
and a 95%
value
3 h after
with a Chi-square 1983). A relative
threshold
pared
produced
of gross meas-
be rejected.
In order
Ethanol
the
at a rate of
only
during
for
to wait for an
ethanol
of the culture
was stopped
of freedom
cannot
species
Detection
Since the computed than
An
taken the
is assumed
measure-
in small amounts
is investigated
degrees
three
of 1.55 g/l. A feed
errors
than
balances
of unmeasured
and Stephanopoulos,
smaller
at 4°C
of 750 mg/l.
cose consumed restarted
flasks
produced
and
again,
species.
by imprecise
fermentation).
5 mol%
ment.
of 1.6 1 to obtain
at 1.42 h. In order
the first phase
The feeding
5E
of 3;o”C for a
was centrifuged
concentration
samples
At
deter-
in conical
at a temperature
of 37.4 g/l was fed to the reactor
concentration
Ethanol
were
gas measurements
to an initial volume
culture,
during
(Wang
Here
in the substrates
all the measured
be explained
the glycerol
urement
the compo-
22P and Oxymat
The
of 12 h. This inoculum
active
and
oxygen
Ultramat
the biomass
inoculation.
for
30 s.
0.064 l/h beginning
then
nitrogen
meter,
respectively.
an initial biomass
can
elec-
balance,
biomass
error.
that the two
and by the presence
and 79%
with 20 g/l of glucose
solution
ments
mem-
(e.g.
dioxide
taken every
and transferred
0.45,um
The inlet gas flow was measured
of carbon
period
Errors
was measured
on
A gas stream
using Siemens
Initially,
broth
for 48 h at 100°C.
5878
gas analyzers,
of
redox
ethanol,
Note
are 4c +
of available
The
electrons
involve
formed electrons
C,H*O,N,
a - 11.2%
products.
together
Boehringer
with
consumed
the available
glucose,
exhibits
in the
error
i.e. there are
CO*).
compound
there are more
and ethanol
using
kits. The biomass
oxygen
a Brooks
Glucose
measured
10 ml
after drying
of 20.%% with
were
ethanol
during a fed batch of 9 h
for a total of 29 data points. concentrations
consumed,
balance,
involves
to the
than in the products
ethanol,
oxygen,
respect
shows a - 8.9%
must be conserved.
which
1, 11.
is checked:
h - 20 - 3n. The total number trons
1,
atoms in the substrates
For the redox
and pH
the glucose,
balance
in the chemical
The system was sampled
concentrations
with
quantities
ethanol)
(biomass,
Ferm
is used for measured
are [2,4,
of the data
to the glucose
(glucose,
of
weights
of certain
more carbon
deficiencies.
2000, was kept at constant temperature
and biomass
by
yeast,
In addition
for
a 21 Bioengineering
of 30°C and 5, respectively, every
consistency
respect
grown
value
and KIppeli
for the ash content,
selected
the carbon
et al., 1990)
substrate.
was
any medium
fermenter,
The
9763, was
(Randolph
as the limiting
0.5 g of yeast glucose
medium
tested
the
range of the corresponding
The
conservation
analytical
using experimental
measure-
deviation,
The data are scaled with respect to the
approximative 4. EXPERIMENTAL RESULTS
and biomass
by Sonnleitner
and corrected
the biomass.
with a
at the times correspond-
ing to the glucose, To
of are
and then interpolated
cubic spline to be available
_
to the form
measurements
exactly
according
the data is slightly modithose two balances.
The
J. L. HARMON
12%
ef ~1.
a.2 3
4
5
6
7
8
9
B 2 -0.5
Fig. 4. Molar deviations in the reconciliated data matrix (after projection to meet the constraints; reference value taken after about 3 h). molar
deviations
matrix
are
singular
forming
plotted
are necessary
of constraints
involved
here:
5 - 2 = 3 independent cies (Bonvin ciliated
reactions
and Rippin.
data
matrix
tions, there is no longer of
singular
the
noise
values
problem,
autocorrelation
disappearance reaction. mately
A
profile soon
second
time profiles 5) indicate
ethanol
Glucose
ficients:
after
oxidation The
are:
oxidation glucose
is the only
glucose is
1
8
the
9
the same procedure of
formed
that contains This
metabolism
coefficients:
the
A
is is
between
3
and ethanol
of data
extent
data matrix
data
the measurements
only two factors,
fermentation
as described
simulated
data.
the glucose subset
glu-
= 0.
analysis
oxidation
is approximated
by
and it is used to extract shown
appears
to be qualitatively
remains
close to zero until glucose
For the extraction
in
correct,
Fig. i.e.
6.
This
the extent
fermentation
containing
of the ethanol
oxidation
the data between
sets
extent,
3 and 5 h and
-._00 0.06 -
of
metabo-
~0.02-
ethanol the
= - 1. Glucose
m
do.ooa.02
stoichiometric by
GF
jo.c&
is exhausted,
and
to the anaerobic
to the experimental
time
oxidative
Tat-k 3. Time
7
{h)
the stoichiometric
applied
is produced.
characterized
for
a matrix
by
expected
known
= 0 and ethanol
T&c
after 7.5 h
The
= - 1
“5
in.
3.
a priori
with
the
approxi-
autocorrelation
has stopped.
corresponds
retaining
and
-to be consumed
mentation
reactions.
In this via data
is initially a single
and ethanol
in Table
lism of glucose. coefficients
(10).
there is a clear disappearance
the feeding
5
and 7 h involving
of the columns
after about 7 h when ethanol
events are given
Ethanol
to eliminate
is detected
the backward
4
In what follows,
non-dominant
appears
begins
_ *. .:. II4
“3
.:
_
of glucose
above
reac-
the appearance
There
* _ .: .
- . -
1
cose = - 1 and oxygen
spe-
three
by equation
reaction
(not shown), after
those
3 contains
A third reaction from
a reaction
between
has been achieved
the feed is restarted
Likewise,
to
there are at most
discarding
of reactions.
5 h when
the culture.
(balances
using the two balances_
of the U matrix (Fig.
when
by
noise reduction
reconciliation
and 0.0. A
the possibility
as indicated
.
* : I :
-
Fig. 5. Forward autocorrelation time profiles.
1990). (ii) Since the recon-
of rank
Q<.
3
(i) since the matrix
species,
-‘.‘-.‘*‘T.
-1.0 j
data
.cf’-‘.-.
~,_‘.__‘--.--‘5.‘.‘._._.. ,.“., L__._._~‘, .;*
corresponding
two constraints
be met) for five measured
The
reconciliated
4. The
values are: 1.30,0.39,0.022,0.0
few remarks
some
the
in Fig.
__
.8 0.5 B ,g 0.0
Tim (i-u)
,
= 0. coeffer-
and corresponding
Fig. 6. Extracted extents for the three reactions. rcxtions
in the fcrmcntation
pruccss
Time (h)-s 3
5
7
7,s
v
Ethanol
mctaholism
Ethanol not mctilholircd
Ethanol
consumed
No ethanol
Ethanot
Glucose
metabolism
Glucose
Glucose
oxidation
No gluc~xc
Glucose oxidation glucose fcrmcntation
Number
of factors
I factor
0 factor
2 factors
oxidation
2 factors
produced
analytical modeling
Factor Table 4. Extracted Glucose GO EO GO GF GO
Biomass 3.29 1.14 3.25 0.87 3.27
0.0 - 1.0 0.0 1.7 0.0
O2
CGZ
-2.60 -1.83 -2.64 0.00 -2.62
2.71 0.86 2.75 1.73 2.73
(a) Stoicbiometries extracted using glucose femwzntaton extent; (b) stoichiometries extracted using ethanol oxidation extent. GO: Glucose oxidation; EO: ethanol oxidation; GF: glucose fermentation
between
and
7.5
extracted
9 h is formed.
extent
oxidation
in Fig.
The
plot
6 indicates
takes place mainly
of
that
between
the
ethanol
5.5 and 7 h, as
expected. Knowledge allows the
of
Furthermore, known
matrix
ethanol
since
two
for
(Table dation
stoichiometric the
and
that
has been
glucose
oxidation
from the
and
glucose
the
a second-level generated
error
and
this point,
mentation extract
and
the stoichiometries ethanol
the extent
sents a third-level Fig. 6, indicates occurs resumes
until
oxidation
of glucose extraction.
with the feeding
of
can
be
oxidation. 7he
that the glucose
exhaustion
of glucose
plot,
of glucose
shown
Table
7h
The
is expected
for
to be about 0.45
anaerobic
is consumed).
the observed
growth
For
ratio should to the pure
that the computed
quotients
to increase on glucose,
growth
pure
for
characterizes
on ethanol,
pure aerobic
infinite
5 shows
coefficient
which
a mixed
lie between metabolisms.
and the expected
for the three reactions
are very
close indeed. In addition,
often
the stoichiometric
to values
from
ficients different
strain
continuous
be ferto in and
after 7.5 h.
that only (aerobic tion,
one
aerobic
trast,
and
on glucose
growth
the methods
(1989) dation.
Glucose lower
derably
higher The
possible
differences
as only
producsource
found
in
and CO2
aim at
by Axelsson in Table
for glucose this
yields,
work
response
oxihas
the bio-
is also lower
are due to noise
6.
but a consi-
yield. Furthermore, oxidation
of
In con-
simultaneously.
agreement
the fact that the metabolic
three
at a time
ethanol
are compared
biomass
mass yield for ethanol
from
in this article
fermentation
ethanol
coef-
a slightly
on glucose).
the stoichiometries
slightly
work.
was
without
occurring
is a very good
For
in such conditions
presented
and in this work
There
both
obtained
growth
reactions
Nevertheless,
for
on ethanol
and anaerobic
uncoupling
(but
performed
metabolism
are
experiments.
gives stoichiometric
yeast
medium)
cultures
growth
carbon
(1989)
the same
can be
However,
in the literature
in independent
Axelsson for
coefficients
the literature.
presented
determined
example,
in this
but also to
may be different
when two metabolisms occur simultaneously.
Discussion validity
be checked
ratio,
corresponding
respiratory
reaction
after
stoichiometric
no oxygen
when
a0 m
0.47 0.4%a.55
growth
1.1 during
the values
This repre-
oxidation
glucose
used
the
aerobic
metabolism,
expected. At
of
to become
(i.e.
extracmust
pure
to about
extraction.
propagation
ratio
the type of metabolism,
extraction,
by
the
during
ethanol
used for this stoichiometric some
i.e.
GF (b)
EO (a)
1.04 1.0-1.2
CO2 to that for Oz. This
also that this stoichiometric
themselves
Consequently,
for
twice, once each using
extent
represents
were
oxi-
can be deter-
stoichiometry
extracted
the
RQ stoichiometry RQ theoretical
the stoichiometries
of the ethanol
of glucose
the
extent. Notice
the extents
stoichiometries
for
Go average
compared
can be computed
knowledge
fermentation
identification
coefficients
transformation
stoichiometries
fermentation
Notice
oxidation
reactions,
extent
space for reactions.
oxidation
reaction,
oxidation
extent,
mined.
tion
each
4). Similarly,
oxidation
i.e.
fermentation
T, and with it the complete
for the two
the
glucose
of the stoichiometric
and
glucose
are
the
computation
quotientsfor the three reactions
Table 5. Respiratory
stoichiometries
Ethanol
- 1.0 0.0 - 1.0 -1.0 -1.0
(a): (a): (b): (b): average
1297
data
of biochemical
of the extracted
by computing
stoichiometries
the respiratory
can
quotient,
The largest discrepancies ficients.
Biomass
Table 6. Comparison of reaction stoichiometriesfound by Axe&son tore, a single mechanism at the time) and in this work (fed-batch mechanisms at the time) Glucose
Reaction GO EO GF
Axelsson
TIis work Ax&son This work Axelsson This work
- 1.0 -1.0 0.0 0.0 - 1.0 -1.0
EthrtnOl 0.00 0.00 -1.00 -1.00 1.88 1.70
Biomass
1.32 1.14 0.36 0.87
are in the biomass
measurements
4 - 2.33 -2.62 - 1.61 -1.83 0.00 0.00
(continuous culculture, several CGz
RQ
Z
1.07 1.04 0.42 0.47
0.68 0.86
coef-
are the most likely
J. L. Hnar.~~ru et al.
1298
to
be
in error
possible
for
sample
because
each
volume
measurement more.
a single
sample.
would
In contrast,
repeated.
The
allows
increase
the accuracy
scaling
of the
measure-
measurements
of O2 and CO2 acqui-
noise
to be filtered
out.
The
factor
for biomass
would
not
regarding
biomass.
In this work, to the biomass
to the other
even
The
measured
affinity
residual
6Wmg/l
measurements
of the yeast for glucose glucose
respiratory
concentration
capacity
glucose
and
can
oxidative
be
fully
capacity
uptake
saturation
level,
the saturation 7.
It
additional
oxygen
is limited
6.67 and 7.5 h. Between glucose The
uses
specific
(Fig.
up most glucose
8) because
concentration
piratory
quotient, on both
extracted
observations:
uptake
rate
rate is nearly
zero glucose
in
remains between
(Fig.
and 6)
slowly and the
The observed
match
occurs
The
biological at all times
(the glucose
that interruption
is consumed
meta-
prevails.
these
resbelow
that an oxidative
oxidation
during
decreases
ethanol
of
capacity.
in Fig. 9. decreases
except when the feed is stopped the residual
and
uptake because
very rapidly),
I
6
3
8
Fig. 8. Specific glucose uptake rate.
ethanol when
oxidation
sets in after 5 h and stops after 7 h
all the ethanol
feeding
of glucose
city increases formation
and
is consumed. (Fig.
immediately
It is interesting to the
approximately
to note
reactor
resolve data
reaction
networks
using factor
analytical
The complete
where
trial and error.
seen
as a complete of desired
of extraction. FA
the
In factor analysis.
some
amount
this is not a step-by-
user
puts
new,
however,
drawback user input.
and
should
especially
in the case
reaction
is much room
extraction
not be a good
since the application
ment in many of the individual scaling
infor-
out at the
as it allows
and chemical there
in the
comes
most of the steps require
This,
Nonetheless,
to biochemical
is relatively
procedure
in Fig. 3.
at the top and the answer
bottom.
have none
CONCLUSION
was presented.
algorithm
mation
corres-
reactions
Consequently.
As stated in the introduction, step
fermen-
is negligible. 5.
techniques
uptake
and ethanol
that the extents
macroscopic
to help
for biochemical
glucose
sets in. The glucose
the same range.
approach
is described
The
capacity,
the capa-
these observations.
three
of these reactions
An
9).
the oxidative
tation extent supports ponding
Following
after 7.5 h, the respiratory
rapidly
rate is far above
offer
steps. much
of
networks
for improveIn particular, potential
for
research.
20 0
*
4 Time(h)
up to
rate is shown
rate is constant
increases.
glucose
glucose
is oxidized
3 and 6.47 h, oxidation
shown
extents
is
than the
of the respiratory
one with time, an indication bolism
excess
feed is stopped
the feed
biomass
as the
consumption.
8 mmol/g/h
until the substrate
that
capacity.
consumption to
the
is lower
ethanol
limited
If the glucose
oxygen
uptake
than
means
as long
limit.
of the respiratory
The specific constant
without
if the glucose
is
This
only
this
is very high.
yeast
1986).
above
to ethanol
Conversely,
Baker’s
oxidized
we
noisier.
is smaller
.
2
as
of 37.5 g/l. The
is not saturated.
increases
reduced
of
Kappeli,
though
were
for an inlet concentration
(Sonnleitner
Fig.
variables,
that the biomass
I
0
we
chose to give the same importance
The
o-b-.-
the effect of noise but also the accuracy
of the results
knew
10
the
operation
ethanol
and suspect
high frequency
use of a small only reduce
and
is
increasing
the reactor
glucose
random
measurement
fact,
but also perturb
ments can be duplicated, sition
In
4
I
00 i”.
“\
10 0, 0
Tima (h)
Fig. 7. Specific oxygen consumption rate (0) CO, pruduction rate (t3).
*ooo4a
B
-*
“Ia*
0 . 2
and specific
I 4
. 6 .rimc (h)
Fig.
9. Respiratory quotient.
. 8
,
Factor
Furthermore, biochemical
the application
data should
for the normal intermediary ing.
In
of
Although
no work
step
kinetic has
to realize
satisfactory mation
been
tics).
That
is, the
predicting
iterative
process
Moreover, vidual
interesting
FA
reactions
data,
could
as it is Once
provide be
infor(e.g.
reaction
allowing
could of
step
of purposes
could
facilitate
instead
in this
the retrieved
thus
which
process.
of the loop.
to model
modeling
raw
a method an
more
the modeling
a conglomerate
to
kineoverall
reliable. of indiof
reac-
tions. Acknowledgement-Financial National Foundation for acknowledged.
support from the Swiss Scientific Research is gratefully
NOMENCLATURE B = C= c, = D = F= m = M = N = n, = R = RQ = S = T= I= rrer= U=
number of observations number of balances (constraints) molar concentration of the sth species data matrix of dimension B x S containing the quantities consumed or produced feed rate (malls) vector containing the number of balanced units matrix of constraints of dimension S X C stoichiometric matrix of dimension R x S stoichiometric coefficient of the sth species in the rth reaction number of independent reactions respiratory quotient number of species transformation matrix of dimension R x R defined in equation (13) time (h) reference time matrix of left singular vectors defined in equation
(9) V = matrix of right singular
Greek
vectors
defined
in equation
(9) V = volume (I) X = reaction extents matrix of dimension B X R f, = extent rate of the rth reaction defined in equation (2)
symbols
= = = = =
abstract input output original extracted
Superscripts -= b = b= D+ = D’ = D” =
indicates feed concentration approximation of D defined in equation (10) projection of D defined in equations (23) and (26) pseudo-inverse of D submatrix of D defined in equation (19) submatrix of D defined in equation (19)
values
REFERENCES and control of fermentation Axelsson J. P., Modelling processes, PhD thesis. Lund Institute of Technology, Lund, Sweden (1989) _ Bonvin D. and D. W. T. Rippin, Target factor analysis for the identification of stoichiometric models. Chem. Engng Sci. 45, 3417-3426 (1990). C. J. Meyer and A. D. Gampp H., M. Maeder, Zuberbuehler, Evolving factor analysis. Comments Inorg. Chem. 6, 41-60 (1987). Hamer J., Stoichiometric interpretation of multireaction data: application to fed-batch fermentation data. Chem. Engng Sci. 44, 2363-2374 (1989). Harman H. H., Modern Facror Analysis. The University of Chicago Press. Chicago (1970). Harmon J. H., Ph. Duboc and D. Bonvin, Application of Factor Analysis to the Resolution of Biochemical Reaction Networks, Internal Report-1992.08, Institut d’Automatique, EPFL (1992). Horn R. A. and C. A. Johnson, Matrix Analysis. Cambridge University Press, Cambridge (1985). Malinowski E. R., Theory of the distribution of the error eigenvalues resulting from principal component analysis with applications to spectroscopic data. J. Chemometrics 1,33340 (1987). Malinowski E. R., Factor Analysis in Chemistry. John Wiley, New York (1991). Prinz 0. and D. Bonvin, Monitoring discontinuous reactors using factor-analytical methods, IFAC Symp. DYCORD + ‘92, College Park, MD (1992). Randolph T. W., I. W. Mar&n, D. E. Martens and U. von Stockar, Calorimetric control of fed-batch fermentations. Biotechnol. Bioengng 36, 678-684 (1990). Saner U. M., Modelling and on-line estimation in a batch culture of Bacillus subtilis, PhD thesis. Swiss Federal Institute of Technology, Zurich (1991). Shrager R. I. and R. W. Hendler, Titration of individual components in a mixture with resolution of difference spectra, pKs, and redox transitions. Anal. Chem. 54, 1147-1152 (1982). Sonnleitner B. and 0. Ksppeli, Growth of Saccharomyces cereobiae is controlled by its limited respiratory capacity: formulation and verification of a hypothesis. Bbtechnol. Bioengng 28, 927-937 (1986) _ Stephanopoulos G. and K.-Y. San, Studies on on-line bioreactor identification. I. Theory. Biorechnol. Bioengng 26, 1176-1188 (1984). Wang N. S. and G. Stephanopoulos, Application of macroscopic balances to the identification of gross measurement errors. Biotechnol. Bioengng 25, 2177-2208 (1983).
Subscripts a in out orig ext
1299
L = matrix of singular a, = ith singular value
possible
the
accomplished
the closure
coefficients,
adds an
the
in
results are obtained,
yield
to
more specific model-
indicates
can be used for a variety
calculate
analysis
It merely
modeling
this is an extremely
conceivable
of
allows
last
inclusion area,
process.
step which the
of factor
not be seen as a substitute
modeling
fact,
analytical modeling of biochemical data
APPENDIX Indicators (a)
of rank
From
Malinowski
(1987).
the F function
s
c
(B-j+
l)(S--j+l)
F(l.S-n)=‘-“”
2
X(B-n+l~;S-n+l)
(Al)
J. L. HARMON et al.
1300
is used to check the hypothesis that o;, is statistically similar to the pool of S-n smallest singular values (i.e. o”+lY o,+2. . 0s): P{F(l,S-#l)>F(a,
l,S-n)k=a.
(A21
A confidence level of 95% is typically used (a = 0.05). (b) Shrager and Hendler (1982) proposed the following first-order autocorrelation function as an indicator of rank:
a-1 AU’fO(ut)
= 2
u,ku(i+ ,,t .
(A31
j=l
Autocorrelation values close to one are deemed to be highly significant, whereas values close to zero represent random signals. The authors suggested a cutoff value of 0.5, although this number may vary for different applications.