Factor analytical modeling of biochemical data

Factor analytical modeling of biochemical data

Computersthem. Engng Vol. IY, No. 12, pp. t287-1300, 1995 Copyright @ lW5 Elscvier Science Ltd Printed in Great Britain. All rights reserved Peqgamon...

1MB Sizes 8 Downloads 30 Views

Computersthem. Engng Vol. IY, No. 12, pp. t287-1300, 1995 Copyright @ lW5 Elscvier Science Ltd Printed in Great Britain. All rights reserved

Peqgamon 0098-1354@4)00119-7

FACTOR

ANALYTICAL

0098-1354/95 $Y.SO+O.~

MODELING DATA

J. L. HARMON,’

OF BIOCHEMICAL

PH. DLJBOC,~. 7 and D. BONVIN’.

*

’ Institut d’Automatique.

Ecole Polytechnique Fedcrate de Lausanne, CH-1015 Lausanne, Switzerland de Genie Biologique. Ecole Polytechnque Fcderale de Lausanne, CH-1015 Lausanne, Switzerland

2Laboratoire

(Received

8 October

1994; final revision received November

26 October 1994)

1994; received for publication

21

Abstract-Factor

Analysis, a multivariate technique for determining the major trends or factors in a data matrix, is shown in this paper to be appropriate for resolving biochemical reaction networks. As opposed to an algorithmic approach, the methods presented in this article are intended to be a highly interactive set of tools. The researcher can use these tools to investigate a data matrix of concentrationchange measurements by proposing different reaction networks. Several tools are adapted from other fields. and a few new techniques are proposed. The new techniques involve the estimation (or exfraction) of reaction stoichiometries and reaction extents when all the reactions are not present at all times. This article presents theoretical elements, simulation results as well as an application of the method to experimental data from the fed-batch production of Baker’s yeast grown on glucose. Reaction stoichiometries and reaction extents are estimated for the reactions of glucose fermentation, glucose oxidation and ethanol oxidation

1. INTRODUCTION In general,

data analysis

is characterized redundant researcher order

in biochemical

by a large

number

measurements.

ing all measurements would

pretation ration

of

of unspecific

is generally

simply

model.

in biochemical these

Thus,

research

measurements

into existing

modeling,

and

control

tion

and the

one of the

The problem and spawned man, FA

of redundant

sciences With spread

the advent

to the field of chemometrics,

for the elucidation good

overview,

method been

adapted

Target

Bonvin,

(Bonvin

1992).

The

these

several

accept

reactions.

researchers

For

a

of chemical

reac-

not only

or reject the

have proposed

aim at

reactions

stoichiometries

biochemical schemes

field, to adapt

iCurrently at: lnstitut de Genie Chimique. Ecole Polytechnique Fed&ale de Lausanne. Switzerland. $ To whom all correspondence should be sent.

than

a

(1991)

method two

(1989) reac-

assume

may

coefficients

set

is difficult be

to

and

the

unclear.

a method

of

drawback

the value

reactions.

reactions,

proposed

minimum

The major

In that

stoichiometric

of this approach of several

that are rarely

is

stoichio-

accurately

avail-

able a priori. As opposed

to a specific

in this paper

a priori

algorithm,

are meant allows

scenarios

Once

the stoichiometry

the reseacher a reaction

of any newly

unobserved

states and reaction

While niques

are

involve

the extraction

chiometries reaction Section 1287

its

is

data,

to estimate

reactor

operation

1992).

from other fields,

also

network

rates (Stephanopou-

many of the techniques

are adaptations

and

retrieved

19841, or to monitor

and Bonvin,

to pro-

certain types of

can be used to test the

consistency

(Prinz

the methods

to be an interac-

by including

information.

resolved,

10s and San,

1990; Prinz and

does

Saner

define

Hamer

irreversible

stoichiometries

pose different

e.g.

has

of independent

to

more

tive set of tools which

age, (for

work,

presented

(Har-

(TFA),

Analysis

and Rippin, method

in

1991). A particular

Factor

the number

but also at helping for

measurements

to the investigation

tion networks determining

of spectral

see Malinowski,

of FA,

(FA)

of the computer

of

another

metric

part of this century

the field of Factor Analysis

1970).

quickly

data was confronted

in the early

for

that it must

and optimiza-

of

this geometrical

coefficients.

integ-

to bioreactors.

that tries to find a sparse

consisting

selection tries

is the inter-

tion methods. the social

network

implement

the data in

their

a method

However,

incorporat-

not available,

like to analyze

FA techniques

presented

experiments

Since a model

to form a meaningful

new challenges

diverse

proposed.

without

shown

These

of reaction having

to

in this work

several

new tech-

new

extents resolve

methods and stoithe

entire

network. 2 describes application

some to

basic properties

biochemical

reactors.

of FA The

J. L. HARMON et al.

1288 description

of

approximate of resolving sections:

the individual

a network.

problem

determination mation

tools

is given

order of their application

of

This is broken

formulation,

information.

and

tion extents from

reaction

using

production

4. Conclusions

and

Problem

the methods

Basically,

FA

experimental of Baker’s

further

yeast

research

in

ideas

=

common

R

D contains

to observations

extents

the matrix

The

different

reactions.

FA

be

can

common

on

applied

the

the matrix

with the rows represent

information)

the and

(species-dependent factors

As

the complete

in

many

found

1970; Malinowski, only

by

in time and the col-

C the stoichiometries

information).

of

and interrelated

(time-dependent

are

then

the

description

references

of

(Harman,

1991), this paper will concentrate

problem

formulation

to bioprocesses,

and

and on several

solution

techniques

of interest. Basic where

In

model.

a set of R reactions

of reaction

of a particular

can be expressed

2

a general

are taking

experiment

place,

the rate

species at a particular

rrrr is the stoichiometric in the rth reaction,

+ Fo,,(t)cs (f). (2)

the rth reaction volume

expressed

of the system,

concentration

the feed

the

and

input

three

terms

represent

in mollh species

form equation

X is the unknown (with the

r= I

volumetric

on the right

hand

rate of

rth

reaction),

stoichiometric

F,,,

rates.

side of equation

the rate of accumulation,

is produced

species

is con-

need to have

biomass,

protein

the input

are The (2) rate

,

(4)

B x S data matrix, of reaction

x,, representing

and

N

is the

(with

senting

the stoichiometry

number

of reactions

the problem

Yr,nT=XN

(measured)

matrix

matrix

the extent of

unknown

R Xs

n:,

repre-

the rth row,

of the rth reaction).

R may also be unknown.

consists of estimating

The Thus,

R, X and N from

D. Data pre-treatment Atomic

At this point it may be

and other balances.

desirable

to inspect

checking

different

For

balance

each

the consistency balances

type,

of the data

(e.g.

a vector

which satisfies the following

carbon,

to be

species.

For

balanced

m can be formed

relationship:

found

the number

of

in the corresponding

balance,

for

the number

found in each of the species. (5),

(5)

of m contains

the carbon

vector contains

by

redox).

Nm=O.

instance,

of carbon

From equations

the

atoms (4) and

one obtains: Dm=XNm=O.

or mass

the overbar

F,,, and flow

or

(3) reads:

B x R extents

the rth column,

or g/h. V is the

with

concentration.

output

of the sth

ir is the reaction

and c, is the molar

of the sth

indicating

coefficient

(e.g.

moles

An amount

species if the

structure

D is the known

where

balance

Here,

and species.

can be used).

D-2

units

r=I

species

defined

Each of the S elements

- Fi,(t)c

(in

do not necessarily

time

as:

=d(vyt))

i,(t)n,

reactor

The species

D contains

matrix

consumed

decreases

character-

where

The

or

(1)

and the columns

R could

the matrix

respectively. produced

if the corresponding

a chemically

(3)

dt,

t,, the. times of the first and the bth

conversely,

In matrix

.

measurements

F,,(t)E, dt + ” F,,&)c&) I 1x1 trrr

or pseudo-species

CDl”rn”S

separate

In an application

to species,

reaction

fzxbxsr

the rows

composition

corresponding umns

c

rowsxfactors

with

an observed

- v(GefM&e‘)

for each observation

increases

to express

D, respectively,

factors.

amount

grams)

of two matrices:

R and C represent

the data matrix

rrcr and

observations,

sumed.

D

istics associated

with

and,

rows x columns

The matrices

-I

data

BACKGROUND

attempts

of

lh

the

as the product

(2) can

the bs-elements

f,, > trer):

V(t*)cs(tb)

=

in

and reac-

formulation

data matrix

D (for

Equation

to give

of

are also presented. 2. THEORETICAL

with time

transfor-

extraction

stoichiometries

are estimated

the fed-batch

Section

the data matrix

into five

results using both noise-free

and noisy data are used to illustrate Section 3. Finally,

rate, respectively.

down

of reactions,

quantities,

Simulated

and the output be integrated

data pre-treatment,

of the number abstract

in the

in the process

Note

also that equation

(6) implies

matrix satisfies the balance. metries derived

that if the data

then any set of stoichio-

from D will also satisfy that balance.

For the case of C constaints reads:

(6)

(balances),

equation

(6)

Factor analytical modeling of biochemical data (7)

where

Equation

(7)

singular

values,

lie in the

null

values),

U is the B x S orthonormal

DM=O, with M a SX C matrix indicates space

that each

of

errors, The

owing

not necessarily

can

constraints

of D must

However,

MT.

D does

data

of constraints.

row

then

be

(Bonvin

to

measurement

satisfy

reconciliated

by projecting

as follows

equation

(7).

to verify

the

D on the null space

and Rippin,

of MT

1990):

D,=D(I-MM+), where

M+

represents Factor

Scaling. magnitude

are many The

ways

the standard

deviation

species

column

weight more

accurately

analysis

measurement

measured

making

There

elements.

species

of

error for each giving

errors.

equal

This has of the

(9).)

their interpretation

with

more diffi-

space

An

alternative

ties (i.e.

way

consists

to the range

the variability

Once

in scaling

so that rank(D)

values

N must be post multiplied physically

meaningful,

unscaled

(in

cause

these singular

rank

of

D

to

determine

After

the

according

of the number data

have

S.

to equation

independent

properly

(3) and scaled,

This can be done

1).

i=

largest initial

The

reactions

occurring

of

simultaneously

must be determined. In the case of no measurement gives

the

number

Since noise cannot

be avoided

tial

noisy

rank

of

determined. portion

only errors.

no

Singular Value

loss

Decomposition

of generality

than the number

B>

then

(Horn

hopefully,

to

be of a

contains (SVD)

is

the

SVD

and Johnson, D=

(SVD) . Assuming

Decomposition

greater S),

the essen-

needs

entails the removal

which,

Value

matrix

R.

for that purpose.

Singular with

this matrix

the rank of reactions

in practice,

data

This necessarily

of

a method

the

noise,

of independent

that the sample of measured

of

D can

be

1985):

LEV==

size is

species expressed

(i.e. as

aiuiv:,

are zero,

of noise will

subsequent

step

values

can

is to

be

neg-

data.

Singular

are

reconstructed

and

are

values

variance

Malinowski

value

with

compared

are added

the

to the

in one

at a

until the data is adequately ratio

(1987)

whether

test using

developed

expressed the singular

an F function

a particular

singular

value,

similar to the S - n smaller

singular

(see Appendix). qualitative

method

representing

true signals

bit rapid

fluctuations.

the autocorrelation

tive measure

Shrager

correlation

between

to different

singular

vector random)

overview number

of many

other

of factors

the authors’ no

small (e.g.

Malinowski

opinion,

one amount

of

(1991) methods

present

these

will

of trial and error

if a

measure-

between

smaller

its

than 0.5).

gives an excellent to determine

the

Overall,

generally

It is usually

methods

corres-

its autocorre-

in noisy data.

+- 1 factor.

these

with

is expected and hence,

its

as there will

In contrast,

associated

elements,

lation will be relatively In addition,

large

If a

factor,

the elements

ment noise, little correlation (mostly

true

observations.

is mostly

(1982)

as a quantita-

(see Appendix).

will be relatively

ponding

be smooth

and Hendler

to a

of

of this

noise will exhi-

function

corresponds

autocorrelation be some

should

of this smoothness

vector

vectors

with random

proposed singular

is the inspection

of the matrix U. Column

while others associated

certain

(9) using only singular

This idea can be quantitatively

a Fisher

a,, , is statistically

that

by reconstructing

to equation

with the largest

values

same result within 2

values

qualitatively

data

singular

matrix

arranged

of

to be non zero and the

singular

the data matrix according

the columns

the number

the number

singular

The

many

cc&

the rank

lected and set to zero.

values

of reactions

been

as they

In the case of no measure-

values

be

how

Another

Determination

S in

to as-

important

this application,

the last S-R

to investigate

stoichiome-

it is necessary

of the corresponding

reactions).

ment noise,

explained.

tries.

space

replace

and the rank of D gives R. The presence

with

of W to

the row

B, L? would

urns of U and V on D and, thus, can reveal

values.

with the inverse

set of vectors

spanning

are notably

the significance

spc-

the computed

the

decreasing

2 R.

The singular

data

in each column).

X and N have been estimated,

obtain

the

of the corresponding

by

of D, and V is the S x S

S>

time in the summation

with respect

containing

sume that both B and S are larger than or equal to R

(i.e.

the species

matrix

(ordered

Furthermore,

of decreasing

cult.

D

equation

the term associated

the

D

set of vectors

in the numerical

of the results regarding

a low weight,

W.

the importance

of D, but the disadvantage

significance

the data

matrix,

of

of D. (In the case where

independent

essentiaHy

of increasing

orthonormal

of the system

of the expected

to the different

the advantage

of

a,,

the column

indicate

is to use the inverse

of D),

spanning

be normalized.

the diagonal

method

Z is the S X S diagonal

it is imperative

multiplying

weighting

of choosing

most common (i.e.

Hence,

by post

with a diagonal

of M.

by the order

of the data matrix

is accomplished

matrix

is affected

of the variables.

that the columns This

(8)

the pseudo-inverse

analysis

12439

give

in the

accepted

is perfect,

and

must be used.

a

J. L. HARMON et al.

1290 Once been

the number

determined,

of independent

reactions

the last S - R singular

set to zero, and D is approximated

R has

values

are

using R factors

as

last rows sponds ward

of D. Note

now

matrices

observation D&+r=g

o,u,vT,

(IO)

,=I

The

where

the superscript

^ denotes

singular

value

noise-induced

may

be considered not imply

only error.

vant

the

Some

reaction since

Conversely, S-dimensional

the

(10)

The

insignificant

and

that its removal

will

loss of information error

is

rele-

unavoidable.

spans

the

in the reduced-rank

strategy,

however,

than valuable

relationship

(4)

is that more

from

(II)

N, = irT,

(12)

and N. a RX

S abstract

“a”

is used

abstract and represent true matrices In

the

matrix

factors.

this (10)

Hence, time

tions.

It is obvious

some

means

pearance

of

paper, by

profiles

reactions.

for

Factor

evaluated

vation.

For each such matrix,

for

independent

reaction

of a singular

vector

correlations

vs time

of independent

This method direction dent formed

be plotted

reactions.

the disappearance

The

by successively

measure-

backward adding

data rows

spanned

a

Bonvin

physically

and Rippin

meaningful

(1990)

proposed

ntarr i.e. a known vector, solution

stoichi-

t, can be calcuto

(15)

of the target vector

the +

each

the

superscript

indicates

the Moore-Penrose 1985)

and

(16)

the pseudo-inverse

conditions

P is the

stoichiometric of

advantage when

the

coefficients reactions.

R

of TFA

particular

missing

is the ability elements

(Bonvin

and

be

great

to test targets

even

of the target

Rippin,

sensitivity of indepenare

of the

analysis

In general, to biochemical target

termed

free-floating,

(1992)

have

elements.

In

developed

a

of TFA.

target testing is more than chemical

stoichiometries

sely. However,

are

Malinowski,

for the unknown

et al.

vector

1990;

to TFA,

Harmon

for

a

1991). This extension addition,

known

Consequently,

also yields predictions

in the reverse

and

projection

matrix T, at least

must

assothe

(Horn

resulting

To solve for the transformation

obserof

onto the space

by the rows of N,:

matrix.

the R

matrices

N..

(14)

The auto-

in front

and

for a parti-

nT=nT tarN+N. il * =n? fdlP .

obser-

to resolve

a match

n:, = nT + &T= tTN, + eT,

Johnson,

reactions.

can also be employed

to resolve

to find

preTarget

i.e. a row of N is

the transformation

Here

formerly

significantly.

been

method,

stoichiometries

that given a target vector,

satisfying

The appear-

ciated with noise to increase appearance

stoichiometry.

with this

will cause

have

attempts

n is the transformed,

where

Analysis

the autocorrelation

in U are computed.

RxR

nT = tTNar

to have

each

the

the transformation

cular stoichiometry,

as the projection

In this method,

(13)

finding

where E is the model error. Thus, n can be estimated

reac-

the data between

to

popular

known

(13),

and disap-

and the present

is successively

can

equation

= XN.

methods

One

(TFA),

a priori

the D.

detecting

of spectral

experiments.

vation

autocorrelation

only

of notation,

et al., 1987) was developed

vectors

Analysis

T.

lated as the least-squares

full-rank

be helpful

Evolving

of the experiment

of a new

the

the appearance

data matrix containing

the singular

Factor

approximated

retaining

that it would

in titration

beginning

are

for simplicity

idea in mind for the elucidation forward

The

D will also be labeled

of detecting

(Gampp

ments

matrix. matrices

bases for the

systematically

matrix

Autocorrelation

(EFA)

of

be

the reduced-rank

to identify

and are there-

T.

Many

sented

ometry,

to equation

dominant

these

in (4). will

according

stoichiometric because

matrix

testing.

of N,

of those matrices:

reduces

extents matrix

only orthogonal

remainder

D

reaction

now

transformation

From

to

and

by the rows

of X,, respectively,

combinations

problem

between

x,=&z

will

vector

stoichiometries

spanned

X,N, = X,T-‘TN,

Target

be obtained

reaction

a singular

meaningful

lie in the spaces

and the columns

The

noise

of

of abstract quantities

physically

The extents

information.

can now

X, is a B x R abstract

subscript

Transformation

whole approx-

an

significantly.

fore linear

as follows:

where

ance

fact that a

back-

data between

of an independent

autocorrelation

row space of D, some error will natur-

will be removed The

associated

The

network

ally have to be accepted imation.

a matrix

matrix.

does

eradicate to

data

the original

time correHence,

instant and the end of the experiment.

the

decrease with the reduced-rank

contain

disappearance

cause

that the reference

to that of the first row.

target

are

difficult to apply

data since macroscopic often

testing

not

known

can be used

preci-

to check

1291

Factor analytical modeling of biochemical data available the

reaction

literature

stoichiometries

or

compatibility

computed

(available

by

with measured

other

simultaneous

reaction,

substrates

cose and oxygen) by-product)

for

of two

and ethanol,

or of two products

glu-

(biomass

to the value

main substrate.

one

constraints

e.g. the exclusion

(glucose

in addition

reaction’s

processes,

two stoichiometric

each macroscopic

for

data.

Reduced target. In fermentation can often propose

from

means)

Therefore,

subset

with

reactions

an alternative

common

the

natural

progression

involve

all

possible

to divide

the

of reactions.

to apply

of an

reactions

experiment times.

the data in order

As shown

below,

It

not

method

known.

More

a researcher

may

be

able

projected

onto

(e.g.

oxygen

From

the extracted of reaction

reaction

along

with

extents leads to the extraction

equation

(4) and (13)

give:

computed,

(17)

quantities

X, and N, can be

but not the physically

meaningful

(or

one obtains:

X’N’P ~‘N’P+X;;_,N,_,P

I’ 1

=[-&-P=[$-]N, where part

the tilde indicates of the matrix

(=)

a projected

which

matrix,

is orthogonal

have used the fact that N’P = 0. Note only

information

associated

X and

The

matrices

respectively.

i.e. that

to N’.

with the remaining

autocorrelation

appearance

or disappearance

D’,

data

matrix,

only

that section

reactions ometry,

can

can

N:,

be

of rows

is present.

Using

reveal

constructed

by

of D where

a subset

SVD,

including

an abstract

quantities

according

A comparison

to equations

of equations

abstract

(lo)-(12):

I-l 0

of r will

set of r

(r < R):

and

XL

both describe the same space, of

b.

If

only

one

reaction

= X’N’.

(18)

XL and N: can be computed, matrix

D can be written

the single

partitioned

i.e. the column is present

space

in D

(i.e.

L-1 x”

as:

is simply

[;I

= [&J

(19)

[C]

proportional

measurement ment

where

2,

0

but not X’ and

The above D=XN=

that

R - r = 1). its extent

N’. The

(24)

(23) and (24) shows

the matrices

stoichi-

D’ which

from

the

another

space of this reduced

D’ = X:N: Again,

profiles

of reaction(s),

be calculated

span the stoicbiometric reactions

time

set

R and R-r,

of D gives the following

fi = f,A,. the

We

that D contains

D and D are of rank

SVD

N. If

N:)

of R - r reactions.

D = X,N, = XN. that the abstract

to N’

(22)

D=DP= ~!N'+X;;_,NR_,

in fermen-

is

D can be

matrix:

X’N’

from

Extraction of reaction extents. For a set of data D

Note

(19)-(22),

to

once

r reactions

matrix

orthogonal

projection

may

stoichiometries.

with R reactions,

(21) it is possible

the data

the space

equations

in a certain

does not participate

This type of a priori information

tation).

(20)

P=I-(N’)+N’=I-(N:)+N:.

to indicate

specific species which did not participate

set-up,

we can write:

extraction,

precisely,

=

reaction

of

using the following

the data. Similarly.

the available

the extent space for the R - r reactions

to isolate subsets extents

the r

N,=N’. the

is then

this information

or extract reaction

be used to identify

will

the

that the

with

x:=x’

information

For instance,

at all

of

itself and to

priori

a

systems.

of

to bioche-

is to take advantage

in the data matrix

types

in biochemcial

From

the stoichiometric space for the other

testing is difficult

information

incorporate

associated

definitions,

the identifica-

Extraction of information

inherent

respectively.

and with the above

compute

Since target

are

in D’ and with the remaining R - r

present

reactions,

where

mical systems,

quantities

that

simultaneously.

the The

r and R - r are used to indicate

corresponding

With

occur

D”

subscripts

and a

- 1 for

represents

up to R reactions.

data with possibly

tion of T using target testing is often limited to cases only ?WO reactions

r reactions;

only

remaining

and

double

primes

set of data: D’ is associated

represent

noise

to t..

derivation noise. would

the

(computed

from

with the data

chiometric

space.

assumed

the presence

In experimental cause

D’)

the space

to deviate

Subsequently.

of no

data, measurespanned

from

by N:

the true stoi-

this would

cause

J. L. HARMON et al.

1292

errors in the projection given by equation (23). Nevertheless, the hope is that the rank reduction of D’ from S to r according to equation (10) removes much of this noise. Extracting reaction stoichiometries. The extraction of reaction stoichiometries can be handled in the same fashion as extent extraction. The method can be used to extract the stoichiometric space for R - r reactions once the extent space for the other r reactions is known. Basically, if a set of r reaction extents X, can be formed, for example using the extraction technique described above, the data matrix D can be projected on the space orthogonal to X, (or X8,,) using the following projection matrix: P=I-xx,x,+=I-xx,..x,:..

(25)

1 _N,

1

LnR-rj

= [PX,N,+PX,_,N,_,]

The extents of four independent reactions are generated to form the matrix X = [xlxzx*]. By construction, only three reactions occur simultaneously, giving three subsets of data according to Table 1. Each subset of data contains 100 points. Table

1. Reactions

D,. I)2andD3

Data subset Dl

D2

4

RI R2 R3

+ + -

+ +

+ + +

3

3

+

rt

+

3

of reactions

-1

The matrix fi is of rank R - T. SVD of fi gives the following abstract quantities:

N=

[I 0 7 d

and 1Tr, and between n and ii,). However, only the stoichiometries or the extents, but not both, can be scaled arbitrarily. If both are, their product would not equal D. Usually, the stoichiometries are scaled with respect to a key species (e.g. the value - 1 for one of the substrates). Furthermore, since the stoichiometries are extracted using subsets of the data, it is useful to check their compatibility with the complete data set, for example by using the extracted stoichiometries as targets in a TFA scheme. In the case of a discrepancy, the projected target can be used to form the matrix N. The corresponding reaction extents can then easily be computed as follows: (28)

0

-1

-1

-3 0

[ -2

(27)

A comparison of equations (26) and (27) shows that the matrices NR_, and R, both describe the same space, i.e. the row space of I). If only one reaction is present in fi (i.e. R-r= l), its stoichiometry n is simply proportional to 6.. Reconciliation of extracted quantities. As part of the extraction procedures described above, the reaction extents and the reaction stoichiometries are both scaled arbitrarily (proportionality factors between

X=DN+.

in the data subsets

Reaction

(26)

fi = S.R..

twesent

The four independent stoichiometries involve five species:

=PXR_rN~--r=%~_rN~--r.

since PX, = 0

SEMIJLATFiD RESULTS

Data generation

Number

The projected matrix fi becomes: b=PD=P[X,IXR-,]

3.

1

3

1

2

2

-3

-4

1

1.

-1

-1

0

3I

data matrix is simply generated as the product of X and N according to equation (4). We assume that the first two stoichiometric coefficients for each reaction are known a priori. As three reactions occur at all times, we need to propose three stoichiometric coefficients for each reaction in order to solve directly for the transformation matrix T according to equation (13). Since only two coefficients are known for each reaction, the extraction procedure described above is necessary to identify the stoichiometries from D. First, we will illustrate the techniques of factor analysis and extraction on noise-free data. Then, a simulation with 2% noise on each variable will be analyzed. The only available information is the data matrix D and the a priori knowledge of the first two columns of N.

The

Extraction

procedure

in the case of no&e-free

data

The forward autocorrelation profiles (Fig. 1) clearly indicate the presence of three reactions initially. The sharp increase in the autocorrelation of the fourth singular vector after about 100 observations is an indication of the appearance of a fourth reaction. Similarly, the backward autocorrelation profiles (Fig. 2) show that three reactions are present towards the end of the experiment. The decrease

Factor analytical modeling of biochemical data of the autocorrelation after about

of the fourth

200 observations

disappearance

singular

which

vector

by NI.~.

of the

and equation

is an indication

is easily (12).

ing to equations fi has only

dominant

loo

300

200

the above

Now n, by

projecting

By

by

the

The autocorrelation profiles of the singular vectors (noise-free data).

D

two

stoichiometric

1.0

determined

from

1

times

-0.5

300

200

103

The

300

obsavalion

autocorrelation profiles of the singular vectors (noise-free data). and

backward

tions. Three

data subsets

of

the

profiles:

D, (observations

lOl-200), can

be

shown

to contain

m,

x1 can

(22)

four

obtained.

is based

some

of

one reaction.

Figure

The extraction

we propose

individually

the

can be

has

tion (28).

been

and three (x2, xX). A for n,

obtained

of the data and procedure

with

et al., 1992). x,. From

the

spanned

by

according

to

space

This

and extents may

have

not

be

been scaled

to first scale the stoichio-

with

respect

to a key

the extents according

procedure

original

noise-free

allowed and

species, to equa-

us to reproduce

the extents

of the

data.

D2 (observations

201-300).

Each

three

data set

Extraction

of extents

and

and compar-

by the occurrence

of

the procedure.

of x, is first realized

the data sets DI and D which were

by comparing

shown

respectively.

metric space for the three reactions

in the case of noisy data

simultaneous

on the analysis

3 illustrates

three and four reactions,

values

them

these

that differ

stoichio-

n,

computed

stoichiometries

consistently,

noisy

The

stoichiometries

two

and (23).

As

metries

be

the stoichiometries

strategy. The extraction

ison of data subsets

to

can be

This way,

at extracting

exactly

reactions. Extraction

the

reactions.

and (nl,n4)

of the stoichiometric

pro-

only

of

extraction

reac-

l-100),

D3 (observations

space

necessary

data (Harmon

autocorrelation from

T

of the consistency

of four independent can be defined

other

(x2, x_,), (x3, ~4) and

and then to compute the presence

the

determine

stoichiometry

final step is aimed

equations

*

to that

stoichiometries

(nl,n2)

to the available

n2, q 3 and

200

two

knowledge

of the different

sensitivity

The

to

can

matrix

the

in turn

knowledge

Ica

the

(n,,n3),

to R2, R,

and (26).

for each reaction.

that

using

respect

-1.0

one

for

the

can be indicative the

forward

x2

to extract

four times: once using (x2, xj, G)

comparison

I

!

The

obtains

from (x2, x4), (x3, x4) and (x2, x3), respecti-

Note

extracted

0.0

(25)

perpendicularly

meaningful

coefficients

vely.

files indicate

one

perpendicularly

extents,

physically

computed

3 0.5

data

space

stoichiometries

Fig. 2. Backward

1). SVD to xg.

it is possible

2X2-transformation

obtain metric

0

procedure,

is

(i.e.

the data sets Dz and D and 4

to equations

projecting

spanned

0

reaction

is proportional

determined,

space according

-1.0

fourth

matrix which

that the extent space corresponding

and R4 has been

4

value

and D, respectively.

Otservatiw

Fig. 1. Forward

accord-

in D,, see Table

%. which

and ~4 by comparing

to N,,.

singular

not present

of b gives directly Repeating

of D1

The projected

with the additional

R3, the reaction

via SVD

D perpendicular

(22) and (23).

one

associated

determined

The next step consists in project-

ing the data matrix

of a reaction.

0

1293

The

to contain stoichio-

in D1 is spanned

data

matrix

noise-free

measurements

corrupted

with

The

number

determined lation

gaussian

noise.

of time

by the forward

and backward

autocorre-

of

of noise,

The

multiplicative

from changes

as a function

ing the appearance tion.

is constructed

reactions

profiles

Because

2%

of

D

of concentration

data

the

left

singular

vectors

there is some uncertainty or the disappearance

where

the

number

of

of

is D.

in locatof a reac-

reactions

is

J. L.

1294

et al.

HARMON

D

f

Fig. 3. Extraction

procedure for the reaction system given N ext = stoichiometry

extraction;

in Table 1 (Xext= T = transformation).

extent

The last step in the extraction procedure deals with the reconciliation of stoichiometries and extents. TFA is applied to the four extracted stoichiometries. II, is chosen as the average of the three extractions with (Y+, x.,), (x3, x4) and (xl, x3). There is no noticeable discrepancy between the spaces spanned by the extracted stoichiometries and by the first four left singular vectors of D. The extents of the corresponding reactions are computed using

ambiguous are simply discarded. Consequently, the data subsets D,, D2 and D3 are of slightly smaller size than in the noise-free case. The procedure is then the same as that described above for the noise-free case and given in Fig. 3. The extracted and original stoichiometries are listed in Table 2. Because of the addition of noise, the computed stoichiometries differ slightly from those used in the simulation.

Table 2. Original and extrafted stoichiometrics in the case of noisy simulated data RCZPXkMl

RI

Extraction using

h

x4)

Stoichiometry

n,

0 0 0 0 0 0

-1

RI RI RI RI RI

(x2. x3) (x3. -4) average above (x21x3.Qrq) original

a1 ml nl n1 01

-1 -1 -1 -1 -1

Rz R2

(x3. xq) original

% a2

--t -1

R3 R.,

(x2. xq) original

03 03

Rd R,

(X2.%I original

14 nr

0 0 -2 -2

extraction;

1;

-1.00 -1.00 - 1.00 - 1.00 -1.00 -1.00

2.98 2.98 2.98 2.98 2.98 3.00

1.a0 1.00 1.oll 1.00 1.00 1.00

1.02 1.00

2.00 2.00

1.99 2.00

-3 -3

-4.01 -4.00

1.13 1.00

1.20 1.00

-1 -1

-1.27 -1.00

0.14 0.00

2.85 3.00

Factor analytical modeling of biochemical data equation

(28).

In order

fit of the various cent between

to evaluate

extents,

an extracted

in the original

the goodness

the relative

error

of

in per-

data is computed

Factor analysis of data The raw data are first converted

extent and its counterpart

noise-free

1295

equation

as:

(3).

numerically

The

gas

integrated

flow

(29) The relative

errors

are 6.4, 5.3, 6.7 and 3.7%

for x,,

x2. xj and x4, respectively.

ethanol

ments.

to molar

convert

26.5 g/C-m01 (1986),

proposed

species. The

Material and methods The

factor

methods

were

also

data collected

from

a fed-batch

using

experiment

Baker’s

yeast.

Saccharomyces cerevisiae ATCC on a semi-complex with

glucose

to avoid

The KLF

extract

added

each

log

Lab

10-15

min to determine

Mannheim

enzyme

filtration

branes

of

aeration

(1 l/min).

sitions mined were

was used

the glucose

flow

and

of

was grown

that

were

time,

entirely.

reased

and reached

sharply

after

production

the hypothesis F-test

zero.

immediately

the

concentration The

feeding of

set in.

dec-

307.1 for two reactions

correlation

was

0.315 I/h.

of 3.51 is

5.99

is normally

and

significance

(for

2

level),

distributed

compared

2.40 for

in the factor

matrix is projected containing to equation

three

value

is

reactions

com-

of 18.15. The

auto-

of the U matrix are

Consequently,

the influence analytical

R = 3.

of measurement

procedure,

and redox balances

This way,

fied so as to verify

value

with the threshold

the data

onto the null space of the matrix

the carbon (8).

indicates

The first three columns

significant.

to reduce

level

function

values of the columns

of U are found errors

of

test error

measure-

value

significance

computed

[0.92,0.76,0.67,0.46,0.36].

after 6.7 h and the glu-

7.5 h at a rate

at the 5%

reactions:

with the threshold

a

species

value

that the error

of 10.13,

had reached

each function

and a 95%

value

3 h after

with a Chi-square 1983). A relative

threshold

pared

produced

of gross meas-

be rejected.

In order

Ethanol

the

at a rate of

only

during

for

to wait for an

ethanol

of the culture

was stopped

of freedom

cannot

species

Detection

Since the computed than

An

taken the

is assumed

measure-

in small amounts

is investigated

degrees

three

of 1.55 g/l. A feed

errors

than

balances

of unmeasured

and Stephanopoulos,

smaller

at 4°C

of 750 mg/l.

cose consumed restarted

flasks

produced

and

again,

species.

by imprecise

fermentation).

5 mol%

ment.

of 1.6 1 to obtain

at 1.42 h. In order

the first phase

The feeding

5E

of 3;o”C for a

was centrifuged

concentration

samples

At

deter-

in conical

at a temperature

of 37.4 g/l was fed to the reactor

concentration

Ethanol

were

gas measurements

to an initial volume

culture,

during

(Wang

Here

in the substrates

all the measured

be explained

the glycerol

urement

the compo-

22P and Oxymat

The

of 12 h. This inoculum

active

and

oxygen

Ultramat

the biomass

inoculation.

for

30 s.

0.064 l/h beginning

then

nitrogen

meter,

respectively.

an initial biomass

can

elec-

balance,

biomass

error.

that the two

and by the presence

and 79%

with 20 g/l of glucose

solution

ments

mem-

(e.g.

dioxide

taken every

and transferred

0.45,um

The inlet gas flow was measured

of carbon

period

Errors

was measured

on

A gas stream

using Siemens

Initially,

broth

for 48 h at 100°C.

5878

gas analyzers,

of

redox

ethanol,

Note

are 4c +

of available

The

electrons

involve

formed electrons

C,H*O,N,

a - 11.2%

products.

together

Boehringer

with

consumed

the available

glucose,

exhibits

in the

error

i.e. there are

CO*).

compound

there are more

and ethanol

using

kits. The biomass

oxygen

a Brooks

Glucose

measured

10 ml

after drying

of 20.%% with

were

ethanol

during a fed batch of 9 h

for a total of 29 data points. concentrations

consumed,

balance,

involves

to the

than in the products

ethanol,

oxygen,

respect

shows a - 8.9%

must be conserved.

which

1, 11.

is checked:

h - 20 - 3n. The total number trons

1,

atoms in the substrates

For the redox

and pH

the glucose,

balance

in the chemical

The system was sampled

concentrations

with

quantities

ethanol)

(biomass,

Ferm

is used for measured

are [2,4,

of the data

to the glucose

(glucose,

of

weights

of certain

more carbon

deficiencies.

2000, was kept at constant temperature

and biomass

by

yeast,

In addition

for

a 21 Bioengineering

of 30°C and 5, respectively, every

consistency

respect

grown

value

and KIppeli

for the ash content,

selected

the carbon

et al., 1990)

substrate.

was

any medium

fermenter,

The

9763, was

(Randolph

as the limiting

0.5 g of yeast glucose

medium

tested

the

range of the corresponding

The

conservation

analytical

using experimental

measure-

deviation,

The data are scaled with respect to the

approximative 4. EXPERIMENTAL RESULTS

and biomass

by Sonnleitner

and corrected

the biomass.

with a

at the times correspond-

ing to the glucose, To

of are

and then interpolated

cubic spline to be available

_

to the form

measurements

exactly

according

the data is slightly modithose two balances.

The

J. L. HARMON

12%

ef ~1.

a.2 3

4

5

6

7

8

9

B 2 -0.5

Fig. 4. Molar deviations in the reconciliated data matrix (after projection to meet the constraints; reference value taken after about 3 h). molar

deviations

matrix

are

singular

forming

plotted

are necessary

of constraints

involved

here:

5 - 2 = 3 independent cies (Bonvin ciliated

reactions

and Rippin.

data

matrix

tions, there is no longer of

singular

the

noise

values

problem,

autocorrelation

disappearance reaction. mately

A

profile soon

second

time profiles 5) indicate

ethanol

Glucose

ficients:

after

oxidation The

are:

oxidation glucose

is the only

glucose is

1

8

the

9

the same procedure of

formed

that contains This

metabolism

coefficients:

the

A

is is

between

3

and ethanol

of data

extent

data matrix

data

the measurements

only two factors,

fermentation

as described

simulated

data.

the glucose subset

glu-

= 0.

analysis

oxidation

is approximated

by

and it is used to extract shown

appears

to be qualitatively

remains

close to zero until glucose

For the extraction

in

correct,

Fig. i.e.

6.

This

the extent

fermentation

containing

of the ethanol

oxidation

the data between

sets

extent,

3 and 5 h and

-._00 0.06 -

of

metabo-

~0.02-

ethanol the

= - 1. Glucose

m

do.ooa.02

stoichiometric by

GF

jo.c&

is exhausted,

and

to the anaerobic

to the experimental

time

oxidative

Tat-k 3. Time

7

{h)

the stoichiometric

applied

is produced.

characterized

for

a matrix

by

expected

known

= 0 and ethanol

T&c

after 7.5 h

The

= - 1

“5

in.

3.

a priori

with

the

approxi-

autocorrelation

has stopped.

corresponds

retaining

and

-to be consumed

mentation

reactions.

In this via data

is initially a single

and ethanol

in Table

lism of glucose. coefficients

(10).

there is a clear disappearance

the feeding

5

and 7 h involving

of the columns

after about 7 h when ethanol

events are given

Ethanol

to eliminate

is detected

the backward

4

In what follows,

non-dominant

appears

begins

_ *. .:. II4

“3

.:

_

of glucose

above

reac-

the appearance

There

* _ .: .

- . -

1

cose = - 1 and oxygen

spe-

three

by equation

reaction

(not shown), after

those

3 contains

A third reaction from

a reaction

between

has been achieved

the feed is restarted

Likewise,

to

there are at most

discarding

of reactions.

5 h when

the culture.

(balances

using the two balances_

of the U matrix (Fig.

when

by

noise reduction

reconciliation

and 0.0. A

the possibility

as indicated

.

* : I :

-

Fig. 5. Forward autocorrelation time profiles.

1990). (ii) Since the recon-

of rank

Q<.

3

(i) since the matrix

species,

-‘.‘-.‘*‘T.

-1.0 j

data

.cf’-‘.-.

~,_‘.__‘--.--‘5.‘.‘._._.. ,.“., L__._._~‘, .;*

corresponding

two constraints

be met) for five measured

The

reconciliated

4. The

values are: 1.30,0.39,0.022,0.0

few remarks

some

the

in Fig.

__

.8 0.5 B ,g 0.0

Tim (i-u)

,

= 0. coeffer-

and corresponding

Fig. 6. Extracted extents for the three reactions. rcxtions

in the fcrmcntation

pruccss

Time (h)-s 3

5

7

7,s

v

Ethanol

mctaholism

Ethanol not mctilholircd

Ethanol

consumed

No ethanol

Ethanot

Glucose

metabolism

Glucose

Glucose

oxidation

No gluc~xc

Glucose oxidation glucose fcrmcntation

Number

of factors

I factor

0 factor

2 factors

oxidation

2 factors

produced

analytical modeling

Factor Table 4. Extracted Glucose GO EO GO GF GO

Biomass 3.29 1.14 3.25 0.87 3.27

0.0 - 1.0 0.0 1.7 0.0

O2

CGZ

-2.60 -1.83 -2.64 0.00 -2.62

2.71 0.86 2.75 1.73 2.73

(a) Stoicbiometries extracted using glucose femwzntaton extent; (b) stoichiometries extracted using ethanol oxidation extent. GO: Glucose oxidation; EO: ethanol oxidation; GF: glucose fermentation

between

and

7.5

extracted

9 h is formed.

extent

oxidation

in Fig.

The

plot

6 indicates

takes place mainly

of

that

between

the

ethanol

5.5 and 7 h, as

expected. Knowledge allows the

of

Furthermore, known

matrix

ethanol

since

two

for

(Table dation

stoichiometric the

and

that

has been

glucose

oxidation

from the

and

glucose

the

a second-level generated

error

and

this point,

mentation extract

and

the stoichiometries ethanol

the extent

sents a third-level Fig. 6, indicates occurs resumes

until

oxidation

of glucose extraction.

with the feeding

of

can

be

oxidation. 7he

that the glucose

exhaustion

of glucose

plot,

of glucose

shown

Table

7h

The

is expected

for

to be about 0.45

anaerobic

is consumed).

the observed

growth

For

ratio should to the pure

that the computed

quotients

to increase on glucose,

growth

pure

for

characterizes

on ethanol,

pure aerobic

infinite

5 shows

coefficient

which

a mixed

lie between metabolisms.

and the expected

for the three reactions

are very

close indeed. In addition,

often

the stoichiometric

to values

from

ficients different

strain

continuous

be ferto in and

after 7.5 h.

that only (aerobic tion,

one

aerobic

trast,

and

on glucose

growth

the methods

(1989) dation.

Glucose lower

derably

higher The

possible

differences

as only

producsource

found

in

and CO2

aim at

by Axelsson in Table

for glucose this

yields,

work

response

oxihas

the bio-

is also lower

are due to noise

6.

but a consi-

yield. Furthermore, oxidation

of

In con-

simultaneously.

agreement

the fact that the metabolic

three

at a time

ethanol

are compared

biomass

mass yield for ethanol

from

in this article

fermentation

ethanol

coef-

a slightly

on glucose).

the stoichiometries

slightly

work.

was

without

occurring

is a very good

For

in such conditions

presented

and in this work

There

both

obtained

growth

reactions

Nevertheless,

for

on ethanol

and anaerobic

uncoupling

(but

performed

metabolism

are

experiments.

gives stoichiometric

yeast

medium)

cultures

growth

carbon

(1989)

the same

can be

However,

in the literature

in independent

Axelsson for

coefficients

the literature.

presented

determined

example,

in this

but also to

may be different

when two metabolisms occur simultaneously.

Discussion validity

be checked

ratio,

corresponding

respiratory

reaction

after

stoichiometric

no oxygen

when

a0 m

0.47 0.4%a.55

growth

1.1 during

the values

This repre-

oxidation

glucose

used

the

aerobic

metabolism,

expected. At

of

to become

(i.e.

extracmust

pure

to about

extraction.

propagation

ratio

the type of metabolism,

extraction,

by

the

during

ethanol

used for this stoichiometric some

i.e.

GF (b)

EO (a)

1.04 1.0-1.2

CO2 to that for Oz. This

also that this stoichiometric

themselves

Consequently,

for

twice, once each using

extent

represents

were

oxi-

can be deter-

stoichiometry

extracted

the

RQ stoichiometry RQ theoretical

the stoichiometries

of the ethanol

of glucose

the

extent. Notice

the extents

stoichiometries

for

Go average

compared

can be computed

knowledge

fermentation

identification

coefficients

transformation

stoichiometries

fermentation

Notice

oxidation

reactions,

extent

space for reactions.

oxidation

reaction,

oxidation

extent,

mined.

tion

each

4). Similarly,

oxidation

i.e.

fermentation

T, and with it the complete

for the two

the

glucose

of the stoichiometric

and

glucose

are

the

computation

quotientsfor the three reactions

Table 5. Respiratory

stoichiometries

Ethanol

- 1.0 0.0 - 1.0 -1.0 -1.0

(a): (a): (b): (b): average

1297

data

of biochemical

of the extracted

by computing

stoichiometries

the respiratory

can

quotient,

The largest discrepancies ficients.

Biomass

Table 6. Comparison of reaction stoichiometriesfound by Axe&son tore, a single mechanism at the time) and in this work (fed-batch mechanisms at the time) Glucose

Reaction GO EO GF

Axelsson

TIis work Ax&son This work Axelsson This work

- 1.0 -1.0 0.0 0.0 - 1.0 -1.0

EthrtnOl 0.00 0.00 -1.00 -1.00 1.88 1.70

Biomass

1.32 1.14 0.36 0.87

are in the biomass

measurements

4 - 2.33 -2.62 - 1.61 -1.83 0.00 0.00

(continuous culculture, several CGz

RQ

Z

1.07 1.04 0.42 0.47

0.68 0.86

coef-

are the most likely

J. L. Hnar.~~ru et al.

1298

to

be

in error

possible

for

sample

because

each

volume

measurement more.

a single

sample.

would

In contrast,

repeated.

The

allows

increase

the accuracy

scaling

of the

measure-

measurements

of O2 and CO2 acqui-

noise

to be filtered

out.

The

factor

for biomass

would

not

regarding

biomass.

In this work, to the biomass

to the other

even

The

measured

affinity

residual

6Wmg/l

measurements

of the yeast for glucose glucose

respiratory

concentration

capacity

glucose

and

can

oxidative

be

fully

capacity

uptake

saturation

level,

the saturation 7.

It

additional

oxygen

is limited

6.67 and 7.5 h. Between glucose The

uses

specific

(Fig.

up most glucose

8) because

concentration

piratory

quotient, on both

extracted

observations:

uptake

rate

rate is nearly

zero glucose

in

remains between

(Fig.

and 6)

slowly and the

The observed

match

occurs

The

biological at all times

(the glucose

that interruption

is consumed

meta-

prevails.

these

resbelow

that an oxidative

oxidation

during

decreases

ethanol

of

capacity.

in Fig. 9. decreases

except when the feed is stopped the residual

and

uptake because

very rapidly),

I

6

3

8

Fig. 8. Specific glucose uptake rate.

ethanol when

oxidation

sets in after 5 h and stops after 7 h

all the ethanol

feeding

of glucose

city increases formation

and

is consumed. (Fig.

immediately

It is interesting to the

approximately

to note

reactor

resolve data

reaction

networks

using factor

analytical

The complete

where

trial and error.

seen

as a complete of desired

of extraction. FA

the

In factor analysis.

some

amount

this is not a step-by-

user

puts

new,

however,

drawback user input.

and

should

especially

in the case

reaction

is much room

extraction

not be a good

since the application

ment in many of the individual scaling

infor-

out at the

as it allows

and chemical there

in the

comes

most of the steps require

This,

Nonetheless,

to biochemical

is relatively

procedure

in Fig. 3.

at the top and the answer

bottom.

have none

CONCLUSION

was presented.

algorithm

mation

corres-

reactions

Consequently.

As stated in the introduction, step

fermen-

is negligible. 5.

techniques

uptake

and ethanol

that the extents

macroscopic

to help

for biochemical

glucose

sets in. The glucose

the same range.

approach

is described

The

capacity,

the capa-

these observations.

three

of these reactions

An

9).

the oxidative

tation extent supports ponding

Following

after 7.5 h, the respiratory

rapidly

rate is far above

offer

steps. much

of

networks

for improveIn particular, potential

for

research.

20 0

*

4 Time(h)

up to

rate is shown

rate is constant

increases.

glucose

glucose

is oxidized

3 and 6.47 h, oxidation

shown

extents

is

than the

of the respiratory

one with time, an indication bolism

excess

feed is stopped

the feed

biomass

as the

consumption.

8 mmol/g/h

until the substrate

that

capacity.

consumption to

the

is lower

ethanol

limited

If the glucose

oxygen

uptake

than

means

as long

limit.

of the respiratory

The specific constant

without

if the glucose

is

This

only

this

is very high.

yeast

1986).

above

to ethanol

Conversely,

Baker’s

oxidized

we

noisier.

is smaller

.

2

as

of 37.5 g/l. The

is not saturated.

increases

reduced

of

Kappeli,

though

were

for an inlet concentration

(Sonnleitner

Fig.

variables,

that the biomass

I

0

we

chose to give the same importance

The

o-b-.-

the effect of noise but also the accuracy

of the results

knew

10

the

operation

ethanol

and suspect

high frequency

use of a small only reduce

and

is

increasing

the reactor

glucose

random

measurement

fact,

but also perturb

ments can be duplicated, sition

In

4

I

00 i”.

“\

10 0, 0

Tima (h)

Fig. 7. Specific oxygen consumption rate (0) CO, pruduction rate (t3).

*ooo4a

B

-*

“Ia*

0 . 2

and specific

I 4

. 6 .rimc (h)

Fig.

9. Respiratory quotient.

. 8

,

Factor

Furthermore, biochemical

the application

data should

for the normal intermediary ing.

In

of

Although

no work

step

kinetic has

to realize

satisfactory mation

been

tics).

That

is, the

predicting

iterative

process

Moreover, vidual

interesting

FA

reactions

data,

could

as it is Once

provide be

infor(e.g.

reaction

allowing

could of

step

of purposes

could

facilitate

instead

in this

the retrieved

thus

which

process.

of the loop.

to model

modeling

raw

a method an

more

the modeling

a conglomerate

to

kineoverall

reliable. of indiof

reac-

tions. Acknowledgement-Financial National Foundation for acknowledged.

support from the Swiss Scientific Research is gratefully

NOMENCLATURE B = C= c, = D = F= m = M = N = n, = R = RQ = S = T= I= rrer= U=

number of observations number of balances (constraints) molar concentration of the sth species data matrix of dimension B x S containing the quantities consumed or produced feed rate (malls) vector containing the number of balanced units matrix of constraints of dimension S X C stoichiometric matrix of dimension R x S stoichiometric coefficient of the sth species in the rth reaction number of independent reactions respiratory quotient number of species transformation matrix of dimension R x R defined in equation (13) time (h) reference time matrix of left singular vectors defined in equation

(9) V = matrix of right singular

Greek

vectors

defined

in equation

(9) V = volume (I) X = reaction extents matrix of dimension B X R f, = extent rate of the rth reaction defined in equation (2)

symbols

= = = = =

abstract input output original extracted

Superscripts -= b = b= D+ = D’ = D” =

indicates feed concentration approximation of D defined in equation (10) projection of D defined in equations (23) and (26) pseudo-inverse of D submatrix of D defined in equation (19) submatrix of D defined in equation (19)

values

REFERENCES and control of fermentation Axelsson J. P., Modelling processes, PhD thesis. Lund Institute of Technology, Lund, Sweden (1989) _ Bonvin D. and D. W. T. Rippin, Target factor analysis for the identification of stoichiometric models. Chem. Engng Sci. 45, 3417-3426 (1990). C. J. Meyer and A. D. Gampp H., M. Maeder, Zuberbuehler, Evolving factor analysis. Comments Inorg. Chem. 6, 41-60 (1987). Hamer J., Stoichiometric interpretation of multireaction data: application to fed-batch fermentation data. Chem. Engng Sci. 44, 2363-2374 (1989). Harman H. H., Modern Facror Analysis. The University of Chicago Press. Chicago (1970). Harmon J. H., Ph. Duboc and D. Bonvin, Application of Factor Analysis to the Resolution of Biochemical Reaction Networks, Internal Report-1992.08, Institut d’Automatique, EPFL (1992). Horn R. A. and C. A. Johnson, Matrix Analysis. Cambridge University Press, Cambridge (1985). Malinowski E. R., Theory of the distribution of the error eigenvalues resulting from principal component analysis with applications to spectroscopic data. J. Chemometrics 1,33340 (1987). Malinowski E. R., Factor Analysis in Chemistry. John Wiley, New York (1991). Prinz 0. and D. Bonvin, Monitoring discontinuous reactors using factor-analytical methods, IFAC Symp. DYCORD + ‘92, College Park, MD (1992). Randolph T. W., I. W. Mar&n, D. E. Martens and U. von Stockar, Calorimetric control of fed-batch fermentations. Biotechnol. Bioengng 36, 678-684 (1990). Saner U. M., Modelling and on-line estimation in a batch culture of Bacillus subtilis, PhD thesis. Swiss Federal Institute of Technology, Zurich (1991). Shrager R. I. and R. W. Hendler, Titration of individual components in a mixture with resolution of difference spectra, pKs, and redox transitions. Anal. Chem. 54, 1147-1152 (1982). Sonnleitner B. and 0. Ksppeli, Growth of Saccharomyces cereobiae is controlled by its limited respiratory capacity: formulation and verification of a hypothesis. Bbtechnol. Bioengng 28, 927-937 (1986) _ Stephanopoulos G. and K.-Y. San, Studies on on-line bioreactor identification. I. Theory. Biorechnol. Bioengng 26, 1176-1188 (1984). Wang N. S. and G. Stephanopoulos, Application of macroscopic balances to the identification of gross measurement errors. Biotechnol. Bioengng 25, 2177-2208 (1983).

Subscripts a in out orig ext

1299

L = matrix of singular a, = ith singular value

possible

the

accomplished

the closure

coefficients,

adds an

the

in

results are obtained,

yield

to

more specific model-

indicates

can be used for a variety

calculate

analysis

It merely

modeling

this is an extremely

conceivable

of

allows

last

inclusion area,

process.

step which the

of factor

not be seen as a substitute

modeling

fact,

analytical modeling of biochemical data

APPENDIX Indicators (a)

of rank

From

Malinowski

(1987).

the F function

s

c

(B-j+

l)(S--j+l)

F(l.S-n)=‘-“”

2

X(B-n+l~;S-n+l)

(Al)

J. L. HARMON et al.

1300

is used to check the hypothesis that o;, is statistically similar to the pool of S-n smallest singular values (i.e. o”+lY o,+2. . 0s): P{F(l,S-#l)>F(a,

l,S-n)k=a.

(A21

A confidence level of 95% is typically used (a = 0.05). (b) Shrager and Hendler (1982) proposed the following first-order autocorrelation function as an indicator of rank:

a-1 AU’fO(ut)

= 2

u,ku(i+ ,,t .

(A31

j=l

Autocorrelation values close to one are deemed to be highly significant, whereas values close to zero represent random signals. The authors suggested a cutoff value of 0.5, although this number may vary for different applications.