




版權說明:本文檔由用戶提供并上傳,收益歸屬內容提供方,若內容存在侵權,請進行舉報或認領
文檔簡介
Chapter4
Partial
Least
Squares
Analysi4.1Basic
Concep4.2NIPALS
and
SIMPLS
Algorithm4.3Programming
Method
of
Standard
Partial
Least
Square4.4Example
Application4.5Stack
Partial
Least
Squares
4.1BasicConcep
4.1.1PartialLeastSquares
ConsiderthegeneralsettingofalinearPLSalgorithmtomodeltherelationbetweentwodatasets(blocksofvariables).DenotebyX?RNanN-dimensionalspaceofvariablesrepresentingthefirstblockandsimilarlybyy?RNaspacerepresentingthesecondblockofvariables.PLSmodelstherelationsbetweenthesetwoblocksbymeansofscorevectors.
After
observing
n
data
samples
from
each
block
of
variables,
PLS
decomposes
the
(n×N)matrix
of
zero-mean
variables
X
and
the
(n×M)matrix
of
zero-mean
variables
Y
into
theform
Graphically,itcanbeshownasFig.4.1,wheretheT,Uare(n×p)matricesofthepextractedscorevectors(components,latentvectors),the(N×p)matrixPandthe(M×p)matrixQrepresentmatricesofloadingsandthe(n×N)matrixEandthe(n×M)matrixFarethematricesofresiduals.ThePLSmethod,whichinitsclassicalformisbasedonthenonlineariterativepartialleastsquares(NIPALS)algorithm,findsweightvectorsw,csuchthat
where
cov(t,u)
=tTu/n
denotes
the
sample
covariance
between
the
score
vectors
t
andu.
The
NIPALS
algorithm
starts
with
random
initialization
of
the
Y-space
score
vector
uand
repeats
a
sequence
of
the
following
steps
until
convergence.
Notethatu=yifM=1,thatis,Yisaone-dimensionalvectorthatwedenotebyy.InthiscasetheNIPALSprocedureconvergesinasingleiteration.
Itcanbeshownthattheweightvectorwalsocorrespondstothefirsteigenvectorofthefollowingeigenvalueproblem
TheX-andY-spacescorevectorstanduarethengivenas
wheretheweightvectorc
isdefineinsteps(4)and(5)ofNIPALS.Similarly,eigenvalueproblemsfortheextractionoft,uorcestimatescanbederived.TheuserthensolvesforoneoftheseeigenvalueproblemsandtheotherscoreorweightvectorsarereadilycomputableusingtherelationsdefinedinNIPALS.
4.1.2Form
of
Partial
Least
Squares
PLS
is
an
iterative
process.After
the
extraction
of
the
score
vectorst,u
the
matricesXand
Yare
deflated
by
subtracting
their
rank-one
approximations
based
on
t
and
u.Different
forms
of
deflation
define
several
variants
of
PLS.
Usingequation(4.1.1)thevectorsofloadingspandqarecomputedascoefficientsofregressingXontandYonu,respectively
4.1.2.1PLS
Mode
A
The
PLS
Mode
A
is
based
on
rank-one
deflation
of
individual
block
matrices
using
thecorresponding
score
and
loading
vectors.
In
each
iteration
of
PLSModeA
the
Xand
Ymatrices
are
deflated
4.1.2.2PLS1,PLS2
PLS1(one
of
the
block
of
dataconsists
of
a
single
variable)
and
PLS2(both
blocksare
multidimensional)
are
used
as
PLS
regression
methods.The
sevariants
of
PLS
are
themost
frequently
used
PLS
approaches.
The
relationship
between
Xand
Y
is
asymmetric.Two
assumptions
aremade:i)
the
score
vectors{ti}pi=1are
good
predictors
of
Y;pdenotes
the
number
of
extracted
score
vectors—PLS
iterations,ii)
a
linear
inner
relationbetween
the
scores
vectors
t
and
u
exists;that
is,
where
D
is
the
(p×p)
diagonal
matrix
and
Hdenotes
the
matrix
of
residuals.
The
asymmetricassumption
of
the
predictor-predicted
variable(s)
relation
is
transformed
into
a
deflationscheme
where
the
predictor
space,
sayX,
score
vectors{ti}pi=1
are
good
predictors
of
Y.The
score
vectors
are
then
used
to
deflate
Y,that
is,
a
component
of
the
regression
of
Y
ont
is
removed
from
Y
at
each
iteration
of
PLS
4.1.2.3PLS-SB
Asoutlinedattheendofthepreviousparagraphthecomputationofalleigenvectorsofequation(4.1.3)atoncewoulddefineanotherformofPLS.Thiscomputationinvolvesasequenceofimplicitrank-onedeflationsoftheoverallcross-productmatrix.ThisformofPLSwasusedinandinaccordancewithitisdenotedasPLS-SB.IncontrasttoPLS1andPLS2,theextractedscorevectors{ti}pi=1areingeneralnotmutuallyorthogonal.
4.1.3PLS
Regression
As
mentioned
in
the
previous
section,
PLS1
and
PLS2
can
be
used
to
solve
linearregression
problems.
Combining
assumption(4.1.5)
of
a
linear
relation
be
tween
the
scoresvectors
t
and
uwith
the
decomposition
of
the
Ymatrix,
equation(4.1.1)
can
be
written
as
This
defines
the
equation
where
CT=DQT
nowdenotesthe(p×M)matrix
of
regression
coefficients
and
F*=HQT+F
is
the
residual
matrix.
Equation(4.1.6)is
simply
the
decomposition
of
Yusingordinary
least
squaresreg
ression
with
orthogonal
predictors
T.
Wenowconsiderorthonormalisedscorevectorst,thatis,TTT=I,andthematrixC=YTTofthenotscaledtolengthoneweightvectorsc.Itisusefultoredefineequation(4.1.6)intermsoftheoriginalpredictorsX.Todothis,weusetherelationship
wherePisthematrixofloadingvectorsdefinedinequation(4.1.1).Pluggingthisrelationintoequation(4.1.6),weyiel
Forabetterunderstandingofthesematrixequations,theyarealsogiveningraphicalrepresentationinFig.4.2.
whereBrepresentsthematrixofregressioncoefficients
Forthelastequality,therelationsamongT,U,WandPareused[12,10,17].NotethatdifferentscalingsoftheindividualscorevectorstandudonotinfluencetheBmatrix.FortrainingdatatheestimateofPLSregressionis
and
for
testing
data
we
have
where
Xt
and
Tt=XtXTU(TTXXTU)-1
represent
the
matrices
of
testing
data
and
scorevectors,respectively.
4.1.4Statistic
From
thematrices
of
residuals
Ehand
Fhsums
of
squares
canbe
calculated
asfollows:
the
total
sum
of
squares
over
a
matrix,
the
sums
of
squares
over
rows,and
thesums
of
squares
overcolumns.These
sums
of
squares
can
be
used
to
construct
variance-like
estimators.The
statistical
propertiesof
these
estimatorshavenotundergone
arigorous
mathematical
treatment
yet,but
some
properties
can
be
understood
intuitively.
Sumsofsquaresoverthecolumnsindicatetheimportanceofavariableforacertaincomponent.Sumsofsquaresovertherowsindicatehowwelltheobjectsfitthemodel.Thiscanbeusedasanoutlierdetectioncriterion.IllustrationsaregiveninFig.4.3(a)forvariablestatisticsandinFig.4.3(b)forsamplestatistics.
AnadvantageofPLSisthatthesestatisticscanbecalculatedforeverycomponent.Thisisanidealmeansoffollowingthemodel-buildingprocess.Theevolutionofthesestatisticscanbefollowed(asshowninFig.4.3(a)and(b))asmoreandmorecomponentsarecalculatedsothatanideaofhowthedifferentobjectsandvariablesfitcanbeobtained.Incombinationwithacriterionformodeldimensionality,thestatisticscanbeusedtoestimatewhichobjectsandvariablescontributemainlytothemodelandwhichcontributemainlytotheresidual.
4.2NIPALS
and
SIMPLS
Algorithm
4.2.1NIPALS
In
this
section,
the
transpose
of
the
matrix
is
represented
by
the
superscript
4.2.1.1Theory
The
PLS
algorithm
as
described
in
this
section
will
be
called
the“standard”PLS
algorithm.It
has
been
presented
in
detail
elsewhere[3-6].
For
some
alternative
implementations
of
PLSsee
e.g.references[7-9].The
first
step
in
standard
PLS
is
to
center
the
data
matrices
Xand
Y,
giving
X0,and
Y0,
respectively.
Then
a
set
of
A
orthogonal
Xblock
factor
scoresT=[t1,t2,…,tA]and
companion
Yblock
factor
scores
U=[u1,u2,…,uA]
arecalculated
factor
by
factor.
The
first
PLS
factorst1,and
u1,are
weighted
sumsof
thecentered
variables:t1=X0w1,and
u1=Y0q1,
respectively.
Usually
theweights
aredetermined
via
the
NIPALS
algorithm.This
is
the
iterative
sequence:
OncethefirstXblockfactort1isobtainedoneproceedswithdeflatingthedatamatrices.ThisyieldsnewdatasetsX1andY1,whicharethematricesofresidualsobtainedafterregressingallvariablesont1
4.2.1.2NIPALS-PLSFactorsinTermsofOriginalVariables
Eachoftheweightvectorswa,a=2,3,…,A,usedfordefiningtheassociatedfactorscores,appliestoadifferentmatrixofresidualsXa-1,
and
not
to
the
original
centered
data
X0.This
obscures
the
interpretati
on
of
the
factors,mainly
because
one
looses
sight
of
what
is
in
the
depletedmatrices
Xa,as
one
goes
tohigher
dimensions,
a≥1.Some
X
variables
are
used
in
the
first
factors,
others
only
muchlater.Therelationbetweenfactorsandvariablesisbetterdisplayedbytheloadingspa(a=1,2,…,A).Indeed,theweightvectors,collectedinthep×AmatrixW,havefoundlessuseininterpretingPLSregressionmodelsthantheloadingvectors.
Itisthereforeadvantageoustore-expresstheNIPALS-PLSfactorstaintermsoftheoriginalcentereddataX0,say
or,collectingthealternativeweightvectorsinap×AmatrixR=[r1,r2,…,rA
],
ThefactorscoresTcomputedviaNIPALS-PLS,i.e.viadepletedXmatrices,canbeexpressedexactlyaslinearcombinationsofthecenteredXvariables,sincealldeflatedmatricesXaandfactorscoresta,a=1,2,…,A,lieinthecolumnspaceofX0.Thus,RcanbecomputedfromtheregressionofTonX0:
where
P=[p1,p2,…,pA]
is
the(p×A)
matrix
of
factor
loadings
and
the
superscript-indicates
any
generalized
inverse
and+indicates
the
uniqueMoore-Penrose
pseudo-inverse.We
also
have
the
relation.
Sincer'bpa=r'bX'0ta/(t'ata)=t'ata/(t'ata)=δab.HereIAisthe(A×A)identitymatrixandδabisKronecker’sdelta.ThusRisageneralizedinverseofP'.AnotherexpressionforRis.
whichfollowsfromtheobservationthatRandWsharethesamecolumnspaceandthatP'Rshouldbeequaltotheidentitymatrix.
Theexplicitcomputationofthe(pseudo-)inversematricesineqnuation(4.2.15)and(4.2.17)detractssomewhatfromthePLS-NIPALSalgorithm,thatisotherwiseverystraightforward.Hiiskuldssongivesthefollowingrecurrentrelation.
starting
with
r1=w1,
How
ever,
this
relation
depends
on
the
tridiagonal
structure
of
P'Pand
is
only
correct
for
univariate
Y
=y(m=1,PLS1).Equation(4.2.19)
and
(4.2.20)form
a
set
of
updating
formulas
that
is
generally
applicable:
starting
with
G1=Ip
.Note
that
the
vectors
raare
not
normalized,
in
contrast
to
theweight
vectors
wa,
Thus
in
equation
(4.2.13),
neither
ta
nor
raare
normalized.
WhentheRweightsareavailable,aclosedformmultipleregression-typepredictionmodelcanbeobtainedmorereadily:
Here,B
PLS=Rdiag(b)Q'=W(P'W)-1diag(b)Q'isthep×msetofbiasedmultivariateregressioncoefficientsobtainedviaPLSregression.
ThemodificationweproposeleadstothedirectcomputationoftheweightsR.InthiswayweavoidtheconstructionofdeflateddatamatricesX1,X2,…,XAandY1,Y2,…,YAandby-passthecalculationofweightsW.Theexplicitcomputationofmatrixinversesasinequation(4.2.15)or(4.2.17)isalsocircumvented.ThenewlydefinedRissimilar,butnotidentical,tothe“standard”Rintroducedinequation(4.2.14).Infact,ournewRcontainsnormalizedweightvectorsjustasWinstandardPLS.
Thus,thetaskwefaceistocomputeweightvectorsraandqa(a=1,2,…,A),whichcanbeapplieddirectlytothecentereddata:
Theweightsshouldbedeterminedsuchastomaximizethecovarianceofscorevector
taandua
undersomeconstraints.(Thetermcovariancewillbeusedsomewhatlooselyandinterchangeablywiththetermscross-productorinnerproduct;theymerelydifferbyascalarfactorn-1).Specifically,fourconditionscontrolthesolution:
(1)maximizationofcovariance
(2)normalizationofweightsra:
(3)normalizationofweightsqa:
(4)orthogonalityoftscores
4.2.2.2SIMPLSAlgorithm
ItisexpedienttocomputeSa+1fromitspredecessorSa.ToachievetheprojectionontothecolumnspaceofPa
willbecarriedoutasasequenceoforthogonalprojections.ForthisweneedanorthonormalbasisofPa,sayVa=[v1,v2,…,va].VamaybeobtainedfromaGramSchmidtorthonormalizationofPa,i.e.,
starting
with
Vi=v1
∝p1.An
additional
simplification
is
possible
when
the
response
isunivariate
(m=1,PLS1).In
this
case,
one
may
employ
the
orthogonality
properties
of
P,viz.,p'bpa=0,forb≤a-2.ThesepropertiescarryovertotheorthonormalizedloadingsV,i.e.,p'bVa=0,forb≤a-2.Thus,orthogonalityofpa
withrespecttoVa-2isautomaticallytakencareofandequation(4.2.32)simplifiesto
Theprojectionontothesubspacespannedbythefirstaloadingvectors,pa(p'a
pa)-1p'a,cannowbereplacedbyVaV'aandtheprojectionontheorthogonalimplementPa⊥byIp-VaV'a=∏a1(IpVbV'b).Thus,utilizingtheorthonormalityofV,theproductmatricesSa(a=1,2,…),aresteadilydepletedbyprojectingouttheperpendiculardirectionsva:
4.2.2.3Fitting,PredictionandResidualAnalysis
ForthedevelopmentofthetheoryandalgorithmofSIMPLSitwasconvenienttochoosenormalizedweightvectorsra.Thischoice,however,isinnowayessential.Wewillnowswitchtoanormalizationofthescores
tainstead,sincethisconsiderablysimplifiessomeoftheensuingformulas.ThecodegivenintheAppendixalreadyusesthelatternormalizationscheme.Thusweredefinera=ra/|X0ra|andta=ta/|ta|,givingunit-lengthscorevectorstaandorthonormalT:T'T=IA.
Predictedvaluesofthecalibrationsamplesarenowobtainedas
giving
Fornewobjectsweemploythestraightforwardpredictionformula
Thefactorscoresta*=x0*raandleverageh*=∑ta*2maybecomputedfordiagnosticpurposes,e.g.,toassesswhetherornotthenewobjectlieswithintheregioncoveredbythetrainingobjects.
4.2.2.4DetailedSIMPLSAlgorithm
4.3Programming
Method
of
StandardPartial
Least
Squares
4.3.1Cross-validation
Learning
the
parameters
of
a
prediction
function
and
testing
it
on
the
same
data
is
amethodological
mistake:
a
model
that
would
just
repeat
the
labels
of
the
samples
that
ithas
just
seen
would
have
a
perfect
score
but
would
fail
to
predict
any
thing
useful
on
yet-unseen
data.
This
situation
is
called
overfitting.
To
avoid
it,it
is
common
practice
whenperforming
a
(supervised)
machine
learning
experiment
to
hold
out
part
of
the
availabledata
asatest
set
X_test,
Y_
test.
Note
that
the
word“experiment”
is
not
intended
todenote
academic
use
only,
because
even
in
commercial
setting
smachine
learning
usuallystarts
out
experimentally.
Whenevaluatingdifferentsettings(“hyperparameters”)forestimators,thereisstillariskofoverfittingonthetestsetbecausetheparameterscanbetweakeduntiltheestimatorperformsoptimally.Thisway,knowledgeaboutthetestsetcan“leak”intothemodelandevaluationmetricsnolongerreportongeneralizationperformance.Tosolvethisproblem,yetanotherpartofthedatasetcanbeheldoutasaso-called“validationset”:trainingproceedsonthetrainingset,afterwhichevaluationisdoneonthevalidationset,andwhentheexperimentseemstobesuccessful,finalevaluationcanbedoneonthetestset..
However,bypartitioningtheavailabledataintothreesets,wedrasticallyreducethenumberofsampleswhichcanbeusedforlearningthemodel,andtheresultscandependonaparticularrandomchoiceforthepairof(train,validation)sets.
Asolutiontothisproblemisaprocedurecalledcross-validation(CVforshort).Atestsetshouldstillbeheldoutforfinalevaluation,butthevalidationsetisnolongerneededwhendoingCV.
4.3.1.1Cross-validation
Iterators
For
i.i.d.
Data
Thefollowingcross-validatorscanbeusedinsuchcases.
Note
Whileii..d.dataisacommonassumptioninmachinelearningtheory,itrarelyholdsinpractice.Ifoneknowsthatthesampleshavebeengeneratedusingatime-dependentprocess,it’ssafertouseatime-seriesawarecross-validationscheme.Similarlyifweknowthatthegenerativeprocesshasagroupstructure(samplesfromcollectedfromdifferentsubjects,experiments,measurementdevices)itsafertousegroup-wisecross-validation
1.K-Fold
Inthebasicapproach,calledK-FoldCV,thetrainingsetissplitintoksmallersets(otherapproachesaredescribedbelow,butgenerallyfollowthesameprinciples).ThefollowingprocedureisfollowedforeachoftheK“folds”:
(1)Amodelistrainedusingk-1ofthefoldsastrainingdata;
(2)theresultingmodelisvalidatedontheremainingpartofthedata.
Exampleof2-foldcross-validationonadatasetwith4samples:
Hereisavisualizationofthecross-validationbehaviorinFig.4.4.NotethatK-Foldisnotaffectedbyclassesorgroups.
2.RepeatedK-Fold
RepeatedK-FoldrepeatsK-Foldntimes.ItcanbeusedwhenonerequirestorunK-Foldntimes,producingdifferentsplitsineachrepetition
Exampleof2-foldK-Foldrepeated2times:
3.LeaveOneOut(LOO)
LeaveOneOut(orLOO)isasimplecross-validation.Eachlearningsetiscreatedbytakingallthesamplesexceptone,thetestsetbeingthesampleleftout.Thus,fornsamples,wehavendifferenttrainingsetsandndifferenttestsset.Thiscross-validationproceduredoesnotwastemuchdataasonlyonesampleisremovedfromthetrainingset:
4.LeavePOut(LPO)
LeavePOutisverysimilartoLeaveOneOutasitcreatesallthepossibletraining/testsetsbyremovingpsamplesfromthecompleteset.Fornsamples,thisproducestrain-testpairs.UnlikeLeaveOneOutandK-Fold,thetestsetswilloverlapforp>1.
ExampleofLeave-2-Outonadatasetwith4samples:
5.LeavePOut(LPO)RandomPermutationsCross-validationa.k.a.Shuffle&Split
TheShuffleSplititeratorwillgenerateauserdefinednumberofindependenttrain/testdatasetsplits.Samplesarefirstshuffledandthensplitintoapairoftrainandtestsets.
Itispossibletocontroltherandomnessforreproducibilityoftheresultsbyexplicitlyseedingtherandomstatepseudorandomnumbergenerator.
Hereisausageexample:
Hereisavisualizationofthecross-validationbehaviorinFig.4.5.NotethatShuffleSplitisnotaffectedbyclassesorgroups.
4.3.1.2Cross-validationIteratorswithStratificationBasedonClassLabels
Someclassificationproblemscanexhibitalargeimbalanceinthedistributionofthetargetclasses:forinstancetherecouldbeseveraltimesmorenegativesamplesthanpositivesamples.InsuchcasesitisrecommendedtousestratifiedsamplingasimplementedinStratifiedK-FoldandStratifiedShuffleSplittoensurethatrelativeclassfrequenciesisapproximatelypreservedineachtrainandvalidationfold.
1.StratifiedK-Fold
StratifiedK-FoldisavariationofK-Foldwhichreturnsstratifiedfolds:containsapproximatelythesamepercentageofsamplesofeachtargetclassasthecompleteset.
Exampleofstratified3-foldcross-validationonadatasetwith10samplesfromtwoslightlyunbalancedclasses:
Hereisavisualizationofthecross-validationbehaviorinFig.4.6.
2.StratifiedShuffleSplit
StratifiedShuffleSplitisavariationofShuffleSplit,whichreturnsstratifiedsplits,i.e,whichcreatessplitsbypreservingthesamepercentageforeachtargetclassasinthecompleteset.
Hereisavisualizationofthecross-validationbehaviorinFig.4.7.Fig.4.7StratifiedShuffleSplit
4.3.1.3Cross-validation
Iterators
for
Grouped
Data
The
ii..d.
assumption
is
broken
if
the
underlying
generative
process
yield
groups
ofdependent
samples.
Such
a
grouping
of
data
is
domain
specific.An
examplewould
bewhen
there
ismedical
data
collected
frommultiple
patients,
withmultiple
samples
taken
fromeachpatient.
And
such
data
is
likely
to
be
dependent
on
the
individual
group.
In
our
example,the
patient
id
for
each
sample
will
beits
group
identifier.
1.GroupK-Fold
GroupK-FoldisavariationofK-Foldwhichensuresthatthesamegroupisnotrepresentedinbothtestingandtrainingsets.Forexampleifthedataisobtainedfromdifferentsubjectswithseveralsamplesper-subjectandifthemodelisflexibleenoughtolearnfromhighlypersonspecificfeaturesitcouldfailtogeneralizetonewsubjects.GroupK-Foldmakesitpossibletodetectthiskindofoverfittingsituations.
Imagineyouhavethreesubjects,eachwithanassociatednumberfrom1to3:
Eachsubjectisinadifferenttestingfold,andthesamesubjectisneverinbothtestingandtraining.Noticethatthefoldsdonothaveexactlythesamesizeduetotheimbalanceinthedata.
Hereisavisualizationofthecross-validationbehaviorinFig.4.8.Fig.4.8GroupK-Fol
2.LeaveOneGroupOut
LeaveOneGroupOutisacross-validationschemewhichholdsoutthesamplesaccordingtoathird-partyprovidedarrayofintegergroups.Thisgroupinformationcanbeusedtoencodearbitrarydomainspecificpre-definedcross-validationfolds.
Eachtrainingsetisthusconstitutedbyallthesamplesexcepttheonesrelatedtoaspecificgroup.
Forexample,inthecasesofmultipleexperiments,LeaveOneGroupOutcanbeusedtocreateacross-validationbasedonthedifferentexperiments:wecreateatrainingsetusingthesamplesofalltheexperimentsexceptone:
3.LeavePGroupOut
LeavePGroupsOutissimilarasLeaveOneGroupOut,butremovessamplesrelatedtoPgroupsforeachtraining/testset.
4.GroupShuffleSplit
TheGroupShuffleSplititeratorbehavesasacombinationofShuffleSplitandLeavePGroupsOut,andgeneratesasequenceofrandomizedpartitionsinwhichasubsetofgroupsareheldoutforeachsplit.
Hereisausageexample:
Hereisavisualizationofthecross-validationbehaviorinFig.4.9Fig.4.9GroupShuffleSpli
4.3.2Procedure
of
NIPALS
4.3.2.1Inner
Loop
of
The
Iterative
NIPALS
Algorithm
Provides
an
alternative
to
the
svd(X'Y);
returns
the
first
left
and
right
singularvectors
of
X'Y.
See
PLS
for
themeaning
of
the
parameters.It
is
similar
to
the
PowermethodfordeterminingtheeigenvectorsandeigenvaluesofaX'Y.
4.3.2.2Center
X
and
Y
4.3.2.3NIPALS
ThisclassimplementsthegenericPLSalgorithm,constructors’parametersallowtoobtainaspecificimplementationsuchas:
ThisimplementationusesthePLSWold2blocksalgorithmbasedontwonestedloops:
(i:)Theouterloopiterateovercomponents.
(ii)Theinnerloopestimatestheweightsvectors.Thiscanbedonewithtwoalgo.(a)theinnerloopoftheoriginalNIPALSalgo,or(b)aSVDonresidualscross-covariancematrices
4.4Example
Application
4.4.1Demo
of
PLS
Software
version
python
2.7,and
aMicrosoftWindows
7
operating
system.Cross-validation
and
train
test
split
are
performedusing
the
sklearnpackage,
respectively.Dataset
loading
is
done
using
the
scipy
package,and
other
programs
can
be
implementedby
individuals..
4.4.2CornDataset
Inthissectionthecorndatasetwasusedforexperiments.LatentvariablesofPLSareallowedtotakevaluesintheset[1,15],anditisdeterminedbythe10-foldcross-validation.Nopre-processingmethodswereusedotherthanmean-centering.Table4.1showsthetrainingerror,cross-validationerror,predictionerror,andprincipalcomponentnumberofthePLSmodelformoisture,oil,protein,andstarchcontentdirectlyusingthecorndataset.
Inthispaper,theprincipalcomponentofthePLSalgorithmisselectedbythe10-foldcross-validationmethod.TheRMSECVofthePLSmodelisgiveninFig.4.10-Fig.4.12,respectively.Fig.4.10TheselectionprocessoftheoptimallatentvariablesnumberfromPLSmodelaboutthem5specinstrumentFig.4.11TheselectionprocessoftheoptimallatentvariablesnumberfromPLSmodelaboutthemp5specinstrumentFig.4.12TheselectionprocessoftheoptimallatentvariablesnumberfromPLSmodelaboutthemp6specinstrume
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯系上傳者。文件的所有權益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網頁內容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
- 4. 未經權益所有人同意不得將文件中的內容挪作商業或盈利用途。
- 5. 人人文庫網僅提供信息存儲空間,僅對用戶上傳內容的表現方式做保護處理,對用戶上傳分享的文檔內容本身不做任何修改或編輯,并不能對任何下載內容負責。
- 6. 下載文件中如有侵權或不適當內容,請與我們聯系,我們立即糾正。
- 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- T/CQAP 3009-2023大興安嶺地產中藥材白鮮皮質量規范
- T/CPMA 013-2020中國肺癌篩查標準
- T/CIQA 48-2023低酸性罐藏和酸性罐藏食品商業無菌快速檢測實時光電法
- T/CIMA 0070-2023交流電能表用隧道磁阻電流傳感器
- T/CIMA 0047-2022綜合能源電力感知終端
- T/CIMA 0016-2020飲用水中銅綠假單胞菌檢測系統
- T/CIES 032-2023離網光伏路燈應用設計規范
- T/CHTS 10130-2024高韌超薄瀝青磨耗層技術指南
- T/CHINABICYCLE 19-2023可持續時尚企業指南自行車與電動自行車
- T/CHES 65-2022生態護坡預制混凝土裝配式護岸技術規程
- 中外航海文化知到課后答案智慧樹章節測試答案2025年春中國人民解放軍海軍大連艦艇學院
- 復調音樂巡禮-巴赫勃蘭登堡協奏曲 課件-2023-2024學年高中音樂人音版(2019)必修音樂鑒賞
- 2009年《四川省建設工程工程量清單計價定額》
- 監理平行檢查記錄表格模板
- 實驗室生物安全手冊(完整版)資料
- 臨時圍擋施工方案(標準版)
- 中班語言《噓我們有個計劃》課件
- 水墨中國風名著《水滸傳》簡介主題PPT模板課件
- Q∕GDW 11958-2020 國家電網有限公司應急預案編制規范
- TCSCS 009-2020 鋼結構滑移施工技術標準
- 小學英語GreedyRabbit教案
評論
0/150
提交評論