一種簡化的Turbo碼并行譯碼方法_第1頁
一種簡化的Turbo碼并行譯碼方法_第2頁
一種簡化的Turbo碼并行譯碼方法_第3頁
一種簡化的Turbo碼并行譯碼方法_第4頁
一種簡化的Turbo碼并行譯碼方法_第5頁
已閱讀5頁,還剩2頁未讀 繼續免費閱讀

下載本文檔

版權說明:本文檔由用戶提供并上傳,收益歸屬內容提供方,若內容存在侵權,請進行舉報或認領

文檔簡介

-

PAGE

1

-

ASimplifiedParallelDecodingSchemeforTurboCodes

WangLe,HuangYu,YangHongwen

WirelessCommunicationCenter,BeijingUniversityofPostsandTelecommunications,Beijing,China(100876)

Abstract

ParalleldecodingofTurbocodesisvitaltotheapplicationswithveryhighdatarates.Intheoverlappingparalleldecodingscheme,thedecodingthroughputcannotincreaselinearlywiththenumberofsegments.Aschemewherethethroughputcanincreaselinearlywasproposed,wheretheboundaryinformationofeachiterationisstoredtobetheinitialconditioninnextiteration.Inthispaper,wehavesimplifiedtheschemebyonlystoringtheindexofthemostprobablestateoftheboundary.Withthecostofsomesmallperformanceloss,theschemecanreducelargelytheextramemoryrequiredtostoretheboundaryinformation.

Keywords: Turbocodes;MAPalgorithm;paralleldecoding

Introduction

Withtheever-increasingdemandofthedatarate,turbocodeisfacingstringentchallenges.ITUsuggeststhatthedatarateof4Gsystembeatleast100Mbps,inmobileconditions.Asthelocaloperatingclockfrequencyinhardwaresystemistypicallywithin100MHznowadays,itisreported[1]thatthethroughputofonly6.17Mbpscanbeachievedforturbocodeswithconventionalserialdecoding[2].Obviously,thedecodingalgorithmofturbocodehastobeimprovedsignificantlytosurviveinthefuturecommunicationsystems.

Thebottleneckofthethroughputliesinitsdecodingscheme.Theconvolutionalcodesistypicallyemployedastheconstituentcodes,leadingtothedecodingalgorithmemployssomerecursiveschemesbasedontrellis,suchassymbol-by-symbolMaximumAPosterioriprobability(MAP)algorithm[3].Itrequireslotsofrecursivecomputationsinbothcomponentdecoders.Moreover,thedecodingprocedureisiterative.Hence,thedecodingdelayisquitelarge.

Toreducethedecodingdelayandimprovethedecodingthroughput,theparalleldecodingmethodwasproposed[4]bydividingthewholeframeintomultiplesub-blocksanddecodingthesegmentswithabankofidenticalsub-blockMAPprocessors.Withtheparalleldecodingscheme,thethroughputcanbeimprovedbyafactorofaboutK,thenumberofsub-blockssegmented.Underthesameconditionsasmentionedpreviously,thethroughputof100MbpscanbeachievedwithK=16.

However,asaresultofthesegmentation,eachsub-blocksuffersthelostofboundaryinformation(initialconditions),whichisvitalfortheperformance.Togettheboundaryinformation,amethodwithsegmentsoverlappingisproposed[4],bycomputingsomemoreredundanttrellisstages.However,suchoverlappingsufferssomeextracomputationaldelay,andleadstolowerdecodingspeed.TheactualspeedupfactorissmallerthanK,especiallywhenKislargeandtheframelengthissmall.Anotherwaytogettheboundarydistributioninformationwasproposedin[5].Itsuggeststousetheforwardandbackwardvariablescomputedinthepreviousiterationtoprovideanapproximationtotheboundary.Simulationresultsshowthatitsperformanceisveryclosetotheserialdecodingschemein[2].Thecostofthismethodissomeadditionalmemoryneededtostoretheintermediateboundaryinformation.

Toreducethesizeoftheadditionalmemoryin[5],wesuggestthatinsteadofthewholeboundarydistributioninformation,onlytheindexofthemostprobablestatebestored.Withthissimplifiedstoragescheme,wecansavealotofmemoryatthecostofsomeacceptableperformanceloss.

Therestofthepaperisorganizedasfollows.First,section2givesabriefdescriptionofthestandardserialdecodingtoshowitsweakpointindecodingspeed.Section3showsthetwoparalleldecodingmethodsproposedin[4-5].Ourproposedschemeisgiveninsection4andsection5isthesimulationresults.Finally,section6concludesthepaper.

SerialdecodingschemeofTurbocodes

TurbocodesintroducedbyBerrouetal[2]haveanear-Shannoncapability,thusbeingveryattractivetovariousapplications.However,thelargedecodingdelayprohibitsitsapplications,especiallyforthosetimeconstraintorhighdatarateservices.

ThedecoderofTurbocodesemploysaseriallyconcatenatedstructure,whichconsistsoftwocomponentdecoders,asshowninFig.1.Thedecodingproceedsinaniterativemanner.Ineachiteration,thefirstcomponentdecoder,correspondingtothefirstcomponentencoder,deliversitscalculatedextrinsicinformation(Le1)tothe2ndcomponentdecoderafterinterleavingtoserveastheapriori

information(La2).Thenthesecondcomponentdecoderfeedbacksitsextrinsicinformation(Le2)tothefirstoneafterde-interleaving,workingasaprioriinformation(La1).xandxinFig.1are

receivedinformationbitsanditsinterleavedversion,respectively.The

y1and

y2arereceived

redundantbitsproducedbythecomponentRecursiveSystematicConvolutional(RSC)encoders,respectively.

La1x

y1

Le1

La2

x

y2

Le2

LLRs

Figure1.DecoderofTurboCodes

ThefunctionofthecomponentdecoderistocalculatetheLLRs(LogarithmofLikelihoodRatio)fortheinformationbits.ThecalculationofLLRsgenerallyemploystheMAPalgorithm[3]wheretheLLRsfortheithinformationbitofalengthNframecanbeexpressedas

i(s)i(s)i(s,1)

LLRiLog

1

i(s)i(s)i(s,0)

0

(1)

wherethesummationinnumeratoranddenominatorisoverthesetofbranchesintrellisdiagram,inwhichtheinformationbitis1or0,respectively.NotethatwearenotgoingtoexplainthedetailsoftheMAPalgorithmoritsvariants;rather,wesimplyshowtherecursivenatureinthealgorithmsthat

prohibitthedecodingspeed.Therearethreekindsofvariablesin(1).

i(s)

and

i(s)

denotethe

forwardandbackwardvariablesintheithtrellisstage,

is,d

isthebranchmetriccorresponding

thebranchwhichstartsfromstatesandtheinformationbitisd.

Thei

in(1)canbedirectlycalculatedwithchannelinputsandtheaprioriinputswithinoneclock

cycle,providedthatalltheinputsnecessaryarereadytouse.Itistherequiresrecursivecalculationasshownin(2)and(3)forbinaryRSC:

i(s)

and

i(s)

which

i(s)i1(s1)i(s1,1)i1(s0)i(s0,0),

i1,2...N1

(2)

1

1

0

0

i(s)i1(s)i1(s,1)i1(s)i1(s,0),

iN1,N2...1

(3)

wheresd

isthepreviousstatecomingtostateswiththeinputinformationbitsasd.

isthe

s

d

arrivingstateviathecurrentstateswithinputbitasd.Theinitialconditionfortherecursivecalculationis

0

0

(s)1

(s)1

s0

else

s0

(4)

(5)

0

N else

Notethattheinitialcondition(5)onlyappliesfortheRSCwhichisterminated[6].IfRSCisopen[2],theinitialconditionshouldbe

1

N(s)2m, s

(6)

where2misthenumberofthetrellisstates.ThewholeprocessisshowninFig.2.

LLRs

Figure2.theoperationflowchartforMAPalgorithm

Apparently,tocalculatei,weshouldcalculate

i1

first.Andtocalculate

N1,wehavetowait

forallpastibeenobtainedandthesituationissimilartoi.Thus,theprocessingforone

componentdecoderinoneiterationcannotbefinishedwithinNclockcycles(someextraclock

cyclesmayberequiredtoexecutesomeotheroperations).Iftheclockfrequencyis

fc,thenthedelay

forprocessingoftheMAPismorethan

Tmap

N

f

.Supposethatthemaximumiterationis

c

Imax,then

thedelayfordecodingonecodewordofturbocodeismorethanT

2T I

2NI

,leadsto

dec map

max

max

c

ainformationthroughputlessthan

RNTdec

fc

2Imax

f

.Forexample,when

fc100MHz

and

Imax8,thethroughputcannotbelargerthan6.25Mbps.Notethatwithsomeearly-stopiterationstrategy[7],theactualnumberofiterationstosuccessfullydecodeonecodewordmaybemuchless

than

Imax,butthehardwarehavetobedesignedwith

Imax.

Paralleldecodingmethods

Toreducethedecodingdelayandimproveitsthroughput,wecandividethewholecodewordintomultiplesub-blocksanddecodethemwithmultipleMAPprocessorsinparallel.However,thesegmentsinthemiddlehavenoinitialcondition[4-5].Withrandomorarbitraryinitializationas(6)willleadtoaperformancelossthatisunacceptable.Sodirectlyworkingateachsegmentfacesaproblemofhowto

initializetheforwardandbackwardvariablesattheboundaryofeachsegment.Asshownin(2),

ks

dependsonallpast

js,jk.Notethatthefarawaythejleavesk,thelesseffect

will

j(s)

haveon

k(s).Thus,tohaveagoodaccuracyforthe

k(s)

atthebeginningofeach

sub-block,weshouldstarttheprocessingof(2)earlieratjkL

withLlargeenough[4].This

isshowninFig.3.Foracertainsub-blockoflengthMlocatingfrompositionktokM1,we

startthecalculationfromkL

with

kLs

beingarbitrarilyinitializedas(6).Theposition

kL

tok1

belongstotheprevioussub-block,hencetheparallelprocessersarepartially

overlapped.Thecalculationof

k(s)

isexactlythesame.Tohaveagoodaccuracyforthe

kM1(s)

attheright-handbeginningpointofthesegment,weshouldstartfromjkM1L.

overlappingarea

kM(s)

kL(s)

k1(s)

k(s)

kM1(s)

Figure3.theoverlappingmethodin[4]

Therefore,toproducealltheforwardvariables

i(s)

inonecomponentdecoderwithinoneiteration,

K

thewholeprocessingofthecodewordwithKsegmentsandLoverlappingrequiresNL

clock

cyclesinsteadofN cycles,andthedecodingspeedwillbesloweddownbyafactorof N .

K NLK

LargertheLandKis,slowerthedecodingspeedwillbe.Forinstance,assumingthattheblocklengthis2298[8],ifwesegmentitinto50sub-blocks,andtheoverlappinglengthis30,thenitrequires76clockcyclestoproducealltheforwardvariables,insteadof46cycles.Hence,theMAPdecodingspeedcanonlybeimprovedby30times,farlessthanK50,whichisoriginallysupposedtobe.

Tosolvesuchproblem,amethodwithstoragefeaturewasproposedin[5].Insteadofoverlappingtechnique[4],itutilizestheforwardandbackwardvariablescomputedinthepreviousiterationtogettheboundaryinformationforeachsub-block.Forexample,forthep+1thiterationandthesame

sub-blocklocatingfromkto

kM1,thecalculationof(2)willstartatkwithinitialcondition

k1

p1s

beapproximatedwith

ps,thefinalresultsinlastiteration.ThisisshowninFig.4.

(s)

(p)

k1

k1

Thebackwardvariableisprocessedinthesameway.

(p)(s)

kM

(p)(s)

k1

(p1)(s)

k

(p1)(s)

kM1

Figure4.themethodin[5]

Comparedwiththeoverlappingmethod,suchmethodrequiresnoredundantcomputations.Hence,thedecodingspeedcanincreaselinearlywiththenumberofsegments.However,itrequiresanextra

memoryofsize2m(K1)v

tostorethefinalresultsof

ps

inlastiteration,wherevisthe

k1

k1

numberofbitsusedtoquantizetheps.

Proposedmethod

k1

Forsomeapplicationssuchassmallmobileterminals,itispreferredthatthememoryrequiredshouldbeassmallaspossible.Tokeepthefeatureofthemethodsin[5]butsavethememory,inthispaperweproposedtostoretheindexofthemostprobablestateinsteadofalltherealvaluesofintermediate

boundary(e.g.

(p)(s)inFig.4),computedinthepreviousiteration.Withthestatenumberof2m,

weonlyneedtostorembits,ratherthan2mvbits.Thismaybealargesavingifmandvislarge.

Atp+1thiteration,forthesamesub-blocklocatingfromkto

kM1,thecalculationof(2)will

startatkandtheinitialconditionissetas(inlogarithmdomain)

logp1s 0

ss*

(7)

k1

p1

else

whereisapredeterminedparameterwhichonlydependsonthenumberofiterationsandcanbe

k1

optimizedviasimulation,s*argmaxlogpsistheindexofthemostprobablestateinthe

s

previousiteration.Withthismethod,theonlythingweneedtostoredfornextiterationisrequiresmbits.

Simulationresults

s*,which

Inthissectionwewillshowtheperformanceoftheproposedmethod.Thesimulationadoptstheturbocodedefinedincdma2000[9],theinterleaverlengthischosenasN=2298,themaximumiteration

numberis

Imax8.Thenumberofsub-blockis

K20

implyingthatthehardwareMAPprocesser

consistsof20parallelprocessorsandeachprocesses

2298/20115

bits.Bothlogmapand

max-logmapdecodingalgorithmsareconsideredandFER(frameerrorrate)andBER(biterrorrate)areobserved.

Figure5.FERperformanceofcoderate=1/3

Fig.5includesfourFERperformancecurvesforcoderate1/3,usingmax-logmaporlogmapdecodingalgorithmrespectively.AtFER=0.01,formax-logmapalgorithm,theperformancelossoftheproposedmethodisabout0.07dB,comparedwiththemethodin[5].Andforlogmap,theperformancelossatFER=0.01isabout0.1dB.

Figure6.FERperformanceofcoderate=3/4

InFig.6,wecanseethatforcoderate3/4,theperformancelossislargerthanthatofcoderate1/3.AtFFR=0.01,theperformancelossisabout0.3dBforbothlogmapandmax-logmapdecodingalgorithm.WehavealsothesimulatedBERperformancewhicharenotshownhereduetolimitedspace.TheconclusionforBERperformanceisroughlythesame.

Conclusion

Inthispaper,wesimplifiedthemethodin[5],byonlystoringtheindexofthemostprobablestate,insteadofallvaluesofintermediateboundary.Theproposedschemecanreducethesizeofmemorylargely,withsomeacceptableFERorBERperformanceloss.

References

“Throughput,latencyandcomplexityestimationfortheturbodecoder,”3GPPTSGRANWG1#46,R1-062155,

Tallinn,Estonia,Aug

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯系上傳者。文件的所有權益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網頁內容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
  • 4. 未經權益所有人同意不得將文件中的內容挪作商業或盈利用途。
  • 5. 人人文庫網僅提供信息存儲空間,僅對用戶上傳內容的表現方式做保護處理,對用戶上傳分享的文檔內容本身不做任何修改或編輯,并不能對任何下載內容負責。
  • 6. 下載文件中如有侵權或不適當內容,請與我們聯系,我們立即糾正。
  • 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論