第05章標(biāo)量處理機(jī)eng3課件

上傳人：y*** IP屬地：貴州上傳時(shí)間：2023-07-24 格式：PPT 頁(yè)數(shù)：101 大?。?.50MB 積分：25 舉報(bào) 版權(quán)申訴

已閱讀5頁(yè)，還剩96頁(yè)未讀，繼續(xù)免費(fèi)閱讀

版權(quán)說(shuō)明：本文檔由用戶(hù)提供并上傳，收益歸屬內(nèi)容提供方，若內(nèi)容存在侵權(quán)，請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡(jiǎn)介

2023/7/23余臘生版權(quán)所有，違者必究5-1MultipleInstructionIssueWehaveattemptedtolimitstallsfromhazardstolowertheaverageCPItotheidealCPIof1CanwedecreaseCPItounder1?How?Issueandexecutemorethan1instructionatatimeMultiple-issueprocessorscomeintwokinds:Superscalarsusestaticand/ordynamicschedulingmechanismsandmultiplefunctionalunitstoissuemorethan1instructionatatimeVLIW(verylonginstructionword)useinstructionswhicharethemselvesmultipleinstructions,scheduledbyacompilerallinstructionsinthelongwordareexecutedinparallelthisrequiressoftware(compiler)support2023/7/23余臘生版權(quán)所有，違者必究5-2SuperscalarHardwareissuesfrom1to8instructionsperclockcycletheseinstructionsmustbeindependentandsatisfyotherconstraintsAvoidstructuralhazards-usedifferentfunctionalunits,makeupto1memoryreferencecombinedSchedulingofinstructionscanbedonestaticallybyacompilerordynamicallybyhardwareWhileasuperscalarcanissueanycombinationofinstructions,forsimplicity,wewillconcentrateona2instructionsuperscalarforMIPSwhereoneinstructionwillbeanintegeroperationandtheother,ifavailablewillbeafloatingpointoperationThissimplificationreducesthecomplexityofthehardware,butalsoreducestheusefulnessofthesuperscalar2023/7/23余臘生版權(quán)所有，違者必究5-3超標(biāo)量處理機(jī)的基本結(jié)構(gòu)如果把處理機(jī)中能夠同時(shí)運(yùn)行的指令條數(shù)定義為指令并行度ILP（instructionlevelparallelism），那未一條k級(jí)流水線的ILP為k。如果一個(gè)超標(biāo)量處理機(jī)中存在n條這樣的流水線，其ILP為nk。12341234整數(shù)寄存器123412345656浮點(diǎn)數(shù)寄存器存儲(chǔ)器圖2-26常見(jiàn)的超標(biāo)量處理機(jī)組成返回上一張2023/7/23余臘生版權(quán)所有，違者必究5-4指令的單發(fā)射與多發(fā)射處理機(jī)從指令存儲(chǔ)單元（或指令分配單元）取得指令的過(guò)程稱(chēng)為“發(fā)射”。如果一個(gè)處理機(jī)在單個(gè)時(shí)鐘周期中只能取出一條指令供執(zhí)行，就稱(chēng)為單發(fā)射處理機(jī)。如果在一個(gè)時(shí)鐘周期內(nèi)可以同時(shí)取得多條指令的處理機(jī)可以稱(chēng)為多發(fā)射處理機(jī)。時(shí)鐘周期指令I(lǐng)1I2I351234IFIDEXWRIFIDEXWRIFIDEXWR時(shí)鐘周期指令I(lǐng)6I1I2I3I4I512345EXWRIFIDIFIDEXWRIFIDEXWRIFIDEXWRIFIDEXWRIFIDEXWR(a)單發(fā)射

(b)多發(fā)射圖2-28單發(fā)射與多發(fā)射工作方式比較返回上一張2023/7/23余臘生版權(quán)所有，違者必究5-5超標(biāo)量流水線處理機(jī)超標(biāo)量流水線的發(fā)射策略

已經(jīng)指出，限制指令級(jí)并行性的3種因素是：1.結(jié)構(gòu)相關(guān)，即資源沖突；2.控制相關(guān)；3.數(shù)據(jù)相關(guān)，即WR相關(guān)、RW相關(guān)、WW相關(guān)。在超標(biāo)量流水中，上述相關(guān)的存在，使問(wèn)題變得更加復(fù)雜化。因此超標(biāo)量流水線的調(diào)度，即指令的發(fā)射和完成策略，對(duì)于充分利用指令級(jí)的并行度，提高超標(biāo)量處理器的性能十分重要。

所謂指令發(fā)射策略包括兩層意思，

一是取指令的次序，另一個(gè)是所取指令的執(zhí)行次序。2023/7/23余臘生版權(quán)所有，違者必究5-6超標(biāo)量流水線處理機(jī)指令發(fā)射(instructionissue)是指啟動(dòng)指令進(jìn)入執(zhí)行段的過(guò)程。指令發(fā)射策略是指指令發(fā)射所用的協(xié)議或規(guī)則。當(dāng)指令按程序的次序發(fā)射時(shí)，稱(chēng)之為按序發(fā)射(in-orderissue)。為改善流水線性能，可以將存在相關(guān)性的指令推后發(fā)射，而將后面無(wú)相關(guān)性的指令提前發(fā)射，即不按程序原有次序發(fā)射指令，稱(chēng)之為無(wú)序發(fā)射(out-of-orderissue)。類(lèi)似地，指令的完成也有按序完成和無(wú)序完成之分。一般而言，無(wú)序發(fā)射總導(dǎo)致無(wú)序完成。

2023/7/23余臘生版權(quán)所有，違者必究5-7超標(biāo)量流水線處理機(jī)超標(biāo)量流水線共有3種調(diào)度策略：1.按序發(fā)射按序完成；2.按序發(fā)射無(wú)序完成；3.無(wú)序發(fā)射無(wú)序完成。無(wú)論哪種調(diào)度策略，都要保證程序運(yùn)行的最終結(jié)果是正確的.2023/7/23余臘生版權(quán)所有，違者必究5-8超標(biāo)量流水線處理機(jī)假設(shè)有一個(gè)并行度為2的超標(biāo)量流水線，其結(jié)構(gòu)如圖7(a)所示。它分為取指(F)段、譯碼(D)段、執(zhí)行(E)段和寫(xiě)回(W)段共四段。F、D、W段都是1個(gè)時(shí)鐘周期完成。E段有多個(gè)功能部件：其中LOAD／STORE部件完成D-Cache訪問(wèn)只需1個(gè)時(shí)鐘周期，加法器部件完成加法操作需2個(gè)時(shí)鐘周期，乘法器部件完成乘法操作則需3個(gè)時(shí)鐘周期。加法器和乘法器都已流水化。F段和D段要求指令成對(duì)的輸入。E段有內(nèi)部數(shù)據(jù)定向傳送，結(jié)果生成即可使用。2023/7/23余臘生版權(quán)所有，違者必究5-9超標(biāo)量流水線處理機(jī)使用的程序包含如下6條指令序列：

LOAD

R1,M(A)

；R1←M(A)

ADD

R2，R1

；R2←(R2)＋(R1)

ADD

R3，R4

；R3←(R3)＋(R4)

MUL

R4，R5

；R4←(R4)×(R5)

LOAD

R6，M(B)

；R6←M(B)

MUL

R6，R7

；R6←(R6)×(R7)上述指令中I1，I2有WR相關(guān)，I3，I4有RW相關(guān)，I5，I6有WW相關(guān)和WR相關(guān)。2023/7/23余臘生版權(quán)所有，違者必究5-10超標(biāo)量流水線處理機(jī)1

按序發(fā)射圖7(b)給出了按序發(fā)射按序完成的譯碼段、執(zhí)行段、寫(xiě)回段的推進(jìn)情況，而圖7(c)給出了流水線的時(shí)空?qǐng)D。2023/7/23余臘生版權(quán)所有，違者必究5-11超標(biāo)量流水線處理機(jī)我們看到，指令I(lǐng)5與I3，I4無(wú)關(guān)，若不推遲寫(xiě)回而是在時(shí)鐘7寫(xiě)回，程序的語(yǔ)義仍然正確。這樣實(shí)現(xiàn)的話，I5先于I4完成，這種情況就是按序發(fā)射無(wú)序完成，其流水線時(shí)空?qǐng)D見(jiàn)圖8所示。雖然總的完成時(shí)間仍是10個(gè)時(shí)鐘周期，但是圖7(b)中的I5不存在了，LOAD／STORE部件的利用率得到了提高。2023/7/23余臘生版權(quán)所有，違者必究5-12超標(biāo)量流水線處理機(jī)2

無(wú)序發(fā)射從按序發(fā)射方式看到，譯碼段只是對(duì)到達(dá)的指令進(jìn)行資源沖突或數(shù)據(jù)相關(guān)性的判測(cè)，若無(wú)沖突或相關(guān)性則按序發(fā)射出去，否則指令滯留在譯碼段直到?jīng)_突或相關(guān)性消失再發(fā)射，如圖7(b)中的I2。如果處理器具有前找能力，即后續(xù)的指令中可能有獨(dú)立指令，它與已在流水線上的指令不相關(guān)，此時(shí)應(yīng)提前譯碼并執(zhí)行，以充分發(fā)揮超標(biāo)量多條指令流水線的能力。這就是無(wú)序發(fā)射的目的2023/7/23余臘生版權(quán)所有，違者必究5-13超標(biāo)量流水線處理機(jī)2

無(wú)序發(fā)射為實(shí)現(xiàn)無(wú)序發(fā)射，就必須在流水線的譯碼段和執(zhí)行段之間建立緊密的聯(lián)系。一種常用的方法是使用指令窗口，它實(shí)質(zhì)上是一個(gè)緩沖棧。當(dāng)處理器譯碼一條指令后就將它放入指令窗口，只要緩沖器不滿(mǎn)，就繼續(xù)取和譯碼后續(xù)的指令。指令由指令窗口發(fā)射到執(zhí)行段。只要滿(mǎn)足兩個(gè)條件：1.指令所需的功能部件是可用的，2.無(wú)相關(guān)性阻礙這條指令的執(zhí)行，那么這條指令即可發(fā)射出去，與取指或譯碼的順序無(wú)關(guān)。2023/7/23余臘生版權(quán)所有，違者必究5-14超標(biāo)量流水線處理機(jī)2

無(wú)序發(fā)射使用指令窗口的超標(biāo)量流水線模型見(jiàn)圖9(a)所示。注意，指令窗口只是譯碼段與執(zhí)行段之間的緩沖機(jī)構(gòu)，并不是流水線的一個(gè)獨(dú)立段。在無(wú)序發(fā)射方式下，前述程序的6條指令在流水線上的推進(jìn)情況及流水線時(shí)空?qǐng)D分別示于圖9(b)和(c)中。2023/7/23余臘生版權(quán)所有，違者必究5-15SuperscalarProblemsWemustnowexpandthepotentialproblemsthatarisewithasuperscalarpipelineoveranordinarypipeline:RAWhazardscouldexistbetweenthetwoinstructionsissuedatthesametimeTherearenewpotentialWAWandWARhazardsWeneedtohavetwiceasmanyregisterreadsandwritesasbefore,ourregisterfilemustbeexpandedtoaccommodatethisLoadsandStoresareintegeroperationseveniftheyaredealingwithfloatingpointregisterswemightbereadingfloatingpointregistersforaFPoperationandalsoreading/writingfloatingpointregistersforanFPloadorstoreMaintainingpreciseexceptionsisdifficultbecauseanintegeroperationmayhavealreadycompletedHardwaremustdetecttheseproblems(andquickly)2023/7/23余臘生版權(quán)所有，違者必究5-16CostofaSuperscalarWealreadyhadthemultiplefunctionalunits,sothereisnoaddedcostintermsofhavinganintandaFPinstructionissueandexecuteinparallelThereareaddedcoststhoughforHazarddetectionthecomplexityhereisincreasedbecausenowinstructionsmustbecomparednotonlytoinstructionsfurtherdownthepipeline,buttotheinstructionatthesamestage,plusthereisapotentialfortwiceasmanyinstructionsbeingactiveatonetime!MaintainingpreciseexceptionsTwosetsofbusesintegeroperationsfromintegerregisterstointegerALU&datacacheFPoperationsfromFPregisterstoFPfunctionalunit&datacacheAbilitytoaccessfloatingpointregisterfilebyupto3instructionsduringthesamecycle(aloadorstoreFPintheIDorWBstage,anFPinstructioninIDandanFPinstructioninWB)2023/7/23余臘生版權(quán)所有，違者必究5-17HardwareBasedSpeculationInissuingmultipleinstructionspercycle,branchpredictionmaynotbeaccurateenoughtomaintainareasonableissuerateAhighissueprocessormayneedtoexecuteabrancheveryclockcycle!Toexploitfurtherperformance,wenowlookathardwaretopromotespeculativeinstructionissueHardwarewillpredictthenextinstructionandissueitbeforedeterminingthebranchresultIfpredictingwrong,theinstructionmustbekilledoffbeforeitcanaffectachangetothemachine’sstateitcannotupdateregistersormemoryWeaddanewbuffercalledthereorderbufferThisbufferstorestheresultsofcompletedinstructionsthatwerespeculated,untilthespeculationisproventrueorfalseIftrue,wecanallowtheinstruction’sresultstobewrittentoregisters/memoryIffalse,wemustremoveitandallinstructionsthatfolloweditsincetheywerespeculatedincorrectlyWeAddanewstatetoinstructionexecutioncalledcommittoourTomasulo-basedsuperscalararchitectureShouldtheresultbestoredinthedestinationregister?Thisbecomesthefinalstepforallinstructions2023/7/23余臘生版權(quán)所有，違者必究5-18TheNewArchitectureWillcombine:Tomasulo-basedapproachofreservationstationsfordynamicschedulingmulti-issuesuperscalarseparatelycontrolledintegratedfetchunitwhichwillspeculateoncontroldependencesreorderbuffertotemporarilystoreresultsbeforetheyaremovedtoregisters2023/7/23余臘生版權(quán)所有，違者必究5-19StepsforHardwareWemustenhanceourcontrolhardwarefromTomasulo’sapproachtoincludeInstructioncannotissueifthereorderbufferisfullUponissue,updateregisterstatustoincludereorderbufferentrynumber,andenterreorderbufferentrynumberintodestinationfieldofreservationstation–usethisvaluetorenameregistersifneededExecutionremainsthesamealthoughloadsandstoresarenowbeinghandledbyaseparatememorycontrolunitWriteresultremainsthesameexceptthatvaluesarenotwrittentoregistershere,buttheyareforwardedviaCDBIneachcycle,committheinstructionatthefrontofthereorderbufferifithasreachedthewriteresultstageandthespeculationfortheinstructionwascorrectOtherwise,ifthespeculationfortheinstructionwaswrong,flushtheinstructionandallothersinthereorderbufferuntilyoureachthefirstinstructionfetchedafterthebranchconditionwasdetermined2023/7/23余臘生版權(quán)所有，違者必究5-20ExampleHerewetakeabrieflookatanotherexampleofspeculationThecodeisgivenbelowAssumethereareseparateintegerunitsforeffectiveaddresscalculation,ALUoperations,andbranchconditionevaluationNoticethattherearenoFPoperationshere,soallinstructionsshouldexecutein1cycleWewilllookatthecyclesatwhicheachinstructionissues,executes,andwritestotheCDBwithoutspeculation,andissues,executes,writesandcommitswithspeculationLoop: LD R2,0(R1) DADDIU R2,R2,#1 SD R2,0(R1) DADDIU R1,R1,#4 BNE R2,R3,Loop2023/7/23余臘生版權(quán)所有，違者必究5-21WithoutSpeculationCycle#InstructionIssueExecuteMemAccCDBComments1LDR2,0(R1)1234Firstissue1DADDIUR2,R2,#1156WaitforLD1SDR2,0(R1)237Waitforadd1DADDIUR1,R1,#4234Executedirectly1BNER2,R3,Loop37Waitforadd2LDR2,0(R1)48910WaitforBNE2DADDIUR2,R2,#141112WaitforLD2SDR2,0(R1)5913Waitforadd2DADDIUR1,R1,#4589Waitfor1stBNE2BNER2,R3,Loop613Waitforadd3LDR2,0(R1)7141516Waitfor2ndBNE3DADDIUR2,R2,#171718WaitforLW3SDR2,0(R1)81519Waitforadd3DADDIUR1,R1,#481415Waitfor2ndBNE3BNER2,R3,Loop919Waitforadd2023/7/23余臘生版權(quán)所有，違者必究5-22WithSpeculationCycleInstructionIssueExecMemAccCDBCommitComments1LDR2,0(R1)12345Firstissue1DADDIUR2,R2,#11567WaitforLD1SDR2,0(R1)237Waitforadd1DADDIUR1,R1,#42348Commitinorder1BNER2,R3,Loop378Waitforadd2LDR2,0(R1)45679Nodelay2DADDIUR2,R2,#148910WaitforLD2SDR2,0(R1)5610Waitforadd2DADDIUR1,R1,#456711Commitinorder2BNER2,R3,Loop61011Waitforadd3LDR2,0(R1)7891012Nodelay3DADDIUR2,R2,#17111213WaitforLW3SDR2,0(R1)8913Waitforadd3DADDIUR1,R1,#4891014Commitinorder3BNER2,R3,Loop91314Waitforadd2023/7/23余臘生版權(quán)所有，違者必究5-23DesignIssuesReorderbuffervs.moreregistersWecouldforegothereorderbufferbyprovidingadditionaltemporarystorage–inessence,thetwoarethesamesolution,justaslightlydifferentimplementationBothrequireagooddealmorememorythanweneededwithanordinarypipeline,butbothimproveperformancegreatlyHowmuchshouldwespeculate?Otherfactorscauseourmultiple-issuesuperscalartoslow–cacheissuesorexceptionsforinstance,soalargeamountofspeculationisdefeatedbyotherhardwarefailings,wemighttrytospeculateoveracoupleofbranches,butnotmoreSpeculatingovermultiplebranchesImagineourloophasaselectionstatement,nowwespeculateovertwobranches–speculationovermorethanonebranchgreatlycomplicatesmattersandmaynotbeworthwhile2023/7/23余臘生版權(quán)所有，違者必究5-24Limitations/DifficultiesInherentlimitationstomultiple-issuearethelimitedamountofILPofaprogram:Howmanyinstructionsareindependentofeachother?Howmuchdistanceisavailablebetweenloadinganoperandandusingit?betweenusingandsavingit?Coupledwiththemulti-cyclelatencyforcertaintypesofoperationsthatcauseinconsistenciesintheamountofissuingthatcanbesimultaneousDifficultiesinbuildingtheunderlyinghardwareNeedmultiplefunctionunits(costgrowslinearlywiththenumberofunits)Needanincrease(possiblyverylarge)inmemoryandregister-filebandwidthwhichmighttakeupsignificantspaceonthechipandmayrequirelargersystembussizeswhichturnsintomorepinsComplexityofmultiplefetchesmeansamorecomplexmemorysystem,possiblywithindependentbanksforparallelaccesses2023/7/23余臘生版權(quán)所有，違者必究5-25LimitationsonIssueSizeIdeally,wewouldliketoissueasmanyindependentinstructionssimultaneouslyaspossible,butthisisnotpracticalbecausewewouldhaveto:LookarbitrarilyfaraheadtofindaninstructiontoissueRenameallregisterswhenneededtoavoidWAR/WAWDetermineallregisterandmemorydependencesPredictallbranchesProvideenoughfunctionalunitstoensureallreadyinstructionscanbeissuedWhatisapossiblemaximumwindowsize?Todetermineregisterdependencesoverninstructionsrequiresn2-ncomparisons2000instructions4,000,000comparisons50instructions2450comparisonsWindowsizeshaverangedbetween4and32withsomerecentmachineshavingsizesof2-8Amachinewithwindowsizeof32achievesabout1/5oftheidealspeedupformostbenchmarks2023/7/23余臘生版權(quán)所有，違者必究5-26OtherEffectsWithinfiniteregisters,registerrenamingcaneliminateallWAWandWARhazardsWithTomasulo’sapproach,thereservationstationsoffervirtualregistersMostmachinestodayhaveonlyafewvirtualregistersandperhaps32Intand32FPregistersavailableFigure3.41showstheresultingissuespercyclefordifferentnumbersofregistersSurprisingly,thenumberofregistersdoeshaveadramaticimpactandthat>32registersaredesirableAsidefromregisterrenaming,wehavenamedependenciesonmemoryreferencesThreemodelsofanalysisare:Global(perfectanalysisofallglobalvars)Stackperfect(perfectanalysisofallstackreferences)theseoffersomeimprovement,particularlyin2benchmarksInspection(examineaccessesforinterferenceatcompiletime)None(assumeallreferencesconflict)thesehavesimilarresults,between3-6instructions/cycle2023/7/23余臘生版權(quán)所有，違者必究5-27ExampleProcessorsLet’scomparethreehypotheticalprocessorsanddeterminetheirMIPSratingforthegccbenchmarkProcessor1:simpleMIPS2-issuesuperscalarpipelinewithclockrateof1GHz,CPIof1.0,cachesystemwith.01missesperinstructionProcessor2:deeplypipelinedMIPSwithaclockrateof1.2GHz,CPIof1.2,smallercacheyielding.015missesperinstructionProcessor3:speculativesuperscalarwith64-entrywindowthatachieves50%ofitsidealissueratewithaclockrateof800MHz,asmallcacheyielding.02missesperinstruction(although10%ofthemisspenaltyisnotvisibleduetodynamicscheduling)Assumememoryaccesstime(misspenalty)is100ns2023/7/23余臘生版權(quán)所有，違者必究5-28SolutionFirst,determinetheCPI(includingtheimpactofcachemisses)Processor1:1GHzclock=1nsperclockcyclememoryaccessof100nssomisspenalty=100/1=60cyclescachepenalty=.01*100=1.0cyclesperinstructionoverallCPI=1.0+1.0=2.0Processor2:1.2GHzclock=.83nsperclockcyclemisspenalty=100/.83=120cyclescachepenalty=.015*120=1.8cyclesperinstructionoverallCPI=1.2+1.8=3.0Processor3:800MHzclock=1.25nsperclockcyclemisspenaltytakesaffectonly90%ofthetime,somisspenalty=.90*100/1.25=72cyclescachepenalty=.02*72=1.44overallCPItobecomputednext…2023/7/23余臘生版權(quán)所有，違者必究5-29SolutionContinuedTheCPIofprocessor3requiresabitmoreeffortSincewewerenotgiventheCPI,wehavetocomputeitbyconsideringthenumberofinstructionissuespercycleWitha64-entrywindow,themaximumnumberofinstructionissuespercycleis9,wearetoldthatthisprocessoraverages50%itsidealrate,sothismachineissues4.5instructionspercyclegivingitaprocessorCPI=1/4.5=.22overallCPI=.22+1.44=1.66NowwecandeterminetheMIPSratingforeachProcessor1:1GHz/2.0=500MIPSProcessor2:1.2GHz/3.0=400MIPSProcessor3:800MHz/1.66=482MIPSThe2-issueprocessor(proc1)isagoodcompromisebetweenspeedofclockandissuerate,andyieldsthebestperformance2023/7/23余臘生版權(quán)所有，違者必究5-30超標(biāo)量流水線處理機(jī)典型處理機(jī)結(jié)構(gòu)

Motorola公司的MC88110微處理器、Intel公司的Pentium微處理器都是典型的超標(biāo)量流水線設(shè)計(jì)。前者是RISC機(jī)器，后者具有CISC和RISC兩者的特性。下面只介紹Pentium機(jī)的超標(biāo)量流水線.2023/7/23余臘生版權(quán)所有，違者必究5-31超標(biāo)量流水線處理機(jī)Pentium能在每個(gè)時(shí)鐘周期執(zhí)行兩條指令。它的某些指令完全是以硬連線實(shí)現(xiàn)的，并能在一個(gè)時(shí)鐘周期執(zhí)行完畢(RISC特征)；另外一些指令是以微指令來(lái)實(shí)現(xiàn)的，可能需要2-3個(gè)時(shí)鐘周期的執(zhí)行時(shí)間(CISC特征)。因此，Pentium的超標(biāo)量流水線與RISC處理器超標(biāo)量流水線相比，既簡(jiǎn)單又復(fù)雜。簡(jiǎn)單是指它采用的超標(biāo)量技術(shù)簡(jiǎn)單且直截了當(dāng)；復(fù)雜是指讓不定長(zhǎng)、不同尋址方式、不同實(shí)現(xiàn)方式的指令流經(jīng)并行度為2的指令流水線是要頗費(fèi)苦心的。2023/7/23余臘生版權(quán)所有，違者必究5-32超標(biāo)量流水線處理機(jī)1

Pentium指令流水線的結(jié)構(gòu)Pentium處理器內(nèi)包含一個(gè)浮點(diǎn)部件(FPU)。浮點(diǎn)運(yùn)算是流水化的，一條浮點(diǎn)運(yùn)算指令分成8段完成。下面主要介紹整數(shù)指令流水線，其結(jié)構(gòu)如圖11所示。2023/7/23余臘生版權(quán)所有，違者必究5-33超標(biāo)量流水線處理機(jī)從圖11中看出，Pentium有兩個(gè)32位的ALU來(lái)完成所有的整數(shù)運(yùn)算和邏輯操作，因而能支持U、V兩條指令流水線的并行執(zhí)行。芯片內(nèi)部獨(dú)立設(shè)置的指令Cache(I-cache)和數(shù)據(jù)Cache(D-cache)，其容量各為8KB，是對(duì)流水線的有力支持。兩個(gè)預(yù)取緩沖器，每個(gè)都是32字節(jié)，負(fù)責(zé)由I-cache或主存取指令，并緩存其中。指令譯碼器除完成譯碼指令外，還要完成指令配對(duì)檢查。如果遇到轉(zhuǎn)移指令，要在譯碼之后將轉(zhuǎn)移指令地址送至轉(zhuǎn)移目標(biāo)緩沖器BTB進(jìn)行查找?？刂芌OM中存放用于控制指令執(zhí)行時(shí)操作順序的微指令。以上3個(gè)部件被U、V兩條流水線共用。2023/7/23余臘生版權(quán)所有，違者必究5-34超標(biāo)量流水線處理機(jī)兩個(gè)地址生成器用于產(chǎn)生(或計(jì)算)存儲(chǔ)器操作數(shù)地址，各種工作模式下的邏輯地址最終要轉(zhuǎn)換成物理地址來(lái)訪問(wèn)D-cache，并由轉(zhuǎn)換后援緩沖器TLB來(lái)加速這種地址轉(zhuǎn)換過(guò)程。D-cache是雙端口的，一個(gè)時(shí)鐘周期能存取兩個(gè)32位數(shù)據(jù)(或一個(gè)64位浮點(diǎn)數(shù))。通用寄存器組有8個(gè)32位整數(shù)寄存器，用于地址計(jì)算、保存ALU的源操作數(shù)和目的操作數(shù)。兩個(gè)32位的ALU都具有一個(gè)時(shí)鐘周期的等待時(shí)間。只有簡(jiǎn)單指令而且沒(méi)有寄存器→存儲(chǔ)器或存儲(chǔ)器→寄存器操作的算術(shù)邏輯指令才能在一個(gè)時(shí)鐘周期執(zhí)行完畢。大多數(shù)簡(jiǎn)單指令是以硬連線實(shí)現(xiàn)的，執(zhí)行段只需1個(gè)時(shí)鐘周期。少數(shù)涉及寄存器→存儲(chǔ)器或存儲(chǔ)器→寄存器操作的算術(shù)邏輯指令，它們需2-3個(gè)時(shí)鐘周期才能執(zhí)行完畢。但由于Pentium具有排序化硬件，允許將這些少數(shù)例外也作為簡(jiǎn)單指令對(duì)待。2023/7/23余臘生版權(quán)所有，違者必究5-35超標(biāo)量流水線處理機(jī)2

流水線的調(diào)度策略Pentium通過(guò)U、V兩條流水線能在每個(gè)時(shí)鐘周期執(zhí)行兩條整數(shù)指令。這兩條流水線都由5段組成，前兩段(PF、D1)是U、V共享的，見(jiàn)圖12(a)所示?，F(xiàn)說(shuō)明如下：預(yù)取(PF)段由I-cache取指令，指令長(zhǎng)度是可變的，存入一個(gè)預(yù)取緩沖器。譯碼1(D1)段譯碼指令確認(rèn)它的操作碼和尋址方式等有關(guān)信息。此段要完成指令配對(duì)檢查和轉(zhuǎn)移指令預(yù)測(cè)。兩條連續(xù)的指令I(lǐng)1、I2前后被譯碼，然后判決是否將這一對(duì)指令并行發(fā)射出去。發(fā)射一對(duì)指令必須滿(mǎn)足以下4個(gè)條件：1.兩條指令是簡(jiǎn)單指令；2.兩條指令間不存在WR相關(guān)和WW相關(guān)，即I1的目標(biāo)寄存器既不是I2的源寄存器也不是I2的目標(biāo)寄存器。RW相關(guān)則用發(fā)射策略予以避免；3.每條指令都不同時(shí)含有立即數(shù)和偏移量；4.只有I1指令允許帶有指令前輟。如果不滿(mǎn)足上述條件，只允許I1指令發(fā)射到U流水線的下一段。2023/7/23余臘生版權(quán)所有，違者必究5-36超標(biāo)量流水線處理機(jī)譯碼2(D2)段計(jì)算并產(chǎn)生存儲(chǔ)器操作數(shù)的地址。如果TLB命中，只需1個(gè)時(shí)鐘周期，否則不只1個(gè)時(shí)鐘周期。當(dāng)然不是所有指令都有存儲(chǔ)器操作數(shù)，但也必須流經(jīng)這個(gè)段。執(zhí)行(EX)段此段主要是在ALU、桶形移位器或其他功能部件中完成指定的運(yùn)算。需要時(shí)完成D-cache訪問(wèn)。寫(xiě)回(WB)段將運(yùn)算的結(jié)果打入目標(biāo)寄存器和標(biāo)志寄存器。U、V兩條流水線是不等價(jià)的，也不能交換使用。U流水線能執(zhí)行所有的整數(shù)和浮點(diǎn)數(shù)指令，而V流水線只能執(zhí)行簡(jiǎn)單的整數(shù)指令和浮點(diǎn)數(shù)交換這樣的少數(shù)浮點(diǎn)數(shù)指令。U、V兩條流水線的調(diào)度采用按序發(fā)射按序完成策略。檢查合格的一對(duì)指令同時(shí)被發(fā)射到U、V流水線的D2段，這一對(duì)指令也必須同時(shí)離開(kāi)D2段進(jìn)入EX段。如果一條指令在D2段滯留，另一條指令也必須在D2段停頓，如圖12(b)的I1、I2情況所示(時(shí)鐘4)。一旦成對(duì)進(jìn)入EX段，若能同時(shí)執(zhí)行完最好，否則就使U流水線的指令先執(zhí)行完。如圖12(b)所示的指令I(lǐng)3、I4情況是：I3執(zhí)行所需時(shí)間較長(zhǎng)，此時(shí)V流水線的I4必須停頓，等待I3執(zhí)行完(時(shí)鐘7)。圖12(b)所示的指令I(lǐng)5、I6情況是：U流水線中的I5執(zhí)行所需時(shí)間較短，那么它可先執(zhí)行完畢并進(jìn)入寫(xiě)回段(時(shí)鐘9)。2023/7/23余臘生版權(quán)所有，違者必究5-37超標(biāo)量流水線處理機(jī)Pentium的超標(biāo)量流水線在每個(gè)時(shí)鐘周期能執(zhí)行兩條簡(jiǎn)單的整數(shù)指令，但一般只能執(zhí)行一條浮點(diǎn)數(shù)指令。這是因?yàn)楦↑c(diǎn)數(shù)指令流水線是8段，而前5段是與U、V流水線的5段共享的，而且某些浮點(diǎn)操作數(shù)是64位，因此除少數(shù)例外(如浮點(diǎn)數(shù)交換指令)，浮點(diǎn)數(shù)指令不能與整數(shù)指令同時(shí)執(zhí)行。

2023/7/23余臘生版權(quán)所有，違者必究5-38PentiumII:RISCfeaturesAllRISCfeaturesareimplementedontheexecutionofmicroinstructionsinsteadofmachineinstructionsMicroinstruction-levelpipelinewithdynamicallyscheduledmicrooperationsFetchmachineinstruction(3stages)Decodemachineinstructionintomicroinstructions(2stages)Issuemicroinstructions(2stages,registerrenaming,reorderbufferallocationperformedhere)Executeofmicroinstructions(1stage,floatingpointunitspipelined,executiontakesbetween1and32cycles)Writeback(3stages)Commit(3stages)Superscalarcanissueupto3microoperationsperclockcycleReservationstations(20ofthem)andmultiplefunctionalunits(5ofthem)Reorderbuffer(40entries)andspeculationused2023/7/23余臘生版權(quán)所有，違者必究5-39MoreonthePipelineFunctionalUnitshavethefollowingstages IntegerALU 1 IntegerLoad 3 IntegerMultiply 4 FPadd 3 FPmultiply 5(partiallypipelined–multipliescanstarteveryothercycle) FPdivide 32(notpipelined)Thefetchunitcanfetchupto16bytespercycle,whichisenoughtodeterminehowmuchmoreneedstobefetchedfrommemory(recallinstructionsvaryinlengthfrom1-17bytes)sothefetchmighttake2-3cyclesinall2023/7/23余臘生版權(quán)所有，違者必究5-40CISC指令的RISC實(shí)現(xiàn)指令Cache16KB指令流緩沖器指令流長(zhǎng)度譯碼器譯碼器對(duì)齊段寄存器分配器去重排序緩沖器ROB（簡(jiǎn)單）譯碼器2（復(fù)雜）譯碼器0（簡(jiǎn)單）譯碼器1微代碼指令序列發(fā)生器譯碼后指令隊(duì)列靜態(tài)轉(zhuǎn)移預(yù)測(cè)動(dòng)態(tài)轉(zhuǎn)移預(yù)測(cè)下一個(gè)IPRATIFU1IFU2IFU3ID1ID22023/7/23余臘生版權(quán)所有，違者必究5-41FunctionalUnitArchitectureInstructionfetchedfrominstructioncacheInstructionunitdecodesintomicrocodeMicrocodeissuedtooneofthefunctionalunits(upto3issuespercycle)5functionalunits1setofintegerunits1setofFPunits1branchunit2load/storeunitsFunctionalunitsdirectlyconnectedtodatacacheforquickaccessSecondlevelcacheusedasbackuptobothinstructionanddatacaches2023/7/23余臘生版權(quán)所有，違者必究5-42ReservationStationsTheuseofreservationstationsallowsdynamicandmultipleissuewithareorderbufferunitingallofthistogetherNoticethat2stores,1load,1simpleintegerorMMXand1complexinteger/FP/MMXoperationcanbeissuedatatime2023/7/23余臘生版權(quán)所有，違者必究5-43HandlingSpeculationInstructionfetchanddecodeplacesmicroinstructionsininstructionpoolDispatchandExecutionUnitissuesmicroinstructionsFunctionalunitsareinsideoftheexecutionunitDispatchunitusesspeculationwhenissuingmicroinstructionsAsmicroinstructionsfinish,theydonotwriteresultstoregisters(orcache)butinsteadwaitfortheretireunitTheretireunitwritesallresultsbacktodataregistersand/orcache2023/7/23余臘生版權(quán)所有，違者必究5-44SourceofStallsThisarchitectureisverycomplexandreliesonbeingabletofetchanddecodeinstructionsquicklyTheprocessbreaksdownwhenLessthan3instructionscanbefetchedin1cycleLessthan3instructionscanbeissuedbecauseinstructionshavedifferentnumberofmicrooperationsLimitationofreservationstationsandreorderbufferslotsDatadependencesDatacacheaccessresultsinamissBranchesaremispredictedInthelast3cases,thiscouldcausethereorderbuffertostall,resultinginmultiplemicroinstructionsnotbeingabletocommitforseveralcyclesOverall,thePentiumProhasbetween.2and2.8stallsperinstructiononSPEC95benchmarks,average1+stallperinstructionAndhasanaverageCPIofaround2.52023/7/23余臘生版權(quán)所有，違者必究5-45FallaciesandPitfallsF:ProcessorswithlowerCPIswillalwaysbefasterF:ProcessorswithfasterclockrateswillalwaysbefasterP:EmphasizinganimprovementinCPIbyincreasingissueratewhilesacrificingclockratecanleadtolowerperformanceP:Improvingonlyoneaspectofamultiple-issueprocessorandexpectingoverallperformanceimprovementP:SometimesbiggeranddumberisbetterThisspecificallyreferstousingsimplerbranchpredictionschemesthanmorecomplexones2023/7/23余臘生版權(quán)所有，違者必究5-46超標(biāo)量流水處理機(jī)性能

為便于比較，將單流水線普通標(biāo)量處理機(jī)的指令級(jí)并行度記作(1，1)，超標(biāo)量處理機(jī)的指令級(jí)并行度記為(m，1)。

在理想情況下，N條指令在單流水線普通標(biāo)量處理機(jī)上的執(zhí)行時(shí)間為T(mén)(1,1)=(k-N-1)Δt

其中，k是流水線的級(jí)數(shù)，Δt是一個(gè)時(shí)鐘周期的時(shí)間長(zhǎng)度。

如果把相同的N條指令在一臺(tái)每個(gè)時(shí)鐘周期發(fā)射m條指令的超標(biāo)量處理機(jī)上執(zhí)行，所需的執(zhí)行時(shí)間為

其中，第一項(xiàng)是第一批m條指令同時(shí)通過(guò)m條指令流水線所需要的執(zhí)行間，而第二項(xiàng)是執(zhí)行其余N－m條指令所需的時(shí)間。這時(shí)，每一個(gè)時(shí)鐘周期有m條指令分別通過(guò)m條指令流水線。

超標(biāo)量處理機(jī)相對(duì)于單流水普通標(biāo)量處理機(jī)的加速比為

當(dāng)N→∞時(shí)，在沒(méi)有資源沖突，沒(méi)有數(shù)據(jù)相關(guān)和控制相關(guān)的理想情況下超標(biāo)量處理機(jī)的加速比最大為

S(m,1)max=m如果與順序執(zhí)行結(jié)構(gòu)相比，加速比為km

2023/7/23余臘生版權(quán)所有，違者必究5-47超流水線處理機(jī)指令執(zhí)行時(shí)序典型處理機(jī)結(jié)構(gòu)超流水線處理機(jī)性能余臘生版權(quán)所有，違者必究

兩種定義：

一個(gè)周期內(nèi)能夠分時(shí)發(fā)射多條指令的處理機(jī)稱(chēng)為超流水線處理機(jī)。

指令流水線有8個(gè)或更多功能段的流水線處理機(jī)稱(chēng)為超流水線處理機(jī)。提高處理機(jī)性能的不同方法：

超標(biāo)量處理機(jī)是通過(guò)增加硬件資源為代價(jià)來(lái)?yè)Q取處理機(jī)性能的。超流水線處理機(jī)則通過(guò)各硬件部件充分重疊工作來(lái)提高處理機(jī)性能。兩種不同并行性：

超標(biāo)量處理機(jī)采用的是空間并行性

超流水線處理機(jī)采用的是時(shí)間并行性余臘生版權(quán)所有，違者必究指令執(zhí)行時(shí)序每隔1/n個(gè)時(shí)鐘周期發(fā)射一條指令，流水線周期為1/n個(gè)時(shí)鐘周期在超標(biāo)量處理機(jī)中，流水線的有些功能段還可以進(jìn)一步細(xì)分例如：ID功能段可以再細(xì)分為譯碼、讀第一操作數(shù)和讀第二操作數(shù)三個(gè)流水段。也有些功能段不能再細(xì)分，如WR功能段一般不再細(xì)分。因此有超流水線的另外一種定義：有8個(gè)或8個(gè)以上流水段的處理機(jī)稱(chēng)為超流水線處理機(jī)余臘生版權(quán)所有，違者必究每個(gè)時(shí)鐘周期分時(shí)發(fā)送3條指令的超流水線IF時(shí)鐘

周期指令I(lǐng)1I2I3IDEXWR123456I4I5I6IFIDEXWRI7I8I9IFIDEXWRIFIDEXWRIFIDEXWRIFIDEXWRIFIDEXWRIFIDEXWRIFIDEXWR余臘生版權(quán)所有，違者必究典型處理機(jī)結(jié)構(gòu)MIPSR4000處理機(jī)每個(gè)時(shí)鐘周期包含兩個(gè)流水段，是一種很標(biāo)準(zhǔn)的超流水線處理機(jī)結(jié)構(gòu)。指令流水線有8個(gè)流水段有兩個(gè)Cache，指令Cache和數(shù)據(jù)Cache的容量各8KB，每個(gè)時(shí)鐘周期可以訪問(wèn)Cache兩次，因此在一個(gè)時(shí)鐘周期內(nèi)可以從指令Cache中讀出兩條指令，從數(shù)據(jù)Cache中讀出或?qū)懭雰蓚€(gè)數(shù)據(jù)。主要運(yùn)算部件有整數(shù)部件和浮點(diǎn)部件余臘生版權(quán)所有，違者必究余臘生版權(quán)所有，違者必究MIPSR4000處理機(jī)的流水線操作指令CacheIF：取第一條指令 IS：取第二條指令

RF：讀寄存器堆，指令譯碼

EX：執(zhí)行指令 DF：取第一個(gè)數(shù)據(jù)

DS：取第二個(gè)數(shù)據(jù) TC：數(shù)據(jù)標(biāo)志

校驗(yàn)；WB：寫(xiě)回結(jié)果指令

譯碼讀寄

存器堆ALU數(shù)據(jù)Cache標(biāo)志檢驗(yàn)寄存器堆IFISRFEXDFDSWBTC余臘生版權(quán)所有，違者必究IF流水線周期當(dāng)前CPU周期ISRFEXDFDSTCWBIFISRFEXDFDSTCWBIFISRFEXDFDSTCWBIFISRFEXDFDSTCWBIFISRFEXDFDSTCWBIFISRFEXDFDSTCWBIFISRFEXDFDSTCWBIFISRFEXDFDSTCWB主時(shí)

鐘

周期MIPSR4000正常指令流水線工作時(shí)序余臘生版權(quán)所有，違者必究如果在LOAD指令之后的兩條指令中，任何一條指令要在它的EX流水級(jí)使用這個(gè)數(shù)據(jù)，則指令流水線要暫停一個(gè)時(shí)鐘周期采用順序發(fā)射方式余臘生版權(quán)所有，違者必究MIPSR4000正常指令流水線工作時(shí)序暫停IFISRFEXDFDSTCWBISRFEXDFDSTCWBRFEXDFDSTCWBEXDFDSTCWBEXDFDSTWBDFDSTCWBIFISRFI1I2I3I4I5I6運(yùn)行運(yùn)行Load指令使用Load數(shù)據(jù)余臘生版權(quán)所有，違者必究超流水線處理機(jī)性能指令級(jí)并行度為(1,n)的超流水線處理機(jī)，執(zhí)行N條指令所的時(shí)間為：超流水線處理機(jī)相對(duì)于單流水線普通標(biāo)量處理機(jī)的加速比為：余臘生版權(quán)所有，違者必究即：

超流水線處理機(jī)的加速比的最大值為：S(1,n)MAX=n2023/7/23余臘生版權(quán)所有，違者必究5-59超標(biāo)量超流水線處理機(jī)指令執(zhí)行時(shí)序典型處理機(jī)結(jié)構(gòu)超標(biāo)量流水線處理機(jī)性能余臘生版權(quán)所有，違者必究

把超標(biāo)量與超流水線技術(shù)結(jié)合在一起，就成為超標(biāo)量超流水線處理機(jī)

指令執(zhí)行時(shí)序超標(biāo)量超流水線處理機(jī)在一個(gè)時(shí)鐘周期內(nèi)分時(shí)發(fā)射指令n次，每次同時(shí)發(fā)射指令m條，每個(gè)時(shí)鐘周期總共發(fā)射指令m×

n條。余臘生版權(quán)所有，違者必究IF時(shí)鐘周期指令I(lǐng)1I2I3IDEXWR12345I4I5I6I7I8I9IFIDEXWRIFIDEXWRIFIDEXWRIFIDEXWRIFIDEXWRIFIDEXWRIFIDEXWRIFIDEXWRIFIDEXWRIFIDEXWRIFIDEXWRI10I11I12每時(shí)鐘周期發(fā)射3次,每次3條指令余臘生版權(quán)所有，違者必究典型處理機(jī)結(jié)構(gòu)DEC公司的Alpha處理機(jī)采用超標(biāo)量超流水線結(jié)構(gòu)。主要由四個(gè)功能部件和兩個(gè)Cache組成：整數(shù)部件EBOX、浮點(diǎn)部件FBOX、地址部件ABOX和中央控制部件IBOX。中央控制部件IBOX可以同時(shí)從指令Cache中讀入兩條指令，同時(shí)對(duì)讀入的兩條指令進(jìn)行譯碼，并且對(duì)這兩條指令作資源沖突檢測(cè)，進(jìn)行數(shù)據(jù)相關(guān)性和控制相關(guān)性分析。如果資源和相關(guān)性允許，IBOX就把兩條指令同時(shí)發(fā)射給EBOX、ABOX和FBOX三個(gè)指令執(zhí)行部件中的兩個(gè)。指令流水線采用順序發(fā)射亂序完成的控制方式。在指令Cache中有一個(gè)轉(zhuǎn)移歷史表，實(shí)現(xiàn)條件轉(zhuǎn)移的動(dòng)態(tài)預(yù)測(cè)。在EBOX內(nèi)還有多條專(zhuān)用數(shù)據(jù)通路，可以把運(yùn)算結(jié)果直接送到執(zhí)行部件。余臘生版權(quán)所有，違者必究Alpha21064處理機(jī)共有三條指令流水線

整數(shù)操作流水線和訪問(wèn)存儲(chǔ)器流水線分為7個(gè)流水段，其中，取指令和分析指令為4個(gè)流水段，運(yùn)算2個(gè)流水段，寫(xiě)結(jié)果1個(gè)流水段。浮點(diǎn)操作流水線分為10個(gè)流水段，其中，浮點(diǎn)執(zhí)行部件FBOX的延遲時(shí)間為6個(gè)流水段。所有指令執(zhí)行部件EBOX、IBOX、ABOX和FBOX中都設(shè)置由專(zhuān)用數(shù)據(jù)通路。析指令為4個(gè)流水段，運(yùn)算2個(gè)流水段，寫(xiě)結(jié)果1個(gè)流水段。浮點(diǎn)操作流水線分為10個(gè)流水段，其中，浮點(diǎn)執(zhí)行部件FBOX的延遲時(shí)間為6個(gè)流水段。所有指令執(zhí)行部件EBOX、IBOX、ABOX和FBOX中都設(shè)置由專(zhuān)用數(shù)據(jù)通路。Alpha21064處理機(jī)的三條指令流水線的平均段數(shù)為8段，每個(gè)時(shí)鐘周期發(fā)射兩條指令。因此，Alpha21064處理機(jī)是超標(biāo)量超流水線處理機(jī)。余臘生版權(quán)所有，違者必究余臘生版權(quán)所有，違者必究IF

取值

SWAP

交換雙發(fā)射指令、轉(zhuǎn)移預(yù)測(cè)I0

指令譯碼

訪問(wèn)通用寄存器堆，發(fā)射校驗(yàn)A1

計(jì)算周期1，IBOX計(jì)算新的PC值A(chǔ)2

計(jì)算周期2，查指令快表WR

寫(xiě)整數(shù)寄存器堆，指令Cache命中檢測(cè)7個(gè)流水段的整數(shù)操作流水線SWAP1IFI0I1A0A1WR234560余臘生版權(quán)所有，違者必究IF

取值

SWAP

交換雙發(fā)

人人文庫(kù)> 全部分類(lèi)> 教育資料 > 課件下載

溫馨提示

1. 本站所有資源如無(wú)特殊說(shuō)明，都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
2. 本站的文檔不包含任何第三方提供的附件圖紙等，如果需要附件，請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶(hù)所有。
3. 本站RAR壓縮包中若帶圖紙，網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽，若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間，僅對(duì)用戶(hù)上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理，對(duì)用戶(hù)上傳分享的文檔內(nèi)容本身不做任何修改或編輯，并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容，請(qǐng)與我們聯(lián)系，我們立即糾正。
7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶(hù)因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。

第05章標(biāo)量處理機(jī)eng3課件

文檔簡(jiǎn)介

溫馨提示

最新文檔

評(píng)論

第05章標(biāo)量處理機(jī)eng3課件

文檔簡(jiǎn)介

溫馨提示

最新文檔

評(píng)論

相關(guān)文檔