持久內(nèi)存快速編程手冊(cè)_第1頁
持久內(nèi)存快速編程手冊(cè)_第2頁
持久內(nèi)存快速編程手冊(cè)_第3頁
持久內(nèi)存快速編程手冊(cè)_第4頁
持久內(nèi)存快速編程手冊(cè)_第5頁
已閱讀5頁,還剩82頁未讀, 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡(jiǎn)介

持久內(nèi)存以創(chuàng)新的內(nèi)存技術(shù)重新定義了傳統(tǒng)存儲(chǔ)架構(gòu),將高性價(jià)比的大容量妙地結(jié)合在一起,以合理的價(jià)格提供大型持久內(nèi)存層級(jí)。憑借在內(nèi)存密集型持兼容,有助于更加有效地挖掘數(shù)據(jù)的潛在價(jià)值。開發(fā)人員可以利用行業(yè)標(biāo)模式,構(gòu)建更簡(jiǎn)單,更強(qiáng)大的應(yīng)用,確保對(duì)數(shù)據(jù)中心的投資能夠適應(yīng)未來的需httpswwwintelcncontentwwwcnzhproductsmemorystorageoptanedc-persistent-memory.html)。的方法使用,有些用法對(duì)應(yīng)用來說是透明的。例如,所有持久內(nèi)存易于理解,因此不在我們的討論之列?;蛘邔⒊志脙?nèi)存配置成內(nèi)存模式,系統(tǒng)的持方式和系統(tǒng)內(nèi)存一樣,應(yīng)用不需要做任何更改,所以也不在我們的討論之列。我們久內(nèi)存式訪問,即應(yīng)用管理駐留在持久內(nèi)存中的可字節(jié)尋址的數(shù)據(jù)結(jié)構(gòu)。英特爾編寫了一本持久內(nèi)存編程的書如圖1所示,中文版也會(huì)在2021年發(fā)表。這個(gè)快速編程手冊(cè)可以作補(bǔ)充,讓開發(fā)人員快速了解持久內(nèi)存相關(guān)的編程方法和重要概念。該手冊(cè)不會(huì)涉及及使用的方方面面,而只是通過一些示例說明持久內(nèi)存編程的一些重要概念。我們具有易失性,僅將持久內(nèi)存用來擴(kuò)展內(nèi)存容量,但主要介紹持久性用例,即持久確保數(shù)據(jù)結(jié)構(gòu)一些其它存儲(chǔ)的示例,主要目的可以給大家一些直觀的比較和感圖1持久內(nèi)存編程英文版慮?持久內(nèi)存的性能(吞吐量、延遲和帶寬)遠(yuǎn)高于NAND,但是可能低于DRAM。?持久內(nèi)存支持字節(jié)尋址(類似于內(nèi)存)。應(yīng)用可以只更新所需的數(shù)據(jù),不會(huì)產(chǎn)生任何讀取-修改-寫入(read-modify-write,RMW)開銷。ccess成后,可以直接從用戶空間訪問持久內(nèi)存上的數(shù)據(jù)。數(shù)據(jù)訪問不經(jīng)過任何內(nèi)面緩存(pagecache)或中斷。內(nèi)存上的數(shù)據(jù)可立即使用,也就是說:o系統(tǒng)通電后即可使用數(shù)據(jù)。o應(yīng)用不需要花時(shí)間來預(yù)熱高速緩存。o它們可在內(nèi)存映射后立即訪問數(shù)據(jù)。內(nèi)存模塊的數(shù)據(jù)位于系統(tǒng)本地。應(yīng)用負(fù)責(zé)在不同系統(tǒng)之間復(fù)制數(shù)據(jù)。應(yīng)用開發(fā)人員通常會(huì)考慮內(nèi)存駐留(memory-resident)數(shù)據(jù)結(jié)構(gòu)和存儲(chǔ)駐留(storage-resident)數(shù)據(jù)結(jié)構(gòu)。就數(shù)據(jù)中心應(yīng)用而言,開發(fā)人員要謹(jǐn)慎地在存儲(chǔ)中保持一致的數(shù)據(jù)結(jié)構(gòu),例外。這個(gè)問題通??梢允褂萌罩炯记?如預(yù)寫日志)來解決,先將更改寫常要依賴于數(shù)據(jù)庫、編程庫和現(xiàn)代文件系統(tǒng)的組合來提供一致性。即便如發(fā)人員設(shè)計(jì)一種策略,在運(yùn)行時(shí)和從應(yīng)用和系統(tǒng)崩潰中恢復(fù)系統(tǒng)時(shí)確保存久內(nèi)存設(shè)備的管理和訪問。該套件的開發(fā)工作與持久內(nèi)存的操作系統(tǒng)支圖2PMDK相關(guān)的開發(fā)庫,可以聯(lián)系我們。IntelRXeonRPlatinumCPU.40GHzpDirectModemlibpmemobjlibpmemblkhttppmemiopmdk/libpmem/。B務(wù)性,即在系統(tǒng)奔潰或者突然斷電的情況下不能出現(xiàn)臟頁。3.所有的持久內(nèi)存內(nèi)的數(shù)據(jù)可以恢復(fù)。子中我們只考慮單線程的情況。ffset。pmembasechar*)pmem_map_file(filename,PMEM_SIZE,PMEM_FILE_CREATE,0666,edlenispmem2.pmem_memset_persist(pmem_base,0x00,PMEM_SIZE);3.pmem_memset_persist(pmem_meta,0x0,8);原子操作4.pmem_memcpy_persist(PAGE_FROM_META(page_new),content,4096);非原子操作.pmem_memcpy_persist(page_new,&atomic_value,8);原子操作6.pmem_persist(page_old,8);原子操作編程里面,個(gè)人認(rèn)為以下這兩點(diǎn)是持久內(nèi)存編程的基礎(chǔ),所有的上層的庫和應(yīng)instep1:數(shù)據(jù)寫入CPUCache(這個(gè)地方?jīng)]有談及NTW,NTW不經(jīng)過cache)step2:數(shù)eamohthbhemhghineBLOCKPAGENUMefineREQBLOCKEQPAGExffdefineDISKBLOCKNUM24*1024blocksCNUMBERxabcdabcd*********************\srebuildtheblockdramandfreeliststructureafterthesystemrestart.bytesatomic新了,而另外的域還沒有更新的情況typedefstructpage_meta{uinttreqid;//req_id,whichisintherangefrom1~1024*1024*1024}page_meta_t;umbertypedefstructpmem_layout{uint64_tmagic_number;}pmem_layout_t;charpmembaseuinttpageaddress//真正的也得首地址。ePAGESIZEfinePMEMMETASIZEsizeofpagemetatfineMAGICNUMSIZEsizeofpmemlayouttdefinePMEMSIZE24UL//定義我們使用持久內(nèi)存的大小100GBdefinePMEMPAGENUMBERSPMEM_SIZE-MAGIC_NUM_SIZE)/(PAGE_SIZE+PMEM_META_SIZE)-1)多少個(gè)內(nèi)存頁definePAGE_ID_FROM_META(meta)(((uint64_t)meta-MAGIC_NUM_SIZE-definePAGEFROM_META(meta)(char*)(page_address+PAGE_SIZE*(((uint64_t)meta-MAGICNUMSIZE-(uint64_t)pmem_base)/PMEM_META_SIZE))//meta->realpagenePAGEMETAFROMIDidpagemetat**********************構(gòu)pageidpageidBLOCKPAGENUMcachepagescntDISKBLOCK_NUMtypedefstructblock_page{uint64_tpage_id1:32;uint64_tpage_id2:32;}block_page_t;typedefstructblock_data{uinttcachedpagescnt//ifthecountoveronethreshold,mightneedtowritebacktotheSSD.}block_data_t;typedefstructdisk{}disk_t;disk_t*disk;afreelistthatalwayspickthefreepagefromthefreelistfreenumructfreepagesttfreenumpage_meta_t**free_list;}free_pages_t;free_pages_t*free_pages;{enpagecachenuminti;baseECREATEappedlenispmemNULLreturn-1;}printfpmemmapfilemapped_len=%ld,pmem_base=%p,is_pmem=%ld\n",mapped_len,pmembaseispmem;pmem_layout_t*pmem_data=(pmem_layout_t*)pmem_base;page_meta_t*pmem_meta=(page_meta_tuinttpmembasesizeofpmemlayoutt;diskdisktcallocsizeofcharsizeofdiskt));if(disk==NULL)return-1;eepagesfreepagestmallocsizeoffreepagesteepagesfreelistpagemetatmalloc(PMEM_PAGE_NUMBERS*sizeof(page_meta_t*));if(free_pages==NULL||free_pages->free_list==NULL){listnreturn-1;}if(pmem_data->magic_number!=MAGIC_NUMBER){firsttimeinitandwritethewholeblockstructureto0emsetpersistpmembasexPMEMSIZEpageaddress(uint64_t)(((uint64_t)pmem_meta+sizeofpagemetatPMEMPAGENUMBERSPAGESIZE0xfffffffffffff000);pmemdatapageoffset=page_address-(uint64_t)pmem_base;pmempersistpmem_data->page_offset),sizeof(pmem_data->page_offset));pmemdatamagicnumber=MAGIC_NUMBER;pmempersistpmemdatasizeofpmem_data->magic_number));for(i=0;i<PMEM_PAGE_NUMBERS;i++){free_pages->free_list[i]=pmem_meta;pmemmeta}free_pages->free_num=PMEM_PAGE_NUMBERS;}else{uint64_tblock_id,page_id;uint64_treq_id;//magicnumbercheckpass,thatmeanswemighthavefreepagescaches.for(i=0;i<PMEM_PAGE_NUMBERS;i++){if(pmem_meta->valid==0){free_pages->free_list[j]=pmem_meta;pmemmeta}else{//fillthediskstructure.req_id=pmem_meta->req_id;block_id=req_id>>10;page_id=req_id&0x3ff;MMETApmemmeta}else{MMETApmemmeta}pmemmeta;}}numjpage_address=pmem_data->page_offset+(uint64_t)pmem_base;}umreturn0;}{inti;for(i=0;i<DISK_BLOCK_NUM;i++){}returncnt;}tintwrite_req(uint64_treq_id,unsignedchar*content){uint64_tblock_id=req_id>>REQ_BLOCK;uint64_treq_page_id=req_id&REQ_PAGE;uinttatomicvalueuint64_tpage_id1,page_id2;uinttfreenumfreepagesfree_num;if(block_id>DISK_BLOCK_NUM){return-1;}page_meta_t**free_list=free_pages->free_list;if(free_num==0){}if(page_id1==0&&page_id2==0){ereisnopageinthelocationaddonepageGetonefreepagemetainthepagemetatpage_new=free_list[free_num-1];ntenttavalidsn,eenumgenew}更新,必須先寫道新的位置上page_meta_t*page_old;if(page_id1!=0){pageoldPAGEMETAFROM_ID(page_id1);}else{pageoldPAGEMETAFROM_ID(page_id2);}if(page_old->sn!=0){pageoldsn;}pagemetatpage_new=free_list[free_num-1];ntentgenewuefree_list[free_num-1]=page_old;}elseif(page_id1!=0&&page_id2!=0){page_meta_t*page1_meta,*page2_meta;pagemetaPAGEMETAFROM_ID(page_id1);pagemetaPAGEMETAFROM_ID(page_id2);enumfreenumfree_list[free_num-1]=page2_meta;}else{ueenumfreenumfree_list[free_num-1]=page1_meta;}}return0;}void*read_req(uint64_treq_id){//req_idandcontent;onlyaftertheinitsuccess,thenthisAPIcanbecalled.uint64_tblock_id=req_id>>REQ_BLOCK;uint64_treq_page_id=req_id&REQ_PAGE;uinttatomicvalueuint64_tpage_id1,page_id2;if(block_id>DISK_BLOCK_NUM){angenreturnNULL;}page_meta_t*page_meta;if(page_id1!=0){pagemetaPAGEMETAFROMID(page_id1);}else{pagemetaPAGEMETAFROMID(page_id2);}urnPAGEFROMMETApagemetaifpageidpageidtwopagespagemetatpage1_meta=PAGE_META_FROM_ID(page_id1);pagemetatpage2_meta=PAGE_META_FROM_ID(page_id2);if(page1_meta->sn<page2_meta->sn){turnPAGEFROMMETApagemeta}else{turnPAGEFROMMETApagemeta}}else{}returnNULL;}//req_idandcontent;onlyaftertheinitsuccess,thenthisAPIcanbecalled.uint64_tblock_id=req_id>>REQ_BLOCK;uint64_treq_page_id=req_id&REQ_PAGE;uinttatomicvalueuint64_tpage_id1,page_id2;if(block_id>DISK_BLOCK_NUM){angenreturn;}uinttfreenumfreepagesfree_num;page_meta_t**free_list=free_pages->free_list;if(free_num==0){}uelyonepageistherepage_meta_t*page_meta;if(page_id1!=0){pagemetaPAGEMETAFROMID(page_id1);}else{pagemetaPAGEMETAFROMID(page_id2);}enumfreenumfree_list[free_num-1]=page_meta;ifpageidpageidtwopagespagemetatpage1_meta=PAGE_META_FROM_ID(page_id1);pagemetatpage2_meta=PAGE_META_FROM_ID(page_id2);if(page1_meta->sn<page2_meta->sn){}else{}free_list[free_num-1]=page1_meta;free_list[free_num-1]=page2_meta;enumfreenum}else{}}neWRITECOUNTneOVERWRITECOUNT{imeunsignedcharpagecontentunsignedchar*)malloc(PAGE_SIZE);nttiutostartstdchronosteadyclocknowutostopstdchronosteadyclocknowdurationdoublediffstopstartunsignedchar*read_content;memsetpagecontent,0xab,PAGE_SIZE);chronosteadyclocknownitmntpmemcbsfilehronosteadyclocknowdiff=stop-start;tdchronosteadyclocknowfor(i=0;i<WRITE_COUNT;i++){writereqipagecontent;}hronosteadyclocknowdiff=stop-start;memsetpagecontent,0xcd,PAGE_SIZE);tdchronosteadyclocknowfor(i=0;i<OVERWRITE_COUNT;i++){writereqipagecontent;}hronosteadyclocknowdiff=stop-start;utoverwritewriterequpdatetaketimetdchronosteadyclocknowfor(i=0;i<OVERWRITE_COUNT;i++){ontentunsignedcharreadreqimemcpypagecontentread_content,PAGE_SIZE);}hronosteadyclocknowdiff=stop-start;overwritereadreqtaketimetdchronosteadyclocknowfor(i=OVERWRITE_COUNT;i<WRITE_COUNT;i++){ontentunsignedcharreadreqimemcpypagecontentread_content,PAGE_SIZE);}hronosteadyclocknowdiff=stop-start;outoverwritewritecountreadreqtaketime"<<diff.count()/(WRITE_COUNT-OVERWRITE_COUNT)<<std::endl;tstdchronosteadyclocknow//for(i=0;i<WRITE_COUNT;i++){chronosteadyclocknowstdcoutdeletewritecounttaketimediffcountWRITECOUNT<std::endl;return0;}gcbs_req.cpp-ocbs_req-lpmem-O2”,然后使用“taskset-c2./cbs_req運(yùn)行這段代?~taskset-c2./cbs_req_newpmemmapfilemappedlen107374182400,is_pmem=1initdonepagecachenumfreepagenumber26163298sinittimecountwrite_reqtime4.26528e-06overwritewriterequpdatetaketime2.15705e-06overwritereadreqtaketime1.06379e-06thepageshouldfillwithpatenxcd0xcdoverwritewritecountread_reqtaketime1.07138e-06thepageshouldfillwithpaternxab0xab?~vimcbs_req_new.cpp?~taskset-c2./cbs_req_newpmemmapfilemappedlen107374182400,is_pmem=1initdonepagecachenumfreepagenumber26063298sinittimepagecountwrite_reqtime2.19311e-06overwritewriterequpdatetaketime2.71467e-06overwritereadreqtaketime1.04566e-06thepageshouldfillwithpatenxcd0xcdoverwritewritecountread_reqtaketime1.06975e-06thepageshouldfillwithpaternxab0xab完全原始地訪問持久內(nèi)存并且無需庫提供分配器或事務(wù)功能,那么可以將libpmemlibpmem自定義內(nèi)存管理和恢復(fù)邏輯來實(shí)現(xiàn)高性能ididtypedefstructpage_meta{uinttreqid;//req_id,whichisintherangefrom1~1024*1024*1024uint64_tpadding[7];//forperformance,avoidfalsesharing“R-M-F”}page_meta_t;umbertypedefstructpmem_layout{uint64_tmagic_number;uinttpadding6];}pmem_layout_t;?~taskset-c2./cbs_req_newpmemmapfilemappedlen107374182400,is_pmem=1inittimecountwrite_reqtime4.46733e-06overwritewriterequpdatetaketime1.71743e-06overwritereadreqtaketime1.04520e-06thepageshouldfillwithpatenxcd0xcdoverwritewritecountread_reqtaketime1.06970e-06thepageshouldfillwithpaternxab0xab?~taskset-c2./cbs_req_newpmemmapfilemappedlen107374182400,is_pmem=1sinittimepagecountwrite_reqtime1.75316e-06overwritewriterequpdatetaketime1.82072e-06overwritereadreqtaketime1.04660e-06thepageshouldfillwithpatenxcd0xcdoverwritewritecountread_reqtaketime1.05974e-06thepageshouldfillwithpaternxab0xabpmemmapfilemappedlen107374182400,is_pmem=1initdonepagecachenumfreepagenumber26163298inittimecountwrite_reqtime1.60401e-06overwritewriterequpdatetaketime1.80627e-06overwritereadreqtaketime1.07942e-06thepageshouldfillwithpatenxcd0xcdoverwritewritecountread_reqtaketime8.9152e-07thepageshouldfillwithpaternxab0xabpmemmapfilemappedlen107374182400,is_pmem=1initdonepagecachenumfreepagenumber26063298sinittimepagecountwrite_reqtime1.86702e-06overwritewriterequpdatetaketime2.00082e-06overwritereadreqtaketime8.18381e-07thepageshouldfillwithpatenxcd0xcdoverwritewritecountread_reqtaketime8.41381e-07thepageshouldfillwithpaternxab0xab寫入頁表,缺頁中斷會(huì)導(dǎo)致清零的操作代價(jià)較大,但一旦完成寫入,虛擬地址的時(shí)if(pmem_data->magic_number!=MAGIC_NUMBER){firsttimeinitandwritethewholeblockstructureto0emsetpersistpmembasexPMEMSIZEpmemdatamagicnumber=MAGIC_NUMBER;pmempersistpmemdatasizeofpmem_layout_t));for(i=0;i<PMEM_PAGE_NUMBERS;i++){free_pages->free_list[i]=pmem_meta;pmem_memset_persist(PAGE_FROM_META(pmem_meta),0x0,8);//pre-faultpmemmeta}free_pages->free_num=PMEM_PAGE_NUMBERS;}?~taskset-c2./cbs_req_newpmemmapfilemappedlen107374182400,is_pmem=1inittimecountwrite_reqtime1.53175e-06overwritewriterequpdatetaketime1.69467e-06overwritereadreqtaketime1.04470e-06thepageshouldfillwithpatenxcd0xcdoverwritewritecountread_reqtaketime1.06070e-06thepageshouldfillwithpaternxab0xab?~taskset-c2./cbs_req_newpmemmapfilemappedlen107374182400,is_pmem=1sinittimepagecountwrite_reqtime1.75408e-06overwritewriterequpdatetaketime1.82161e-06overwritereadreqtaketime1.04456e-06thepageshouldfillwithpatenxcd0xcdoverwritewritecountread_reqtaketime1.06950e-06thepageshouldfillwithpaternxab0xab調(diào)用相應(yīng)的函數(shù),安裝/pmem/valgrind,然后測(cè)試你的程序“valgrind--definePMEMSIZE1024UL//定義我們使用持久內(nèi)存的大小?~valgrind--tool=pmemcheck./cbs_req_new==pmemcheck-1.0,asimplepersistentstorechecker=Copyright(c)2014-2020,IntelCorporationUsingValgrind-3.15.0andLibVEX;rerunwith-hforcopyrightinfoCommand:./cbs_req_new=pmemmapfilemappedlen10737418240,is_pmem=1initdonepagecachenumfreepagenumber616328inittimecountwritereqtime.000361661overwritewriterequpdatetaketime59447overwritereadreqtaketime5.06452e-08thepageshouldfillwithpatenxcd0xcdoverwritewritecountread_reqtaketime2.46624e-08thepageshouldfillwithpaternxab0xab===248561==Numberofstoresnotmadepersistent:0==ERRORSUMMARY:0errors?~valgrind--tool=pmemcheck./cbs_req_new==pmemcheck-1.0,asimplepersistentstorechecker=Copyright(c)2014-2020,IntelCorporationUsingValgrind-3.15.0andLibVEX;rerunwith-hforcopyrightinfoCommand:./cbs_req_new=pmemmapfilemappedlen10737418240,is_pmem=1initdonepagecachenumfreepagenumber16328sinittimepagecountwritereqtime.000358157overwritewriterequpdatetaketime5847overwritereadreqtaketime5.01243e-08thepageshouldfillwithpatenxcd0xcdoverwritewritecountread_reqtaketime2.46348e-08thepageshouldfillwithpaternxab0xab===248566==Numberofstoresnotmadepersistent:0==ERRORSUMMARY:0errorsllForAction1.http://pmem.io/pmdk/libpmem/;2./pmem/pmdk/tree/master/src/examples/libpmem3./twitter/pelikan/blob/master/src/datapool/datapool_pmem.cidtintnumberTips解決在持久內(nèi)存編程時(shí)遇到的許多常見的算法和數(shù)據(jù)問題。如果選擇使要保持?jǐn)?shù)據(jù)原子性、斷電一致性,同時(shí)能夠恢復(fù)數(shù)據(jù)。libpmemobj是一個(gè)如圖3libpmemobj的的原語接口(PrimitivesAPIs),這些接口和libpmem一樣不支持事務(wù)性的操作,應(yīng)用需要考慮數(shù)據(jù)原子性、一致性以及數(shù)據(jù)的恢復(fù)。libpmemobj利用統(tǒng)一的日志(unifiedlogs)來實(shí)現(xiàn)數(shù)據(jù)的事務(wù)特性。持久化內(nèi)存的原子分配和事務(wù)的接口(AtomicAPIs、TracactionalAPIs、ActionAPIs)都是基于這些統(tǒng)一的日志來實(shí)現(xiàn)的。為了保證數(shù)據(jù)的可恢復(fù)性,需要考慮持久化內(nèi)存的位置的獨(dú)立性,所以應(yīng)用不能利用直接地址來保存數(shù)據(jù)。libpmemobj使用偏移指針(offsetpointers這是相對(duì)于內(nèi)存池基地址的偏移來表示一個(gè)數(shù)據(jù)對(duì)象。持久化內(nèi)存是以文件的方式圖3libpmemobj的接口框架務(wù)性,即不能出現(xiàn)臟頁。子中我們只考慮單線程的情況。中使用的libpmemobj核心接口主要包括:poppmemobjcreatefilenameLAYOUTNAMEPMEM_SIZE,0666);2.pop=pmemobj_open(filename,POBJ_LAYOUT_NAME(cbs_cache));3.PMEMoidroot=pmemobj_root(pop,sizeof(structroot));4.rootp=(structroot*)pmemobj_direct(root);.OID_IS_NULL(block_oid)6.block_oid=pmemobj_tx_zalloc(sizeof(structblock),0);7.PMEMoidpage_oid=pmemobj_tx_alloc(sizeof(structpage),0);IZE9.pmemobj_tx_free(page_oid);10.TX_BEGIN(pop){END示例2libpmemobj實(shí)現(xiàn)持久化的頁緩存eamohthbhemhghrtemobjhePAGESIZEineBLOCKPAGENUMefineREQBLOCKEQPAGExffdefineDISKBLOCKNUM24*1024blocksdefinePMEMSIZE*1024UL//定義我們使用持久內(nèi)存的大小*********************\srebuildtheblockdramstructureafterthesystemrestart.inpmemYOUTNAMEcbscacheBEGINcbscacheENDcbscache//blockconentshavetheBLOCK_PAGE_NUM,pages.structpage{signedcharrpagePAGESIZEstructblockarraybasedqueuecontainer*/Muint64_tcached_page_cnt;rootobjectmaxsizeisGdefinedintheheadfilerootsizeislimitedsoatthepocwejustsetheBLOCKNUMas1000;structroot{MtpNULLpNULL{cbsinitthefilenamesnfilenameEif(pop!=NULL){printfpmemobjcreatesuccessfullyn");}else{if(pop==NULL){return1;}}ULLntfrootpreturn1;}return0;}{Lpleasecheckyourpersistentmemorypoolninti;intcnt;PMEMoidblock_oid;ockpNULLfor(i=0;i<DISK_BLOCK_NUM;i++){if(!OID_IS_NULL(block_oid)){blockpstructblock*)pmemobj_direct(block_oid);cnt+=blockp->cached_page_cnt;}}returncnt;}intwrite_req(uint64_treq_id,unsignedchar*content){uint64_tblock_id=req_id>>REQ_BLOCK;uint64_treq_page_id=req_id&REQ_PAGE;assertreqpageidBLOCKPAGENUMreqidandreqnumberinoneblock,can'tover1024;reqidshouldnotduplicate.TX_BEGIN(pop){blockuctblockblockpstructblockpmemobjdirectblockoidstructpagepagepstructpagepmemobj_direct(page_oid);memcpypagep>rpage,content,PAGE_SIZE);dpmemobjtxaddrangedirect(&blockp->cached_page_cnt,sizeof(uint64_t));blockpcachedpagecnt++;uctblockblockpstructblockpmemobjdirectblockoiductpagepagepstructpagepmemobjdirectpageoidmemcpypagep>rpage,content,PAGE_SIZE);totxaddsnapshotd//8bytes,noneedtosnapshot.blockpcachedpagecnt++;pmemobjtxadd_range_direct(pagep->rpage,PAGE_SIZE);uctpagepagepstructpagepmemobjdirectpageoidmemcpypagep>rpage,content,PAGE_SIZE);}}return0;}void*read_req(uint64_treq_id){//req_idandcontent;onlyaftertheinitsuccess,thenthisAPIcanbecalled.uint64_tblock_id=req_id>>REQ_BLOCK;uint64_treq_page_id=req_id&REQ_PAGE;if(OID_IS_NULL(block_oid)){returnNULL;}else{uctblockblockpstructblockpmemobjdirectblockoidif(OID_IS_NULL(page_oid)){returnNULL;}else{uctpagepagepstructpagepmemobjdirectpageoidpageprpage}}}//req_idandcontent;onlyaftertheinitsuccess,thenthisAPIcanbecalled.uint64_tblock_id=req_id>>REQ_BLOCK;uint64_treq_page_id=req_id&REQ_PAGE;if(OID_IS_NULL(block_oid)){return;}else{uctblockblockpstructblockpmemobjdirectblockoidif(OID_IS_NULL(page_oid)){return;}else{TX_BEGIN(pop){}}}neWRITECOUNTneOVERWRITECOUNT{imecnttiutostartstdchronosteadyclocknowutostopstdchronosteadyclocknowstd::chrono::duration<double>diff=stop-start;unsignedchar*read_content;memsetpagecontent,0xab,4096);chronosteadyclocknownitmntpmemcbscachehronosteadyclocknowdiff=stop-start;tdchronosteadyclocknowfor(i=0;i<WRITE_COUNT;i++){writereqipagecontent;}hronosteadyclocknowdiff=stop-start;memsetpagecontent,0xcd,4096);tdchronosteadyclocknowfor(i=0;i<OVERWRITE_COUNT;i++){writereqipagecontent;}hronosteadyclocknowdiff=stop-start;utoverwritewriterequpdatetaketimetdchronosteadyclocknowfor(i=0;i<OVERWRITE_COUNT;i++){contentunsignedcharreadreqimemcpy(page_content,read_content,PAGE_SIZE);}hronosteadyclocknowdiff=stop-start;overwritereadreqtaketimetdchronosteadyclocknowfor(i=OVERWRITE_COUNT;i<WRITE_COUNT;i++){memcpy(page_content,read_content,PAGE_SIZE);}hronosteadyclocknowdiff=stop-start;outoverwritewritecountreadreqtaketime"<<diff.count()/(WRITE_COUNT-OVERWRITE_COUNT)<<std::endl;tstdchronosteadyclocknow//for(i=0;i<WRITE_COUNT;i++){chronosteadyclocknowstdcoutdeletewritecounttaketimediffcountWRITECOUNTstd::endl;return0;}gcbs_req_obj.cpp-ocbs_req_obj-lpmemobj-O2”,然后使用“taskset-c2./cbs_req_obj?~taskset-c2./cbs_req_new_pmemobjmemobjcreatesuccessfullysinittimecountwrite_reqtime7.33781e-06overwritewriterequpdatetaketime5.61578e-06overwritereadreqtaketime1.0457e-06thepageshouldfillwithpatenxcd0xcdoverwritewritecountread_reqtaketime1.11605e-06thepageshouldfillwithpaternxab0xab?~taskset-c2./cbs_req_new_pmemobjsinittimepagecountwrite_reqtime5.60939e-06overwritewriterequpdatetaketime5.58853e-06overwritereadreqtaketime1.05356e-06thepageshouldfillwithpatenxcd0xcdoverwritewritecountread_reqtaketime1.10197e-06thepageshouldfillwithpaternxab0xabo節(jié)的開銷,如果數(shù)據(jù)對(duì)象較小,那空間開銷可能會(huì)非常的大??梢允褂谩皃mempoolinfo-O常關(guān)鍵的,我們可以使用pmemcheck的工具“valgrind--tool=pmemcheck./your_program”llForAction1.https://pmem.io/pmdk/libpmemobj/2./pmem/pmdk/tree/master/src/examples/libpmemobjTipspbppmemblkcreatefilename,ELEMENT_SIZE,POOL_SIZE,0666);2.pbp=pmemblk_open(filename,ELEMENT_SIZE);3.nelements=pmemblk_nblock(pbp);4.pmemblk_write(pbp,content,req_id)pmemblk_read(pbp,buf,req_id)所以這個(gè)庫非常適合我們?nèi)挠懻摰氖褂贸志脙?nèi)存作為頁緩存的需求。其實(shí)現(xiàn)示例3所示:lkhhbhdhghemblkheamortsizeofthepmemblkpool100GB*/definePOOLSIZEuintt30)*100UL))sizeofeachelementinthepmempool*/eELEMENTSIZEsize_tnelements;{createthepmemblkpooloropenitifitalreadyexists*/NULLif(pbp==NULL){return-1;}howmanyelementsfitintothefile*/ementspmemblknblockpbpreturn0;}intwrite_req(uint64_treq_id,unsignedchar*content){reqidnelementsidperrorpmemblkwrite);return-1;}return0;}void*read_req(uint64_treq_id,unsignedchar*buf){qidnelementsreadtheblockatindexreadsaszerosinitially)*/if(pmemblk_read(pbp,buf,req_id)<0){perrorpmemblkread);returnNULL;}returnbuf;}neWRITECOUNTneOVERWRITECOUNT{imeunsignedchar*read_content;cnttiutostartstdchronosteadyclocknowutostopstdchronosteadyclocknowstd::chrono::duration<double>diff=stop-start;memsetpagecontent,0xab,4096);chronosteadyclocknownitmntpmemcbspmblkhronosteadyclocknowdiff=stop-start;stdcoutcachedpagecountgetcachedcountstdendltdchronosteadyclocknowfor(i=0;i<WRITE_COUNT;i++){writereqipagecontent;}hronosteadyclocknowdiff=stop-start;memsetpagecontent,0xcd,4096);tdchronosteadyclocknowfor(i=0;i<OVERWRITE_COUNT;i++){writereqipagecontent;}hronosteadyclocknowdiff=stop-start;utoverwritewriterequpdatetaketimetdchronosteadyclocknowfor(i=0;i<OVERWRITE_COUNT;i++){adcontentunsignedcharreadreqipagecontent}hronosteadyclocknowdiff=stop-start;overwritereadreqtaketimetdchronosteadyclocknowfor(i=OVERWRITE_COUNT;i<WRITE_COUNT;i++){adcontentunsignedcharreadreqipagecontent}hronosteadyclocknowdiff=stop-start;coutoverwritewritecountreadreqtaketime"<<diff.count()/(WRITE_COUNT-OVERWRITE_COUNT)<<std::endl;treturn0;}gcbsreqpmemblk.cpp-ocbs_req_pmemblk-lpmemblk-O2”,然后跑“taskset-c?~taskset-c2./cbs_req_pmemblkfileholdselementscbsinittime73344write_reqtime4.19133e-06overwritewriterequpdatetaketime2.99101e-06overwritereadreqtaketime1.2958e-06thepageshouldfillwithpatenxcd0xcdoverwritewritecountread_reqtaketime1.348e-06thepageshouldfillwithpaternxab0xab?~taskset-c2./cbs_req_pmemblkfileholdselementscbsinittime95083write_reqtime2.99173e-06overwritewriterequpdatetaketime2.92603e-06overwritereadreqtaketime1.27759e-06thepageshouldfillwithpatenxcd0xcdoverwritewritecountread_reqtaketime1.34219e-06thepageshouldfillwithpaternxab0xab臺(tái)通常是虛擬化的,且應(yīng)用程序高度抽象化,以避免對(duì)底層硬件細(xì)節(jié)作出顯設(shè)備僅與特定服務(wù)器保持本地連接,那么如何在云原生環(huán)境中簡(jiǎn)化持些數(shù)據(jù)結(jié)構(gòu)將直接影響到KV引擎在不同工作負(fù)載下的性能,如點(diǎn)查詢(pointquery)較多時(shí),hash-basedindex會(huì)比較合適,而范圍查詢(rangesearch)更適合采用b+tree,skiplist等數(shù)據(jù)如圖4所示(165行),pmemkv實(shí)現(xiàn)基于持久內(nèi)存的本地KV引擎,此時(shí)如何管理value3.并發(fā)或者單線程。kvdblackholeAcceptseverything,returns nothing - - - vsmapVolatilesorted hashmapvcmapVolatileconcurrentshmapcmapConcurrenthashmaptree3PersistentBtreestreeSortedpersistentemapPersistentsortedmapbasedonconcurrentskiplist(blockingerase)radixPersistentsortedsingle-threadedprefixtreepmemkv口如下:6.pmemkv_config*cfg=pmemkv_config_new();7.pmemkv_config_put_string(cfg,"path",path)8.pmemkv_config_put_uint64(cfg,"force_create",fcreate)9.pmemkv_config_put_uint64(cfg,"size",size)pmemkvopencmap,cfg,&db);pmemkvputdbstrstrlenstr,(constchar*)content,PAGE_SIZE);pmemkvget(db,str,strlen(str),pmemkv_get_value,&val);eamohdhthbhghemkvhgtePAGESIZEdefinePMEMSIZE*1024ULpmemkvdbdbNULL{pmemkvconfigcfgpmemkv_config_new();if(pmemkv_config_put_string(cfg,"path",path)!=PMEMKV_STATUS_OK){returnNULL;}returnNULL;}returnNULL;}returncfg;}{ints;pmemkvconfigcfgNULL;KE}else{E}//cmapopenwiththecfg.return0;}intwrite_req(uint64_treq_id,unsignedchar*content){ints;eqidstringldreqidsPMEMKVSTATUSOKreturn0;}{size_tcnt;ints;returncnt;}xnvalueluebytessignedcharvalunsignedcharvalueackfunctionmainlyforatomicthatmeansoutofthecallbackfunctionyoumightnotgettherealvaluecontentormightchangedbyotherthread?return;}unsignedchar*read_req(uint64_treq_id,unsignedchar*val){ints;ldreqid//valisaheapwith4096.returnval;}return;}neWRITECOUNTneOVERWRITECOUNT{imecnttiutostartstdchronosteadyclocknowutostopstdchronosteadyclocknowstd::chrono::duration<double>diff=stop-start;unsignedchar*read_content;memsetpagecontent,0xab,4096);chronosteadyclocknowitmntpmemcbspmemkvhronosteadyclocknowdiff=stop-start;std::cout<<"kvcount"<<cnt<<std::endl;tdchronosteadyclocknowfor(i=0;i<WRITE_COUNT;i++){writereqipagecontent;}hronosteadyclocknowdiff=stop-start;memsetpagecontent,0xcd,4096);tdchronosteadyclocknowfor(i=0;i<OVERWRITE_COUNT;i++){writereqipagecontent;}hronosteadyclocknowdiff=stop-start;utoverwritewriterequpdatetaketimetdchronosteadyclocknowfor(i=0;i<OVERWRITE_COUNT;i++){entreadreqipagecontent}hronosteadyclocknowdiff=stop-start;overwritereadreqtaketimetdchronosteadyclocknowfor(i=OVERWRITE_COUNT;i<WRITE_COUNT;i++){entreadreqipagecontent}hronosteadyclocknowdiff=stop-start;outoverwritewritecountreadreqtaketime"<<diff.count()/(WRITE_COUNT-OVERWRITE_COUNT)<<std::endl;ttstdchronosteadyclocknow//for(i=0;i<WRITE_COUNT;i++){chronosteadyclocknowstdcoutdeletewritecounttaketimediffcountWRITECOUNTstd::endl;return0;}gcbs_req_pmemkv.cpp-ocbs_req_pmemkv-lpmemkv-O2”,然后使用“taskset-cO/mnt/pmem0/cbs_pmemkv|more”來觀察對(duì)象空間的大小。?~taskset-c2./cbs_req_pmemkvsinittimereqiteitetime53356e-05write_requpdatetaketime1.2625e-05readreqtaketime.17473e-06thepageshouldfillwithpatenxcd0xcdoverwritewritecountread_reqtaketime3.36632e-06thepageshouldfillwithpaternxab0xab?~taskset-c2./cbs_req_pmemkvsinittimewrite_reqtime9.77862e-06overwritewriterequpdatetaketime9.67322e-06overwritereadreqtaketime2.1298e-06thepageshouldfillwithpatenxcd0xcdoverwritewritecountread_reqtaketime2.09206e-06thepageshouldfillwithpaternxab0xabllForAction1./pmem/pmemkv/tree/master/examples2.https://pmem.io/pmemkv/vTipsmkv。統(tǒng)一的函數(shù)分配而生:http://memkind.github.io/memkind/memkind_arch_20150318.pdf。memkind_create_pmem()采用tmpfile函數(shù)創(chuàng)建,在創(chuàng)建的目錄中不會(huì)顯示,并且當(dāng)程序退出后MEMKINDHOGMEMORY”可以避免這個(gè)中斷,從而不會(huì)再次產(chǎn)生缺頁中斷。如何通過預(yù)先次訪問的缺頁中斷代價(jià),現(xiàn)在仍然在討論中間。daddrsize_tsize,boolcommitted,edarenaind{boolresult=true;if(memkind_get_hog_memory()){returnresult;}

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。

最新文檔

評(píng)論

0/150

提交評(píng)論