pthread計算劃分和任務分配_第1頁
pthread計算劃分和任務分配_第2頁
pthread計算劃分和任務分配_第3頁
pthread計算劃分和任務分配_第4頁
pthread計算劃分和任務分配_第5頁
已閱讀5頁,還剩10頁未讀, 繼續(xù)免費閱讀

下載本文檔

版權說明:本文檔由用戶提供并上傳,收益歸屬內容提供方,若內容存在侵權,請進行舉報或認領

文檔簡介

pthread編程:任務粒度和分配策略(schedulingstrategy)、優(yōu)化處理器間由規(guī)約操作導致的W-W。統(tǒng)自動調度到具體的處理器上執(zhí)行。同時,pthread為線程之間的進度同步提供了一組同步pthread提供的線程管理相關的主要函數有:pthread_create-createanewthread -terminatecallingthreadpthread_join-waitforthreadterminationpthread_timedjoin_np-trytojoinwithaterminatedthreadpthread_tryjoin_np-trytojoinwithaterminatedthreadpthread_detach-detachathreadpthread_equal-comparethreadIDspthread_kill-sendasignaltoathreadpthread_kill_other_threads_np-terminateallotherthreadsinprocesspthread_self-getthecallingthreadIDpthread提供了三種線程同步機制:鎖機制、信號量機制(semaphore)、條件信號機制(condition)。其中,鎖機制可用來實現線程的互斥操作,pthread提供了互斥鎖(mutex)、自旋鎖(spinlock)(read-writelock)。線程都通過“加鎖”操作進入互斥操作區(qū),通過pthread_mutex_init-destroyandinitializeamutexpthread_mutex_destroy-destroyandinitializeamutexpthread_mutex_lock-lockandunlockamutexpthread_mutex_trylock-lockandunlockamutexpthread_mutex_timedlock-lockamutexpthread_mutex_unlock-lockandunlockapthread_spin_init-destroyorinitializeaspinlockobjectpthread_spin_destroy-destroyorinitializeaspinlockobjectpthread_spin_lock-lockaspinlockobjectpthread_spin_trylock(3p)-lockaspinlockobjectpthread_spin_unlock(3p)-unlockaspinlockobjectpthread_rwlock_init-destroyandinitializearead-writelockobjectpthread_rwlock_destroy-destroyandinitializearead-writelockobjectpthread_rwlock_rdlock-lockaread-writelockobjectforreadingpthread_rwlock_tryrdlock-lockaread-writelockobjectforreadingpthread_rwlock_timedrdlock-lockaread-writelockforreadingpthread_rwlock_wrlock-lockaread-writelockobjectforwritingpthread_rwlock_trywrlock-lockaread-writelockobjectforwritingpthread_rwlock_timedwrlock-lockaread-writelockforwritingsem_init-initializeanunnamedsemaphoresem_destroy-destroyanunnamedsemaphoresem_getvalue-getthevalueofasemaphoresem_wait-lockasemaphoresem_trywait-lockasemaphoresem_timedwait-lockasemaphoresem_post-unlockapthread_cond_init-destroyandinitializeconditionvariablespthread_cond_destroy-destroyandinitializeconditionvariablespthread_cond_wait-waitonaconditionpthread_cond_timedwait-waitonaconditionpthread_cond_signal-signalaconditionpthread_cond_broadcast-broadcastacondition并行的任務執(zhí)行規(guī)約操作時,需要進入互斥操作區(qū),以避免寫寫。采用pthread的unsignedintunsignedlonglonglongtypesync_fetch_and_add(type*ptr,typevalue);typesync_fetch_and_sub(type*ptr,typevalue);typesync_fetch_and_or(type*ptr,typevalue);typesync_fetch_and_and(type*ptr,typevalue);typesync_fetch_and_xor(type*ptr,typevalue);typesync_fetch_and_nandtype*ptrtypevalue);typesync_add_and_fetch(type*ptr,typevalue);typesync_sub_and_fetch(type*ptr,typevalue);typesync_or_and_fetch(type*ptr,typevalue);typesync_and_and_fetch(type*ptr,typevalue);typesync_xor_and_fetch(type*ptr,typevalue);typesync_nand_and_fetch(type*ptr,typevalue); pare_and_swap(type*ptr,typeoldval,type pare_and_swap(type*ptr,typeoldval,type==sync_synchronizefull sync_lock_test_and_set(type*ptr,type sync_lock_releasetype*ptr)將*ptr0BSPP(N0)的求解,K0是已發(fā)現的最大素數,M=min(N+1,??2),則當前超級計算步要搜索的自然數區(qū)間為(N0M)。當前超級計算步完成P(M)的求解。根據BSP系統(tǒng)的結構和訪存機制,在每顆處理器的P(N0)的解的副本r_P(N0)l_P(M)應緩存在寫緩沖區(qū)。在當前超級計算步的數據交換階段,每個處理因將本地寫緩沖區(qū)中的l_P(M)發(fā)送給其他每顆處理器;將其他各個處理器發(fā)送來的素數集合與本地的l_P(M)、r_P(N0)合并,作為r_P(M)BSP處理器執(zhí)行0r_P(N0){2,0While(??2<N)00Mmin(N+1,0采用動態(tài)計算劃分模式并行計算:根據r_P(N0)計算l_P(M)l_P(M),將其他各個處理器發(fā)送來的素數集合與本地的l_P(M)一起并入集合r_P(N0);K0max{k:kr_}空間組成,每個線程負責本地局部空間與全局空間的數據交換。采用pthread編程實現問題P(N)的并行算法,需解決三個問題:P(N0)、l_P(M)的管理和;M的動態(tài)P(N0)、l_P(M)的管理和。P(N0)和l_P(M)中元素的數量都是未知的。為提高程序的算步上,各個線程開始搜索素數之前,都需要P(N0);完成素數搜后,要分別將自己找到的素數l_P(M)并入到P(N0),這中間有可能需要為P(N0)重新分配更大的空MM中搜索素數時,各個自然數涉及的整數除法運算量不可預估,適各自搜索到的素數集合l_P(M)并入到P(N0,才能夠更新K0、并開始下一個超級計算步。pthreadmasterthreadK0的計算,然后與其他線實現方式1:mtxvecPrime:當前找到的全部素數地程度的任務后再合并到全局的vecPrime中。實現方式2:atomic全局的vecPrime中的素數到各個線程本地的vecPrime中;而各線程在超級計#include<stdio.h>#include<string.h>#include<time.h>//#include<math.h>#include<stdlib.h>#defineNANO #defineBLOCK_SIZE longintn longint*vecPrime,nPrime, longint int cMaster,cWorker; longintlbound,ubound,{longinti,j,k,um=vecPrime[0]=2;vecPrime[1]=j=nPrime=lbound=ubound=vecPrime[nPrime]*vecPrime[nPrime];if(ubound<0||ubound>arg)ubound=arg;for(i=lbound;i<ubound;i+=2)for(k=1;k<j;k++)if(i%vecPrime[k]==0)break;if(k<j)continue;if(nPrime==um{um+=temp=vecPrime=(longint*)malloc(um*sizeof(longint));memcpy(vecPrime,temp,(um-BLOCK_SIZE)*sizeof(longint));}}lbound=ubound+}}intcmpLongInt(constvoid*p1,constvoid*p2)longintval1=*((longint*)p1),val2=*((longintif(val1<val2)return-1;if(val1==val2)return0;return1;}{longinti,j,k,longintloc_lbound,loc_ubound; while(pMyStatus->id!=pthread_self())pMyStatus++;pMyStatus->um=BLOCK_SIZE;j=2; {pthread_cond_wait(&cMaster,&mtx);if(lbound== UM)pMyStatus->nPrime=0;while(true){lbound+=task_size;loc_lbound=lbound;if(loc_lbound>=ubound)break;loc_ubound=loc_lbound+task_size;if(loc_ubound>ubound)loc_ubound=for(i=loc_lbound;i<loc_ubound;i+=2)for(k=1;k<j;k++)if(i%vecPrime[k]==0)break;if(k<j)continue;if(pMyStatus->nPrime==pMyStatus->um{temp=pMyStatus->vecPrime;pMyStatus->um+=BLOCK_SIZE; um*sizeof(longint)); pMyStatus->nPrime*sizeof(longint));}}}}return}longint i,j,k,*temp;um=vecPrime[0]=2;vecPrime[1]=j=nPrime=lbound=totalThread=0;pthread_mutex_init(&mtx,NULL);pthread_cond_init(&cMaster,NULL);for(i=0;i<thread_num;i++)pthread_create(&(threads[i].id),NULL,mtx_worker,ubound=vecPrime[nPrime-1]*vecPrime[nPrime-1];if(ubound<0||ubound>arg)ubound=arg;task_size=(ubound-lbound)/(10*thread_num);if(task_size<10)task_size=(ubound-lbound+thread_num-1)/thread_num;if(task_size%2==1)task_size++;totalThread=0;while(totalThread!=thread_num)pthread_cond_wait(&cWorker,&mtx);for(i=0;i<thread_num;i++){if(threads[i].nPrime==0)if( um{temp= }

vecPrime=(longint*)malloc(um*sizeof(longint));memcpy(vecPrime,temp,nPrime*sizeof(longint)); threads[i].nPrime*sizeof(longint));nPrime+=}qsort(vecPrime,nPrime,sizeof(longint),cmpLongInt);lbound=ubound+2;}lbound= for(i=0;i<thread_num;i++)pthread_join(threads[i].id,NULL);}{longinti,j,k,longintloc_lbound,loc_ubound; while(pMyStatus->id!=pthread_self())pMyStatus++; um=BLOCK_SIZE;j=2; {pthread_cond_wait(&cMaster,&mtx);if(lbound== UM)pMyStatus->nPrime=0;while(true){loc_lbound= if(loc_lbound>=ubound)break;loc_ubound=loc_lbound+if(loc_ubound>ubound)loc_ubound=for(i=loc_lbound;i<loc_ubound;i+=2)for(k=1;k<j;k++)if(i%vecPrime[k]==0)break;if(k<j)continue;if(pMyStatus->nPrime==pMyStatus->um{temp=pMyStatus->vecPrime;pMyStatus->um+=BLOCK_SIZE; malloc(pMyStatus->um*sizeof(longint)); pMyStatus->nPrime*sizeof(longint));}}}}return}longint i,j,k,*temp;um=vecPrime=(longint*)malloc( um*sizeof(longint));vecPrime[0]=2;vecPrime[1]=j=nPrime=lbound=totalThread=0;pthread_mutex_init(&mtx,NULL);pthread_cond_init(&cMaster,NULL);for(i=0;i<thread_num;i++)pthread_create(&(threads[i].id),NULL,atomic_worker,ubound=vecPrime[nPrime-1]*vecPrime[nPrime-1];if(ubound<0||ubound>arg)ubound=arg;task_size=(ubound-lbound)/(10*thread_num);if(task_size<10)task_size=(ubound-lbound+thread_num-1)/thread_num;if(task_size%2==1)task_size++;totalThread=0;while(totalThread!=thread_num)pthread_cond_wait(&cWorker,&mtx);for(i=0;i<thread_num;i++){if(threads[i].nPrime==0)if(nPrime+threads[i].nPrime>um{temp= }

vecPrime=(longint*)malloc(um*sizeof(longint));memcpy(vecPrime,temp,nPrime*sizeof(longint)); threads[i].nPrime*sizeof(longint));nPrime+=}qsort(vecPrime,nPrime,sizeof(longint),cmpLongInt);lbound=ubound+2;}lbound= for(i=0;i<thread_num;i++)pthread_join(threads[i].id,NULL);}longinti,j,k,*temp,loc_vecPrime[BLOCK_SIZE];longintloc_lbound,loc_ubound,loc_nPrime; while(pMyStatus->id!=pthread_self())pMyStatus++;j=2; {pthread_cond_wait(&cMaster,&mtx);if(lbound== UM)loc_nPrime=0;while(true){loc_lbound= if(loc_lbound>=ubound)break;loc_ubound=loc_lbound+if(loc_ubound>ubound)loc_ubound=for(i=loc_lbound;i<loc_ubound;i+=2)while(i>pMyStatus->vecPrime[j-1]*pMyStatus->vecPrime[j-1])j++;for(k=1;k<j;k++)if(i%pMyStatus->vecPrime[k]==0)break;if(k<j)continue;if(loc_nPrime==BLOCK_SIZE)temp=vecPrime;um+=vecPrime=(longint*)malloc( um*sizeof(longint));memcpy(vecPrime,temp,nPrime*sizeof(longint)); nPrime+=BLOCK_SIZE;loc_nPrime=0;}}}if(loc_nPrime>0){if(nPrime+loc_nPrime um{temp=um+=vecPrime=(longint*)malloc( um*sizeof(longint));memcpy(vecPrime,temp,nPrime*sizeof(longint));}memcpy(vecPrime+nPrime,loc_vecPrime,loc_nPrime*sizeof(longint));nPrime+=loc_nPrime;}}return(void*)0;}longint i,j,k,*temp;um=vecPrime=(longint*)malloc(um*sizeof(longint));vecPrime[0]=2;vecPrime[1]=j=nPrime=lbound=totalThread=0;pthread_mutex_init(&mtx,NULL);pthread_cond_init(&cMaster,NULL);if(thread_num>Max_Thread_Num)thread_num=Max_Thread_Num;for(i=0;i<thread_num;i++){threads[i].vecPrime=NULL;pthread_create(&(threads[i].id),NULL,dup_worker,NULL);}ubound=vecPrime[nPrime-1]*vecPrime[nPrime-1];if(ubound<0||ubound>arg)ubound=arg;task_size=(ubound-lbound)/(10*thread_num);if(task_size<10)task_size=(u

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯系上傳者。文件的所有權益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網頁內容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
  • 4. 未經權益所有人同意不得將文件中的內容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網僅提供信息存儲空間,僅對用戶上傳內容的表現方式做保護處理,對用戶上傳分享的文檔內容本身不做任何修改或編輯,并不能對任何下載內容負責。
  • 6. 下載文件中如有侵權或不適當內容,請與我們聯系,我們立即糾正。
  • 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論