人臉識(shí)別文獻(xiàn)翻譯(中英雙文)_第1頁
人臉識(shí)別文獻(xiàn)翻譯(中英雙文)_第2頁
人臉識(shí)別文獻(xiàn)翻譯(中英雙文)_第3頁
人臉識(shí)別文獻(xiàn)翻譯(中英雙文)_第4頁
人臉識(shí)別文獻(xiàn)翻譯(中英雙文)_第5頁
已閱讀5頁,還剩6頁未讀, 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡(jiǎn)介

1、精品好資料學(xué)習(xí)推薦4 Two-dimensional Face Recognition4.1 Feature LocalizationBefore discussing the methods of comparing two facial images we now take a brief look at some at the preliminary processes of facial feature alignment. This process typically consists of two stages: face detection and eye localizatio

2、n. Depending on the application, if the position of the face within the image is known beforehand (for a cooperative subject in a door access system for example) then the face detection stage can often be skipped, as the region of interest is already known. Therefore, we discuss eye localization her

3、e, with a brief discussion of face detection in the literature review .The eye localization method is used to align the 2D face images of the various test sets used throughout this section. However, to ensure that all results presented are representative of the face recognition accuracy and not a pr

4、oduct of the performance of the eye localization routine, all image alignments are manually checked and any errors corrected, prior to testing and evaluation.We detect the position of the eyes within an image using a simple template based method. A training set of manually pre-aligned images of face

5、s is taken, and each image cropped to an area around both eyes. The average image is calculated and usedas a template.Figure 4-1 The average eyes. Used as a template for eye detection.Both eyes are included in a single template, rather than individually searching for each eye in turn, as the charact

6、eristic symmetry of the eyes either side of the nose, provide a useful feature that helps distinguish between the eyes and other false positives that may be picked up in the background. Although this method is highly susceptible to scale (i.e. subject distance from the camera) and also introduces th

7、e assumption that eyes in the image appear near horizontal. Some preliminary experimentation also reveals that it is advantageous to include the area of skin just beneath the eyes. The reason being that in some cases the eyebrows can closely match the template, particularly if there are shadows in t

8、he eye-sockets, but the area of skin below the eyes helps to distinguish the eyes from eyebrows (the area just below the eyebrows contain eyes, whereas the area below the eyes contains only plain skin).A window is passed over the test images and the absolute difference taken to that of the average e

9、ye image shown above. The area of the image with the lowest difference is taken as the region of interest containing the eyes. Applying the same procedure using a smaller template of the individual left and right eyes then refines each eye position.This basic template-based method of eye localizatio

10、n, although providing fairly preciselocalizations, often fails to locate the eyes completely. However, we are able toimprove performance by including a weighting scheme.Eye localization is performed on the set of training images, which is then separated into two sets: those in which eye detection wa

11、s successful; and those in which eye detection failed. Taking the set of successful localizations we compute the average distance from the eye template (Figure 4-2 top). Note that the image is quite dark, indicating that the detected eyes correlate closely to the eye template, as we would expect. Ho

12、wever, bright points do occur near the whites of the eye, suggesting that this area is often inconsistent, varying greatly from the average eye template.Figure 4-2 Distance to the eye template for successful detections (top) indicating variance due tonoise and failed detections (bottom) showing cred

13、ible variance due to miss-detected features.In the lower image (Figure 4-2 bottom), we have taken the set of failed localizations(images of the forehead, nose, cheeks, background etc. falsely detected by the localization routine) and once again computed the average distance from the eye template. Th

14、e bright pupils surrounded by darker areas indicate that a failed match is often due to the high correlation of the nose and cheekbone regions overwhelming the poorly correlated pupils. Wanting to emphasize the difference of the pupil regions for these failed matches and minimize the variance of the

15、 whites of the eyes for successful matches, we divide the lower image values by the upper image to produce a weights vector as shown in Figure 4-3. When applied to the difference image before summing a total error, this weighting scheme provides a much improved detection rate.Figure 4-3 - Eye templa

16、te weights used to give higher priority to those pixels that best represent the eyes.4.2 The Direct Correlation ApproachWe begin our investigation into face recognition with perhaps the simplest approach, known as the direct correlation method (also referred to as template matching by Brunelli and P

17、oggio) involving the direct comparison of pixel intensity values taken from facial images. We use the term Direct Correlation to encompass all techniques in which face images are compared directly, without any form of image space analysis, weighting schemes or feature extraction, regardless of the d

18、istance metric used. Therefore, we do not infer that Pearsons correlation is applied as the similarity function (although such an approach would obviously come under our definition of direct correlation). We typically use the Euclidean distance as our metric in these investigations (inversely relate

19、d to Pearsons correlation and can be considered as a scale and translation sensitive form of image correlation), as this persists with the contrast made between image space and subspace approaches in later sections.Firstly, all facial images must be aligned such that the eye centers are located at t

20、wo specified pixel coordinates and the image cropped to remove any backgroundinformation. These images are stored as grayscale bitmaps of 65 by 82 pixels and prior to recognition converted into a vector of 5330 elements (each element containing the corresponding pixel intensity value). Each correspo

21、nding vector can be thought of as describing a point within a 5330 dimensional image space. This simple principle can easily be extended to much larger images: a 256 by 256 pixel image occupies a single point in 65,536-dimensional image space and again, similar images occupy close points within that

22、 space. Likewise, similar faces are located close together within the image space, while dissimilar faces are spaced far apart. Calculating the Euclidean distance d, between two facial image vectors (often referred to as the query image q, and gallery image g), we get an indication of similarity. A

23、threshold is then applied to make the final verification decision.4.2.1 Verification TestsThe primary concern in any face recognition system is its ability to correctly verify a claimed identity or determine a persons most likely identity from a set of potential matches in a database. In order to as

24、sess a given systems ability to perform these tasks, a variety of evaluation methodologies have arisen. Some of these analysis methods simulate a specific mode of operation (i.e. secure site access or surveillance), while others provide a more mathematical description of data distribution in somecla

25、ssification space. In addition, the results generated from each analysis method maybe presented in a variety of formats. Throughout the experimentations in this thesis, we primarily use the verification test as our method of analysis and comparison, although we also use Fishers Linear Discriminate t

26、o analyze individual subspace components in section 7 and the identification test for the final evaluations described in section 8. The verification test measures a systems ability to correctly accept or reject the proposed identity of an individual. At a functional level, this reduces to two images

27、 being presented for comparison, for which the system must return either an acceptance (the two images are of the same person) or rejection (the two images are of different people). The test is designed to simulate the application area of secure site access. In this scenario, a subject will present

28、some form of identification at a point of entry, perhaps as a swipe card, proximity chip or PIN number. This number is then used to retrieve a stored image from a database of known subjects (often referred to as the target or gallery image) and compared with a live image captured at the point of ent

29、ry (the query image). Access is then granted depending on the acceptance/rejection decision. The results of the test are calculated according to how many times the accept/reject decision is made correctly. In order to execute this test we must first define our test set of face images. Although the n

30、umber of images in the test set does not affect the results produced (as the error rates are specified as percentages of image comparisons), it is important to ensure that the test set is sufficiently large such that statistical anomalies become insignificant (for example, a couple of badly aligned

31、images matching well). Also, the type of images (high variation in lighting, partial occlusions etc.) will significantly alter the results of the test. Therefore, in order to compare multiple face recognition systems, they must be applied to the same test set.However, it should also be noted that if

32、 the results are to be representative of system performance in a real world situation, then the test data should be captured under precisely the same circumstances as in the application environment. On the other hand, if the purpose of the experimentation is to evaluate and improve a method of face

33、recognition, which may be applied to a range of application environments, then the test data should present the range of difficulties that are to be overcome. This may mean including a greater percentage of difficult images than would be expected in the perceived operating conditions and hence highe

34、r error rates in the results produced. Below we provide the algorithm for executing the verification test. The algorithm is applied to a single test set of face images, using a single function call to the face recognition algorithm: Compare Faces (FaceA, FaceB). This call is used to compare two faci

35、al images, returning a distance score indicating how dissimilar the two face images are: the lower the score the more similar the two face images. Ideally, images of the same face should produce low scores, while images of different faces should produce high scores.Every image is compared with every

36、 other image, no image is compared with itself and no pair is compared more than once (we assume that the relationship is symmetrical). Once two images have been compared, producing a similarity score, the ground-truth is used to determine if the images are of the same person or different people. In

37、 practical tests this information is often encapsulated as part of the image filename (by means of a unique person identifier). Scores are then stored in one of two lists: a list containing scores produced by comparing images of different people and a list containing scores produced by comparing ima

38、ges of the same person. The final acceptance/rejection decision is made by application of a threshold. Any incorrect decision is recorded as either a false acceptance or false rejection. The false rejection rate (FRR) is calculated as the percentage of scores from the same people that were classifie

39、d as rejections. The false acceptance rate (FAR) is calculated as the percentage of scores from different people that were classified as acceptances.These two error rates express the inadequacies of the system when operating at aspecific threshold value. Ideally, both these figures should be zero, b

40、ut in reality reducing either the FAR or FRR (by altering the threshold value) will inevitably resultin increasing the other. Therefore, in order to describe the full operating range of aparticular system, we vary the threshold value through the entire range of scoresproduced. The application of eac

41、h threshold value produces an additional FAR, FRRpair, which when plotted on a graph produces the error rate curve shown below.Figure 4-5 - Example Error Rate Curve produced by the verification test.The equal error rate (EER) can be seen as the point at which FAR is equal to FRR. This EER value is o

42、ften used as a single figure representing the general recognitionperformance of a biometric system and allows for easy visual comparison of multiplemethods. However, it is important to note that the EER does not indicate the level oferror that would be expected in a real world application. It is unl

43、ikely that any realsystem would use a threshold value such that the percentage of false acceptances wasequal to the percentage of false rejections. Secure site access systems would typicallyset the threshold such that false acceptances were significantly lower than false rejections: unwilling to tol

44、erate intruders at the cost of inconvenient access denials.Surveillance systems on the other hand would require low false rejection rates tosuccessfully identify people in a less controlled environment. Therefore we should bear in mind that a system with a lower EER might not necessarily be the bett

45、er performer towards the extremes of its operating capability. There is a strong connection between the above graph and the receiver operatingcharacteristic (ROC) curves, also used in such experiments. Both graphs are simply two visualizations of the same results, in that the ROC format uses the Tru

46、e Acceptance Rate (TAR), where TAR = 1.0 FRR in place of the FRR, effectively flipping the graph vertically. Another visualization of the verification test results is to display both the FRR and FAR as functions of the threshold value. This presentation format provides a reference to determine the t

47、hreshold value necessary to achieve a specific FRR and FAR. The EER can be seen as the point where the two curves intersect.Figure 4-6 - Example error rate curve as a function of the score thresholdThe fluctuation of these error curves due to noise and other errors is dependant on the number of face

48、 image comparisons made to generate the data. A small dataset that only allows for a small number of comparisons will results in a jagged curve, in which large steps correspond to the influence of a single image on a high proportion of thecomparisons made. A typical dataset of 720 images (as used in

49、 section 4.2.2) provides258,840 verification operations, hence a drop of 1% EER represents an additional 2588 correct decisions, whereas the quality of a single image could cause the EER tofluctuate by up to 4 二維人臉識(shí)別4.1 特征定位在討論兩幅人臉圖像的比較之前,我們先簡(jiǎn)單看下面部圖像特征定位的初始過程。這一過程通常有由兩個(gè)階段組成:人臉檢測(cè)和眼睛定位。根據(jù)不同的應(yīng)用,如果在面部圖像

50、是事先所知的(例如在門禁系統(tǒng)主題之中),因?yàn)樗兄獏^(qū)域是已知的,那么人臉檢測(cè)階段通常是可以跳過的。因此,我們討論眼睛定位的過程中,有一個(gè)人臉檢測(cè)文獻(xiàn)的簡(jiǎn)短討論。眼睛定位適用于對(duì)齊的各種測(cè)試二維人臉圖像的方法通篇使用于這一節(jié)。但是,為了確保所有的結(jié)果都代表面部識(shí)別準(zhǔn)確率,而不是對(duì)產(chǎn)品功能的眼睛定位,所有圖像結(jié)果都是手動(dòng)檢查的。若有錯(cuò)誤,則需要更正測(cè)試和評(píng)價(jià)。我們發(fā)現(xiàn)在一個(gè)使用圖像的眼睛一個(gè)簡(jiǎn)單的基于模板的位置方法。在一個(gè)區(qū)域中對(duì)前臉手動(dòng)對(duì)齊圖像進(jìn)行采取和裁剪,以兩只眼睛周圍的區(qū)域,平均計(jì)算圖像作為模板。圖4-1 - 平均眼睛,用作模板的眼睛檢測(cè)兩個(gè)眼睛都包括在一個(gè)模板,而不是單獨(dú)的為單個(gè)搜索,因

51、為眼睛在鼻子兩邊對(duì)稱的特點(diǎn),這樣就提供了一個(gè)可用方法,可以幫助區(qū)分眼睛和其他可能誤報(bào)的背景。雖然這種方法介紹了假設(shè)眼睛水平的形象出現(xiàn)后很容易受到小距離的影響(即主體和相機(jī)的距離),但初步試驗(yàn)顯示,還是利于包括眼睛下方的皮膚區(qū)域得到校準(zhǔn)去的結(jié)果。因?yàn)樵谀承┣闆r下,眉毛可以密切配合模板,特別是如果在眼睛區(qū)域的陰影周圍。此外眼睛以下的皮膚面積有助于區(qū)分眉毛(眉毛下方的面積眼中包含的眼睛,而該地區(qū)眼睛下面的皮膚只含有純色)。窗口區(qū)域是通過對(duì)測(cè)試圖像和絕對(duì)差采取的這一平均眼睛上面顯示的圖像。圖像的最低差額面積作為含有眼中感知的區(qū)域。運(yùn)用同樣的程序使用小模板單人左,右眼,然后提取每只眼睛的位置。 這個(gè)基本

52、模板的眼睛定位方法,盡管提供了相當(dāng)精確的本地化,但往往不能找到完全的眼睛區(qū)域。但是,我們能夠改善性能和加權(quán)值。眼睛定位是在執(zhí)行圖像處理,然后被分成集兩套:哪些眼睛檢測(cè)成功的,和哪些眼睛檢測(cè)失敗的。以成功的本地化處理,我們?cè)谟?jì)算平均距離眼睛模板(圖4-2丁部)時(shí),請(qǐng)注意,該圖像是非常黑暗的,這表明發(fā)現(xiàn)眼睛密切相關(guān)的眼睛模板,正如我們期望的那樣。然而,亮點(diǎn)確實(shí)發(fā)生在眼睛區(qū)域,表明這方面經(jīng)常是不一致的,不同于普通模板。圖4-2 對(duì)眼睛模板成功檢測(cè)(左),由于方差噪音和失敗的檢測(cè)(右)顯示在右側(cè)的圖像(圖4-2右),前額,鼻子圖像,臉頰,背景等采用了虛假的檢測(cè),并再次從眼睛計(jì)算了平均距離。明亮點(diǎn)由暗區(qū)

53、包圍表明,一個(gè)失敗的匹配往往和鼻子和顴骨地區(qū)絕大多數(shù)的高相關(guān)性差相關(guān)。我們排除以上價(jià)值較低的圖像產(chǎn)生的重矢量,如圖4-3所示。應(yīng)用到差分圖像在總結(jié)前的誤差,這個(gè)比重計(jì)劃大大提高了檢出率。圖 4-34.2直接相關(guān)方法 我們把最簡(jiǎn)單的人臉識(shí)別調(diào)查方法稱為直接相關(guān)方法(也稱為模板匹配的布魯內(nèi)利和波焦),其中所涉及的像素亮度值直接比較取自面部圖像。我們使用術(shù)語直接關(guān)系,以涵蓋所有圖像技術(shù)所面臨的直接比較,以及沒有任何形式的形象空間分析,加權(quán)計(jì)劃或特征提取。因此,我們并不能推斷皮爾遜函數(shù)的相關(guān)性,作為應(yīng)用相似的功能(盡管這種做法顯然會(huì)受到我們的直接相關(guān)的定義)。我們通常使用歐氏距離度量作為我們的調(diào)查結(jié)果

54、(負(fù)相關(guān),Pearson相關(guān),可以考慮作為一個(gè)規(guī)模和翻譯的圖像相關(guān)敏感的形式),這還對(duì)比了后面的章節(jié)的空間和子空間圖像方法。首先,所有的面部圖像必須保持一致,這樣使眼睛在兩個(gè)中心位于指定的像素坐標(biāo)和裁剪,以消除任何背景中的圖像信息。這些圖像存儲(chǔ)為65和82像素灰度位圖前進(jìn)入了5330元素(每個(gè)元素包含向量轉(zhuǎn)換確認(rèn)相應(yīng)的像素強(qiáng)度值)。每一個(gè)對(duì)應(yīng)的向量可以認(rèn)為是在說明5330點(diǎn)的三維圖像空間。這個(gè)簡(jiǎn)單的原則很容易被推廣到更大的照片:由256像素的圖像256占用一個(gè)在65,536維圖像空間,并再次指出,類似的圖像占據(jù)接近點(diǎn)在該空間。同樣,類似的面孔靠近一起在圖像空間,而不同的面間距相距甚遠(yuǎn)。計(jì)算歐幾里得距離d,兩個(gè)人臉圖像向量(通常稱為查詢圖像Q和畫廊圖像克),我們得到一個(gè)相似的跡象。然后用一個(gè)閾值,制作出最后核查結(jié)果。4.2.1驗(yàn)證測(cè)試 任何一個(gè)人臉識(shí)別系統(tǒng)的主要關(guān)注點(diǎn)是它能夠從一個(gè)潛在的集合數(shù)據(jù)庫中正確地驗(yàn)證人臉的身份或確定一個(gè)人最可能的身份。為了評(píng)估一個(gè)給定的系統(tǒng)的能力是否能執(zhí)行這些任務(wù),我們可以采用不同的評(píng)價(jià)方法。其中一些分析方法模擬一個(gè)具體的運(yùn)作模式(即安全網(wǎng)站的訪問或監(jiān)視),而其他人提供更多的數(shù)據(jù)分布(數(shù)學(xué)描述中的一些分類空間)。此外

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。

評(píng)論

0/150

提交評(píng)論