![L-Diversity Privacy Beyond K-Anonymityl-多樣性超越K-匿名隱私_第1頁](http://file4.renrendoc.com/view/41a89548536e91ce66b9d3dd84dc85cb/41a89548536e91ce66b9d3dd84dc85cb1.gif)
![L-Diversity Privacy Beyond K-Anonymityl-多樣性超越K-匿名隱私_第2頁](http://file4.renrendoc.com/view/41a89548536e91ce66b9d3dd84dc85cb/41a89548536e91ce66b9d3dd84dc85cb2.gif)
![L-Diversity Privacy Beyond K-Anonymityl-多樣性超越K-匿名隱私_第3頁](http://file4.renrendoc.com/view/41a89548536e91ce66b9d3dd84dc85cb/41a89548536e91ce66b9d3dd84dc85cb3.gif)
![L-Diversity Privacy Beyond K-Anonymityl-多樣性超越K-匿名隱私_第4頁](http://file4.renrendoc.com/view/41a89548536e91ce66b9d3dd84dc85cb/41a89548536e91ce66b9d3dd84dc85cb4.gif)
![L-Diversity Privacy Beyond K-Anonymityl-多樣性超越K-匿名隱私_第5頁](http://file4.renrendoc.com/view/41a89548536e91ce66b9d3dd84dc85cb/41a89548536e91ce66b9d3dd84dc85cb5.gif)
版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報或認(rèn)領(lǐng)
文檔簡介
L-Diversity:PrivacyBeyondK-AnonymityAshwinMachanavajjhala,JohannesGehrke,DanielKifer,MuthuramakrishnanVenkitasubramaniamCS295Dataprivacyandconfidentiality-ViragKothariOverviewIntroductionAttacksonk-AnonymityBayesOptimalPrivacyl-DiversityPrinciplel-DiversityInstantiationsMultipleSensitiveAttributesMonotonicityPropertyUtilityConclusionCS295DataprivacyandconfidentialityBackgroundLargeamountofperson-specificdatahasbeencollectedinrecentyearsBothbygovernmentsandbyprivateentitiesDataandknowledgeextractedbydataminingtechniquesrepresentakeyassettothesocietyAnalyzingtrendsandpatternsFormulatingpublicpoliciesLawsandregulationsrequirethatsomecollecteddatamustbemadepublicForexample,CensusdataCS295DataprivacyandconfidentialityWhatAboutPrivacy?Firstthought:anonymizethedataHow?Remove“personallyidentifyinginformation”(PII)Name,SocialSecuritynumber,phonenumber,email,address…AnythingthatidentifiesthepersondirectlyIsthisenough?CS295DataprivacyandconfidentialityRe-identificationbyLinkingNameZipcodeAgeSexAlice4767729FBob4798365MCarol4767722FDan4753223MEllen4678943FVoterregistrationdataQIDSAZipcodeAgeSexDisease4767729FOvarianCancer4760222FOvarianCancer4767827MProstateCancer4790543MFlu4790952FHeartDisease4790647MHeartDiseaseIDNameAliceBettyCharlesDavidEmilyFredMicrodataCS295DataprivacyandconfidentialityClassificationofAttributesKeyattributesName,address,phonenumber-uniquelyidentifying!AlwaysremovedbeforereleaseQuasi-identifiers(5-digitZIPcode,birthdate,gender)uniquelyidentify87%ofthepopulationintheU.S.CanbeusedforlinkinganonymizeddatasetwithotherdatasetsCS295DataprivacyandconfidentialitySensitiveattributesMedicalrecords,salaries,etc.Theseattributesiswhattheresearchersneed,sotheyarealwaysreleaseddirectlyNameDOBGenderZipcodeDiseaseAndre1/21/76Male53715HeartDiseaseBeth4/13/86Female53715HepatitisCarol2/28/76Male53703BrochitisDan1/21/76Male53703BrokenArmEllen4/13/86Female53706FluEric2/28/76Female53706HangNailKeyAttributeQuasi-identifierSensitiveattributeCS295DataprivacyandconfidentialityK-AnonymityTheinformationforeachpersoncontainedinthereleasedtablecannotbedistinguishedfromatleastk-1individualswhoseinformationalsoappearsinthereleaseExample:youtrytoidentifyamaninthereleasedtable,buttheonlyinformationyouhaveishisbirthdateandgender.Therearekmeninthetablewiththesamebirthdateandgender.Anyquasi-identifierpresentinthereleasedtablemustappearinatleastkrecordsCS295DataprivacyandconfidentialityAttacksonK-anonymityHomogeneityAttacksBackgroundKnowledgeAttacksCS295DataprivacyandconfidentialityHomogeneityAttacksSinceAliceisBob’sneighbor,sheknowsthatBobisa31-year-oldAmericanmalewholivesinthezipcode13053.Therefore,AliceknowsthatBob’srecordnumberis9,10,11,or12.ShecanalsoseefromthedatathatBobhascancer.CS295DataprivacyandconfidentialityOriginalTable4-anonymousTableBackgroundKnowledgeAttacksAliceknowsthatUmekoisa21year-oldJapanesefemalewhocurrentlylivesinzipcode13068.Basedonthisinformation,AlicelearnsthatUmeko’sinformationiscontainedinrecordnumber1,2,3,or4.Withadditionalinformation,UmekobeingJapaneseandAliceknowingthatJapanesehaveanextremelylowincidenceofheartdisease,AlicecanconcludedwithnearcertaintythatUmekohasaviralinfection.CS295DataprivacyandconfidentialityOriginalTable4-anonymousTableWeaknessesink-anonymoustablesGiventhesetwoweaknessesthereneedstobeastrongermethodtoensureprivacy.Basedonthis,theauthorsbegintobuildtheirsolution.CS295DataprivacyandconfidentialityAdversariesBackgroundKnowledgeTheadversaryhasaccesstoT*andknowsitwasderivedfromtableT.Thedomainofeachattributeisalsoknown.Theadversarymayalsohaveinstancelevelbackgroundknowledge.Theadversarymayalsoknowdemographicbackgrounddatasuchastheprobabilityofaconditiongivenanage.CS295DataprivacyandconfidentialityBayes-OptimalPrivacyModelsbackgroundknowledgeasaprobabilitydistributionovertheattributesandusesBayesianinferencetechniquestoreasonaboutprivacy.However,Bayes-OptimalPrivacyisonlyusedasastartingpointforadefinitionofprivacysothereare2simplifyingassumptionsmade.Tisasimplerandomsampleofalargerpopulation.AssumeasinglesensitivevalueCS295DataprivacyandconfidentialityPriorbeliefisdefinedas:Posteriorbeliefisdefinedas:CS295DataprivacyandconfidentialityPriorbeliefandposteriorbeliefareusedareusedtogaugetheattacker’ssuccess.CalculatingtheposteriorbeliefCS295DataprivacyandconfidentialityPrivacyPrinciplesCS295DataprivacyandconfidentialityPositiveDisclosure:PublishingthetableT?thatwasderivedfromTresultsinapositivedisclosureiftheadversarycancorrectlyidentifythevalueofasensitiveattributewithhighprobability.Negativedisclosure:PublishingthetableT?thatwasderivedfromTresultsinanegativedisclosureiftheadversarycancorrectlyeliminatesomepossiblevaluesofthesensitiveattribute(withhighprobability)CS295DataprivacyandconfidentialityDrawbackstoBayes-OptimalPrivacyInsufficientknowledgebecausethepublisherisunlikelytoknowthefulldistributionofsensitiveandnon-sensitiveattributesoverthefullpopulation.Thedatapublisherdoesnotknowtheknowledgeofawouldbeattacker.Instancelevelknowledgecannotbemodeled.TherearelikelytobemanyadversarieswithvaryinglevelsofknowledgeCS295DataprivacyandconfidentialityL-DiversityPrincipleTheorem3.1definesamethodofcalculatingtheobservedbeliefoftheadversaryInthecaseofpositivedisclosures,AlicewantstodetermineBob’ssensitiveattributewithaveryhighprobability.GivenTheorem3.1thiscanonlyhappenwhen:CS295DataprivacyandconfidentialityTheconditionofequation2canbesatisfiedbyalackofdiversityinthesensitiveattribute(s)and/orstrongbackgroundknowledge.
Lackofdiversityinthesensitiveattributecanbedescribedasfollows:
Equation3indicatesthatalmostalltupleshavethesamevalueasthesensitivevalueandthereforetheposteriorbeliefisalmost1.ToensurediversityandtoguardagainstEquation3istorequirethataq?-blockhasatleastl≥2differentsensitivevaluessuchthatthelmostfrequentvalues(intheq?-block)haveroughlythesamefrequency.Wesaythatsuchaq?-blockiswell-representedbylsensitivevalues.CS295DataprivacyandconfidentialityThisequationstatesthatBobwithquasi-identifiert[Q]=qismuchlesslikelytohavesensitivevalues′thananyotherindividualintheq?-block.CS295DataprivacyandconfidentialityAnattackermaystillbeabletousebackgroundknowledgewhenthefollowingistrueCS295DataprivacyandconfidentialitySupposeweconsideranequivalenceclassfortheexampleofbackgroundknowledgeattackshownearlier.HereAlicehasbackgroundknowledgethatJapanesepeoplearelesspronetoheartdisease.∴f(s′|q)=0(∵TheprobabilitythatUmekohasheartdiseasegivenhernonsensitiveattributeas‘Japanese’is0).Also,f(s′|q*)=2/4∴f(s′|q)/f(s′|q*)=0.
RevisitingtheexampleInspiteofsuchbackgroundknowledge,iftherearel“wellrepresented”sensitivevaluesinaq?-block,thenAliceneedsl?1damagingpiecesofbackgroundknowledgetoeliminatel?1possiblesensitivevaluesandinferapositivedisclosure!CS295DataprivacyandconfidentialityL-DiversityPrincipleGiventhepreviousdiscussions,wearriveatthel-Diversityprinciple:CS295DataprivacyandconfidentialityRevisitingtheexampleUsinga3-diversetable,wenolongerareabletotellifBob(a31yearoldAmericanfromzipcode13053)hascancer.WealsocannottellifUmeko(a21yearoldJapanesefromzipcode13068)hasaviralinfectionorcancer.CS295Dataprivacyandconfidentiality4-anonymoustable3diversetableDistinctl-DiversityEachequivalenceclasshasatleastlwell-representedsensitivevaluesDoesn’tpreventprobabilisticinferenceattacks10records8recordshaveHIV2recordshaveothervaluesCS295DataprivacyandconfidentialityL-DiversityInstantiationsEntropyl-DiversityRecursive(c,l)DiversityPositiveDisclosure-Recursive(c,l)-DiversityNegative/PositiveDisclosure-Recursive(c1,c2,l)-DiversityCS295DataprivacyandconfidentialityEntropyl-DiversityHereeveryq?-blockhasatleastldistinctvaluesforthesensitiveattributeThisimpliesthatforatabletobeentropyl-Diverse,theentropyoftheentiretablemustbeatleastlog(l).Therefore,entropyl-Diversitymaybetoorestrictivetobepractical.CS295DataprivacyandconfidentialityRecursive(c,l)DiversityLessrestrictivethanentropyl-diversityLets1,…,smbethepossiblevaluesofsensitiveattributeSinaq*-blockAssume,wesortthecountsn(q*,s1),...,n(q*,sm)indescendingorderwiththeresultingsequencer1,…,rm.Wecansayaq*-blockisrecursive(c,l)-diverseifr1<c(r2+….+rm)foraspecifiedconstantc.CS295DataprivacyandconfidentialityPositiveDisclosure-Recursive(c,l)-DiversitySomecasesofpositivedisclosuremaybeacceptablesuchaswhenmedicalconditionis“healthy”.Toallowthesevaluestheauthorsdefinepd-recursive(c,l)-diversity
CS295DataprivacyandconfidentialityNegative/PositiveDisclosure-Recursive(c1,c2,l)-DiversityNpd-recursive(c1,c2,l)-diversitypreventsnegativedisclosurebyrequiringattributesforwhichnegativedisclosureisnotallowedtooccur.MultipleSensitiveAttributesPreviousdiscussionsonlyaddressedsinglesensitiveattributes.SupposeSandVaretwosensitiveattributes,andconsidertheq*-blockwiththefollowingtuples:
{(q,s1,v1),(q,s1,v2),(q,s2,v3),(q,s3,v3)}.Thisq*-blockis3-diverse(actuallyrecursive(2,3)-diverse)withrespecttoS(ignoringV)and3-diversewithrespecttoV(ignoringS).However,ifweknowthatBobisinthisblockandhisvalueforSisnots1thenhisvalueforattributeVcannotbev1orv2,andthereforemustbev3.Toaddressthisproblemwecanaddtheadditionalsensitiveattributestothequasi-identifier.CS295DataprivacyandconfidentialityImplementingPrivacyPreservingDataPublishingDomaingeneralizationisusedtodefineageneralizationlattice.Fordiscussion,allnon-sensitiveattributesarecombinedintoamulti-dimensionalattribute(Q)wherethebottomelementonthelatticeisthedomainofQandthetopofthelatticeisthedomainwhereeachdimensionofQisgeneralizedtoasinglevalue.CS295DataprivacyandconfidentialityImplementingPrivacyDataPublishing(cont.)ThealgorithmforpublishingshouldfindthepointonthelatticewherethetableT*preservesprivacyandisusefulaspossible.Theusefulness(utility)oftableT*isdiminishedasthedatabecomesmoregeneralized,sothemostutilityisatthebottomofthelattice.CS295DataprivacyandconfidentialityMonotonicityPropertyMonotonicitypropertyisdescribedasastoppingpointinthelatticesearch
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 電子廢棄物處理市場調(diào)查研究及行業(yè)投資潛力預(yù)測報告
- 2025年中國衛(wèi)生資源配置行業(yè)發(fā)展監(jiān)測及投資戰(zhàn)略研究報告
- 2025年中國交通機(jī)械零部件行業(yè)市場發(fā)展前景及發(fā)展趨勢與投資戰(zhàn)略研究報告
- 2024-2025年中國三元乙丙防水涂料行業(yè)發(fā)展?jié)摿Ψ治黾巴顿Y方向研究報告
- 勞務(wù)合同范例 木工
- 一具體保理合同范例
- 冷庫海鮮出售合同范本
- 買賣名畫合同范本
- 信息保密協(xié)議合同范本
- 農(nóng)村冷庫銷售合同范例
- 2024年臨床醫(yī)師定期考核試題中醫(yī)知識題庫及答案(共330題) (二)
- 2025-2030年中國反滲透膜行業(yè)市場發(fā)展趨勢展望與投資策略分析報告
- 湖北省十堰市城區(qū)2024-2025學(xué)年九年級上學(xué)期期末質(zhì)量檢測道德與法治試題 (含答案)
- 2025年山東省濟(jì)寧高新區(qū)管委會“優(yōu)才”招聘20人歷年高頻重點提升(共500題)附帶答案詳解
- 2025年中國社會科學(xué)評價研究院第一批專業(yè)技術(shù)人員招聘2人歷年高頻重點提升(共500題)附帶答案詳解
- (2024年高考真題)2024年普通高等學(xué)校招生全國統(tǒng)一考試數(shù)學(xué)試卷-新課標(biāo)Ⅰ卷(含部分解析)
- HCIA-AI H13-311 v3.5認(rèn)證考試題庫(含答案)
- 市場調(diào)查 第三版 課件全套 夏學(xué)文 單元1-8 市場調(diào)查認(rèn)知 - 市場調(diào)查報告的撰寫與評估
- 初中化學(xué)跨學(xué)科實踐活動:海洋資源的綜合利用與制鹽課件 2024-2025學(xué)年九年級化學(xué)科粵版(2024)下冊
- 內(nèi)蒙自治區(qū)烏蘭察布市集寧二中2025屆高考語文全真模擬密押卷含解析
- 初中英語1600詞背誦版+檢測默寫版
評論
0/150
提交評論