




版權說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權,請進行舉報或認領
文檔簡介
Data
Extraction
and
Model
Stealing姜育剛,馬興軍,吳祖煊Recap:
week
7A
Brief
History
of
Backdoor
LearningBackdoor
AttacksBackdoor
DefensesFuture
ResearchThis
WeekData
Extraction
Attack
&
DefenseModel
Stealing
AttackFuture
ResearchThis
WeekData
Extraction
Attack
&
DefenseModel
Stealing
AttackFuture
ResearchData
Extraction
Attack通過模型逆向得到訓練數(shù)據(jù):8001/dss/imageClassify
TerminologyThe
following
terms
describe
the
same
thing:Data
Extraction
AttackData
Stealing
AttackTraining
Data
Extraction
AttackModel
Memorization
AttackModel
Inversion
AttackSecurity
ThreatsMysocialsecuritynumberis078-Personal
Info
LeakageSensitive
Info
LeakageThreats
to
National
SecurityIllegal
Data
Trading…Memorization
of
DNNsEvidence
1:
DNN
learns
different
levels
of
representationsMemorization
of
DNNsEvidence
2:
DNN
can
memorize
random
labels/pixels真實標簽隨機標簽亂序像素隨機像素高斯噪聲Zhang,Chiyuan,etal.“Understandingdeeplearningrequiresrethinkinggeneralization.”ICLR
2017.Memorization
of
DNNsEvidence
3:
The
success
of
GANs
and
diffusion
models/;
/
Intended
vs.
Unintended
MemorizationIntended
MemorizationTask-relatedStatisticsInputs
and
LabelsArpitetal.“Acloserlookatmemorizationindeepnetworks.”
ICML,2017.
Carlinietal.“Thesecretsharer:Evaluatingandtestingunintendedmemorizationinneuralnetworks.”USENIXSecurity,2019.第一層Filter正常CIFAR-10第一層Filter隨機標注CIFAR-10自然語言翻譯模型記憶:“我的社保號碼是xxxx”Unintended
MemorizationTask-irrelevant
but
memorizedEven
appear
only
a
few
times出現(xiàn)4次就能全記住現(xiàn)有數(shù)據(jù)竊取攻擊黑盒竊取主動測試:煤礦里的金絲雀“隨機號碼為****”“我的社保號碼為****”主動注入,然后先兆數(shù)據(jù)在語言模型中的“曝光度”(Exposure)Carlinietal.“Thesecretsharer:Evaluatingandtestingunintendedmemorizationinneuralnetworks.”USENIXSecurity,2019.意外記憶測試和量化:’先兆’黑盒竊取針對通用語言模型:逆向出大量的:名字、手機號、郵箱、社保號等大模型比小模型更容易記住這些信息即使只在一個文檔里出現(xiàn)也能被記住Carlini,Nicholas,etal.“Extractingtrainingdatafromlargelanguagemodels.”
USENIXSecurity,2021.訓練數(shù)據(jù)萃取攻擊Training
Data
Extraction
AttackDefinition
of
MemorizationCarlini,Nicholas,etal.“Extractingtrainingdatafromlargelanguagemodels.”
USENIXSecurity,2021.模型知識提取k-逼真記憶攻擊步驟Carlini,Nicholas,etal.“Extractingtrainingdatafromlargelanguagemodels.”
USENIXSecurity,2021.步驟1:生成大量文本;步驟2:文本篩選和確認實驗結果604條“意外”記憶只在一個文檔里出現(xiàn)的記憶模型越大記憶越強Memorization
ofDiffusion
Models美國馬里蘭大學和紐約大學聯(lián)合研究發(fā)現(xiàn),生成擴散模型會記憶原始訓練數(shù)據(jù),導致在特定文本提示下,泄露原始數(shù)據(jù)生成的:原始的:Memorization
ofDiffusion
ModelsDefinition
of
Replication:Wesaythatageneratedimagehasreplicatedcontentifitcontainsanobject(eitherintheforegroundorbackground)thatappearsidenticallyinatrainingimage,neglectingminorvariationsinappearancethatcouldresultfromdataaugmentation.Somepalli,Gowthami,etal."Diffusionartordigitalforgery?investigatingdatareplicationindiffusionmodels."
CVPR.2023.Memorization
ofDiffusion
ModelsSomepalli,Gowthami,etal."Diffusionartordigitalforgery?investigatingdatareplicationindiffusionmodels."
CVPR.2023.OriginalSegmixDiagonal
OutpaintingPatch
OutpaintingCreate
Synthetic
and
Real
DatasetsExisting
image
retrieval
datasets:OxfordParisINSTREGPR1200Memorization
ofDiffusion
ModelsSomepalli,Gowthami,etal."Diffusionartordigitalforgery?investigatingdatareplicationindiffusionmodels."
CVPR.2023.Train
Image
Retrieval
ModelsMemorization
ofDiffusion
ModelsSomepalli,Gowthami,etal."Diffusionartordigitalforgery?investigatingdatareplicationindiffusionmodels."
CVPR.2023.Similarity
metric:
inner
product
token-wise
inner
productDiffusion
model:
DDPMDataset:
Celeb-AThe
top-2
matches
of
diffusion
models
trained
on
300,
3000,
and
30000
images
(the
full
set
is
30000).Results:Green:
copyBlue:
close
but
no
exact
copyOthers:
similar
but
not
the
sameMemorization
ofDiffusion
ModelsSomepalli,Gowthami,etal."Diffusionartordigitalforgery?investigatingdatareplicationindiffusionmodels."
CVPR.2023.Gen-train
vs
train-train
similarity
score
distribution數(shù)據(jù)越少Copy越多Memorization
ofDiffusion
ModelsSomepalli,Gowthami,etal."Diffusionartordigitalforgery?investigatingdatareplicationindiffusionmodels."
CVPR.2023.Many
close
copy
but
no
exact
match
(similarity
score
<0.65)Case
study:
ImageNet
LDMMost
similar:
theatercurtain,peacock,andbananasLeast
similar:
sealion,bee,andswingMemorization
ofDiffusion
ModelsSomepalli,Gowthami,etal."Diffusionartordigitalforgery?investigatingdatareplicationindiffusionmodels."
CVPR.2023.Case
study:
StableDiffusionLAIONAestheticsv26+:
12M
imagesRandom
select
9000
images
as
source
and
use
their
captions
to
promptMemorization
ofDiffusion
ModelsSomepalli,Gowthami,etal."Diffusionartordigitalforgery?investigatingdatareplicationindiffusionmodels."
CVPR.2023.Case
study:
StableDiffusionSome
keywords
(those
in
red)
are
associated
with
certain
fixed
patterns.
Key
wordsMemorization
ofDiffusion
ModelsSomepalli,Gowthami,etal."Diffusionartordigitalforgery?investigatingdatareplicationindiffusionmodels."
CVPR.2023.Case
study:
StableDiffusionStyle
copying
using
text
prompt:
<Name
of
the
painting>
by
<name
of
the
artist>Memorization
of
Large
Language
Models
(LLMs)Shi,Weijia,etal."DetectingPretrainingDatafromLargeLanguageModels."
arXivpreprintarXiv:2310.16789
(2023).PretrainingdatadetectionMIN-K%PROBMemorization
of
Large
Language
Models
(LLMs)Shi,Weijia,etal."DetectingPretrainingDatafromLargeLanguageModels."
arXivpreprintarXiv:2310.16789
(2023).Detection
on
WIKIMIAA
dynamic
benchmark:
WIKIMIA白盒竊取白盒竊取需要利用梯度信息,也稱梯度逆向攻擊(Gradient
Inversion
Attack)針對梯度共享的訓練:分布式訓練聯(lián)邦學習并行訓練無中心化訓練兩種分布式訓練范式白盒竊取白盒竊取需要利用梯度信息,也稱梯度逆向攻擊(Gradient
Inversion
Attack)Zhang
et
al.“ASurveyonGradientInversion:Attacks,DefensesandFutureDirections.”
IJCAI
2022.迭代逆向(逐層)遞歸逆向逼近反推白盒竊?。旱嫦虻嫦颍和ㄟ^構造數(shù)據(jù)來接近真實梯度真實梯度,假設已知一次前傳兩次后傳生成數(shù)據(jù)產(chǎn)生的梯度
Zhang
et
al.“ASurveyonGradientInversion:Attacks,DefensesandFutureDirections.”
IJCAI
2022.白盒竊?。旱嫦蛞延泄ぷ鲄R總Zhang
et
al.“ASurveyonGradientInversion:Attacks,DefensesandFutureDirections.”
IJCAI
2022.白盒竊?。哼f歸逆向遞歸逆向:基于真實梯度追層逆向推導關鍵點:圖像大?。?2x32)Batch大?。ù蠖酁?)模型大小真實梯度,已知Zhang
et
al.“ASurveyonGradientInversion:Attacks,DefensesandFutureDirections.”
IJCAI
2022.白盒竊?。哼f歸逆向已有工作匯總Zhang
et
al.“ASurveyonGradientInversion:Attacks,DefensesandFutureDirections.”
IJCAI
2022.白盒防御已有工作匯總Zhang
et
al.“ASurveyonGradientInversion:Attacks,DefensesandFutureDirections.”
IJCAI
2022.This
WeekData
Extraction
Attack
&
DefenseModel
Stealing
AttackFuture
ResearchAI模型訓練代價高昂BERTGoogle$160萬大規(guī)模、高性能的AI模型訓練耗費巨大數(shù)據(jù)資源計算資源人力資源模型竊取的動機巨大的商業(yè)價值盡量保持模型性能不希望被發(fā)現(xiàn)寶貴的AI模型模型竊取為其所用模型竊取的方式輸入輸出模型微調(diào)模型剪枝竊取攻擊StealingmachinelearningmodelsviapredictionAPIs,
USENIXSecurity,
2016;
Practicalblack-boxattacksagainstmachinelearning,
ASIACCS,
2017;
Knockoffnets:Stealingfunctionalityofblack-boxmodels,
CVPR,
2019;
Maze:Data-free
modelstealing
attackusingzeroth-ordergradientestimation,
CVPR,
2021;基于方程式求解的攻擊攻擊思路示例基于方程式求解的攻擊Tramèr,Florian,etal."Stealingmachinelearningmodelsviaprediction{APIs}."
USENIXSecurity,2016.100%竊取某些商業(yè)模型所需的查詢數(shù)和時間基于方程式求解的攻擊:竊取參數(shù)攻擊算法參數(shù)個數(shù)為d通過d+1個輸入,構造d+1個下列方程
主要特點:針對傳統(tǒng)機器學習模型:SVM、LR、DT可精確求解,需要模型返回精確的置信度竊取得到的模型還可能泄露訓練數(shù)據(jù)(數(shù)據(jù)逆向攻擊)Tramèr,Florian,etal."Stealingmachinelearningmodelsviaprediction{APIs}."
USENIXSecurity,2016.基于方程式求解的攻擊:竊取超參Wang,Binghui,andNeilZhenqiangGong."Stealinghyperparametersinmachinelearning."
S&P,2018.攻擊思想:模型訓練完了的狀態(tài)應該是Loss梯度為0
基于替代模型的攻擊Orekondy
et
al."Knockoffnets:Stealingfunctionalityofblack-boxmodels."
CVPR,2019.攻擊思想:在查詢目標模型的過程中訓練一個替代模型模擬其行為基于替代模型的攻擊Orekondy
et
al."Knockoffnets:Stealingfunctionalityofblack-boxmodels."
CVPR,2019.Knockoff
Nets攻擊:“仿冒網(wǎng)絡”基于替代模型的攻擊Knockoff
Nets攻擊:攻擊流程采樣大量查詢樣本訓練替代模型強化學習,學習如何高效選擇樣本Orekondy
et
al."Knockoffnets:Stealingfunctionalityofblack-boxmodels."
CVPR,2019.基于替代模型的攻擊Jagielski,Matthew,etal.“Highaccuracyandhighfidelityextractionofneuralnetworks.”
USENIXSecurity,2020.高準確(accuracy)vs高保真(fidelity)竊取攻擊藍色:目標決策邊界橙色:高準確竊取綠色:高保真竊取基于替代模型的攻擊Jagielski,Matthew,etal.“Highaccuracyandhighfidelityextractionofneuralnetworks.”
USENIXSecurity,2020.高準確(accuracy)vs高保真(fidelity)竊取攻擊目標模型(黑盒)查詢圖片替代模型模型輸出作為標簽指導替代模型訓練
概率輸出類別輸出基于替代模型的攻擊Jagielski,Matthew,etal.“Highaccuracyandhighfidelityextractionofneuralnetworks.”
USENIXSecurity,2020.功能等同竊取FunctionallyEquivalentExtraction攻擊步驟:尋找在某個Neuron上,讓ReLU=0的關鍵點在關鍵點兩側探索邊界,確定對應權重只能竊取兩層網(wǎng)絡基于替代模型的攻擊Carlini
et
al."Cryptanalyticextractionofneuralnetworkmodels."
AnnualInternationalCryptologyConference,2020.加密分析竊取CryptanalyticExtraction思想:ReLU的二級導為0
&有限差分(finite
difference)ReLU=0基于替代模型的攻擊加密分析竊取CryptanalyticExtraction竊取0-deep神經(jīng)網(wǎng)絡:竊取1-deep神經(jīng)網(wǎng)絡:Carlini
et
al."Cryptanalyticextractionofneuralnetworkmodels."
AnnualInternationalCryptologyConference,2020.基于替代模型的攻擊Yuan,Xiaoyong,eta
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
- 4. 未經(jīng)權益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負責。
- 6. 下載文件中如有侵權或不適當內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 企業(yè)新聞稿件培訓
- 新生兒生理性黃疸的健康宣教
- 小學防火知識教案
- 2025建筑施工勞務合同范本
- 2025企業(yè)借款合同范本樣式
- 2025內(nèi)河貨物運輸合同范本
- 回避性-限制性攝食障礙的健康宣教
- 2025河北省土地使用權轉讓合同
- 2025陜西省房地產(chǎn)交易合同
- 2025電力供應與用電客戶合同 標準版 模板
- DBJ-T 13-318-2019 建筑施工承插型盤扣式鋼管支架安全技術規(guī)程
- 河南2023年河南省農(nóng)村信用社員工招聘2600人考試參考題庫含答案詳解
- 身體知道答案(珍藏版)
- 安徽省高等學校質(zhì)量工程項目結題報告
- GB/T 22795-2008混凝土用膨脹型錨栓型式與尺寸
- GB/T 19851.15-2007中小學體育器材和場地第15部分:足球門
- GB/T 10095.1-2001漸開線圓柱齒輪精度第1部分:輪齒同側齒面偏差的定義和允許值
- ICU 呼吸機相關性肺炎預防措施執(zhí)行核查表
- 汽車吊檢測保養(yǎng)記錄
- 市政工程安全臺賬表
- 航天模型的設計、制作與比賽課件
評論
0/150
提交評論