版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
ExecutiveSummary
Recentdevelopmentshaveimprovedtheabilityoflargelanguagemodels(LLMs)andotherAIsystemstogeneratecomputercode.Whilethisispromisingforthefieldof
softwaredevelopment,thesemodelscanalsoposedirectandindirectcybersecurity
risks.Inthispaper,weidentifythreebroadcategoriesofriskassociatedwithAIcodegenerationmodels:1)modelsgeneratinginsecurecode,2)modelsthemselvesbeingvulnerabletoattackandmanipulation,and3)downstreamcybersecurityimpactssuchasfeedbackloopsintrainingfutureAIsystems.
Existingresearchhasshownthat,underexperimentalconditions,AIcodegenerationmodelsfrequentlyoutputinsecurecode.However,theprocessofevaluatingthe
securityofAI-generatedcodeishighlycomplexandcontainsmanyinterdependentvariables.TofurtherexploretheriskofinsecureAI-writtencode,weevaluated
generatedcodefromfiveLLMs.Eachmodelwasgiventhesamesetofprompts,whichweredesignedtotestlikelyscenarioswherebuggyorinsecurecodemightbe
produced.Ourevaluationresultsshowthatalmosthalfofthecodesnippetsproducedbythesefivedifferentmodelscontainbugsthatareoftenimpactfulandcould
potentiallyleadtomaliciousexploitation.Theseresultsarelimitedtothenarrowscopeofourevaluation,butwehopetheycancontributetothelargerbodyofresearch
surroundingtheimpactsofAIcodegenerationmodels.
Givenbothcodegenerationmodels’currentutilityandthelikelihoodthattheircapabilitieswillcontinuetoimprove,itisimportanttomanagetheirpolicyandcybersecurityimplications.Keyfindingsincludethebelow.
●IndustryadoptionofAIcodegenerationmodelsmayposeriskstosoftware
supplychainsecurity.However,theseriskswillnotbeevenlydistributedacrossorganizations.Larger,morewell-resourcedorganizationswillhaveanadvantageoverorganizationsthatfacecostandworkforceconstraints.
●Multiplestakeholdershaverolestoplayinhelpingtomitigatepotentialsecurity
risksrelatedtoAI-generatedcode.TheburdenofensuringthatAI-generated
codeoutputsaresecureshouldnotrestsolelyonindividualusers,butalsoonAIdevelopers,organizationsproducingcodeatscale,andthosewhocanimprove
securityatlarge,suchaspolicymakingbodiesorindustryleaders.Existing
guidancesuchassecuresoftwaredevelopmentpracticesandtheNIST
CybersecurityFrameworkremainsessentialtoensurethatallcode,regardlessofauthorship,isevaluatedforsecuritybeforeitentersproduction.Other
cybersecurityguidance,suchassecure-by-designprinciples,canbeexpandedto
CenterforSecurityandEmergingTechnology|1
includecodegenerationmodelsandotherAIsystemsthatimpactsoftwaresupplychainsecurity.
●Codegenerationmodelsalsoneedtobeevaluatedforsecurity,butitiscurrentlydifficulttodoso.Evaluationbenchmarksforcodegenerationmodelsoftenfocusonthemodels’abilitytoproducefunctionalcodebutdonotassesstheirabilitytogeneratesecurecode,whichmayincentivizeadeprioritizationofsecurityover
functionalityduringmodeltraining.Thereisinadequatetransparencyaround
models’trainingdata—orunderstandingoftheirinternalworkings—toexplorequestionssuchaswhetherbetterperformingmodelsproducemoreinsecure
code.
CenterforSecurityandEmergingTechnology|2
TableofContents
ExecutiveSummary 1
Introduction 4
Background 5
WhatAreCodeGenerationModels? 5
IncreasingIndustryAdoptionofAICodeGenerationTools 7
RisksAssociatedwithAICodeGeneration 9
CodeGenerationModelsProduceInsecureCode 9
Models’VulnerabilitytoAttack 11
DownstreamImpacts 13
ChallengesinAssessingtheSecurityofCodeGenerationModels 15
IsAIGeneratedCodeInsecure? 18
Methodology 18
EvaluationResults 22
UnsuccessfulVerificationRates 22
VariationAcrossModels 24
SeverityofGeneratedBugs 25
Limitations 26
PolicyImplicationsandFurtherResearch 28
Conclusion 32
Authors 33
Acknowledgments 33
AppendixA:Methodology 34
AppendixB:EvaluationResults 34
Endnotes 35
CenterforSecurityandEmergingTechnology|3
Introduction
AdvancementsinartificialintelligencehaveresultedinaleapintheabilityofAI
systemstogeneratefunctionalcomputercode.Whileimprovementsinlargelanguage
modelshavedrivenagreatdealofrecentinterestandinvestmentinAI,codegenerationhasbeenaviableusecaseforAIsystemsforthelastseveralyears.
SpecializedAIcodingmodels,suchascodeinfillingmodelswhichfunctionsimilarlyto“autocompleteforcode,”and“general-purpose”LLM-basedfoundationmodelsare
bothbeingusedtogeneratecodetoday.Anincreasingnumberofapplicationsand
softwaredevelopmenttoolshaveincorporatedthesemodelstobeofferedasproductseasilyaccessiblebyabroadaudience.
Thesemodelsandassociatedtoolsarebeingadoptedrapidlybythesoftware
developercommunityandindividualusers.AccordingtoGitHub’sJune2023survey,92%ofsurveyedU.S.-baseddevelopersreportusingAIcodingtoolsinandoutof
work.1AnotherindustrysurveyfromNovember2023similarlyreportedahighusagerate,with96%ofsurveyeddevelopersusingAIcodingtoolsandmorethanhalfof
respondentsusingthetoolsmostofthetime.2Ifthistrendcontinues,LLM-generatedcodewillbecomeanintegralpartofthesoftwaresupplychain.
ThepolicychallengeregardingAIcodegenerationisthatthistechnological
advancementpresentstangiblebenefitsbutalsopotentialsystemicrisksforthe
cybersecurityecosystem.Ontheonehand,thesemodelscouldsignificantlyincreaseworkforceproductivityandpositivelycontributetocybersecurityifappliedinareas
suchasvulnerabilitydiscoveryandpatching.Ontheotherhand,researchhasshownthatthesemodelsalsogenerateinsecurecode,posingdirectcybersecurityrisksif
incorporatedwithoutproperreview,aswellasindirectrisksasinsecurecodeendsupinopen-sourcerepositoriesthatfeedintosubsequentmodels.
Asdevelopersincreasinglyadoptthesetools,stakeholdersateverylevelofthe
softwaresupplychainshouldconsidertheimplicationsofwidespreadAI-generated
code.AIresearchersanddeveloperscanevaluatemodeloutputswithsecurityinmind,programmersandsoftwarecompaniescanconsiderhowthesetoolsfitintoexisting
security-orientedprocesses,andpolicymakershavetheopportunitytoaddressbroadercybersecurityrisksassociatedwithAI-generatedcodebysettingappropriate
guidelines,providingincentives,andempoweringfurtherresearch.ThisreportprovidesanoverviewofthepotentialcybersecurityrisksassociatedwithAI-generatedcodeanddiscussesremainingresearchchallengesforthecommunityandimplicationsforpolicy.
CenterforSecurityandEmergingTechnology|4
Background
WhatAreCodeGenerationModels?
CodegenerationmodelsareAImodelscapableofgeneratingcomputercodein
responsetocodeornatural-languageprompts.Forexample,ausermightprompta
modelwith“WritemeafunctioninJavathatsortsalistofnumbers”andthemodelwilloutputsomecombinationofcodeandnaturallanguageinresponse.Thiscategoryof
modelsincludesbothlanguagemodelsthathavebeenspecializedforcodegenerationaswellasgeneral-purposelanguagemodels—alsoknownas“foundationmodels”—
thatarecapableofgeneratingothertypesofoutputsandarenotexplicitlydesignedto
outputcode.ExamplesofspecializedmodelsincludeAmazonCodeWhisperer,
DeepSeekCoder,WizardCoder,andCodeLlama,whilegeneral-purposemodelsincludeOpenAI’sGPTseries,Mistral,Gemini,andClaude.
Earlieriterationsofcodegenerationmodels—manyofwhichpredatedthecurrent
generationofLLMsandarestillinwidespreaduse—functionedsimilarlyto
“autocompleteforcode,”inwhichamodelsuggestsacodesnippettocompletealine
asausertypes.These“autocomplete”models,whichperformwhatisknownascode
infilling,aretrainedspecificallyforthistaskandhavebeenwidelyadoptedinsoftwaredevelopmentpipelines.Morerecentimprovementsinlanguagemodelcapabilitieshaveallowedformoreinteractivity,suchasnatural-languagepromptingorauserinputtingacodesnippetandaskingthemodeltocheckitforerrors.Likegeneral-purposelanguagemodels,userscommonlyinteractwithcodegenerationmodelsviaadedicatedinterfacesuchasachatwindoworaplugininanotherpieceofsoftware.Recently,specialized
scaffoldingsoftwarehasfurtherincreasedwhatAImodelsarecapableofincertaincontexts.Forinstance,somemodelsthatcanoutputcodemayalsobecapableof
executingthatcodeanddisplayingtheoutputstotheuser.3
Aslanguagemodelshavegottenlargerandmoreadvancedoverthepastfewyears,
theircodegenerationcapabilitieshaveimprovedinstepwiththeirnaturallanguage-
generationcapabilities.4Codinglanguagesare,afterall,intentionallydesignedto
encodeandconveyinformation,andhavetheirownrulesandsyntacticalexpectationsmuchlikehumanlanguages.Researchersinthefieldofnaturallanguageprocessing
(NLP)havebeeninterestedintranslatingbetweennaturallanguageandcomputercode
formanyyears,butthesimultaneousintroductionoftransformer-basedlanguage
modelarchitecturesandlargedatasetscontainingcodeledtoarapidimprovementincodegenerationcapabilitiesbeginningaround2018–2019.Asnewmodelswere
released,researchersalsobeganexploringwaystomakethemmoreaccessible.Inmid-2021,forexample,OpenAIreleasedthefirstversionofCodex,aspecializedlanguage
CenterforSecurityandEmergingTechnology|5
modelforcodegeneration,alongwiththeHumanEvalbenchmarkforassessingthe
correctnessofAIcodeoutputs.5Github,inpartnershipwithOpenAI,thenlauncheda
previewofaCodex-poweredAIpairprogrammingtoolcalledGithubCopilot.6Althoughitinitiallyfunctionedmoresimilarlyto“autocompleteforcode”thanacurrent-
generationLLMchatbot,GithubCopilot’srelativeaccessibilityandearlysuccesshelped
spurinterestincodegenerationtoolsamongprogrammers,manyofwhomwereinterestedinadoptingAItoolsforbothworkandpersonaluse.
Tobecomeproficientatcodegeneration,modelsneedtobetrainedondatasets
containinglargeamountsofhuman-writtencode.Modernmodelsareprimarilytrainedonpublicly-available,open-sourcecode.7Muchofthiscodewasscrapedfromopen-
sourcewebrepositoriessuchasGithub,whereindividualsandcompaniescanstore
andcollaborateoncodingprojects.Forexample,thefirstversionofthe6-terabyte
datasetknownasTheStackconsistsofsourcecodefilesin358differentprogramminglanguages,andhasbeenusedtopretrainseveralopencodegenerationmodels.8Otherlanguagemodeltrainingdatasetsareknowntocontaincodeinadditiontonatural-
languagetext.The825-gigabytedatasetcalledThePilecontains95gigabytesofGithubdataand32gigabytesscrapedfromStackExchange,afamilyofquestion-answeringforumsthatincludescodesnippetsandothercontentrelatedto
programming.9However,thereisoftenlimitedvisibilityintothedatasetsthat
developersusefortrainingmodels.Wecanspeculatethatthemajorityofcodebeing
usedtotraincodegenerationmodelshasbeenscrapedfromopen-sourcerepositories,butotherdatasetsusedfortrainingmaycontainproprietarycodeorsimplybeexcludedfrommodelcardsorotherformsofdocumentation.
Additionally,somespecializedmodelsarefine-tunedversionsofgeneral-purpose
models.Usually,theyarecreatedbytraininggeneral-purposemodelswithadditional
dataspecifictotheusecase.Thisisparticularlylikelyininstanceswherethemodel
needstotranslatenatural-languageinputsintocode,asgeneral-purposemodelstendtobebetteratfollowingandinterpretinguserinstructions.OpenAI’sCodexisonesuchexample,asitwascreatedbyfine-tuningaversionofthegeneral-purposeGPT-3
modelon159gigabytesofPythoncodescrapedfromGithub.10CodeLlamaandCodeLlamaPython—basedonMeta’sLlama2model—areotherexamplesofsuchmodels.
ResearchinterestinAIcodegenerationhasconsistentlyincreasedinthepastdecade,especiallyexperiencingasurgeinthepastyearfollowingthereleaseofhigh-
performingfoundationmodelssuchasGPT-4andopen-sourcemodelssuchasLlama2.Figure1illustratesthetrendbycountingthenumberofresearchpapersoncode
generationbyyearfrom2012–2023.Thenumberofresearchpapersoncode
CenterforSecurityandEmergingTechnology|6
generationmorethandoubledfrom2022to2023,demonstratingagrowingresearchinterestinitsusage,evaluation,andimplications.
Figure1:NumberofPapersonCodeGenerationbyYear*
Source:CSET’sMergedAcademicCorpus.
IncreasingIndustryAdoptionofAICodeGenerationTools
Codegenerationpresentsoneofthemostcompellingandwidelyadoptedusecasesforlargelanguagemodels.InadditiontoclaimsfromorganizationssuchasMicrosoftthattheirAIcodingtoolGitHubCopilothad1.8millionpaidsubscribersasofspring2024,
upfrommorethanamillioninmid-2023,11softwarecompaniesarealsoadopting
*ThisgraphcountsthenumberofpapersinCSET’sMergedAcademicCorpusthatcontainthe
keywords“codegeneration,”“AI-assistedprogramming,”“AIcodeassistant,”“codegenerating
LLM,”or“codeLLM”andarealsoclassifiedasAI-orcybersecurity-relatedusingCSET’sAIclassifierandcybersecurityclassifier.NotethatatthetimeofwritinginFebruary2024,CSET’sMerged
AcademicCorpusdidnotyetincludeallpapersfrom2023duetoupstreamcollectionlags,which
mayhaveresultedinanundercountingofpapersin2023.ThecorpuscurrentlyincludesdatafromClarivate’sWebofScience,TheLens,arXiv,PaperswithCode,SemanticScholar,andOpenAlex.
MoreinformationregardingourmethodologyforcompilingtheMergedAcademicCorpusaswellasbackgroundonourclassifiersandadetailedcitationofdatasourcesareavailablehere:
https://eto.tech/dataset-docs/mac/
;
/publication/identifying-ai-research/.
CenterforSecurityandEmergingTechnology|7
internalversionsofthesemodelsthathavebeentrainedonproprietarycodeand
customizedforemployeeuse.GoogleandMetahavecreatednon-public,customcodegenerationmodelsintendedtohelptheiremployeesdevelopnewproductsmore
efficiently.12
ProductivityisoftencitedasoneofthekeyreasonsindividualsandorganizationshaveadoptedAIcodegenerationtools.Metricsformeasuringhowmuchdeveloper
productivityimprovesbyleveragingAIcodegenerationtoolsvarybystudy.Asmall
GitHubstudyusedbothself-perceivedproductivityandtaskcompletiontimeas
productivitymetrics,buttheauthorsacknowledgedthatthereislittleconsensusaboutwhatmetricstouseorhowproductivityrelatestodeveloperwell-being.13AMcKinseystudyusingsimilarmetricsclaimedthatsoftwaredevelopersusinggenerativeAItoolscouldcompletecodingtasksuptotwiceasfastasthosewithoutthem,butthatthesebenefitsvarieddependingontaskcomplexityanddeveloperexperience.14Companieshavealsoruninternalproductivitystudieswiththeiremployees.AMetastudyontheirinternalcodegenerationmodelCodeComposeusedmetricssuchascodeacceptancerateandqualitativedeveloperfeedbacktomeasureproductivity,findingthat20%of
usersstatedthatCodeComposehelpedthemwritecodemorequickly,whileaGooglestudyfounda6%reductionincodingiterationtimewhenusinganinternalcode
completionmodelascomparedtoacontrolgroup.15Morerecently,aSeptember2024studyanalyzingdatafromrandomizedcontroltrialsacrossthreedifferentorganizationsfounda26%increaseinthenumberofcompletedtasksamongdevelopersusing
GitHubCopilotasopposedtodeveloperswhowerenotgivenaccesstothetool.16Moststudiesareinagreementthatcodegenerationtoolsimprovedeveloperproductivityin
general,regardlessoftheexactmetricsused.
AIcodegenerationtoolsareundoubtedlyhelpfultosomeprogrammers,especially
thosewhoseworkinvolvesfairlyroutinecodingtasks.(Generally,themorecommonacodingtaskorcodinglanguage,thebetteracodegenerationmodelcanbeexpectedtoperformbecauseitismorelikelytohavebeentrainedonsimilarexamples.)Automatingrotecodingtasksmayfreeupemployees’timeformorecreativeorcognitively
demandingwork.TheamountofsoftwarecodegeneratedbyAIsystemsisexpectedtoincreaseinthenear-tomedium-termfuture,especiallyasthecodingcapabilitiesof
today’smostaccessiblemodelscontinuetoimprove.
Broadlyspeaking,evidencesuggeststhatcodegenerationtoolshavebenefitsatboththeindividualandorganizationallevels,andthesebenefitsarelikelytoincreaseover
timeasmodelcapabilitiesimprove.Therearealsoplentyofincentives,suchaseaseof
accessandpurportedproductivitygains,fororganizationstoadopt—oratleastexperimentwith—AIcodegenerationforsoftwaredevelopment.
CenterforSecurityandEmergingTechnology|8
RisksAssociatedwithAICodeGeneration
Thistechnologicalbreakthrough,however,mustalsobemetwithcaution.Increasing
usageofcodegenerationmodelsinroutinesoftwaredevelopmentprocessesmeans
thatthesemodelswillsoonbeanimportantpartofthesoftwaresupplychain.Ensuringthattheiroutputsaresecure—orthatanyinsecureoutputstheyproduceareidentifiedandcorrectedbeforecodeentersproduction—willalsobeincreasinglyimportantfor
cybersecurity.However,codegenerationmodelsareseldomtrainedwithsecurityasabenchmarkandareinsteadoftentrainedtomeetvariousfunctionalitybenchmarkssuchasHumanEval,asetof164human-writtenprogrammingproblemsintendedto
evaluatemodels’code-writingcapabilityinthePythonprogramminglanguage.17Asthe
functionalityofthesecodegenerationmodelsincreasesandmodelsareadoptedintothestandardroutineoforganizationsanddevelopers,overlookingthepotential
vulnerabilitiesofsuchcodemayposesystemiccybersecurityrisks.
Theremainderofthissectionwillexaminethreepotentialsourcesofriskingreater
detail:1)codegenerationmodels’likelihoodofproducinginsecurecode,2)themodels’vulnerabilitytoattacks,and3)potentialdownstreamcybersecurityimplicationsrelatedtothewidespreaduseofcodegenerationmodels.
CodeGenerationModelsProduceInsecureCode
Anemergingbodyofresearchonthesecurityofcodegenerationmodelsfocusesonhowtheymightproduceinsecurecode.Thesevulnerabilitiesmaybecontainedwithinthecodeitselforinvolvecodethatcallsapotentiallyvulnerableexternalresource.
Human-computerinteractionfurthercomplicatesthisproblem,as1)usersmay
perceiveAI-generatedcodeasmoresecureormoretrustworthythanhuman-
generatedcode,and2)researchersmaybeunabletopinpointexactlyhowtostopmodelsfromgeneratinginsecurecode.Thissectionexploresthesevarioustopicsinmoredetail.
Firstly,variouscodegenerationmodelsoftensuggestinsecurecodeasoutputs.Pearceetal.(2021)showthatapproximately40%ofthe1,689programsgeneratedbyGithubCopilot18werevulnerabletoMITRE’s“2021CommonWeaknessEnumerations(CWE)Top25MostDangerousSoftwareWeaknesses”list.19SiddiqandSantos(2022)foundthatoutof130codesamplesgeneratedusingInCoderandGithubCopilot,68%and
73%ofthecodesamplesrespectivelycontainedvulnerabilitieswhenchecked
manually.20Khouryetal.(2023)usedChatGPTtogenerate21programsinfive
differentprogramminglanguagesandtestedforCWEs,showingthatonlyfiveoutof21wereinitiallysecure.Onlyafterspecificpromptingtocorrectthecodedidan
CenterforSecurityandEmergingTechnology|9
additionalsevencasesgeneratesecurecode.21Fuetal.(2024)showthatoutof452real-worldcasesofcodesnippetsgeneratedbyGithubCopilotfrompubliclyavailableprojects,32.8%ofPythonand24.5%ofJavaScriptsnippetscontained38different
CWEs,eightofwhichbelongtothe2023CWETop25list.22
Incertaincodinglanguages,codegenerationmodelsarealsolikelytoproducecodethatcallsexternallibrariesandpackages.Theseexternalcodesourcescanpresenta
hostofproblems,somesecurity-relevant:Theymaybenonexistentandmerely
hallucinatedbythemodel,outdatedandunpatchedforvulnerabilities,ormaliciousin
nature(suchaswhenattackersattempttotakeadvantageofcommonmisspellingsinURLsorpackagenames).23Forexample,VulcanCybershowedthatChatGPTroutinelyrecommendednonexistentpackageswhenansweringcommoncodingquestions
sourcedfromStackOverflow—over40outof201questionsinNode.jsandover80outof227questionsinPythoncontainedatleastonenonexistentpackageintheanswer.24Furthermore,someofthesehallucinatedlibraryandpackagenamesarepersistent
acrossbothusecasesanddifferentmodels;asafollow-upstudydemonstrated,a
potentialattackercouldeasilycreateapackagewiththesamenameandgetuserstounknowinglydownloadmaliciouscode.25
Despitetheseempiricalresults,thereareearlyindicationsthatusersperceiveAI-
generatedcodetobemoresecurethanhuman-writtencode.This“automationbias”
towardsAI-generatedcodemeansthatusersmayoverlookcarefulcodereviewand
acceptinsecurecodeasitis.Forinstance,ina2023industrysurveyof537technologyandITworkersandmanagers,76%respondedthatAIcodeismoresecurethanhumancode.26Perryetal.(2023)furthershowedinauserstudythatstudentparticipantswithaccesstoanAIassistantwrotesignificantlylesssecurecodethanthosewithout
access,andweremorelikelytobelievethattheywrotesecurecode.27However,thereissomedisagreementonwhetherornotusersofAIcodegenerationtoolsaremorelikelytowriteinsecurecode;otherstudiessuggestthatuserswithaccesstoAIcode
assistantsmaynotbesignificantlymorelikelytoproduceinsecurecodethanusers
withoutAItools.28Thesecontradictoryfindingsraiseaseriesofrelatedquestions,suchas:Howdoesauser’sproficiencywithcodingaffecttheiruseofcodegeneration
models,andtheirlikelihoodofacceptingAI-generatedcodeassecure?Could
automationbiasleadhumanprogrammerstoaccept(potentiallyinsecure)AI-generatedcodeassecuremoreoftenthanhuman-authoredcode?Regardless,thefactthatAI
codingtoolsmayprovideinexperienceduserswithafalsesenseofsecurityhas
cybersecurityimplicationsifAI-generatedcodeismoretrustedandlessscrutinizedforsecurityflaws.
CenterforSecurityandEmergingTechnology|10
Furthermore,thereremainsuncertaintyaroundwhycodegenerationmodelsproduceinsecurecodeinthefirstplace,andwhatcausesvariationinthesecurityofcode
outputsacrossandwithinmodels.Partoftheanswerliesinthatmanyofthesemodelsaretrainedoncodefromopen-sourcerepositoriessuchasGithub.Theserepositories
containhuman-authoredcodewithknownvulnerabilities,largelydonotenforcesecure
codingpractices,andlackdatasanitizationprocessesforremovingcodewitha
significantnumberofknownvulnerabilities.Recentworkhasshownthatsecurity
vulnerabilitiesinthetrainingdatacanleakintooutputsoftransformer-basedmodels,
whichdemonstratesthatvulnerabilitiesintheunderlyingtrainingdatacontributetotheproblemofinsecurecodegeneration.29Addingtothechallenge,thereisoftenlittleto
notransparencyinexactlywhatcodewasincludedintrainingdatasetsandwhetherornotanyattemptsweremadetoimproveitssecurity.
Manyotheraspectsofthequestionofhow—andwhy—codegenerationmodelsproduceinsecurecodearestillunanswered.Forexample,a2023Metastudythat
comparedseveralversionsofLlama2,CodeLlama,andGPT-3.5and4foundthat
modelswithmoreadvancedcodingcapabilitiesweremorelikelytooutputinsecure
code.30Thissuggestsapossibleinverserelationshipbetweenfunctionalityandsecurityincodegenerationmodelsandshouldbeinvestigatedfurther.Inanotherexample,
researchersconductedacomparativestudyoffourmodels–GPT-3.5,GPT-4,Bard,
andGemini–andfoundthatpromptingmodelstoadopta“securitypersona”eliciteddivergentresults.31WhileGPT-3.5,GPT-4,andBardsawareductioninthenumberofvulnerabilitiesfromthenormalpersona,Gemini’scodeoutputcontainedmore
vulnerabilities.32Theseearlystudieshighlightsomeoftheknowledgegapsconcerning
howinsecurecodeoutputsaregeneratedandhowtheychangeinresponsetovariablessuchasmodelsizeandpromptengineering.
Models’VulnerabilitytoAttack
Inadditiontothecodethattheyoutput,codegenerationmodelsaresoftwaretoolsthatneedtobeproperlysecured.AImodelsarevulnerabletohacking,tampering,or
manipulationinwaysthathumansarenot.33Figure2illustratesthecodegenerationmodeldevelopmentworkflow,wheretheportionsinredindicatevariouswaysa
maliciouscyberactormayattackamodel.
CenterforSecurityandEmergingTechnology|11
Figure2:CodeGenerationModelDevelopmentWorkflowandItsCybersecurityImplications
Source:CSET.
GenerativeAIsystemshaveknownvulnerabilitiestoseveraltypesofadversarial
attacks.Theseincludedatapoisoningattacks,inwhichanattackercontaminatesamodel’strainingdatatoelicitadesiredbehavior,andbackdoorattacks,inwhichan
attackerattemptstoproduceaspecificoutputbypromptingthemodelwitha
predeterminedtriggerphrase.Inthecodegenerationcontext,adatapoisoningattack
maylooklikeanattackermanipulatingamodel’strainingdatatoincreaseitslikelihoodofproducingcodethatimportsamaliciouspackageorlibrary.Abackdoorattackonthemodelitself,meanwhile,coulddramaticallychangeamodel’sbehaviorwithasingle
triggerthatmaypersistevenifdeveloperstrytoremoveit.34Thischangedbehaviorcanresultinanoutputthatviolatesrestrictionsplacedonthemodelbyitsdevelopers(suchas“don’tsuggestcodepatternsassociatedwithmalware”)orthatmayreveal
unwantedorsensitiveinformation.Researchershavepointedoutthatbecausecodegenerationmodelsaretrainedonlargeamountsofdatafromafinitenumberof
unsanitizedcoderepositories,attackerscouldeasilyseedthese
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 煙草制品個(gè)性化營銷策略-洞察分析
- 農(nóng)村護(hù)林防火發(fā)言稿范文(15篇)
- 營養(yǎng)與免疫力-洞察分析
- 演化策略可持續(xù)發(fā)展-洞察分析
- 創(chuàng)新驅(qū)動(dòng)的設(shè)計(jì)院醫(yī)療技術(shù)的突破口
- 辦公室文化中人與寄生蟲的和諧共生
- 《Ct擴(kuò)散爐結(jié)構(gòu)簡(jiǎn)介》課件
- 《生活中常見的鹽》課件
- 醫(yī)學(xué)領(lǐng)域?qū)嶒?yàn)教學(xué)中的心理干預(yù)實(shí)踐
- 優(yōu)化工業(yè)互聯(lián)網(wǎng)平臺(tái)的用戶體驗(yàn)策略
- 2024-2030年電助力自行車行業(yè)供需平衡分析及未來發(fā)展走勢(shì)預(yù)測(cè)報(bào)告
- 鄉(xiāng)村振興的實(shí)踐探索學(xué)習(xí)通超星期末考試答案章節(jié)答案2024年
- 《 太赫茲超材料設(shè)計(jì)仿真及其傳感特性研究》范文
- 2024中華人民共和國兩用物項(xiàng)出口管制條例全文解讀課件
- 戶外P10單色LED顯示屏方案
- 外研版小學(xué)英語(三起點(diǎn))六年級(jí)上冊(cè)期末測(cè)試題及答案(共3套)
- 醫(yī)療器械質(zhì)量記錄和追溯管理制度
- unit 5(單元測(cè)試)-2024-2025學(xué)年人教PEP版英語三年級(jí)上冊(cè)
- 2024-2030年中國立式輥磨機(jī)行業(yè)市場(chǎng)發(fā)展趨勢(shì)與前景展望戰(zhàn)略分析報(bào)告
- 保密工作履職報(bào)告?zhèn)€人
- 七年級(jí)生物上冊(cè) 2.1.1 練習(xí)使用顯微鏡教案 (新版)新人教版
評(píng)論
0/150
提交評(píng)論