2024年AI現(xiàn)狀報(bào)告-2024年10月版(英文)-Air+Street資本_第1頁
2024年AI現(xiàn)狀報(bào)告-2024年10月版(英文)-Air+Street資本_第2頁
2024年AI現(xiàn)狀報(bào)告-2024年10月版(英文)-Air+Street資本_第3頁
2024年AI現(xiàn)狀報(bào)告-2024年10月版(英文)-Air+Street資本_第4頁
2024年AI現(xiàn)狀報(bào)告-2024年10月版(英文)-Air+Street資本_第5頁
已閱讀5頁,還剩421頁未讀 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡介

venturecapital?rminvestinginAI-?rstcompanies.HerunstheResearchandAppliedAISummit(RAAIS),theRAAISFoundation(fundingopen-sourceAIprojects),AIcommunitiesintheUSandEurope,andSpinout.fyi(improvinguniversityspinoutcreation).HestudiedbiologyatWilliamsCollegeandearnedaPhDfromCambridgeincancerresearchasaGatesScholar.regularlywritesresearch,analysis,andcommentaryonanassociatedirectoratMilltownPartners,whereheadvisedbigtechnologycompanies,start-ups,andinvestorsonpolicyandpositioning.HegraduatedfromtheUniversityofOxfordin2017withadegreeinHistory.Arti?cialintelligence(AI)isamultidisciplinary?eldofscienceandengineeringwhosegoalistocreateintelligentmachines.WebelievethatAIwillbeaforcemultiplierontechnologicalprogressinourincreasinglydigital,data-drivenworld.Thisisbecauseeverythingaroundustoday,rangingfromculturetoconsumerproducts,isaproductofintelligence.TheStateofAIReportisnowinitsseventhyear.Considerthisreportasacompilationofthemostinterestingthingswe’veseenwithagoaloftriggeringaninformedconversationaboutthestateofAIanditsimplicationforthefuture.Weconsiderthefollowingkeydimensionsinourreport:-Research:Technologybreakthroughsandtheircapabilities.-Industry:AreasofcommercialapplicationforAIanditsbusinessimpact.-Politics:RegulationofAI,itseconomicimplicationsandtheevolvinggeopoliticsofAI.-Safety:Identifyingandmitigatingcatastrophicrisksthathighly-capablefutureAIsystemscouldposetous.-Predictions:Whatwebelievewillhappeninthenext12monthsanda2023performancereviewtokeepushonest.Arti?cialintelligence(AI):abroaddisciplinewiththegoalofcreatingintelligentmachines,asopposedtothenaturalintelligencethatisdemonstratedbyhumansandanimals.Arti?cialgeneralintelligence(AGI):atermusedtodescribefuturemachinesthatcouldmatchandthenexceedthefullrangeofhumancognitiveabilityacrossalleconomicallyvaluabletasks.AIAgent:anAI-poweredsystemthatcantakeactionsinanenvironment.Forexample,anLLMthathasaccesstoasuiteoftoolsandhastodecidewhichonetouseinordertoaccomplishataskthatithasbeenpromptedtodo.AISafety:a?eldthatstudiesandattemptstomitigatetherisks(minortocatastrophic)whichfutureAIcouldposetohumanity.Computervision(CV):theabilityofaprogramtoanalyseandunderstandimagesandvideo.Deeplearning(DL):anapproachtoAIinspiredbyhowneuronsinthebrainrecognisecomplexpatternsindata.The“deep”referstothemanylayersofneuronsintoday’smodelsthathelptolearnrichrepresentationsofdatatoachievebetterperformancegains.Diffusion:Analgorithmthatiterativelydenoisesanarti?ciallycorruptedsignalinordertogeneratenew,high-qualityoutputs.Inrecentyearsithasbeenattheforefrontofimagegenerationandproteindesign.GenerativeAI:AfamilyofAIsystemsthatarecapableofgeneratingnewcontent(e.g.text,images,audio,or3Dassets)basedon'prompts'.GraphicsProcessingUnit(GPU):asemiconductorprocessingunitthatenablesalargenumbercalculationstobecomputedinparallel.Historicathiswasrequiredforrenderingcomputergraphics.Since2012GPUshaveadaptedfortrainingDLmodels,whichalsorequirealargenumberofparallelcalculations.(Large)Languagemodel(LM,LLM):amodeltrainedonvastamountsof(often)textualdatatopredictthenextwordinaself-supervisedmanner.Theterm“LLM”isusedtodesignatemulti-billionparameterLMs,butthisisamovingde?nition.Machinelearning(ML):asubsetofAIthatoftenusesstatisticaltechniquestogivemachinestheabilityto"learn"fromdatawithoutbeingexplicitlygiventheinstructionsforhowtodoso.Thisprocessisknownas“training”a“model”usingalearning“algorithm”thatprogressivelyimprovesmodelperformanceonaspeci?ctask.Model:aMLalgorithmtrainedondataandusedtomakepredictions.Naturallanguageprocessing(NLP):theabilityofaprogramtPrompt:auserinputoftenwritteninnaturallanguagethatisusedtoinstructanLLMtogeneratesomethingortakeaction.Reinforcementlearning(RL):anareaofMLinwhichsoftwareagentslearngoal-orientedbehaviorbytrialanderrorinanenvironmentthatprovidesrewardsorpenaltiesinresponsetotheiractions(calleda“policy”)towardsachievingthatgoal.Self-supervisedlearning(SSL):aformofunsupervisedlearning,wheremanuallylabeleddataisnotneeded.Rawdataisinsteadmodi?edinanautomatedwaytocreatearti?ciallabelstolearnfrom.AnexampleofSSLislearningtocompletetextbymaskingrandomwordsinasentenceandtryingtopredictthemissingones.Transformer:amodelarchitectureatthecoreofmoststateoftheart(SOTA)MLresearch.Itiscomposedofmultiple“attention”layerswhichlearnwhichpartsoftheinputdataarethemostimportantforagiventask.TransformersstartedinNLP(speci?callymachinetranslation)andsubsequentlywereexpandedintocomputervision,audio,andothermodalities.Intherestoftheslides,iconsinthetoprightcornerindicateinputandoutputmodalitiesforthemodel.:Softwaretooluse(text,codegeneration&execution) .:3D:Robotstate:Biologicalmodality→:TexttoSoftwaretooluse→.:Imageto3D→.:Textto3D-Frontierlabperformanceconverges,butOpenAImaintainsitsedgefollowingthelaunchofo1,asplanningandreasoningemergeasamajorfrontier.-Foundationmodelsdemonstratetheirabilitytobreakoutoflanguageasmultimodalresearchdrivesintomathematics,biology,genomics,thephysicalsciences,andneuroscience.-USsanctionsfailtostopChinese(V)LLMsrisingupcommunityleaderboards.-NVIDIAremainsthemostpowerfulcompanyintheworld,enjoyingastintinthe$3Tclub,whileregulatorsprobetheconcentrationsofpowerwithinGenAI.-MoreestablishedGenAIcompaniesbringinbillionsofdollarsinrevenue,whilestart-upsbegintogaintractioninsectorslikevideoandaudiogeneration.Althoughcompaniesbegintomakethejourneyfrommodeltoproduct,long-termquestionsaroundpricingandsustainabilityremainunresolved.-Drivenbyabullruninpublicmarkets,AIcompaniesreach$9Tinvalue,whileinvestmentlevelsgrowhealthilyinprivatecompanies.-Whileglobalgovernanceeffortsstall,nationalandregionalAIregulationhascontinuedtoadvance,withcontroversiallegislationpassingintheUSandEU.-TherealityofcomputerequirementsforcesBigTechcompaniestoreckonwithreal-worldphysicalconstraintsonscalingandtheirownemissionstargets.Meanwhile,governments’ownattemptstobuildcapacitycontinuetolag.-AnticipatedAIeffectsonelections,employmentandarangeofothersensitiveareasareyettoberealizedatanyscale.-Avibe-shiftfromsafetytoaccelerationtakesplaceascompaniesthatpreviouslywarnedusaboutthependingextinctionofhumanityneedtorampupenterprisesalesandusageoftheirconsumerapps.-GovernmentsaroundtheworldemulatetheUKinbuildingupstatecapacityaroundAIsafety,launchinginstitutesandstudyingcriticalnationalinfrastructureforpotentialvulnerabilities.-Everyproposedjailbreaking‘?x’hasfailed,butresearchersareincreasinglyconcernedwithmoresophisticated,long-termattacks.stateof.ai2024AHollywood-gradeproductionmakesuseofgenerativeAIforvisualeffects.AgenerativeAImediacompanyisinvestigatedforitsmisuseduringinthe2024USelectioncircuit.Self-improvingAIagentscrushSOTAinacomplexenvironment(e.g.AAAgame,tooluse,science).TechIPOmarketsunthawandweseeatleastonemajorlistingforanAI-focusedcompany(e.g.DBRX).TheGenAIscalingcrazeseesagroupspend>$1Btotrainasinglelarge-scalemodel.TheUS’sFTCorUK’sCMAinvestigatetheMicrosoft/OpenAIdealoncompetitiongroundWeseelimitedprogressonglobalAIgovernancebeyondhigh-levelvoluntarycommitments.FinancialinstitutionslaunchGPUdebtfundstoreplaceVCequitydollarsforcomputefunding.AnAI-generatedsongbreaksintotheBillboardHot100Top10ortheSpotifyTopHits2024.Asinferenceworkloadsandcostsgrowsigni?cantly,alargeAIcompany(e.g.OpenAI)acquiresorbuildsaninference-focusedAIchipcompany.~~Largelybadly,butGenAIAIvisualeffectshavebeenseeninNet?ixandHBOproductions.Notyet,butthere’sstilltime.Notyet,despitepromisingworkonopen-endedness,includingstronggameperformance.WhiletheMagni?centSevenhaveenjoyedstronggains,privatecompaniesarehangingonuntilmarketssettle.However,AIchipcompanyCerebrashas?ledtoIPO.Notquiteyet-let’sgiveitanotheryear.Bothregulatorsareinvestigatingthispartnership.ThecommitmentsfromBletchleyandSeoulsummitsremainvoluntaryandhigh-level.SomeVCfundsarerumoredtobeofferingGPUsforequity,butwe’reyettoseeanyonegodownthedebtroute.Itturnsoutthishadalreadyhappenedlastyearwith“HeartonMySleeve”,butwe’vealsoseenanAI-generatedsongreach#27inGermanyandspendseveraldaysintheTop50.SamAltmanisreportedlyraisinghugesumsofmoneytodothis,whileeachofGoogle,Amazon,MetaandMicrosoftcontinuetobuildandimprovetheirownedAIsilicon.Introduction|Research|Industry|Politics|Safety●●Onbothformalbenchmarksandvibes-basedanalysis,thebest-fundedfrontierlabsareabletorackupscoreswithinlowsingledigitsofeachotheronindividualcapabilities.Modelsarenowconsistentlyhighlycapablecoders,arestrongatfactualrecallandmath,butlessgoodatopen-endedquestion-answeringandmulti-modalproblemsolving.Manyofthevariationsaresuf?cientlysmallthattheyarenowlikelytobetheproductofdifferencesinimplementation.Forexample,GPT-4ooutperformsClaude3.5SonnetonMMLU,butapparentlyunderperformsitonMMLU-Pro-abenchmarkdesignedtobemorechallenging.Consideringtherelativelysubtletechnicaldifferencesbetweenarchitecturesandlikelyheavyoverlapsinpre-trainingdata,modelbuildersarenowincreasinglyhavingtocompeteonnewcapabilitiesandproductIntroduction|Research|Industry|Politics|Safety●Byshiftingcomputefrompre-andpost-trainingtoinference,o1reasonsthroughcomplexpromptsstep-by-stepinachain-of-thought(COT)style,employingRLtosharpentheCOTandthestrategiesituses.Thisunlocksthepossibilityofsolvingmulti-layeredmath,science,andcodingproblemswhereLLMshavehistoricallystruggled,duetotheinherentlimitationsofnext-tokenprediction.OpenAIreportsigni?cantimprovementsonreasoning-heavybenchmarksversus4o,withthestarkestonAIME2024(competitionmath),withawhoppingscoreof83.83versus13.4.However,thiscapabilitycomesatasteepprice:1Minputtokensofo1-previewcosts$15,while1Moutputtokenswillsetyouback$60.Thismakesit3-4xmoreexpensivethanGPT-4o.OpenAIisclearinitsAPIdocumentationthatitisnotalike-for-like4oreplacementandthatitisnotthebestmodelfortasksthatrequireconsistentlyquickresponses,imageIntroduction|Research|Industry|Politics|SafetyIntroduction|Research|Industry|Politics|Safety●Metastucktothesamedecoder-onlytransformerarchitecturethatit’susedsinceLlama1,withminoradaptations,namelymoretransformerlayersandattentionheads.Metausedanincredible15Ttokenstotrainthefamily.Whilethisblewthroughthe“Chinchilla-optimal”amountoftrainingcompute,theyfoundthatboththe8Band70Bmodelsimprovedlog-linearlyupto15T.Llama3.1405Bwastrainedover16,000H100GPUs,the?rstLlamamodeltrainedatthisscale.MetafollowedupwithLlama3.2inSeptember,whichincorporated11Band90BVLMs(Llama’smultimodaldebut).TheformerwascompetitivewithClaude3Haiku,thelatterwithGPT-4o-mini.Thecompanyalsoreleased1Band3Btext-onlymodels,designedtooperateon-device.Llama-basedmodelshavenowrackedupover440MdownloadsonHuggingIntroduction|Research|Industry|Politics|SafetyIntroduction|Research|Industry|Politics|SafetyIntroduction|Research|Industry|Politics|Safety●●AteamfromtheUniversityofEdinburgh?aggedupthenumberofmistakesinMMLU,includingthewronggroundtruth,unclearquestions,andmultiplecorrectanswers.Whilelowacrossmostindividualtopics,therewerebigspikesincertain?elds,suchasvirology,where57%oftheanalyzedinstancescontainederrors.OnamanuallycorrectedMMLUsubset,modelsbroadlygaininperformance,althoughworsenedonprofessionallawandformallogic.ThissaysinaccurateMMLUinstancesarebeinglearnedduringpre-training.Inmoresafety-criticalterritory,OpenAIhaswarnedthatSWE-bench,whichevaluatesmodels’abilitytosolvereal-worldsoftwareissues,wasunderestimatingtheautonomoussoftwareengineeringcapabilitiesofmodels,asitcontainedtasksthatwerehardorimpossibletosolve.Theresearcherspartneredwiththecreatorsofthebenchmarktocreate●Thearena,whichallowsuserstointeractwithtworandomlyselectedchatbotsside-by-sideprovidesaroughcrowdsourcedevaluation.receivingthesamescores,withthelatteralsooutperformingClaudeSonnet3.5.Thishasledtoconcernsthattherankingisessentiallybecomingawayofassessingwhichwritingstyleusershappentoprefermost.Additionally,assmallermodelstendtoperformlesswellontasksinvolvingmoretokens,the8kcontextlimitarguablygivesthemanunfairadvantage.However,theearlyversionofthevisionleaderboardisnowbeginningtogaintractionandalignsbetterwithotherevals.冒+回→●AGoogleDeepMind/NYUteamgeneratedmillionsofsynthetictheoremsandproofsusingsymbolicengines,usingthemtotrainalanguagemodelfromscratch.●AlphaGeometryalternatesbetweenthelanguagemodelproposingnewconstructionsandsymbolicenginesperformingdeductionsuntilasolutionisfound.●Impressively,Itsolved25outof30onabenchmarkofOlympiad-levelgeometryproblems,nearinghumanInternationalMathematicalOlympiadgoldmedalistperformance.ThenextbestAIperformancescoredonly10.●Italsodemonstratedgeneralisationcapabilities-forexample,?ndingthataspeci?cdetailina2004IMOproblemwasunnecessarytofortheproof.Introduction|Research|Industry|Politics|Safety●AMeta/MITteamlookingatopen-weightpre-trainedLLMsconcludedthatit’spossibletodoawaywithuptohalfamodel’slayersandsufferonlynegligibleperformancedropsonquestion-answeringbenchmark.Theyidenti?edoptimallayersforremovalbasedonsimilarityandthen“healed”themodelthroughsmallamountsofef?cient?ne-tuning.NVIDIAresearcherstookamoreradicalapproachbypruninglayers,neurons,attentionheads,andembeddings,andthenusingknowledgedistillationforef?cientretraining.achievedcomparableorsuperiorperformancetomodelslikeMistral7BandLlama-38Bwhileusingupto40xfewertrainingtokens.Introduction|Research|Industry|Politics|SafetyGooglehaveembracedthisapproach,distillingGemini1.5FlashfromGemini1.5Pro,whileGemma29BwasdistilledfromGemma227B,andGemma2Bfromalargerunreleasedmodel.ThereisalsocommunityspeculationthatClaude3Haiku,ahighlycapablesmallermodel,isadistilledversionofthelargerOpus,butAnthropichasnevercon?rmedthis.Thesedistillationeffortsaregoingmultimodaltoo.BlackForestLabshavereleasedFLUX.1dev,anopen-weighttext-to-imagedistilledfromtheirPromodel.Tosupporttheseefforts,thecommunityhasstartedtoproduceopen-sourcedistillationtools,likearcee.ai’sDistillKit,whichsupportsbothLogit-basedandHiddenStates-baseddistillation.Llama3.1405Bisalsobeingusedfordistillation,afterMetaupdateditstermssooutputlogitscanbeusedtoimproveanymodels,notjustIntroduction|Research|●Microsoft’sphi-3.5-miniisa3.8BLMthatcompeteswithlargermodelslike7BandLlama3.18B.Itperformswellonreasoningandquestion-answering,butsizerestrictsitsfactualknowledge.Toenableon-deviceinference,themodelwasquantizedto4bits,reducingitsmemoryfootprinttoapproximately1.8GB.●AppleintroducedMobileCLIP,afamilyofef?cientimage-textmodelsoptimizedforfastinferenceonsmartphones.Usingnovelmultimodalreinforcedtraining,theyimprovetheaccuracyofcompactmodelsbytransferringknowledgefromanimagecaptioningmodelandanensembleofstrongCLIPencoders.●HuggingFacealsogotinontheactionwithSmolLM,afamilyofsmalllanguagemodels,availablein135M,360M,and1.7Bformats.ByusingahighlycuratedsyntheticdatasetcreatedviaanenhancedversionofCosmopedia(seeslide31)theteamachievedSOTAperformanceforthesize.冒+回→●Microsoft’sBitNetusesa“BitLinear”layertoreplacestandardlinearlayers,employing1-bitweightsandquantizedactivations.●Itshowscompetitiveperformancecomparedtofull-precisionmodelsanddemonstratesascalinglawsimilartofull-precisiontransformers,withsigni?cantmemoryandenergysavings.●MicrosoftfollowedupwithBitNetb1.58,withternaryweightstomatchfull-precisionLLMperformanceat3Bsizewhileretainingef?ciencygains.●Meanwhile,ByteDance’sTiTok(Transformer-based1-DimensionalTokenizer)quantizesimagesintocompact1Dsequencesofdiscretetokenforimagereconstructionandgenerationtasks.Thisallowsimagestoberepresentedwithasfewas32tokens,insteadofhundredsorthousands.Introduction|Research|●Inspiredbymodelinterpretabilityresearch,ReFT(RepresentationFine-tuning)doesn’talterthemodel’sweights.Instead,itmanipulatesthemodel’sinternalrepresentationsatinferencetimetosteeritsbehavior.Whileitcomeswithaslightinterferencepenalty,ReFTrequires15-65xfewerparameterscomparedtoweight-based?ne-tuningmethods.Italsoenablesmoreselectiveinterventionsonspeci?clayersandtokenpositions,enabling?ne-grainedcontrolovertheadaptationprocess.Theresearchersshowitspotentialinfew-shotadaptationwhereachatmodelisgivenanewpersonawithjust?veexamples.Combinedwiththesmallstoragefootprintforlearnedinterventions,itcouldbeusedforreal-timepersonalizationondeviceswithsuf?cientcomputepower.Introduction|Research|Industry|Politics|Safety●Selectivestate-spacemodelslikeMamba,designedlastyeartohandlelongsequencesmoreef?ciently,cantosomeextentcompetewithtransformers,butlagontasksthatrequirecopyingorin-contextlearning.Thatsaid,Falcon’sMamba7Bshowsimpressivebenchmarkperformanceversussimilar-sizedtransformermodels.●Hybridmodelsappeartobeamorepromisingdirection.Combinedwithself-attentionandMLPlayers,theAI21’sMamba-Transformerhybridmodeloutperformsthe8BTransformeracrossknowledgeandreasoningbenchmarks,whilebeingupto8xfastergeneratingtokensininference.●Inanostalgiatrip,thereareearlysignsofacomebackforrecurrentneuralnetworks,whichhadfallenoutoffashionduetotrainingandscalingdif?culties.●Grif?n,trainedbyGoogleDeepMind,mixeslinearrecurrencesandlocalattention,holdingitsownagainstLlama-2whilebeingtrainedon6xfewertokens.Introduction|Research|Industry|Politics|Safety●MOHAWKisanewmethodfordistillingknowledgefromalarge,pre-trainedtransformermodel(teacher)toasmaller,subquadraticmodel(student)likeastate-spacemodel(SSM).●Italignsi)thesequencetransformationmatricesofthestudentandteachermodelsii)andthehiddenstatesofeachlayer,theniii)transferstheremainingweightsoftheteachermodeltothestudentmodelto?netuneit.●●TheauthorscreatePhi-Mamba,anewstudentmodelcombiningMamba-2andanMLPblockandavariantcalledHybrid-Phi-Mambathatretainssomeattentionlayersfromtheteachermodel.MohawkcantrainPhi-MambaandHybrid-Phi-Mambatoachieveperformanceclosetotheteachermodel.Phi-Mambaisdistilledwithonly3Btokens,lessthan1%ofthedatausedtotraineitherthepreviouslybest-performingMambamodelsand2%forthePhi-1.5modelitself.Introduction|Research|●AswellasbeingthemainsourceoftrainingdataforthePhifamily,syntheticdatawasusedbyAnthropicwhentrainingClaude3tohelprepresentscenariosthatmighthavebeenmissinginthetrainingdata.●HuggingFaceusedMixtral-8x7BInstructtogenerateover30M?lesand25Btokensofsynthetictextbooks,blogposts,andstoriestorecreatethePhi-1.5trainingdataset,whichtheydubbedCosmopedia.●Tomakethisprocesseasier,NVIDIAreleasedtheNemotron-4-340Bfamily,asuiteofmodelsdesignedspeci?callyforsyntheticdatageneration,availableviaapermissivelicense.Meta’sLlamacanalsobeusedforsyntheticdatageneration.●Italsoappearspossibletocreatesynthetichigh-qualityinstructiondatabyextractingitdirectlyfromanalignedLLM,withtechniqueslikeMagpie.Models?ne-tunedthiswaysometimesperformcomparablytoLlama-3-8B-Instruct.Introduction|Research|●ANaturepaperfromOxfordandCambridgeresearchersfoundmodelcollapseoccursacrossvariousAIarchitectures,including?ne-tunedlanguagemodels,challengingtheideathatpre-trainingorperiodicexposuretosmallamountsoforiginaldatacanpreventdegradation(measuredbyPerplexityscore).●Thiscreatesa“?rstmoveradvantage”,assustainedaccesstodiverse,human-generateddatawillbecomeincreasinglycriticalformaintainingmodelquality.●However,theseresultsareprimarilyfocusedonascenariowhererealdataisreplacedwithsyntheticdataovergenerations.Inpractise,realandsyntheticdatausuallyaccumulates.●Otherresearchsuggeststhat,providedtheproportionofsyntheticdatadoesn’tgettoohigh,collapsecanusuallybeavoided.Introduction|Research|●FineWeb,thedataset,wascreatedthroughamulti-stepprocessincludingbase?ltering,independentMinHashdeduplicationperdump,selected?ltersderivedfromtheC4dataset,andtheteam’scustom?lters.●Thetextextractionusingthetra?laturalibraryproducedhigherqualitydatathandefaultCommonCrawlWET?les,eventhoughtheresultingdatasetwasmeaningfullysmaller.●Theyfounddeduplicationdroveperformanceimprovements,uptoapoint,beforehittingapointofdiminishingreturns,andthenworseningit.Theteamalsousedllama-3-70b-instructtoannotate500ksamplesfromFineWeb,scoringscoringeachfortheireducationalqualityonascalefrom0to5.FineWeb-edu,which?lteredoutsamplesscoredbelow3,outperformedFineWebandallotheropendatasets,despitebeingsigni?cantlysmaller.Introduction|Research|●Followingtheplaybookthat’sproveneffectiveinregularLLMs,massiveperformanceimprovementshavecomefromscale(GritLMhas~47Bparametersvsthe110Mcommonamongpriorembeddingmodels).Similarly,theusageofbroadwebscalecorporaandimproved?lteringmethodshaveledtolargeimprovementsinthesmallermodels.Meanwhile,ColPaliisavision-languageembeddingmodelthatexploitsthevisualstructureofdocuments,notjusttheirtextembeddings,toimproveretrieval.Retrievalmodelsareoneofthefewsubdomainswhereopenmodelscommonlyoutperformproprietarymodelsfromthebiggestlabs.OntheMTEBRetrievalLeadIntroduction|Research|●Anthropicsolvedthisusing‘contextualembeddings’,whereapromptinstructsthemodeltogeneratetextexplainingthecontextofeachchunkinthedocument.Theyfoundthatthisapproachleadstoareductionoftop-20retrievalfailurerateof35%(5.7%→3.7%).ItcanthenbescaledusingAnthropic’spromptcaching.AsFernandoDiazofCMUobservedinarecentthread,thisisagreatexampleoftechniquespioneeredononeareaofAIresearch(e.g.earlyspeechretrievalanddocumentexpansionwork)beingappliedtoanother.Anotherversionof“whatisResearchfromChromashowsthatthechoiceofchunkingstrategycanaffectretrievalperformancebyupto9%inrecall.Introduction|Research|●Researchersarenowpioneeringnovelapproaches,likeRagnar?k,whichintroducesanovelweb-basedarenaforhumanevaluationthroughpairwisesystemcomparisons.ThisaddressesthechallengeofassessingRAGqualitybeyondtraditionalautomatedmetrics.●Meanwhile,ResearchyQuestionsprovidesalarge-scalecollectionofcomplex,multi-facetedquestionsthatrequirein-depthresearchandanalysistoanswer,drawnfromrealuserqueries.●GoogleDeepMindhasproposedDistributedLow-Communication(DiLoCo),anoptimizationalgorithmthatallowstrainingtooccuronmultiplelooselyconnected“islands”ofdevices.●Eachislandperformsalargenumberoflocalupdatestepsbeforecommunicatingwiththeothers,reducingtheneedforfrequentdataexchange.They’reabletodemonstratefullysynchronousoptimizationacross8oftheseislandswhilereducingcommunication500x.●GDMalsoproposedare?nedversionofDiLoCo,optimizedforasynchronoussettings.●ResearchersatPrimeIntellectreleasedanopen-sourceimplementationandreplicationofDiLoCo,whilescalingitup3x,todemonstrateitseffectivenesson1Bparametermodels.

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論