版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡介
venturecapital?rminvestinginAI-?rstcompanies.HerunstheResearchandAppliedAISummit(RAAIS),theRAAISFoundation(fundingopen-sourceAIprojects),AIcommunitiesintheUSandEurope,andSpinout.fyi(improvinguniversityspinoutcreation).HestudiedbiologyatWilliamsCollegeandearnedaPhDfromCambridgeincancerresearchasaGatesScholar.regularlywritesresearch,analysis,andcommentaryonanassociatedirectoratMilltownPartners,whereheadvisedbigtechnologycompanies,start-ups,andinvestorsonpolicyandpositioning.HegraduatedfromtheUniversityofOxfordin2017withadegreeinHistory.Arti?cialintelligence(AI)isamultidisciplinary?eldofscienceandengineeringwhosegoalistocreateintelligentmachines.WebelievethatAIwillbeaforcemultiplierontechnologicalprogressinourincreasinglydigital,data-drivenworld.Thisisbecauseeverythingaroundustoday,rangingfromculturetoconsumerproducts,isaproductofintelligence.TheStateofAIReportisnowinitsseventhyear.Considerthisreportasacompilationofthemostinterestingthingswe’veseenwithagoaloftriggeringaninformedconversationaboutthestateofAIanditsimplicationforthefuture.Weconsiderthefollowingkeydimensionsinourreport:-Research:Technologybreakthroughsandtheircapabilities.-Industry:AreasofcommercialapplicationforAIanditsbusinessimpact.-Politics:RegulationofAI,itseconomicimplicationsandtheevolvinggeopoliticsofAI.-Safety:Identifyingandmitigatingcatastrophicrisksthathighly-capablefutureAIsystemscouldposetous.-Predictions:Whatwebelievewillhappeninthenext12monthsanda2023performancereviewtokeepushonest.Arti?cialintelligence(AI):abroaddisciplinewiththegoalofcreatingintelligentmachines,asopposedtothenaturalintelligencethatisdemonstratedbyhumansandanimals.Arti?cialgeneralintelligence(AGI):atermusedtodescribefuturemachinesthatcouldmatchandthenexceedthefullrangeofhumancognitiveabilityacrossalleconomicallyvaluabletasks.AIAgent:anAI-poweredsystemthatcantakeactionsinanenvironment.Forexample,anLLMthathasaccesstoasuiteoftoolsandhastodecidewhichonetouseinordertoaccomplishataskthatithasbeenpromptedtodo.AISafety:a?eldthatstudiesandattemptstomitigatetherisks(minortocatastrophic)whichfutureAIcouldposetohumanity.Computervision(CV):theabilityofaprogramtoanalyseandunderstandimagesandvideo.Deeplearning(DL):anapproachtoAIinspiredbyhowneuronsinthebrainrecognisecomplexpatternsindata.The“deep”referstothemanylayersofneuronsintoday’smodelsthathelptolearnrichrepresentationsofdatatoachievebetterperformancegains.Diffusion:Analgorithmthatiterativelydenoisesanarti?ciallycorruptedsignalinordertogeneratenew,high-qualityoutputs.Inrecentyearsithasbeenattheforefrontofimagegenerationandproteindesign.GenerativeAI:AfamilyofAIsystemsthatarecapableofgeneratingnewcontent(e.g.text,images,audio,or3Dassets)basedon'prompts'.GraphicsProcessingUnit(GPU):asemiconductorprocessingunitthatenablesalargenumbercalculationstobecomputedinparallel.Historicathiswasrequiredforrenderingcomputergraphics.Since2012GPUshaveadaptedfortrainingDLmodels,whichalsorequirealargenumberofparallelcalculations.(Large)Languagemodel(LM,LLM):amodeltrainedonvastamountsof(often)textualdatatopredictthenextwordinaself-supervisedmanner.Theterm“LLM”isusedtodesignatemulti-billionparameterLMs,butthisisamovingde?nition.Machinelearning(ML):asubsetofAIthatoftenusesstatisticaltechniquestogivemachinestheabilityto"learn"fromdatawithoutbeingexplicitlygiventheinstructionsforhowtodoso.Thisprocessisknownas“training”a“model”usingalearning“algorithm”thatprogressivelyimprovesmodelperformanceonaspeci?ctask.Model:aMLalgorithmtrainedondataandusedtomakepredictions.Naturallanguageprocessing(NLP):theabilityofaprogramtPrompt:auserinputoftenwritteninnaturallanguagethatisusedtoinstructanLLMtogeneratesomethingortakeaction.Reinforcementlearning(RL):anareaofMLinwhichsoftwareagentslearngoal-orientedbehaviorbytrialanderrorinanenvironmentthatprovidesrewardsorpenaltiesinresponsetotheiractions(calleda“policy”)towardsachievingthatgoal.Self-supervisedlearning(SSL):aformofunsupervisedlearning,wheremanuallylabeleddataisnotneeded.Rawdataisinsteadmodi?edinanautomatedwaytocreatearti?ciallabelstolearnfrom.AnexampleofSSLislearningtocompletetextbymaskingrandomwordsinasentenceandtryingtopredictthemissingones.Transformer:amodelarchitectureatthecoreofmoststateoftheart(SOTA)MLresearch.Itiscomposedofmultiple“attention”layerswhichlearnwhichpartsoftheinputdataarethemostimportantforagiventask.TransformersstartedinNLP(speci?callymachinetranslation)andsubsequentlywereexpandedintocomputervision,audio,andothermodalities.Intherestoftheslides,iconsinthetoprightcornerindicateinputandoutputmodalitiesforthemodel.:Softwaretooluse(text,codegeneration&execution) .:3D:Robotstate:Biologicalmodality→:TexttoSoftwaretooluse→.:Imageto3D→.:Textto3D-Frontierlabperformanceconverges,butOpenAImaintainsitsedgefollowingthelaunchofo1,asplanningandreasoningemergeasamajorfrontier.-Foundationmodelsdemonstratetheirabilitytobreakoutoflanguageasmultimodalresearchdrivesintomathematics,biology,genomics,thephysicalsciences,andneuroscience.-USsanctionsfailtostopChinese(V)LLMsrisingupcommunityleaderboards.-NVIDIAremainsthemostpowerfulcompanyintheworld,enjoyingastintinthe$3Tclub,whileregulatorsprobetheconcentrationsofpowerwithinGenAI.-MoreestablishedGenAIcompaniesbringinbillionsofdollarsinrevenue,whilestart-upsbegintogaintractioninsectorslikevideoandaudiogeneration.Althoughcompaniesbegintomakethejourneyfrommodeltoproduct,long-termquestionsaroundpricingandsustainabilityremainunresolved.-Drivenbyabullruninpublicmarkets,AIcompaniesreach$9Tinvalue,whileinvestmentlevelsgrowhealthilyinprivatecompanies.-Whileglobalgovernanceeffortsstall,nationalandregionalAIregulationhascontinuedtoadvance,withcontroversiallegislationpassingintheUSandEU.-TherealityofcomputerequirementsforcesBigTechcompaniestoreckonwithreal-worldphysicalconstraintsonscalingandtheirownemissionstargets.Meanwhile,governments’ownattemptstobuildcapacitycontinuetolag.-AnticipatedAIeffectsonelections,employmentandarangeofothersensitiveareasareyettoberealizedatanyscale.-Avibe-shiftfromsafetytoaccelerationtakesplaceascompaniesthatpreviouslywarnedusaboutthependingextinctionofhumanityneedtorampupenterprisesalesandusageoftheirconsumerapps.-GovernmentsaroundtheworldemulatetheUKinbuildingupstatecapacityaroundAIsafety,launchinginstitutesandstudyingcriticalnationalinfrastructureforpotentialvulnerabilities.-Everyproposedjailbreaking‘?x’hasfailed,butresearchersareincreasinglyconcernedwithmoresophisticated,long-termattacks.stateof.ai2024AHollywood-gradeproductionmakesuseofgenerativeAIforvisualeffects.AgenerativeAImediacompanyisinvestigatedforitsmisuseduringinthe2024USelectioncircuit.Self-improvingAIagentscrushSOTAinacomplexenvironment(e.g.AAAgame,tooluse,science).TechIPOmarketsunthawandweseeatleastonemajorlistingforanAI-focusedcompany(e.g.DBRX).TheGenAIscalingcrazeseesagroupspend>$1Btotrainasinglelarge-scalemodel.TheUS’sFTCorUK’sCMAinvestigatetheMicrosoft/OpenAIdealoncompetitiongroundWeseelimitedprogressonglobalAIgovernancebeyondhigh-levelvoluntarycommitments.FinancialinstitutionslaunchGPUdebtfundstoreplaceVCequitydollarsforcomputefunding.AnAI-generatedsongbreaksintotheBillboardHot100Top10ortheSpotifyTopHits2024.Asinferenceworkloadsandcostsgrowsigni?cantly,alargeAIcompany(e.g.OpenAI)acquiresorbuildsaninference-focusedAIchipcompany.~~Largelybadly,butGenAIAIvisualeffectshavebeenseeninNet?ixandHBOproductions.Notyet,butthere’sstilltime.Notyet,despitepromisingworkonopen-endedness,includingstronggameperformance.WhiletheMagni?centSevenhaveenjoyedstronggains,privatecompaniesarehangingonuntilmarketssettle.However,AIchipcompanyCerebrashas?ledtoIPO.Notquiteyet-let’sgiveitanotheryear.Bothregulatorsareinvestigatingthispartnership.ThecommitmentsfromBletchleyandSeoulsummitsremainvoluntaryandhigh-level.SomeVCfundsarerumoredtobeofferingGPUsforequity,butwe’reyettoseeanyonegodownthedebtroute.Itturnsoutthishadalreadyhappenedlastyearwith“HeartonMySleeve”,butwe’vealsoseenanAI-generatedsongreach#27inGermanyandspendseveraldaysintheTop50.SamAltmanisreportedlyraisinghugesumsofmoneytodothis,whileeachofGoogle,Amazon,MetaandMicrosoftcontinuetobuildandimprovetheirownedAIsilicon.Introduction|Research|Industry|Politics|Safety●●Onbothformalbenchmarksandvibes-basedanalysis,thebest-fundedfrontierlabsareabletorackupscoreswithinlowsingledigitsofeachotheronindividualcapabilities.Modelsarenowconsistentlyhighlycapablecoders,arestrongatfactualrecallandmath,butlessgoodatopen-endedquestion-answeringandmulti-modalproblemsolving.Manyofthevariationsaresuf?cientlysmallthattheyarenowlikelytobetheproductofdifferencesinimplementation.Forexample,GPT-4ooutperformsClaude3.5SonnetonMMLU,butapparentlyunderperformsitonMMLU-Pro-abenchmarkdesignedtobemorechallenging.Consideringtherelativelysubtletechnicaldifferencesbetweenarchitecturesandlikelyheavyoverlapsinpre-trainingdata,modelbuildersarenowincreasinglyhavingtocompeteonnewcapabilitiesandproductIntroduction|Research|Industry|Politics|Safety●Byshiftingcomputefrompre-andpost-trainingtoinference,o1reasonsthroughcomplexpromptsstep-by-stepinachain-of-thought(COT)style,employingRLtosharpentheCOTandthestrategiesituses.Thisunlocksthepossibilityofsolvingmulti-layeredmath,science,andcodingproblemswhereLLMshavehistoricallystruggled,duetotheinherentlimitationsofnext-tokenprediction.OpenAIreportsigni?cantimprovementsonreasoning-heavybenchmarksversus4o,withthestarkestonAIME2024(competitionmath),withawhoppingscoreof83.83versus13.4.However,thiscapabilitycomesatasteepprice:1Minputtokensofo1-previewcosts$15,while1Moutputtokenswillsetyouback$60.Thismakesit3-4xmoreexpensivethanGPT-4o.OpenAIisclearinitsAPIdocumentationthatitisnotalike-for-like4oreplacementandthatitisnotthebestmodelfortasksthatrequireconsistentlyquickresponses,imageIntroduction|Research|Industry|Politics|SafetyIntroduction|Research|Industry|Politics|Safety●Metastucktothesamedecoder-onlytransformerarchitecturethatit’susedsinceLlama1,withminoradaptations,namelymoretransformerlayersandattentionheads.Metausedanincredible15Ttokenstotrainthefamily.Whilethisblewthroughthe“Chinchilla-optimal”amountoftrainingcompute,theyfoundthatboththe8Band70Bmodelsimprovedlog-linearlyupto15T.Llama3.1405Bwastrainedover16,000H100GPUs,the?rstLlamamodeltrainedatthisscale.MetafollowedupwithLlama3.2inSeptember,whichincorporated11Band90BVLMs(Llama’smultimodaldebut).TheformerwascompetitivewithClaude3Haiku,thelatterwithGPT-4o-mini.Thecompanyalsoreleased1Band3Btext-onlymodels,designedtooperateon-device.Llama-basedmodelshavenowrackedupover440MdownloadsonHuggingIntroduction|Research|Industry|Politics|SafetyIntroduction|Research|Industry|Politics|SafetyIntroduction|Research|Industry|Politics|Safety●●AteamfromtheUniversityofEdinburgh?aggedupthenumberofmistakesinMMLU,includingthewronggroundtruth,unclearquestions,andmultiplecorrectanswers.Whilelowacrossmostindividualtopics,therewerebigspikesincertain?elds,suchasvirology,where57%oftheanalyzedinstancescontainederrors.OnamanuallycorrectedMMLUsubset,modelsbroadlygaininperformance,althoughworsenedonprofessionallawandformallogic.ThissaysinaccurateMMLUinstancesarebeinglearnedduringpre-training.Inmoresafety-criticalterritory,OpenAIhaswarnedthatSWE-bench,whichevaluatesmodels’abilitytosolvereal-worldsoftwareissues,wasunderestimatingtheautonomoussoftwareengineeringcapabilitiesofmodels,asitcontainedtasksthatwerehardorimpossibletosolve.Theresearcherspartneredwiththecreatorsofthebenchmarktocreate●Thearena,whichallowsuserstointeractwithtworandomlyselectedchatbotsside-by-sideprovidesaroughcrowdsourcedevaluation.receivingthesamescores,withthelatteralsooutperformingClaudeSonnet3.5.Thishasledtoconcernsthattherankingisessentiallybecomingawayofassessingwhichwritingstyleusershappentoprefermost.Additionally,assmallermodelstendtoperformlesswellontasksinvolvingmoretokens,the8kcontextlimitarguablygivesthemanunfairadvantage.However,theearlyversionofthevisionleaderboardisnowbeginningtogaintractionandalignsbetterwithotherevals.冒+回→●AGoogleDeepMind/NYUteamgeneratedmillionsofsynthetictheoremsandproofsusingsymbolicengines,usingthemtotrainalanguagemodelfromscratch.●AlphaGeometryalternatesbetweenthelanguagemodelproposingnewconstructionsandsymbolicenginesperformingdeductionsuntilasolutionisfound.●Impressively,Itsolved25outof30onabenchmarkofOlympiad-levelgeometryproblems,nearinghumanInternationalMathematicalOlympiadgoldmedalistperformance.ThenextbestAIperformancescoredonly10.●Italsodemonstratedgeneralisationcapabilities-forexample,?ndingthataspeci?cdetailina2004IMOproblemwasunnecessarytofortheproof.Introduction|Research|Industry|Politics|Safety●AMeta/MITteamlookingatopen-weightpre-trainedLLMsconcludedthatit’spossibletodoawaywithuptohalfamodel’slayersandsufferonlynegligibleperformancedropsonquestion-answeringbenchmark.Theyidenti?edoptimallayersforremovalbasedonsimilarityandthen“healed”themodelthroughsmallamountsofef?cient?ne-tuning.NVIDIAresearcherstookamoreradicalapproachbypruninglayers,neurons,attentionheads,andembeddings,andthenusingknowledgedistillationforef?cientretraining.achievedcomparableorsuperiorperformancetomodelslikeMistral7BandLlama-38Bwhileusingupto40xfewertrainingtokens.Introduction|Research|Industry|Politics|SafetyGooglehaveembracedthisapproach,distillingGemini1.5FlashfromGemini1.5Pro,whileGemma29BwasdistilledfromGemma227B,andGemma2Bfromalargerunreleasedmodel.ThereisalsocommunityspeculationthatClaude3Haiku,ahighlycapablesmallermodel,isadistilledversionofthelargerOpus,butAnthropichasnevercon?rmedthis.Thesedistillationeffortsaregoingmultimodaltoo.BlackForestLabshavereleasedFLUX.1dev,anopen-weighttext-to-imagedistilledfromtheirPromodel.Tosupporttheseefforts,thecommunityhasstartedtoproduceopen-sourcedistillationtools,likearcee.ai’sDistillKit,whichsupportsbothLogit-basedandHiddenStates-baseddistillation.Llama3.1405Bisalsobeingusedfordistillation,afterMetaupdateditstermssooutputlogitscanbeusedtoimproveanymodels,notjustIntroduction|Research|●Microsoft’sphi-3.5-miniisa3.8BLMthatcompeteswithlargermodelslike7BandLlama3.18B.Itperformswellonreasoningandquestion-answering,butsizerestrictsitsfactualknowledge.Toenableon-deviceinference,themodelwasquantizedto4bits,reducingitsmemoryfootprinttoapproximately1.8GB.●AppleintroducedMobileCLIP,afamilyofef?cientimage-textmodelsoptimizedforfastinferenceonsmartphones.Usingnovelmultimodalreinforcedtraining,theyimprovetheaccuracyofcompactmodelsbytransferringknowledgefromanimagecaptioningmodelandanensembleofstrongCLIPencoders.●HuggingFacealsogotinontheactionwithSmolLM,afamilyofsmalllanguagemodels,availablein135M,360M,and1.7Bformats.ByusingahighlycuratedsyntheticdatasetcreatedviaanenhancedversionofCosmopedia(seeslide31)theteamachievedSOTAperformanceforthesize.冒+回→●Microsoft’sBitNetusesa“BitLinear”layertoreplacestandardlinearlayers,employing1-bitweightsandquantizedactivations.●Itshowscompetitiveperformancecomparedtofull-precisionmodelsanddemonstratesascalinglawsimilartofull-precisiontransformers,withsigni?cantmemoryandenergysavings.●MicrosoftfollowedupwithBitNetb1.58,withternaryweightstomatchfull-precisionLLMperformanceat3Bsizewhileretainingef?ciencygains.●Meanwhile,ByteDance’sTiTok(Transformer-based1-DimensionalTokenizer)quantizesimagesintocompact1Dsequencesofdiscretetokenforimagereconstructionandgenerationtasks.Thisallowsimagestoberepresentedwithasfewas32tokens,insteadofhundredsorthousands.Introduction|Research|●Inspiredbymodelinterpretabilityresearch,ReFT(RepresentationFine-tuning)doesn’talterthemodel’sweights.Instead,itmanipulatesthemodel’sinternalrepresentationsatinferencetimetosteeritsbehavior.Whileitcomeswithaslightinterferencepenalty,ReFTrequires15-65xfewerparameterscomparedtoweight-based?ne-tuningmethods.Italsoenablesmoreselectiveinterventionsonspeci?clayersandtokenpositions,enabling?ne-grainedcontrolovertheadaptationprocess.Theresearchersshowitspotentialinfew-shotadaptationwhereachatmodelisgivenanewpersonawithjust?veexamples.Combinedwiththesmallstoragefootprintforlearnedinterventions,itcouldbeusedforreal-timepersonalizationondeviceswithsuf?cientcomputepower.Introduction|Research|Industry|Politics|Safety●Selectivestate-spacemodelslikeMamba,designedlastyeartohandlelongsequencesmoreef?ciently,cantosomeextentcompetewithtransformers,butlagontasksthatrequirecopyingorin-contextlearning.Thatsaid,Falcon’sMamba7Bshowsimpressivebenchmarkperformanceversussimilar-sizedtransformermodels.●Hybridmodelsappeartobeamorepromisingdirection.Combinedwithself-attentionandMLPlayers,theAI21’sMamba-Transformerhybridmodeloutperformsthe8BTransformeracrossknowledgeandreasoningbenchmarks,whilebeingupto8xfastergeneratingtokensininference.●Inanostalgiatrip,thereareearlysignsofacomebackforrecurrentneuralnetworks,whichhadfallenoutoffashionduetotrainingandscalingdif?culties.●Grif?n,trainedbyGoogleDeepMind,mixeslinearrecurrencesandlocalattention,holdingitsownagainstLlama-2whilebeingtrainedon6xfewertokens.Introduction|Research|Industry|Politics|Safety●MOHAWKisanewmethodfordistillingknowledgefromalarge,pre-trainedtransformermodel(teacher)toasmaller,subquadraticmodel(student)likeastate-spacemodel(SSM).●Italignsi)thesequencetransformationmatricesofthestudentandteachermodelsii)andthehiddenstatesofeachlayer,theniii)transferstheremainingweightsoftheteachermodeltothestudentmodelto?netuneit.●●TheauthorscreatePhi-Mamba,anewstudentmodelcombiningMamba-2andanMLPblockandavariantcalledHybrid-Phi-Mambathatretainssomeattentionlayersfromtheteachermodel.MohawkcantrainPhi-MambaandHybrid-Phi-Mambatoachieveperformanceclosetotheteachermodel.Phi-Mambaisdistilledwithonly3Btokens,lessthan1%ofthedatausedtotraineitherthepreviouslybest-performingMambamodelsand2%forthePhi-1.5modelitself.Introduction|Research|●AswellasbeingthemainsourceoftrainingdataforthePhifamily,syntheticdatawasusedbyAnthropicwhentrainingClaude3tohelprepresentscenariosthatmighthavebeenmissinginthetrainingdata.●HuggingFaceusedMixtral-8x7BInstructtogenerateover30M?lesand25Btokensofsynthetictextbooks,blogposts,andstoriestorecreatethePhi-1.5trainingdataset,whichtheydubbedCosmopedia.●Tomakethisprocesseasier,NVIDIAreleasedtheNemotron-4-340Bfamily,asuiteofmodelsdesignedspeci?callyforsyntheticdatageneration,availableviaapermissivelicense.Meta’sLlamacanalsobeusedforsyntheticdatageneration.●Italsoappearspossibletocreatesynthetichigh-qualityinstructiondatabyextractingitdirectlyfromanalignedLLM,withtechniqueslikeMagpie.Models?ne-tunedthiswaysometimesperformcomparablytoLlama-3-8B-Instruct.Introduction|Research|●ANaturepaperfromOxfordandCambridgeresearchersfoundmodelcollapseoccursacrossvariousAIarchitectures,including?ne-tunedlanguagemodels,challengingtheideathatpre-trainingorperiodicexposuretosmallamountsoforiginaldatacanpreventdegradation(measuredbyPerplexityscore).●Thiscreatesa“?rstmoveradvantage”,assustainedaccesstodiverse,human-generateddatawillbecomeincreasinglycriticalformaintainingmodelquality.●However,theseresultsareprimarilyfocusedonascenariowhererealdataisreplacedwithsyntheticdataovergenerations.Inpractise,realandsyntheticdatausuallyaccumulates.●Otherresearchsuggeststhat,providedtheproportionofsyntheticdatadoesn’tgettoohigh,collapsecanusuallybeavoided.Introduction|Research|●FineWeb,thedataset,wascreatedthroughamulti-stepprocessincludingbase?ltering,independentMinHashdeduplicationperdump,selected?ltersderivedfromtheC4dataset,andtheteam’scustom?lters.●Thetextextractionusingthetra?laturalibraryproducedhigherqualitydatathandefaultCommonCrawlWET?les,eventhoughtheresultingdatasetwasmeaningfullysmaller.●Theyfounddeduplicationdroveperformanceimprovements,uptoapoint,beforehittingapointofdiminishingreturns,andthenworseningit.Theteamalsousedllama-3-70b-instructtoannotate500ksamplesfromFineWeb,scoringscoringeachfortheireducationalqualityonascalefrom0to5.FineWeb-edu,which?lteredoutsamplesscoredbelow3,outperformedFineWebandallotheropendatasets,despitebeingsigni?cantlysmaller.Introduction|Research|●Followingtheplaybookthat’sproveneffectiveinregularLLMs,massiveperformanceimprovementshavecomefromscale(GritLMhas~47Bparametersvsthe110Mcommonamongpriorembeddingmodels).Similarly,theusageofbroadwebscalecorporaandimproved?lteringmethodshaveledtolargeimprovementsinthesmallermodels.Meanwhile,ColPaliisavision-languageembeddingmodelthatexploitsthevisualstructureofdocuments,notjusttheirtextembeddings,toimproveretrieval.Retrievalmodelsareoneofthefewsubdomainswhereopenmodelscommonlyoutperformproprietarymodelsfromthebiggestlabs.OntheMTEBRetrievalLeadIntroduction|Research|●Anthropicsolvedthisusing‘contextualembeddings’,whereapromptinstructsthemodeltogeneratetextexplainingthecontextofeachchunkinthedocument.Theyfoundthatthisapproachleadstoareductionoftop-20retrievalfailurerateof35%(5.7%→3.7%).ItcanthenbescaledusingAnthropic’spromptcaching.AsFernandoDiazofCMUobservedinarecentthread,thisisagreatexampleoftechniquespioneeredononeareaofAIresearch(e.g.earlyspeechretrievalanddocumentexpansionwork)beingappliedtoanother.Anotherversionof“whatisResearchfromChromashowsthatthechoiceofchunkingstrategycanaffectretrievalperformancebyupto9%inrecall.Introduction|Research|●Researchersarenowpioneeringnovelapproaches,likeRagnar?k,whichintroducesanovelweb-basedarenaforhumanevaluationthroughpairwisesystemcomparisons.ThisaddressesthechallengeofassessingRAGqualitybeyondtraditionalautomatedmetrics.●Meanwhile,ResearchyQuestionsprovidesalarge-scalecollectionofcomplex,multi-facetedquestionsthatrequirein-depthresearchandanalysistoanswer,drawnfromrealuserqueries.●GoogleDeepMindhasproposedDistributedLow-Communication(DiLoCo),anoptimizationalgorithmthatallowstrainingtooccuronmultiplelooselyconnected“islands”ofdevices.●Eachislandperformsalargenumberoflocalupdatestepsbeforecommunicatingwiththeothers,reducingtheneedforfrequentdataexchange.They’reabletodemonstratefullysynchronousoptimizationacross8oftheseislandswhilereducingcommunication500x.●GDMalsoproposedare?nedversionofDiLoCo,optimizedforasynchronoussettings.●ResearchersatPrimeIntellectreleasedanopen-sourceimplementationandreplicationofDiLoCo,whilescalingitup3x,todemonstrateitseffectivenesson1Bparametermodels.
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 交通事故私下調(diào)解協(xié)議書
- 個人土地補(bǔ)償協(xié)議書
- 闌尾結(jié)石病因介紹
- (立項(xiàng)備案申請模板)海砂淡化及機(jī)制砂項(xiàng)目可行性研究報(bào)告參考范文
- 2023年天津市河西區(qū)高考語文三模試卷
- 山東省菏澤市鄄城縣2024-2025學(xué)年七年級上學(xué)期期中生物學(xué)試題(解析版)-A4
- 2023年直流鼓風(fēng)機(jī)項(xiàng)目融資計(jì)劃書
- 護(hù)理資料培訓(xùn)課件 大便標(biāo)本采集相關(guān)知識
- 養(yǎng)老院老人康復(fù)設(shè)施使用管理制度
- 培訓(xùn)過程控制培訓(xùn)課件
- 在小學(xué)語文課堂教學(xué)中如何滲透孝道教育研究
- 《2021國標(biāo)暖通圖集資料》96K150-3 圓錐形風(fēng)帽
- 大班幼兒告狀行為的現(xiàn)狀及解決策略學(xué)前教育專業(yè)
- 煤礦井下放炮請示匯報(bào)制度范本
- 常見織帶花鏈的排法和穿棕方法
- 拜太歲科儀.doc
- 【公開課】課件——小班數(shù)學(xué)活動《青蛙跳荷葉》
- 趕工措施施工方案(完整版)
- 犬腎衰竭的診斷和治療
- 實(shí)驗(yàn)二十八 實(shí)驗(yàn)設(shè)計(jì)——食醋中總酸度的測定
- 簡約體育賽事大型活動招商合作方案教育實(shí)用PPT輔導(dǎo)課件
評論
0/150
提交評論