




版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進行舉報或認領(lǐng)
文檔簡介
1、AWS動態(tài)管理大規(guī)模Spark集群技術(shù)創(chuàng)新 變革未來Founded in 2013 with HQ in Mountain View, California USAStrong financial backing over $50M by Sequoia Capital, Genesis Capital, and GSR venturesTechnology7 patent pendingElite tech-team of PhDs from top universities specializing in machine learning, big data and securityCli
2、entsPartnered with global clients with an online or mobile presence in gaming, social, commerce and finance4B+Protected accountsglobally800B+Processed eventsto date200MDetected + bad accounts to dateRules EnginesSupervisedMachine LearningUnsupervisedMachine LearningReputation ListsHow it worksSearch
3、 reputation databaseMatches against listsMake decisionSende rReceiverReputation DatabaseIs Sender IP Listed?YES / NOMakeDecisionExamplesEmailIP AddressDevicesCredit Card #sPhone #sHow it worksCheck against rule listsCriteria with weightsCombination rules with logicIF (user email = free email service
4、) AND (comment character count 150 per sec) flag user account as spammermute commentingRULEWEIGHTIP Address is anonymous proxy+800Account age 180 days-500Email is private corporate domain name-350Mismatch billing country and IP country+450Phone number found on 3 accounts+250What is it?An algorithm t
5、hat learns to perform a task from known examples (training data).An important requirement of using supervised learning is having the data to train the model.What is it?An algorithm that learns to identify linkages and patterns in the data without prior knowledge of what to look for.Unsupervised mach
6、ine learning does not requirelabeled training data.Comparison Between ApproachesTimeReputation List and Device FingerprintRules EngineSupervised Machine LearningUnsupervised Machine LearningEffectiveness Limited coverage and precision Can use emulators to bypass device fingerprintNeed to maintain an
7、d adapt rules constantly Poor against adaptive attacks Need large amount of labeled data Have difficulties detecting unknown attacksAuto-label generationDetection of unknown attacksAuto-rules generationDataVisorUnsupervised Machine LearningUML Engine Process FlowSTEP 1DYNAMIC FEATURE EXTRACTIONSTEP
8、2UNSUPERVISED ATTACK RING DETECTIONSTEP 3RESULT CATEGORIZATION & RANKINGGenerating large set of features to describe each input accountPerforming correlation analysis across all counts and identifying attack ringsAssigning confidence score and categorizing attack ringProfile InfoBehaviors & Activiti
9、esOrigins & Digital FingerprintsContents & MetadataRelationship among acctsSecurity event logs + labelsCustom scoring + metadataSocialGamingE-commerceFinanceDynamic event extractionDATA INFRASTRUCTURE (Hadoop, Spark)Correlation engine of billions of usersTemporalEventSeq.Velocity + FreqSpatial / GEO
10、DomainGraph AttributesAttributesRaw Input DataOur feature engineering was designed to operate across a very high dimensional feature space and be comprehensive in extracting fraud features.Data Processing LayerFeature ExtractionCross-event/Time-series Feature EngineeringDATAVISORS UNSUPERVISED MACHI
11、NE LEARNINGUser profile dataDerived features (frequency, velocity, correlation )user0001ProfileBehaviorFreq.Corr.Long vector to describe comprehensive profile and behavior of each useruser0002user0003User IDuser0001Update dynamicallyBehavior dataUser ID.user000 1user0002user0003Velocit yDeviceSeq.Ge
12、oAll application-level events from multiple online service verticals410 Million+ IP addresses3.6 Million+ Email domains160,000+ Device types300,000+ OS versions5.3 Million+ User agent strings700,000+ Phone prefixesFrom 4 Billion+ global users, 800 Billion+ events and growingFinancialE-CommSocialMobi
13、leGlobal Intelligence NetworkXiaomi Mi 5 is a phone that was released in 201650% of its footprint is in Russia and ChinaWhen it appears in other region, its fraud rate can up to 51%user0001user0002user0003Intel from GINIPEmail domainPhone prefixDevice infoGLOBAL INTELLIGENCE NETWORK (GIN)ProfileBeha
14、vi orFreq.Corr.Velocit yDeviceSeq.Geouser0001user0002user0003user0004user0005user9553Cluster001Dimension reduction based onStatistical analysisDomain knowledgeFeature correlationDynamic clustering based oncombinations ofFeature dimensions (f)Feature weights (w)Linkage probability func. (F)User Level
15、Cluster LevelCluster002Cluster551Based on key clustering features, DataVisor engine will output reason code and corresponding categories.ATTACK CATEGORIZATIONREASON CODEAutomated account opening fraudMass account takeover password testingManual transaction fraudCLASSIFICATIONOF FRAUD CAMPAIGNSANALYS
16、IS OFFRAUD TECHNIQUESMONITORING OF FRAUD TRENDSStop Fake Account CreationPrevent mass registration of fake account armiesPrevent Transaction FraudReduce e-commerce and financial fraud 30%-50% more than traditional solutionsIdentify Account TakeoversDetect compromised users before damage to your cust
17、omers or brandBlock Fake Reviews & LikesMaintain trust in your platform by reducing fake comments & votingFilter SpamPrevent spammers from posting illicit or annoying contentDiscover Fake App InstallsSave millions of dollars per year by flagging fake mobile app installsEarly detection view shows how
18、 DataVisor catches crime rings before damage is done.Fake App Installs and Game Play Activity15K+ installs all coming from the same device typeRedmi 3S running Android 5.1.1Fake retention activity within 7 days to 2 weeks following installMultiple app starts every few seconds or minutesCompletely go
19、ne inactive after the faked app_start retentionsAll Installs from Xiaomi Redmi 3sFollowed by fake app starts to mimic retentionOne major “wave” offraudulent installsFraud score95Derive granular user behavior informationNew user ratioFraudulent user ratioFirst/Last seen timeProxy/Data center IPGeoloc
20、ation Deep LearningGlobal Intelligence NetworkFinancialSocialE-CommMobilePro:Unified engine (end-to-end solution)Simple APISpeedCon:Deep learning integration under developmentPro:Production ready (if done right)Extensive ML API for various tasksCon:Limited data pre-processing supportNot end-to-end s
21、olutionUDFDataframeDerived featureOrigin featurePre-processingLoad data into DataFrameEach user defined function (UDF) is builtfrom a feature functionUniform APIServingEvery entry of data point is pre-processedand then fed to DL model for inferenceThe same feature function is used to process data at
22、 serving timeFeature functionsServingdataModelingInferencePipline Module 1Process 200+ GB/day/client5000+ average peak QPSacross clientsBatch process runs multiple times per dayDynamically launch and destroy Spark Cluster utilizing Spot FleetResults are precomputed and written to each data storeOrig
23、inal DataPipline Module 2Pipline Module N+1Pipline Module NPipline Module N+1Pipline Module N+1Moved to a 3 year convertible instance modelReal-time cost tracking (Cloudability)Spot FleetSparkGen Internal Spark Cluster Management SoftwareProd JobSchedulerSpark Resource ManagerProd JobsDev JobsDevelo
24、persMSSSMSSSSSMSSSSTrack pipeline dependency and run all jobs on Spot instancesTip: Spot instances are 7 times cheaper than on-demand 3 times cheaper than reserved instances.Single Static Cluster One-time launch Low utilization Idle timeMultiple Static Clusters One-time launch Moderate utilization Idle time Limited concurrency
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負責。
- 6. 下載文件中如有侵權(quán)或不適當內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 煙臺文化旅游職業(yè)學院《現(xiàn)代文學專題研究》2023-2024學年第一學期期末試卷
- 安康學院《景區(qū)規(guī)劃與管理》2023-2024學年第一學期期末試卷
- 天津城市建設管理職業(yè)技術(shù)學院《臨床醫(yī)學概論A1》2023-2024學年第一學期期末試卷
- 北京經(jīng)貿(mào)職業(yè)學院《臨床檢驗儀器與技術(shù)實驗》2023-2024學年第一學期期末試卷
- 滄州交通學院《工程實踐》2023-2024學年第一學期期末試卷
- 銅仁職業(yè)技術(shù)學院《臨床輸血學檢驗技術(shù)本》2023-2024學年第一學期期末試卷
- 電子商務考試試題及答案
- 電工培訓考試試題及答案
- 南陽科技職業(yè)學院《體育賽事運作實務》2023-2024學年第一學期期末試卷
- 廣州航海學院《外科學(外??疲?023-2024學年第一學期期末試卷
- 【MOOC】算法初步-北京大學 中國大學慕課MOOC答案
- 食品檢驗員考試題庫單選題100道及答案解析
- 鄉(xiāng)鎮(zhèn)污水管道改造施工方案
- 四年級下冊道德與法治知識點
- 人工智能(AI)訓練師職業(yè)技能鑒定考試題及答案
- ASTM-D3359-(附著力測試標準)-中文版
- 全國中小學生學籍信息管理系統(tǒng)學生基本信息采集表(2022修訂版)
- CJT 211-2005 聚合物基復合材料檢查井蓋
- 云南省曲靖市2023-2024學年八年級下學期期末語文試題
- DZ∕T 0212.4-2020 礦產(chǎn)地質(zhì)勘查規(guī)范 鹽類 第4部分:深藏鹵水鹽類(正式版)
- 借款利息確認書
評論
0/150
提交評論