《信息科學(xué)類專業(yè)英語》課件第10章_第1頁
《信息科學(xué)類專業(yè)英語》課件第10章_第2頁
《信息科學(xué)類專業(yè)英語》課件第10章_第3頁
《信息科學(xué)類專業(yè)英語》課件第10章_第4頁
《信息科學(xué)類專業(yè)英語》課件第10章_第5頁
已閱讀5頁,還剩37頁未讀, 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡(jiǎn)介

Lesson10DataWarehouseOverview

(第十課數(shù)據(jù)倉(cāng)庫(kù)概論)

Vocabulary(詞匯)ImportantSentences(重點(diǎn)句)QuestionsandAnswers(問答)Problems(問題)

TheworddatawarehousewasfirstdevelopedbyBillInmonintheearly1990s.Hereferredtoitasbeingaintegratedcollectionofinformationthatcouldhelpcompaniesandorganizationsmakebetterdecisions.

Tobeeffective,adatawarehousehadtobeintegrated,subjectoriented,non-volatile,andtimevariant.Inthisarticle,Iwillgooverallthesefactorsindetail.Ifyouarebuildingadatawarehouse,itisimportantforyoutounderstandwhytheyareimportant.

Beingsubjectorientedmeansthatthedatawillprovideinformationaboutaspecificsubjectratherthantheinformationaboutthefunctionsofacompany.Becauseadatawarehouseissubjectoriented,itwillallowyoutoanalyzeinformationthatisconnectedtoaspecificsubject.Beingintegratedmeansthatthedatathatiscollectedwithinthedatawarehousecancomefromdifferentsources,butcanbecombinedintooneunitthatisrelevantandlogical.Havingatime-variantmeansthatalltheinformationwithinthedatawarehousecanbefoundwithagivenperiodoftime.[1]

Itisimportantthattheinformationcontainedwithinadatawarehouseisstable.Whiledatacanbeadded,itshouldneverbedeleted.Thispropertyisreferredtoasbeingnon-volatile.Whenacompanyusesadatawarehousethatisstable,thiswillallowthemtogetabetterunderstandingoftheoperationswithintheircompany.Despitethefactthatthesetermswerefirstcoinedinthe1990s,theyarestillhighlyaccuratetoday.However,itshouldbenotedthatsomedatawarehousesarevolatile.Thereasonforthisisbecausemanymoderndatawarehousesdealwithterabytesofdata.Becausetheymuststoreterabytesofdata,manycompaniesareforcedtodeletesomeoftheirinformationafteracertainperiodoftime.Forinstance,somecompanieswillsystematicallydeletedatathathasreachedthreeyearsofage.Beforeadatawarehousecanbebuilt,thecorrectdatamustbelocated.Generally,theinformationthatwillbeaddedtothewarehousewillcomefromdailyinformationorhistoricalinformation.Thehistoricalinformationmaybestoredinalegacysystem,andischallengingtoextract.

Thedesignofthedatawarehouseisimportantaswell.Itisimportantfordesignerstomakesurethedesignisconsistentwiththequeriesthatwillbeconductedwithinthewarehouse.Todothissuccessfully,itisimportantfordesignerstounderstandthedatabaseschema.Itiscrucialtomakesurethedatawarehouseisdesignedcorrectly,asitisdifficulttorecreatesomeformsofdata.Anotherimportantaspectofdatawarehousesisdataacquisition.Dataacquisitioncanbedefinedastransferringdatafromasourcetothewarehouse.Dataacquisitionisoneofthemostexpensivepartsofbuildingadatawarehouse.ThisprocesswilloftenbeconductedwithanETL(Extracting,TranslatingandLoading)tool.

Asofthistime,therearejustover50ETLtoolsbeingsold.Itmaycostacompanymillionsofdollarsinordertotransferdatafromsourcestothewarehouse.Oncetheinitialdatahasbeentransferredtothedatawarehouse,theprocessmustberepeatedconsistently.Dataacquisitionisacontinuousprocess,andthegoalofacompanyistomakesurethewarehouseisupdatedonaregularbasis.Whenthewarehouseisupdated,itisoftenhardtodeterminewhichinformationinthesourcehaschangedsincethepreviousupdate.Theprocessofdealingwiththisissueiscalledchangeddatacapture.Thisprocesshasbecomeaseparatefield,andthereareanumberofproductscurrentlybesoldtodealwithit.

Itisimportantfordatatobecleanedbeforeitcanbeplacedinthewarehouse.Thedatacleansingprocessisusuallydoneduringthedataacquisitionphase.Anydatathatisplacedinawarehousebeforebeingcleanwillposeadangertothesystem,anditcannotbeused.Thereasonforthisisbecausethedatamaynotbecorrectifitisnotcleaned,andacompanymaymakeincorrectdecisionsbasedonit.Thiscouldleadtoanumberofproblems.Forexample,alltheinformationwithinadatawarehousethatmeansthesamethingmustbestoredinthesameform.Ifthereisinformationthatreads“MS”and“Microsoft”,eventhoughtheymeanthesamething,onlyoneofthemcanbeusedtorecognizetheelementwithinthedatawarehouse.1DataWarehouseTools

Thereareanumberofimportanttoolswhichareconnectedtodatawarehouses,andoneoftheseisdataaggregation.Adatawarehousecanbedesignedtostoreinformationbasedonacertainlevelofdetail.Forexample,youcanstoredatabasedoneachtransaction,oryoucanstoreitbasedonasummary.Theseareexamplesofdataaggregation.Whendataissummarized,thequerieswillmoveatamuchfasterrate.However,someoftheinformationmaybelostduringaquery,andthisinformationmaybeimportantforsolvingacertainproblem.

Beforeyoudecidewhichoneyouwilluse,itisimportanttoweighyouroptionscarefully.Onceyouhavecarriedoutanoperation,youwillneedtorebuildthewarehouseinordertoundoit.Thebestwaytohandlethissituationistomakesurethedatawarehouseisconstructedwithalargeamountofdetail.However,thecostforthiscanbehugedependingonthestorageoptionsyouchoose.Onceyouhavefilledyourdatawarehousewithimportantinformation,youwillwanttousethisdatatohelpyoumakesmartinvestmentdecisions.Thetoolsthatcanallowyoutodothiswillfallunderatopicthatiscalledbusinessintelligence.

Businessintelligenceisafieldwhichisverydiverse.ItiscomprisedofthingssuchasExecutiveInformationSystems,DecisionSupportSystems,andBusinessintelligencecanfurtherbebrokendownintoafieldthatiscalledmulti-dimensionalanalysistools.Thesearetoolsthatwillallowausertoviewdatafromawidevarietyofangles.AquerytoolwillallowausertosendSQLquerieswithinawarehousetolookforresults.Dataminingisalsoafieldthatfallsunderbusinessintelligence,andwillallowyoutolookforpatternsandrelationshipswithinadatawarehouse.

Anothertoolthatisconnectedtodatawarehousesisdatavisualization.Thetoolsthatareusedfordatavisualizationwillpresentvisualmodelsofdata.Thisdatacouldcomeintheformofintricate3Dimages.Thegoalofdatavisualizationistoallowtheusertoviewtrendsinamethodwhichiseasiertounderstandthancomplicatedmodelsthatarebasedoffstatistics.OnetoolthatisallowingthisfieldtoadvanceisVRML,orVirtualRealityModelingLanguage.Inorderfordatawarehousestofunctionproperly,itisalsoimportanttoplaceanemphasisonmetadatamanagement.Metadatacanbedescribedasbeing“informationaboutinformation”.

Metadatamustbemanagedwhendataisacquiredoranalyzed.Metadatawillbeheldinarepository,andcangiveyouimportantinformationaboutmanyofthedatawarehousetools.Theprocessofproperlymanagingmetadatahasbecomeasciencewithinitself.Ifitisdoneproperly,thecompanycangreatlybenefit.Thereasonwhyitisimportantisbecauseitcanalloworganizationstoanalyzethechangesthatoccurwithindatabasetables.Thisisatoolthatplaysanimportantpartoftheconstructionofadatawarehouse.

Datawarehousingisafieldwhichissomewhatcomplicated.Therearemanyvendorswhoareattemptingtoadvertisethetools,butthecostandcomplexityinvolvedwiththeproductshasnotallowedthemtobeusedbyalargenumberofcompanies.Anycompanythatisthinkingofusingdatawarehousesmustmakesuretheyhavetakenthetimetoreviewandunderstandthetechnology.Itcanonlybeusefulifyouknowhowtouseit.Onceyouunderstandandacquirethetechnology,itispossibleforyoutogainapowerfuladvantageoveryourcompetitors.Thishasmadedatawarehousesattractivetomanycompanies.

Oneofthebiggestadvantagestodatawarehousesisthattheyallowyoutostoreinformationthatyoucanusetoimprovethemarketingstrategiesofyourcompany.Notonlycanyouimprovethemarketingstrategies,butyouwillalsobeabletomakestrategicdecisionsbasedontheinformationyouhavecompiledandorganized.Withtechniquessuchasdatamininganddatavisualization,youwillbeabletodiscoverimportantpatternsthatyoudidn’tknowexisted.Thepatternsthatyoudiscovercanallowyourcompanytoearnlargeprofits.2DataWarehousingMethods

Mostorganizationsagreethatdatawarehousesareausefultool.Theybenefitfromtheabilitytostoreandanalyzedata,andthiscanallowthemtomakesoundbusinessdecisions.Itisalsoimportantforthemtomakesurethecorrectinformationispublished,anditshouldbeeasytoaccessbythepeoplewhoareresponsibleformakingdecisions.

Therearetwoelementsthatmakeupthedatawarehouseenvironment,andthesearepresentationandstaging.Thestagingcouldalsobeknownastheacquisitionarea.ItiscomposedofETLoperations,andoncethedatahasbeenprepared,itwillbesenttothepresentationarea.

Whenthedataisplacedwithinthepresentationarea,anumberofprogramswillanalyzeandreviewit.Whilemanyorganizationsagreeontheoverallgoalofdatawarehouses,theapproachestobuildingthemmaydiffer.Attemptingtousedatamartsaloneisnotagoodapproach,becausetheyaregearedtowardsdepartments.Inadditiontothis,attemptingtousedatamartsalonewillbeinefficient,andyouwillrunintoanumberoflongtermproblems.Therearetwotechniquesforbuildingdatawarehousesthathavebecomeverypopular.ThesearetheKimballBusArchitectureandtheCorporateInformationFactory.

WiththeKimballtechnique,theroughdatawillbetransformedandrefinedwithinthestagingarea.Itisimportanttomakesurethedataisproperlyhandledduringthisstep.Duringthestagingprocess,theroughdatawillbepulledfromthesourcesystems.Whilesomeofthestagingprocessesmaybecentralized,otherswillbedistributed.Thepresentationareawillhaveadimensionalstructure,andthismodelwillholdthesameinformationasastandardmodel.However,itwillbeeasiertouse,anditwilldisplayinformationthatissummarized.

Adimensionalmodelwillbecreatedbyabusinessoperation.Departmentswithintheorganizationdonotplayaroleinthis.Thedatawillbepopulatedonceitisplacedwithinthedimensionalwarehouse,andisnotdependentonthevariousdepartmentsthatmaycomposeanorganization.Whenbusinessprocesseshavebeendevelopedwithinthewarehouse,thesystemwillbecomehighlyefficient.ThenextpopulardatawarehouseapproachthatyouwillwanttobecomefamiliarwithistheCorporateInformationFactory.AnothernameforthistechniqueistheEDWapproach.Thedatathatisextractedfromthesourcewillbecoordinated.

WithintheCIF,astandarddatawarehouseisusedtoholddatarepositories,anditmayalsohavespecificdatawarehouseswhicharedesignedfordatamining.Thedatamartsmaybedesignedforspecificdepartments,andtheymayhavesummarydatawhichisintheformofadimensionalstructure.Theatomicdatamaybeobtainedfromthestandarddatawarehouse.Whiletherearesomesimilaritiesbetweenthesetotechniques,therearesomenotabledifferencesaswell.

Oneoftheprimarydifferencesbetweenthesetwotechniquesisthenormalizeddatafoundation.WiththeKimballapproach,thedatastructuresthatmustbeobtainedbeforethedimensionalpresentationwillbedependentonthesourcedataandtransformation.Inmostcases,theduplicatestorageofdataisnotrequiredinbothdimensionalandnormalizedfoundations.Manyofthepeoplewhochoosetouseanormalizeddatastructurebelievethatitisfasterthanthedimensionalstructure,buttheyoftenfailtotakeETLintoconsideration.

Anotherthingthatseparatesthetwodatawarehouseapproachesisthemanagementofatomicdata.WiththeCIF,atomicdatawillbestoredwithinanormalizeddatawarehouse.Incontrast,theKimballmethodstatesthattheatomicdatashouldbeplacedwithinadimensionalstructure.Whenthedataisplacedwithinadimensionalstructure,itcanbesummarizedinawidevarietyofdifferentways.

Itisimportanttomakesuretheinformationyouhaveisdetailedsothatuserswillbeabletoaskrelevantquestions.Whilemostuserswillnotplaceanemphasisonthedetailsofoneatomictransaction,theymaywantasummaryofalargenumberoftransactions.Itisimportantforthemtohavethedetailssothattheywillbeabletoanswerimportantquestions.Theapproachthatyouchooseshouldbetheonewhichbestservestheneedsofyourcompany.3DataWarehouseDesignStrategies

Tobuildaneffectivedatawarehouse,itisimportantforyoutounderstanddatawarehousedesignprinciples.Ifyourdatawarehouseisnotbuiltcorrectly,youcanrunintoanumberofdifferentproblems.

Thepropermethodsforbuildingapowerfuldatawarehousearebasedoninformationtechnologytactics.Firstoff,itisimportantthatyouandyourorganizationunderstandtheimportanceofhavingadatawarehouse.Ifworkersfeelthatadatawarehouseisunnecessary,theymaynotuseit,andthiscouldcauseconflicts.Everyoneinyourorganizationshouldunderstandtheimportanceofusingthesystem.

Afteryouhavegotyourcolleaguesbehindtheconceptofusingadatawarehouse,youwillwanttonextfocusondataintegrity.Youwillwanttoavoiddesigningadatawarehousethatwillloaddatathatisnotconsistent.Itisalsoimportanttoavoidcreatingadatabasethatwillreplicatedata.Thegoalofyourorganizationshouldbetointegratedataandcreatestandardsthatwillbeusedandfollowed.Afterdataintegrity,youwillnextwanttolookatimplementationefficiency.Thisbasicallymeansthatyouwillwanttodesignatsystemthatissimpletouse.Itdoesn’tmatterhowwelldesignedyourdatawarehouseisifyourworkershaveahardtimeusingit.

Ifyourworkershaveahardtimeusingthedatawarehouse,itwillslowdownthespeedandproductivityofyouroperation.Whenitcomestocreatingadatawarehouse,youwillwanttomakeitassimpleaspossible.Allofyourworkersshouldbeabletouseitwithoutproblems.Implementationefficiencyisaprinciplethatnaturallyleadstothenexttopicyouwillwanttofocuson,andthisisuserfriendliness.Thisisaconceptthatisanimportantpartofyourbusiness.Thereasonforthisisbecauseenduserswillnotutilizeaprogramthatistoodifficulttouse.Itisimportantforyoutokeeptheminmind.Useadesignwhichisfriendlyandeasytolearn.

Onceyouhavedesignedadatawarehousethatisuserfriendly,youwillnextwanttolookatoperationalefficiency.Oncethedatawarehousehasbeencreated,itshouldbeabletocarryoutoperationsquickly.Inadditiontothis,itshouldnothaveerrorsorothertechnicalproblems.Whenerrorsortechnicalproblemsdooccur,theyshouldbesimpletofix.Anotherthingyouwillwanttolookatisthecostinvolvedwithsupportingthesystem.Youwillwanttokeepthesecostslowasmuchaspossible.

Thedesignprinciplesthathavebeendiscussedinthisarticlesofararemorerelatedtobusinessthaninformationtechnology.However,thereareanumberofITdesignprinciplesthatyouwillwanttofollow.Oneoftheseisscalability.Thisisaproblemthatmanydatawarehousedesignersruninto.Thebestwaytodealwiththisissueistocreateadatawarehousethatisscalablefromthebeginning.Designitinawaywhichwillallowittosupportexpansionsorupgrades.Youshouldbeabletoadaptittoanumberofdifferentbusinesssituations.Thebestdatawarehousesarethosewhicharescalable.

Thedatawarehousethatyoudesignshouldfallundertheguidelinesofinformationtechnologystandards.EverytoolthatyouusetobuildyourdatawarehouseshouldworkwellwithITstandards.Youwillwanttomakesureitisdesignedinawaythatmakesiteasierforyourworkerstouse.Whilefollowingtheguidelinesinthisarticlewon’tallowyoutoalwaysbesuccessful,itwillgreatlytiptheoddsinyourfavor.Youshouldbewaryofcompaniesthatpromiseyouperfectresultsifyouusetheirdesignmethods.[2]Nomatterhowwelldesignedyourdatawarehouseis,youwillalwaysrunintoproblems.However,followingtherightprincipleswillmaketheproblemseasiertorecognizeandsolve.

Whenitcomestousingadatawarehouse,itisnotamatterof“if”youwillrunintoproblems.Itismatterof“how”and“when”.Whenyourdatawarehouseiswelldesigned,youwillbebetterequippedtosolveanyproblemsyouencounter.

1.?warehousen.倉(cāng)庫(kù),貨棧。

2.?goover受歡迎,獲得接受;檢查。

3.?orientvt.vi.使熟悉,使適應(yīng);使朝向;確定位置;朝向;確定方向;使適應(yīng)n.東方,亞洲。

4.?variantn.變體;變種;變型adj.不同的;差別的;變異的;各種各樣的。

5.?specificadj.明確的,確切的,詳盡的;具體的,特有的,特定的;僅限于……的。Vocabulary

6.?volatileadj.飛行的,揮發(fā)性的,可變的,不穩(wěn)定的,輕快的,爆炸性的n.有翅的動(dòng)物,揮發(fā)物。

7.?scheman.概要,計(jì)劃,圖表,模式。

8.?acquisitionn.獲得,得到的東西;得到的人,買進(jìn)。

9.?aggregationn.集合,凝聚,集成,集結(jié)(作用),集合[成]體,集團(tuán)。

10.?strategyn.戰(zhàn)略(學(xué)),策略,計(jì)謀,作戰(zhàn)方針;智謀,手腕strategyandtactics戰(zhàn)略與戰(zhàn)術(shù)。

11.?Intricateadj.復(fù)雜的,錯(cuò)綜的,難以理解的。

12.?martn.市場(chǎng);貿(mào)易場(chǎng)所。

13.?repositoryn.倉(cāng)庫(kù),儲(chǔ)藏所;儲(chǔ)物器皿,博物館;學(xué)識(shí)淵博的人;受人信賴的人,知己。

14.?Stagingn.舉行,進(jìn)行;配置,階變,級(jí),級(jí)組,分段運(yùn)輸;分級(jí)法。

15.?Populatevt.居住,使人口聚居于;移民于;殖民于人口稠密(稀少)的城市。

[1]Beingsubjectorientedmeansthatthedatawillprovideinformationaboutaspecificsubjectratherthantheinformationaboutthefunctionsofacompany.Becauseadatawarehouseissubjectoriented,itwillallowyoutoanalyzeinformationthatisconnectedtoaspecificsubject.Beingintegratedmeansthatthedatathatiscollectedwithinthedatawarehousecancomefromdifferentsources,butcanbecombinedintooneunitthatisrelevantandlogical.Havingatime-variantmeansthatalltheinformationwithinthedatawarehousecanbefoundwithagivenperiodoftime.ImportantSentences

所謂“面向主題”,就是數(shù)據(jù)將提供有關(guān)一個(gè)具體的主題的信息,而不是有關(guān)公司運(yùn)行的信息。由于數(shù)據(jù)倉(cāng)庫(kù)是面向主題的,因此它就允許你分析與具體主題相關(guān)的

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。

評(píng)論

0/150

提交評(píng)論