




已閱讀5頁(yè),還剩19頁(yè)未讀, 繼續(xù)免費(fèi)閱讀
版權(quán)說(shuō)明:本文檔由用戶(hù)提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
24 Time Series Analysis Time series data are vectors of numbers typically regularly spaced in time Yearly counts of animals daily prices of shares monthly means of temperature and minute by minute details of blood pressure are all examples of time series but they are measured on different time scales Sometimes the interest is in the time series itself e g whether or not it is cyclic or how well the data fi t a particular theoretical model and sometimes the time series is incidental to a designed experiment e g repeated measures We cover each of these cases in turn The three key concepts in time series analysis are r trend r serial dependence and r stationarity Mosttimeseriesanalysesassumethatthedataareuntrended Iftheydoshowaconsistentupwardordownward trend then they can be detrended before analysis e g by differencing Serial dependence arises because the values of adjacent members of a time series may well be correlated Stationarity is a technical concept but it can be thought of simply as meaning that the time series has the same properties wherever you start looking at it e g white noise is a sequence of mutually independent random variables each with mean zero and variance 2 0 24 1 Nicholson s blowfl ies The Australian ecologist A J Nicholson reared blowfl y larvae on pieces of liver in laboratory cultures that his technicians kept running continuously for almost 7 years 361 weeks to be exact The time series for numbers of adult fl ies looks like this blowfly read table c temp blowfly txt header T attach blowfly names blowfly 1 flies The R Book Second Edition Michael J Crawley 2013 John Wiley the cycles become much less clear cut and the population begins a pronounced upward trend There are two important ideas to understand in time series analysis autocorrelation and partial autocorrelation The fi rst describes how this week s population is related to last week s population This is the autocorrelation at lag 1 The second describes the relationship between this week s population and the population at lag t once we have controlled for the correlations between all of the successive weeks between this week and week t This should become clear if we draw the scatterplots from which the fi rst four autocorrelation terms are calculated lag 1 to lag 4 There is a snag however The vector of fl ies at lag 1 is shorter by one than the original vector because the fi rst element of the lagged vector is the second element of fl ies The coordinates of the fi rst data point to be drawn on the scatterplot are flies 1 flies 2 and the coordinates of the last plot that can be drawn are flies 360 flies 361 because the original vector is 361 element long length flies 1 361 TIME SERIES ANALYSIS787 Thus the lengths of the vectors that can be plotted go down by one for every increase in the lag of one We can produce the four plots for lags 1 to 4 in a function like this par mfrow c 2 2 sapply 1 4 function x plot flies c 361 361 x 1 flies c 1 x 0500010000 15000 050001000015000050001000015000 05000 flies c 361 361 x 1 flies c 361 361 x 1 flies c 361 361 x 1 flies c 1 x flies c 1 x flies c 1 x flies c 1 x flies c 361 361 x 1 1000015000050001000015000 0500010000 15000 0500010000 15000 0500010000 15000 The correlation is very strong at lag 1 but notice how the variance increases with population size small populations this week are invariably correlated with small populations next week but large populations this week may be associated with large or small populations next week The striking pattern here is the way that the correlation fades away as the size of the lag increases Because the population is cyclic the correlation goes to zero then becomes weakly negative and then becomes strongly negative This occurs at lags that are half the cycle length Looking back at the time series the cycles look to be about 20 weeks in length So let us repeat the exercise by producing scatterplots at lags of 7 8 9 and 10 weeks sapply 7 10 function x plot flies c 361 x 1 361 flies c 1 x par mfrow c 1 1 788THE R BOOK 0500010000 15000 0500010000 15000 0500010000 15000 0500010000 15000 050001000015000050001000015000 05000 flies c 361 x 1 361 flies c 361 x 1 361 flies c 361 x 1 361 flies c 361 x 1 361 flies c 1 x flies c 1 x flies c 1 x flies c 1 x 1000015000050001000015000 The negative correlation at lag 10 gradually emerges from the fog of no correlation at lag 7 More formally the autocorrelation function k at lag k is k k 0 where k is the autocovariance function at lag k of a stationary random function Y t given by k cov Y t Y t k The most important properties of the autocorrelation coeffi cient are as follows r They are symmetric backwards and forwards so k k r The limits are 1 k 1 r When Y t and Y t k are independent then k 0 r The converse of this is not true so that k 0 does not imply that Y t and Y t k are independent look at the scatterplot for k 7 in the scatterplots above A fi rst order autoregressive process is written as Yt Yt 1 Zt TIME SERIES ANALYSIS789 Thissaysthatthisweek spopulationis timeslastweek spopulationplusarandomtermZt Therandomness is white noise the values of Z are serially independent they have a mean of zero and they have fi nite variance 2 In a stationary times series 1 0 with the replaced by r correlation coeffi cients estimated from the data Suppose we want the partial autocorrelationbetweentime1andtime3 Tocalculatethis weneedthethreeordinarycorrelationcoeffi cients r12 r13and r23 The partial r13 2is then r13 2 r13 r12r23 1 r2 12 1 r2 23 For more on partial correlation coeffi cients see p 375 Let us look at the correlation structure of the blowfl y data The R function for calculating autocorrelations and partial autocorrelations is acf the autocorrelation function First we produce the autocorrelation plot to look for evidence of cyclic behaviour acf flies main col red 0 0 20 00 20 4 ACF 0 60 81 0 510 Lag 152025 790THE R BOOK Youwillnotseemoreconvincingevidenceofcyclesthanthis Theblowfl iesexhibithighlysignifi cant regular cycleswithaperiodof19weeks Thebluedashedlinesindicatethethresholdvaluesforsignifi cantcorrelation What kind of time lags are involved in the generation of these cycles We use partial autocorrelation type p to fi nd this out acf flies type p main col red 0 20 00 20 4 Partial ACF 0 60 8 510 Lag 152025 The signifi cant density dependent effects are manifest at lags of 2 and 3 weeks with other marginally signifi cant negative effects at lags of 4 and 5 weeks These lags refl ect the duration of the larval and pupal period 1and2periods respectively Thecyclesareclearlycausedbyovercompensatingdensitydependence resulting from intraspecifi c competition between the larvae for food what Nicholson christened scramble competition There is a curious positive feedback at a lag of 12 weeks 12 16 weeks in fact Perhaps you can think of a possible cause for this We should investigate the behaviour of the second half of the time series separately Let us say it is from week 201 onwards second t Intercept 2827 531336 6618 399 2 37e 14 I 1 length second 21 9453 6056 087 8 29e 09 TIME SERIES ANALYSIS791 Residual standard error 2126 on 159 degrees of freedom Multiple R squared 0 189 Adjusted R squared 0 1839 F statistic 37 05 on 1 and 159 DF p value 8 289e 09 This shows that there is a highly signifi cant upward trend of about 22 extra fl ies on average each week in the second half of time series We can detrend the data by subtracting the fi tted values from the linear regression of second on day number detrended second predict lm second I 1 length second par mfrow c 2 2 ts plot detrended There are still cycles there but they are weaker and less regular We repeat the correlation analysis on the detrended data acf detrended main These look more like damped oscillations than repeated cycles What about the partials acf detrended type p main par mfrow c 1 1 05010015005101520 5101520 40004000800000 0 0 0 0 20 2 Partial ACFdetrended ACF 0 4 0 6 0 8 0 40 40 8 Lag LagTime There are still signifi cant negative partial autocorrelations at lags 3 and 5 but now there is a curious extra negative partial at lag 18 It looks therefore as if the main features of the ecology are the same scramble 792THE R BOOK competition for food between the larvae leading to negative partials at 3 and 5 weeks after 1 and 2 generation lags but population size is drifting upwards and the cycles are showing a tendency to dampen out 24 2Moving average The simplest way of seeing pattern in time series data is to plot the moving average A useful summary statistic is the three point moving average y i yi 1 yi yi 1 3 The function ma3 will compute the three point moving average for any input vector x ma3 function x y numeric length x 2 for i in 2 length x 1 y i x i 1 x i x i 1 3 y A time series of mean monthly temperatures will illustrate the use of the moving average temperature read table c temp temp txt header T attach temperature tm ma3 temps plot temps lines tm 2 158 col blue 252015 temps 105 050100150 Index TIME SERIES ANALYSIS793 The seasonal pattern of temperature change over the 13 years of the data is clear Note that a moving average can never capture the maxima or minima of a series because they are averaged away Note also that the three point moving average is undefi ned for the fi rst and last points in the series 24 3Seasonal data Many time series applications involve data that exhibit seasonal cycles The commonest applications involve weather data Here are daily maximum and minimum temperatures from Silwood Park in south east England over a 19 year period weather read table c temp SilwoodWeather txt header T attach weather names weather 1 upper lower rain month yr plot upper type l The seasonal pattern of temperature change is clear but there is no clear trend e g warming see p 791 Note that the x axis is labelled by the day number of the time series Index Westartbymodellingtheseasonalcomponent Thesimplestmodelsforcyclesarescaledsothatacomplete annual cycle is of length 1 0 rather than 365 days Our series consists of 6940 days over a 19 year span so we write length upper 1 6940 index 1 6940 6940 19 1 365 2632 time index 365 2632 The equation for the seasonal cycle is y sin 2 t cos 2 t This is a linear model so we can estimate its three parameters very simply model t Intercept 14 956470 04088365 86 2e 16 sin time 2 pi 2 538830 05781 43 91 2e 16 cos time 2 pi 7 240170 05781 125 23 2e 16 Residual standard error 3 406 on 6937 degrees of freedom Multiple R squared 0 7174 Adjusted R squared 0 7173 F statistic 8806 on 2 and 6937 DF p value 2 2e 16 We can investigate the residuals to look for patterns e g trends in the mean or autocorrelation structure Remember that the residuals are stored as part of the model object plot model resid pch TIME SERIES ANALYSIS795 100 model resid 15 55 10 1000020003000 Index 4000500060007000 15 There looks to be some periodicity in the residuals but no obvious trends To look for serial correlation in the residuals we use the acf function like this windows 7 4 par mfrow c 1 2 acf model resid main acf model resid type p main 0 00 20 4 ACF Partial ACF 0 6 0 00 20 40 6 0 81 0 01020 LagLag 300102030 Thereisverystrongserialcorrelationintheresiduals andthisdropsoffroughlyexponentiallywithincreasing lag left hand graph The partial autocorrelation at lag 1 is very large 0 7317 but the correlations at higher lags are much smaller This suggests that an AR 1 model autoregressive model with order 1 might be appropriate This is the statistical justifi cation behind the old joke about the weather forecaster who was asked what tomorrow s weather would be Like today s he said 796THE R BOOK 24 3 1Pattern in the monthly means The monthly average upper temperatures show a beautiful seasonal pattern when analysed by acf temp ts as vector tapply upper list month yr mean windows 7 7 acf temp main 0 0 ACF 1 00 5 0 5 01051520 Lag There is a perfect cycle with period 12 as expected What about patterns across years ytemp ts as vector tapply upper yr mean acf ytemp main 0 0 ACF 1 00 5 0 5 010122468 Lag TIME SERIES ANALYSIS797 Nothing The pattern you may or may not see depends upon the scale at which you look for it As for spatial patterns Chapter 26 so it is with temporal patterns There is strong pattern between days within months tomorrowwillbeliketoday Thereisverystrongpatternfrommonthtomonthwithinyears Januaryiscold July is warm But there is no pattern at all from year to year there may be progressive global warming but it is not apparent within this recent time series see below and there is absolutely no evidence for untrended serial correlation 24 4Built in time series functions The analysis is simpler and the graphics are better labelled if we convert the temperature data into a regular time series object using ts We need to specify the fi rst date January 1993 as start c 1993 1 and the number of data points per year as frequency 365 high ts upper start c 1993 1 frequency 365 Now use plot to see a plot of the time series correctly labelled by years plot high 19952000 Time 20052010 01020 high 30 24 5Decompositions It is useful to be able to turn a time series into components The function stl with a lower case letter L not numeral one performs seasonal decomposition of a time series into seasonal trend and irregular components using loess First we make a time series object specifying the start date and the frequency as in Section 24 4 then use stl to decompose the series up stl high periodic 798THE R BOOK The plot function produces the data series the seasonal component the trend and the residuals in a single frame plot up 10 datatrendremainderseasonal 020 30141315 16 50510 10 5 0 5 10 19952000 time 20052010 The remainder component is the residuals from the seasonal plus trend fi t The bars at the right hand side are of equal heights in user coordinates 24 6Testing for a trend in the time series It is important to know whether these data provide any evidence for global warming The trend part of the fi gure indicates a fl uctuating increase but is it signifi cant The mean temperature in the last 9 years was 0 71 C higher than in the fi rst 10 years ys 2002 tapply upper ys mean 12 14 6205615 32978 We cannot test for a trend with linear regression because of the massive temporal pseudoreplication Suppose we tried this model1 t Intercept 1 433e 018 136e 02176 113 2e 16 index1 807e 042 031e 058 896 2e 16 sin time 2 pi 2 518e 005 754e 02 43 758 2e 16 cos time 2 pi 7 240e 005 749e 02 125 939 2e 16 Residual standard error 3 387 on 6936 degrees of freedom Multiple R squared 0 7206 Adjusted R squared 0 7205 F statistic 5963 on 3 and 6936 DF p value 2 2e 16 It would suggest wrongly as we shall see that the warming was highly signifi cant index p value less than 2 10 16for a slope of 0 0001807 degrees of warming per day leading to a predicted increase in mean temperature of 1 254 C over the 6940 days of the time series Since there is so much temporal pseudoreplication we should use a mixed model lmer p 695 and because we intend to compare two models with different fi xed effects we use the method of maximum likelihood REML FALSE The explanatory variable for any trend is index and we fi t the model with and without this variable allowing for different intercepts for the different years as a random effect model2 lmer upper index sin time 2 pi cos time 2 pi 1 factor yr REML FALSE model3 Chisq model35 36452 36486 18221 model26 36458 36499 18223011 Clearly the trend is non signifi cant chi squared 0 p 1 If you are prepared to ignore all the variation from day to day and from month to month then you can get rid of the pseudoreplication by averaging and test for trend in the yearly mean values these show a signifi cant trend if the fi rst year 1993 is included but not if it is omitted means as vector tapply upper yr mean model t Intercept 14 271050 3222044 293 2e 16 I 1 19 0 068580 028262 4270 0266 model t Intercept 14 598260 3090147 243 2e 16 I 1 18 0 047610 028551 6680 115 Obviously you need to be circumspect when interpreting trends in time series 24 7Spectral analysis There is an alternative approach to time series analysis which is based on the analysis of frequencies rather than fl uctuations of numbers Frequency is the reciprocal of cycle period Ten year cycles would have a frequency 0 1 per year Here are the famous Canadian lynx data numbers read table c temp lynx txt header T attach numbers names numbers 1 Lynx plot ts Lynx 200 0 4060 Time 80100 100020003000 Lynx 4000500060007000 The fundamental tool of spectral analysis is the periodogram This is based on the squared correlation between the time series and sine cosine waves of frequency and conveys exactly the same information as the autocovariance function It may or may not make the information easier to interpret Using the function is straightforward we employ the spectrum function like this spectrum Lynx main col red TIME SERIES ANALYSIS801 0 00 10 20 3 frequency bandwidth 0 00241 0 40 5 5e 035e 045e 05 spectrum 5e 065e 07 The plot is on a log scale in units of decibels and the subtitle on the x axis shows the bandwidth while the 95 confi dence interval in decibels is shown by the vertical blue bar in the top right hand corner The fi gure is interpreted as showing strong cycles with a frequency of about 0 1 where the maximum value of spectrum occurs That is to say it indicates cycles with a period of 1 0 1 10 years There is a hint of longer period cycles the local peak at frequency 0 033 would produce cycles of length 1 0 033 30 years but no real suggestion of any shorter term cycles 24 8Multiple time series When we have two or more time series measured over the same period the question naturally arises as to whether or not the ups and downs of the different series are correlated It may be that we suspect that change inoneofthevariablescauseschangesintheother e g changesinthenumberofpredatorsmaycausechanges in the number of prey because more predators means m
溫馨提示
- 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶(hù)所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶(hù)上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶(hù)上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶(hù)因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 培訓(xùn)部總結(jié)與規(guī)劃
- 城市交通規(guī)劃合同管理著作權(quán)咨詢(xún)重點(diǎn)基礎(chǔ)知識(shí)點(diǎn)
- 地震安全評(píng)估師重點(diǎn)基礎(chǔ)知識(shí)點(diǎn)
- 營(yíng)銷(xiāo)產(chǎn)品培訓(xùn)大綱設(shè)計(jì)
- 河北釘釘協(xié)議書(shū)
- 公務(wù)用車(chē)車(chē)輛租賃合同
- 民間標(biāo)會(huì)協(xié)議書(shū)
- 超市部分承包合同協(xié)議
- 土地合作居間服務(wù)合同
- 產(chǎn)品質(zhì)量保障與賠償協(xié)議
- 某廠(chǎng)蒸汽管道安裝吹掃及試運(yùn)行方案
- 清華大學(xué)出版社機(jī)械制圖習(xí)題集參考答案(課堂PPT)
- 安徽金軒科技有限公司 年產(chǎn)60萬(wàn)噸硫磺制酸項(xiàng)目環(huán)境影響報(bào)告書(shū)
- 兒科護(hù)理學(xué)智慧樹(shù)知到答案章節(jié)測(cè)試2023年石河子大學(xué)
- 兩篇古典英文版成語(yǔ)故事百鳥(niǎo)朝鳳英文版
- GB/T 37573-2019露天煤礦邊坡穩(wěn)定性年度評(píng)價(jià)技術(shù)規(guī)范
- GB/T 19634-2021體外診斷檢驗(yàn)系統(tǒng)自測(cè)用血糖監(jiān)測(cè)系統(tǒng)通用技術(shù)條件
- GB/T 119.1-2000圓柱銷(xiāo)不淬硬鋼和奧氏體不銹鋼
- 勞動(dòng)保障監(jiān)察執(zhí)法課件
- 國(guó)際貿(mào)易理論發(fā)展及評(píng)述-國(guó)際貿(mào)易
- 小學(xué)奧數(shù):乘法原理之染色法.專(zhuān)項(xiàng)練習(xí)及答案解析
評(píng)論
0/150
提交評(píng)論