多水平模型(英文原著) cha(8)_第1頁(yè)
多水平模型(英文原著) cha(8)_第2頁(yè)
多水平模型(英文原著) cha(8)_第3頁(yè)
多水平模型(英文原著) cha(8)_第4頁(yè)
多水平模型(英文原著) cha(8)_第5頁(yè)
已閱讀5頁(yè),還剩5頁(yè)未讀, 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說(shuō)明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡(jiǎn)介

1、.Chapter 1 Introduction 1.1 Multilevel dataMany kinds of data, including observational data collected in the human and biological sciences, have a hierarchical or clustered structure. For example, animal and human studies of inheritance deal with a natural hierarchy where offspring are grouped withi

2、n families. Offspring from the same parents tend to be more alike in their physical and mental characteristics than individuals chosen at random from the population at large. For instance, children from the same family may all tend to be small, perhaps because their parents are small or because of a

3、 common impoverished environment. Many designed experiments also create data hierarchies, for example clinical trials carried out in several randomly chosen centres or groups of individuals. For now, we are concerned only with the fact of such hierarchies not their provenance. The principal applicat

4、ions we shall deal with are those from the social sciences, but the techniques are of course applicable more generally. In subsequent chapters, as we develop the theory and techniques with examples, we shall see how a proper recognition of these natural hierarchies allows us to seek more satisfactor

5、y answers to important questions.We refer to a hierarchy as consisting of units grouped at different levels. Thus offspring may be the level 1 units in a 2-level structure where the level 2 units are the families: students may be the level 1 units clustered within schools that are the level 2 units.

6、The existence of such data hierarchies is neither accidental nor ignorable. Individual people differ as do individual animals and this necessary differentiation is mirrored in all kinds of social activity where the latter is often a direct result of the former, for example when students with similar

7、 motivations or aptitudes are grouped in highly selective schools or colleges. In other cases, the groupings may arise for reasons less strongly associated with the characteristics of individuals, such as the allocation of young children to elementary schools, or the allocation of patients to differ

8、ent clinics. Once groupings are established, even if their establishment is effectively random, they will tend to become differentiated, and this differentiation implies that the group' and its members both influence and are influenced by the group membership. To ignore this relationship risks o

9、verlooking the importance of group effects, and may also render invalid many of the traditional statistical analysis techniques used for studying data relationships.We shall be looking at this issue of statistical validity in the next chapter, but one simple example will show its importance. A well

10、known and influential study of primary (elementary) school children carried out in the 1970's (Bennett, 1976) claimed that children exposed to so called 'formal' styles of teaching reading exhibited more progress than those who were not. The data were analysed using traditional multiple

11、regression techniques which recognised only the individual children as the units of analysis and ignored their groupings within teachers and into classes. The results were statistically significant. Subsequently, Aitkin et al, (1981) demonstrated that when the analysis accounted properly for the gro

12、uping of children into classes, the significant differences disappeared and the 'formally' taught children could not be shown to differ from the others. This reanalysis is the first important example of a multilevel analysis of social science data. In essence what was occurring here was that

13、 the children within any one classroom, because they were taught together, tended to be similar in their performance. As a result they provide rather less information than would have been the case if the same number of students had been taught separately by different teachers. In other words, the ba

14、sic unit for purposes of comparison should have been the teacher not the student. The function of the students can be seen as providing, for each teacher, an estimate of that teacher's effectiveness. Increasing the number of students per teacher would increase the precision of those estimates bu

15、t not change the number of teachers being compared. Beyond a certain point, simply increasing the numbers of students in this way hardly improves things at all. On the other hand, increasing the number of teachers to be compared, with the same or somewhat smaller number of students per teacher, cons

16、iderably improves the precision of the comparisons.Researchers have long recognised this issue. In education, for example, there has been much debate (see Burstein et al, 1980) about the so called 'unit of analysis' problem, which is the one just outlined. Before multilevel modelling became

17、well developed as a research tool, the problems of ignoring hierarchical structures were reasonably well understood, but they were difficult to solve because powerful general purpose tools were unavailable. Special purpose software, for example for the analysis of genetic data, has been available lo

18、nger but this was restricted to 'variance components' models (see chapter 2) and was not suitable for handling general linear models. Sample survey workers have recognised this issue in another form. When population surveys are carried out, the sample design typically mirrors the hierarchica

19、l population structure, in terms of geography and household membership. Elaborate procedures have been developed to take such structures into account when carrying out statistical analyses. We return to this in a little more detail in a later section. In the remainder of this chapter we shall look a

20、t the major areas explored in this book.1.2 School effectivenessSchooling systems present an obvious example of a hierarchical structure, with pupils grouped or nested or clustered within schools, which themselves may be clustered within education authorities or boards. Educational researchers have

21、been interested in comparing schools and other educational institutions, most often in terms of the achievements of their pupils. Such comparisons have several aims, including the aim of public accountability (Goldstein, 1992) but, in research terms, interest usually is focused upon studying the fac

22、tors that explain school differences. Consider the common example where test or examination results at the end of a period of schooling are collected for each school in a randomly chosen sample of schools. The researcher wants to know whether a particular kind of subject streaming practice in some s

23、chools is associated with improved examination performance. She also has good measures of the pupils' achievements when they started the period of schooling so that she can control for this in the analysis. The traditional approach to the analysis of these data would be to carry out a regression

24、 analysis, using performance score as response, to study the relationship with streaming practice, adjusting for the initial achievements. This is very similar to the initial teaching styles analysis described in the previous section, and suffers from the same lack of validity through failing to tak

25、e account of the school level clustering of students. An analysis that explicitly models the manner in which students are grouped within schools has several advantages. First, it enables data analysts to obtain statistically efficient estimates of regression coefficients. Secondly, by using the clus

26、tering information it provides correct standard errors, confidence intervals and significance tests, and these generally will be more 'conservative' than the traditional ones which are obtained simply by ignoring the presence of clustering - just as Bennett's previously statistically sig

27、nificant results became non-significant on reanalysis. Thirdly, by allowing the use of covariates measured at any of the levels of a hierarchy, it enables the researcher to explore the extent to which differences in average examination results between schools are accountable for by factors such as o

28、rganisational practice or possibly in terms of other characteristics of the students. It also makes it possible to study the extent to which schools differ for different kinds of students, for example to see whether the variation between schools is greater for initially high scoring students than fo

29、r initially low scoring students (Goldstein et al, 1993) and whether some factors are better at accounting for or 'explaining' the variation for the former students than for the latter. Finally, there is often considerable interest in the relative ranking of individual schools, using the per

30、formances of their students after adjusting for intake achievements. This can be done straightforwardly using a multilevel modelling approach.To fix the basic notion of a level and a unit, consider figures 1 and 2 based on hypothetical relationships. . Figure 1 shows the exam score and intake achiev

31、ement scores for five students in a school, together with a simple regression line fitted to the data points. The residual variation in the exam scores about this line, is the level 1 residual variation, since it relates to level 1 units (students) within a sample level 2 unit (school). In figure 2

32、the three lines are the simple regression lines for three schools, with the individual student data points removed. These vary in both their slopes and their intercepts (where they would cross the exam axis), and this variation is level 2 variation. It is an example of multiple or complex level 2 va

33、riation since both the intercept and slope parameters vary. Figure 1 Figure 2The other extreme to an analysis which ignores the hierarchical structure is one which treats each school completely separately by fitting a different regression model within each one. In some circumstances, for example whe

34、re we have very few schools and moderately large numbers of students in each, this may be efficient. It may also be appropriate if we are interested in making inferences about just those schools. If, however, we regard these schools as a (random) sample from a population of schools and we wish to ma

35、ke inferences about the variation between schools in general, then a full multilevel approach is called for. Likewise, if some of our schools have very few students, fitting a separate model for each of these will not yield reliable estimates: we can obtain more precision by regarding the schools as

36、 a sample from a population and using the information available from the whole sample data when making estimates for any one school. This approach is especially important in the case of repeated measures data where we typically have very few level 1 units per level 2 unit. We introduce the basic pro

37、cedures for fitting multilevel models to hierarchically structured data in chapter 2 and discuss the design problem of choosing the numbers of units at each level in chapter 11.1.3 Sample survey methodsWe have already mentioned sample survey data which will be discussed in many of the examples of th

38、is book. The standard literature on surveys, reflected in survey practice, recognises the importance of taking account of the clustering in complex sample designs. Thus, in a household survey, the first stage sampling unit will often be a well-defined geographical unit. From those which are randomly

39、 chosen, further stages of random selection are carried out until the final households are selected. Because of the geographical clustering exhibited by measures such as political attitudes, special procedures have been developed to produce valid statistical inferences, for example when comparing me

40、an values or fitting regression models (Skinner et al, 1989). While such procedures usually have been regarded as necessary they have not generally merited serious substantive interest. In other words, the population structure, insofar as it is mirrored in the sampling design, is seen as a 'nuis

41、ance factor'. By contrast, the multilevel modelling approach views the population structure as of potential interest in itself, so that a sample designed to reflect that structure is not merely a matter of saving costs as in traditional survey design, but can be used to collect and analyse data

42、about the higher level units in the population. The subsequent modelling can then incorporate this information and obviate the need to carry out special adjustment procedures, which are built into the analysis model directly.Although the direct modelling of clustered data is statistically efficient,

43、 it will generally be important to incorporate weightings in the analysis which reflect the sample design or, for example, patterns of non-response, so that robust population estimates can be obtained and so that there will be some protection against serious model misspecification. A procedure for i

44、ntroducing external unit weights into a multilevel analysis is discussed in Chapter 3.1.4 Repeated measures dataA different example of hierarchically structured data occurs when the same individuals or units are measured on more than one occasion. A common example occurs in studies of animal and hum

45、an growth. Here the occasions are clustered within individuals that represent the level 2 units with measurement occasions the level 1 units. Such structures are typically strong hierarchies because there is much more variation between individuals in general than between occasions within individuals

46、. In the case of child height growth, for example, once we have adjusted for the overall trend with age, the variance between successive measurements on the same individual is generally no more than 5% of the variation in height between children. There is a considerable past literature on procedures

47、 for the analysis of such repeated measurement data (see for example Goldstein, 1979), which has more or less successfully confronted the statistical problems. It has done so, however, by requiring that the data conform to a particular, balanced, structure. Broadly speaking these procedures require

48、that the measurement occasions are the same for each individual. This may be possible to arrange, but often in practice individuals will be measured irregularly, some of them a great number of times and some perhaps only once. By considering such data as a general 2-level structure we can apply the

49、standard set of multilevel modelling techniques that allow any pattern of measurements while providing statistically efficient parameter estimation. At the same time modelling a 2-level structure presents a simpler conceptual understanding of such data and leads to a number of interesting extensions

50、 that will be explored in chapter 6.One particularly important extension occurs in the study of growth where the aim is to fit growth curves to measurements over time. In a multilevel framework this involves, in the simplest case, each individual having their own straight line growth trajectory with

51、 the intercept and slope coefficients varying between individuals (level 2). When the level 1 measurements, considered as deviations from each individual's fitted growth curve, are not independent but have an autocorrelated or time series structure, neither the traditional procedures nor the bas

52、ic multilevel ones are adequate. This situation may occur, for example, when measurements are made very close together in time so that a 'positive' deviation from the curve at one time implies also a positive deviation after the short interval before the next measurement. 1.5 Event history m

53、odelsModelling time spent in various states or situations is important in a number of areas. In industry the 'time to failure' of components is a key factor in quality control. In medicine the survival time is a fundamental measurement in studying certain diseases. In economics the duration

54、of employment periods is of great interest. In education, researchers often study the time students spend on different tasks or activities. In studying employment histories, any one individual will generally pass through several periods of employment or unemployment, while at the same time changing

55、his characteristics, for example his level of qualifications. From a modelling point of view we need to model the length of time in each type of employment, relating this to both constant factors such as an individual's social origins or gender and to changing or time dependent factors such as q

56、ualifications and age. The multilevel structure is analogous to that for repeated measures data, with periods taking the place of occasions. Furthermore, we would have generally a further, higher level of the hierarchy since individuals, which are the level 2 units, are themselves typically clustere

57、d into workplaces, which now constitute level 3 unitsFormally, we can rega. In fact, the structure is even more complicated because these workplaces change from period to period and if we wish to include this level in our model we need to consider cross-classifications of the units. We shall have mo

58、re to say about cross classifications shortly.There are particular problems arising when studying event duration data that are encountered when some information is 'censored' in the sense that instead of being able to observe the actual duration we only know that it is longer than some parti

59、cular value, or in some cases less than a particular value. Chapter 9 will discuss ways of dealing with this issue for multilevel event duration models. 1.6 Discrete response dataUntil now we have assumed implicitly that our response or dependent variable is continuously distributed, for example an exam score or anthropometric measure such as height. Many kinds of statistical modelling, however, deal with categorised responses, in the simplest case with proportions. Thus, we might be interested in a mortality rate, or a

溫馨提示

  • 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。

評(píng)論

0/150

提交評(píng)論