




版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
1、chapter 1: introduction to compiling25 of 25 chapter 1introduccin to compiling.the principles and techniques of compiler writing are so pervasive that the ideas found in this book will be used many times in the career of a computer scientist. compiler writing spans programming languages, machine arc
2、hitecture, language theory, algorithms and software engineering.fortunately, a few basic compiler-writing techniques can be used to construct translators for a wide variety of languages and machines. in this chapter we introduce the subject of compiling by describing the components of a compiler the
3、 environment in which compilers do their job, and some software tools that make it easier to build compilers.1.1 compilers simply stated, a compiler is a program that reads a program written in one language -the source language -and translates it into an equivalent program in another language -the t
4、arget language (see fig. 1.1). as an important part of this translation process, the compiler reports to its user the presence of errors in the source program.targetat first glance, the variety of compilers may appear overwhelming. there are thousands of source languages, ranging from traditional pr
5、ogramming languages such as fortran and pascal to specialized languages that have arisen in virtually every area of computer application. target languages are equally as varied; a target language may be another programming language, or the machine language of any computer between a microprocessor an
6、d a supercomputer.compilers arc sometimes classified as single-pass. multi-pass, load-and-go, debugging, or optimizing. depending on how they have been constructed or on what function they arc supposed to perform. despite this apparent complexity. the basic (asks that any compiler must perform are e
7、ssentially the same. by understanding these tasks, we can construct compilers for a wide variety or source languages and target machines using the same basic techniquesour knowledge about how to organize and write compilers has increased vastly since the first compilers started to appear in the earl
8、y i950s it is difficult to give an exact dale for the first compiler because initially a great deal of experimentation and implementation was done independently by several groups. much of the early work. on compiling dealt with the translation or arithmetic formulas into machine code. throughout the
9、 1950s, compilers were considered notoriously difficult programs to write. tile first fortran compiler, for example. took 18 staff-years to implement (backus et al. 1957). we have since discovered systematic techniques for handling many of the important tasks that occur during compilation. good impl
10、ementation languages. programming environments, and software tools have also been developed. with these advances. a substantial compiler call be implemented even as a student project in a one-semester compiler-design course. tb! analysis synthesis model of compilation there lire two parts to compila
11、tion: analysis and synthesis. the analysis part breaks up the source program into constituent pieces and creates an intermediate representation of the source program. the synthesis part constructs the desired target program from the intermediate representation. of the two parts. synthesis requires t
12、he most specialized techniques. we shall consider analysis informally ill section 1.2 and outline the way target code is synthesized in a standard compiler in section 1.3. during analysis, the operations implied by the source program are determined and recorded in a hierarchical structure called a t
13、ree. often, a special kind of tree called a syntax tree is used. in which each node represents an operation and the children of a node represent the arguments of the operation. for example. a syntax tree for an assignment statement is shown in fig. 1.2. .-/, position + -: , initial. . /, ra.te 60 .
14、1.2. syntax tree for position:. initial . rate. 60. many software tools that manipulate source programs first perform some kind of analysis. some examples of such tools include: l, structure editors. a structure editor takes as input a sequence of commands to build a source program. the structure ed
15、itor not only performs the text-creation and modification functions of an ordinary text editor. but it also analyzes the program text, putting an appropriate hierarchical structure on the source program. thus, the structure editor can perform additional tasks that are useful in the preparation of pr
16、ograms. for example, it can check. that the input is correctly formed, can supply keywords automatically (e.g. when the user types while. the editor supplies the matching do and reminds the user that a conditional must come between them). and can jump from a begin or left parenthesis to its matching
17、 end or right parenthesis. further. the output of such an editor is often similar to the output of the analysis phase of a compiler. 1 pretty printers. a pretty printer analyzes a program and prints it in. such a way that the structure of the program becomes clearly visible. for example. comments ma
18、y appear in a special font. and statements may appear with an amount of indentation proportional to the depth of their nesting in the hierarchical organization of the statements. 2 static checkers. a static checker reads a program, analyzes it, and attempts to discover potential bugs without running
19、 the program. the analysis portion is often similar to that found in optimizing compilers of the type discussed in chapter 10. for example a static checker may detect that parts of the source program can never be executed, or that a certain variable might be used before being defined. in addition. i
20、t can catch logical errors such as trying to use a real variable as a pointer. employing the type-checking techniques discussed in chapter 6. 3 interpreters, instead of producing a target program as a translation. an interpreter performs the operations implied by the source program. for an assignmen
21、t statement. for example. an interpreter might build a tree like fig. 1.2, and then carry out the operations at the nodes as it walks the tree. at the root it would discover it had all assignment to perform. so it would call a routine to evaluate the expression on the right and then store the result
22、ing value in the location associated with the identifier position. at the right child of the root, the routine would discover it had to compute the sum or two expressions. it would call itself recursively to compute the value of the expression rate * 60. it would then add that value to the value of
23、the variable initial. interpreters are frequently used to execute command languages. since each operator executed in a command language is usually an invocation of a complex routine such as an editor or compiler. similarly. some very high-level languages, like apl, are normally interpreted because t
24、here are many things about the data, such. as the size and shape of arrays. that cannot be deduced at compile time. traditionally, we think of a compiler as a program that translates a source language like fortran into the assembly or machine language of some computer. however. there arc seemingly u
25、nrelated places where compiler technology is regularly used. the analysis portion in each of the following examples is similar to that of a conventional compiler , t. text formatters. a text formatter lakes input that is a stream of characters, most of which is text to be typeset, but some of which
26、includes commands to indicate paragraphs. figures, or mathematical structures like subscripts and superscripts. we mention some of the analysis done by text formatters in the next section. 1 silicon compilers, a silicon compiler has a source language that is similar or identical to a conventional pr
27、ogramming language. however. the variables of the language represent. not locations in memory. but. logical signals (0 or i) or groups of signals in a switching circuit. the output is a circuit design in an appropriate language. see johnson 119831. ullman 1984, or trickey 1985 for a discussion of si
28、licon compilation. 2 query interpreters. a query interpreter translates a predicate containing relational and boolean operators into commands to search a database for records satisfying that predicate. (see ullman 1982 or date 1986.) the context of a compiler in addition to a compiler, several other
29、 programs may be required to create an executable target program. a source program may be divided into modules stored in separate files. the task of collecting the source program is sometimes entrusted to a distinct program, called a preprocessor. the preprocessor may also expand shorthands, called
30、macros, into source language statements. figure 1.3 shows a typical compilation: the target program created by the compiler may require further processing before it can be run. the compiler in fig. 1.3 creates assembly code that is translated by an assembler into machine code and then linked togethe
31、r with some library routines into he code that actually runs on the machine. we shall consider the components of a compiler in the next two sections; the remaining programs in fig. 1.3 are discussed in section 1.4. 1.1 analysis of the source program in this section, we introduce analysis and illustr
32、ate its use in some text formatting languages. the subject is treated in more detail in chapters 2-4 and 6. in compiling, analysis consists of three phases: i. linear analysts, in which the stream of characters making up the source program is read from left-to-right and grouped into tokens that are
33、sequences of characters having a collective meaning. skeletal source program source program target assembly program relocatable machine code library. relocatable object tiles absolute machine code fig. 1.3. a language-processing system. 1 hierarchical analysis, in which characters or tokens are grou
34、ped hierarchically into nested collections with collective meaning. 2 semantic analysis, in which certain checks are performed to ensure that the components of a program fit together meaningfully. lexical analysis in a compiler, linear analysis is called lexical analysis or scanning. for example, in
35、 lexical analysis the characters in the assignment statement position ;= initial + rae 4 60 woujd be grouped into the following tokens: 1 the identifier position. 2 the assignment symbol :-. 3 the identifier initial. 4 the plus sign. 5 the identifier rate. 6. the multiplication sign. 7. the number 6
36、0. the blanks separating the characters of these tokens would normally be eliminated during lexical analysis. syntax analysis hierarchical analysis is called parsing or syntax analysis. ii involves grouping the tokens of the source program into grammatical phrases that are used by the compiler to sy
37、nthesize output. usually, the grammatical phrases of the source program are represented by a parse tree such as the one shown in fig. 1.4. assignment statement /identifier i position expression i identifier i rate fig. 1.4. parse tree for position: initial + rate * 60. in the expression initial + ra
38、te . 60. the phrase rate * 60 is a logical unit because the usual conventions of arithmetic expressions tell us that multiplication is performed before addition. because the expression initial + rate is followed by a *, it is not grouped into a single phrase by itself in fig. 1.4. the hierarchical s
39、tructure of a program is usually expressed by recursive rules. for example, we might have the following rules as part of the definition of expressions: i. any identifier is an expression. 2. any number is an expression. 3. if expression i and expression 2 are expressions, then so are expression 1 +
40、expression 2 expression i* expression ; ( expression i) rules (i) and (2) are (no recursive) basis rules, while (3) defines expressions in terms of operators applied to other expressions. thus, by rule (1), initial and rate are expressions. by rule (2). 60 is an expression. while by rule (3), we can
41、 first infer that rate * 60 is an expression and finally hat initial +rate 60 is an expression. similarly. many languages define statements recursively by rules such as: i. if identifier i is an identifier, and expression 2 is an expression, then identifier, : = expr(.ision2 is a statement. 2. if ex
42、pression i is an expression and statement 2 is a statement. then while expression i do statement 2 if ( expression i ) then statement 1 are statements. the division between lexical and syntactic analysis is somewhat arbitrary. we usually choose a division that simplifies the overall task of analysis
43、. one factor 0 determining the division ;s whether a source language construct is inherently recursive or not. lexical constructs do not require recursion, while syntactic constructs often do. context-free grammars are a formalization of recursive rules that can be used to guide syntactic analysis.
44、they are introduced in chapter 2 and studied extensively in chapter 4. for example, recursion is not required to recognize identifiers, which are typically strings of letters and digits beginning with a letter. we would normally recognize identifiers by a simple scan of the input stream. waiting unt
45、il a character that was neither a letter nor a digit was found. and then grouping an the letters and digits found up to that point into an identifier token. the characters so grouped are recorded in a table, called a symbol table. and removed from the input so that processing of the next token can b
46、egin. on the other hand. this kind of linear scan is not powerful enough to analyze expressions or statements. for example we cannot properly match parentheses in expressions. or begin and end in statements, without putting some kind of hierarchical or nesting structure on the input. ./ -/ position
47、. position /, initial/ initi.l * /, /, rate 60 rate oreal i (a) (b) 60 fig. 1.5. semantic analysis inserts a conversion from integer 10 real, the parse tree in fig. 1.4 describes the syntactic structure of the input. a more common internal representation of this syntactic structure is given by the s
48、yntax tree in fig. l5(a). a syntax tree is a compressed representation of the parse tree in which the operators appear as the interior nodes. and the operands of an operator are the children of t he node for that operator. the construction of trees such as the one in fig. 1.5(a) is discussed in sect
49、ion 5.2. we shall take up in chapter 2, and in more detail in chapter 5, the subject of syntax-directed translation, in which the compiler uses the hierarchical structure or! the input to help generate the output. seld8dtic analysis the semantic analysis phase checks the source program for semantic
50、errors and gathers type information for the subsequent code-generation phase. it uses the hierarchical structure determined by the syntax-analysis phase to identify the operators and operands of expressions and statements. an important component of semantic analysis is type checking. here (he compil
51、er checks that each operator has operands that are permitted by the source language specification. for example, many programming language definitions require a compiler to report an error every time a real number is used to index an array. however. the language specification may permit some operand
52、coercions, for example, when a binary arithmetic operator is applied to an integer and real. in this case; the compiler may need to convert the integer to 3 real. type checking and semantic analysis are discussed in chapter 6. example 1.1. inside a machine, the bit pattern representing an integer is
53、 generally different from the bit pattern for a real, even if the integer and the real number happen to have the same value. suppose; for example. that all identifiers in fig. 1.5 have been declared to be reals and that 60 by itself is assumed to be an integer. type checking of fig. 1.5(a) reveals t
54、hat is applied to a real, rate, and an integer, 60. the general approach is to convert the integer into a real this has been achieved in fig. l.5(b) by creating an extra node for the operator inttoreal that explicitly converts an integer into a real. alternatively. since the operand of inttoreal is
55、a constant, the compiler may instead replace the integer constant by an equivalent real constant. 0 analysis in text formatters it is useful to regard the input to a text formatter as specifying a hierarchy of boxes that are rectangular regions to be filled by some bit pattern, representing light an
56、d dark. pixels to be printed by the output device. for example, the tex system (knuth (1984an views its input this way. each character that is not part of a command represents a box containing the bit pattern for that character in the appropriate font and size. consecutive characters not separated b
57、y white space (blanks or newline characters) are grouped into words, consisting of a sequence of horizontally arranged boxes. shown schematically in fig. 1.6. the grouping of characters into words (or commands) is the linear or lexical aspect of analysis in a text formatter. boxes in tex may be buil
58、t from smaller boxes by arbitrary horizontal and vertical combinations. for example. hbox fig. 1.6. grouping of characters and words into boxes, groups the list of boxes by juxtaposing them horizontally. while the vbox operator similarly groups a list of boxes by vertical juxtaposition. thus. if we say in tex hboxvbox! 1 vbox 2 we get the arrangement of boxes shown in fig. 1.7. determining the hierarchical arrangement of boxes implied by the input is part of syntax analysis i
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 學(xué)習(xí)2025年雷鋒精神六十二周年主題活動(dòng)實(shí)施方案 (4份)-54
- 2024年油煙凈化設(shè)備項(xiàng)目資金申請(qǐng)報(bào)告代可行性研究報(bào)告
- 2025年河北化工醫(yī)藥職業(yè)技術(shù)學(xué)院?jiǎn)握新殬I(yè)技能測(cè)試題庫(kù)附答案
- 政治-云南省三校2025屆高三2月高考備考聯(lián)考卷(六)試題和答案
- 2025年農(nóng)村宅基地買賣合同協(xié)議書(農(nóng)村土地流轉(zhuǎn)法律保障)
- 2025年度地下車位租賃與車位租賃平臺(tái)服務(wù)合同
- 2025年度室內(nèi)裝修安全監(jiān)理服務(wù)協(xié)議
- 2025年度商鋪?zhàn)赓U稅收優(yōu)惠政策協(xié)議
- 2025年度新能源技術(shù)研發(fā)用工協(xié)議安全責(zé)任承諾書
- 2025年度制造業(yè)企業(yè)生產(chǎn)線人員招聘與培訓(xùn)合同
- 甲狀腺旁腺分泌的激素及功能
- 中央財(cái)政成品油價(jià)格調(diào)整對(duì)漁業(yè)補(bǔ)助資金項(xiàng)目實(shí)施方案
- PFMEA模板完整版文檔
- 論生產(chǎn)安全對(duì)于家庭的重要性
- 風(fēng)力發(fā)電變槳系統(tǒng)外文翻譯
- 教學(xué)能力比賽決賽 《英語》教案
- ECMO IABP完整版可編輯
- 離婚糾紛證據(jù)清單
- 【高考作文指導(dǎo)】用思辨來寫現(xiàn)象類作文(共39張PPT)
- GB/T 4513-2000不定形耐火材料分類
- GB 19147-2013f車用柴油(Ⅳ)
評(píng)論
0/150
提交評(píng)論