日中機(jī)器翻譯中漢語副詞的數(shù)據(jù)處理.pdf_第1頁
日中機(jī)器翻譯中漢語副詞的數(shù)據(jù)處理.pdf_第2頁
日中機(jī)器翻譯中漢語副詞的數(shù)據(jù)處理.pdf_第3頁
日中機(jī)器翻譯中漢語副詞的數(shù)據(jù)處理.pdf_第4頁
日中機(jī)器翻譯中漢語副詞的數(shù)據(jù)處理.pdf_第5頁
已閱讀5頁,還剩13頁未讀, 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡介

1 Chinese Adverb Processing in Japanese to Chinese Machine Translation Zhang Ying Ikeda laboratory Dept of Information Science Faculty of Engineering Gifu University Yanagido 1 1 Gifu Japan 501 1194 Email Chinese Adverb Processing in Japanese to Chinese Machine Translation Zhang Ying Ikeda laboratory Dept of Information Science Faculty of Engineering Gifu University Yanagido 1 1 Gifu Japan 501 1194 Email zhang ikd info gifu u ac jpzhang ikd info gifu u ac jp Tel 0081 90 41835170 Fax 0081 58 213 3331 Abstract Tel 0081 90 41835170 Fax 0081 58 213 3331 Abstract Japanese Chinese machine translation requires a large amount of structural transformations in both grammatical and conceptual level In order to make its control structure clearer and more understandable this paper proposes a model Engine JAW which based on pattern dictionary Translation process is modeled as a data flow computation process When translating between different language families the correspondence between language elements are often not straightforward This paper analyzes Japanese expressions which correspond to Chinese adverbs We classify the expressions into two types which are translated differently An adverb processing method in J to C machine translation using this classification is proposed 1 1 Introduction 1 1 On Machine Translation Introduction 1 1 On Machine Translation The type of MT is that of the text to text variety MT can be divided into two types Unassisted MT and Assisted MT Unassisted MT takes pieces of text and translates them into output for immediate use with no human involvement The result is unpolished text and gives only a gist of the source hence the term gisting The ultimate aim of this type of MT is sometimes known as Fully Automatic High Quality Translation FAHQT perfect translation created solely by a computer Assisted MT uses a human translator to clean up after and sometimes before translation in order to get better quality results Usually the process is improved by limiting the vocabulary through use of a dictionary and the types of sentences grammar allowed The use of a controlled language has been fairly successful Some systems have also been set up to learn from corrections Assisted MT can be divided into Human Aided Machine Translation HAMT a machine that uses human help and Machine Aided Human Translation MAHT a human that uses machine help Computer Aided Translation CAT is a more recent form of MAHT Another area of MT that is worth 2 mentioning here is Natural Language Processing NLP NLP parses sentences and determines their underlying meaning in order for databases to answer SQL queries entered in the form of a question For further information on the structure of MT systems see the recent special report on the future of translation featured in Wired magazine The structure of MT systems can vary but all use some sort of transfer component This component is specialised so that a pair of languages can produce a target sentence The transfer component has a correspondence lexicon which is a comprehensive list of the source language patterns and phrases mapped to a target language Some MT systems use systematic transfer systems which apply software parsers to analyse the source language sentences This type of transfer system means that for every two languages that translation is required between a new a correspondence lexicon must be created An alternate to the transfer component is an Interlingua a type of intermediate language A translation is made from the source language into the Interlingua and then into the target language The benefits of using an Interlingua are that only one part is required for each language and therefore further languages can be added easily See figure 1 Unfortunately the majority of work to date relies on comparative information about the specific pair languages Figure 1 A comparison of a Standard Transfer Component versus an Interlingua Figure 1 A comparison of a Standard Transfer Component versus an Interlingua 1 2 About the Engine JAW 1 2 About the Engine JAW The Engine JAW one kind of PC Pattern Conversion type Machine Translation engine is using original language pattern to make the pattern dictionary Making use of this dictionary it expresses the input command with the IT Input Tree pattern The feature of JAW is divided into three stages the translation for proposition content the 3 translation for functional word after Japanese YOUGEN Declinable and the translation for functional word after Japanese TAIGEN Nominals TAIGEN refers to words that don t inflect in any way mainly nouns and YOUGEN to words that do inflect JAW Chinese is a machine translation system from Japanese into Chinese JAW is the translation engine from Japanese to other languages it means Japanese to Asian and World languages The following is the Outline of JAW Chinese A Dictionary Transfer Rules for Dictionary Transfer Rules for N Function Words Function Words A linearizationlinearization L Y Information ofInformation of S function wordsfunction words I S PatternPattern TR TR Matching Application Matching Application Execution of Conversion Generation Function Execution of Conversion Generation Function Figure 2 Outline of JAW ChineseFigure 2 Outline of JAW Chinese Japanese Chinese Input Tree of Japanese IT Expression Tree of Chinese ET Japanese Patterns Transfer Dictionary Japanese Patterns Transfer Dictionary Translation Rules TR Transfer Tree TT IBUKI Syntax Analysis IBUKI Syntax Analysis 4 After analysis of a Japanese sentence by IBUKI a system developed in our laboratory for segmentation of Japanese into bunsetsus JAW put them into a tree input tree IT then JAW search Japanese patterns in the transfer dictionary for the IT and make a tree of transfer rules transfer tree TT The system is implemented on C and the transfer rule is in fact a C program stored as dll The execution of the transfer rules in the tree produces a network of C objects for Chinese expression tree ET A linearization function is defined for each object as a class method of C The execution of the linearization function on the ET puts the members of ET in a line to make a Chinese output sentence 2 The problems in Adverb processing in Japanese to Chinese machine translation 2 The problems in Adverb processing in Japanese to Chinese machine translation Adverbs have various complex grammatical functions in sentences In natural language processing the study of adverbs has not developed very far to date compared with nouns and verbs because it was thought that adverbs do not construct the main parts of sentence meaning and have various complex grammatical functions in sentences However adverbs occur frequently and make important contributions to sentence meaning Thus the accurate processing of Chinese adverbs is required for high quality machine translation From a linguistic point of view linguists have examined adverb grammatical functions and meanings in detail Quirk et al 1985 Conjuncts and disjuncts usually called sentence adverbs Other studies by linguists include those which handle the meanings of specific adverbs such as even still and already There are also studies of adverb position in Chinese in general and positions of specific adverbs In Chinese the adverbs can be used only in front of the verbs or adjectives But it is difficult to apply the results of these studies to natural language processing directly because they are the knowledge for human so computers cannot understand them easily From the natural language processing viewpoint few studies Glasbey 1993 have considered adverbs in natural language processing Conlon and Evans 1992 aimed to decrease ambiguity in adverb meanings and to select words during generation by applying information about adverb semantics and syntax from linguistic studies to an adverb lexicon These are studies of the multiplicity of adverb meanings A method for determining where adverbs should be placed in Chinese sentences in Japanese to Chinese machine translation has been proposed The main problems in adverb processing in Japanese to Chinese machine translation are the two as follows 1 the multiplicity of adverb meanings 5 2 word ordering of Chinese adverbs in Chinese generation The method is based on adverb grammatical functions subjuncts adjuncts disjuncts and conjuncts and meanings process space time etc preferred positions in sentences initial medial end pre post and priorities between adverbs with the same preferred position There are few studies of differences in expression between Japanese and Chinese for adverbial meaning It showed that only about 55 of Japanese adverbs were translated into Chinese adverbs in translation from Japanese to Chinese by human translators On the other hand only about 17 of Chinese adverbs that appeared in the human translation were translated from Japanese adverbs in the original This shows clearly the difference between Japanese and Chinese representations for adverbial meaning and the difficulty of adverb processing in machine translation Adverb processing in Japanese to Chinese machine translation is very complicated Thus in this paper we focus our attention on the problem of differences in expressions between Japanese and Chinese adverbial meaning from the viewpoint of Chinese adverbs When translating between different language families correspondence between language elements is not straightforward The tendency is especially prominent in translation of adverbial expressions So first we examine in detail examples in which Japanese expressions correspond to Chinese adverbs We classify the examples into two types which are translated differently An adverb processing method in Japanese to Chinese machine translation using the correspondence types and functional differences of adverb expressions is proposed The content of Japanese dictionaries which are used to determine Japanese composition of words Japanese to Chinese word transfer dictionaries which are used to transfer by word units and Japanese to Chinese structure transfer dictionaries which are used to transfer by predicate unit are also presented 3 Correspondence between Japanese expressions and Chinese adverbs 3 Correspondence between Japanese expressions and Chinese adverbs We examined Japanese expressions which correspond to Chinese adverbs in Japanese to Chinese translations made by professional human translators 1 000 sentences in newspaper articles of industrial and economic domains and their translations were investigated from the viewpoint of correspondence between Japanese expressions and Chinese adverbs Examining adverb frequency in Chinese showed that adverbs appeared 585 times in 1 000 sentences that is one adverb appeared in roughly every two sentences on average We classified the examples into two types depending 6 on how they can be translated In Type 1 Japanese expressions can be directly translated into Chinese adverbs as opposed to Type 2 In Type 1 the Japanese expressions can be transferred by word unit In Type 2 is more complicated and very difficult to translate The sentence structures were changed because of the different ways of expression or thinking in Japanese and Chinese As a result some expressions nouns verbs adjectives etc were translated into Chinese adverbs indirectly Type 2 must be transferred by larger units than the word For example Type 1Type 1 Type 1 was the most common type Type 1 consists of Japanese expressions which have adverbial function and correspondence to Chinese adverbs Most of Chinese adverbs were translation of Type 1 Japanese expressions Chinese adverbs which were translated from Japanese adverbs were only small part of all Chinese adverbs In Japanese an adverbial typically takes the form of an adverb or a noun followed by some particle or continuative form of a predicate Adverbs sarani更 even sh rai 將來 in the future Adjectival nouns ni haba ni 大幅 greatly Nouns ni or de ch shin ni中心 especially kiny men de 金融面 financially Verbs continuative form hikitsuduki引 続 continuously Adjectives continuative form subayaku 素早 quickly Nouns chokuei 直接地 directly Verbs attributive form isogu 飛快地 rapidly Adjectival nouns na tokushu na 特別地 specially Adjectives attributive form chikai 幾乎 almost Uninflected adjectives kina 大大地 greatly 7 The most typical case of Type 1 is the correspondence between Japanese adverbs and Chinese adverbs For example the Japanese adverb sarani was translated into the Chinese adverb geng Adjectival nouns followed by the particle ni continuative forms of verbs and adjectives and some nouns followed by particle ni or de which have adverbial functions as a whole were translated into Chinese adverbs Such Japanese expressions are translated as case elements in Chinese The Japanese adverbial particle mo was translated as ye a modal adverb subjunct in Chinese Japanese idiomatic expressions for modals were translated into disjuncts Type 2Type 2 It is essential to treat this type correctly in Japanese to Chinese machine translation Because most direct transfer machine translation systems cannot deal with this type The most typical pattern of Type 2 consisted of nouns attributive forms of verbs adjectival nouns followed by particle na attributive form of adjectives or uninflected adjectives which function as modifiers which were translated into Chinese adverbs When the sentence structures were changed and the modificants were changed to predicates verbs or adjectives the modifiers were accordingly changed to adverbs An example is shown below Jpn Sany denki wa seisan o kogaisha ni zenmen ikan suru 三洋電機(jī) 生産 子會(huì)社 全面 移管 Gloss Sanyo Electric production subsidiary company whole whole transfer do TOP OBJ LOC surface surface Eng Sanyo Electric completely transfers control of production to its subsidiary company Chin 三洋電機(jī)對其子公司進(jìn)行了完全生產(chǎn)移管 In this example the light verb suru together with its object the verbal noun ikan was translated into the Chinese action verb transfer Accompanying this process the attributive noun zenmen which was the modifier of the Japanese action noun ikan was translated into the Chinese adverb completely 完全 which modifies the action verb transfer 8 But Type 2 correspondences are complicated and the predicate unit and sometimes a Japanese transfer unit is different from an Chinese transfer unit for example Japanese nouns to Chinese verbs Japanese adjectives to Chinese adverbs Japanese adverbs to Chinese adjectives Japanese verbs or adjectives to Chinese nouns Japanese clause to phrase etc Type 2 cannot be transferred in conventional direct transfer word to word method We propose an adverb transfer method which uses direct parse tree transfer for Type 2 Direct parse tree transfer provides a flexible framework for translation where source language units are different from target language units An example of a Japanese sentence which has a structure that must be changed when it is translated to Chinese is shown below Jpn watashi wa sh sai na kent o okonau 私 詳細(xì) 検討 行 Gloss I TOP detail examination OBJ do CHN1 我做了詳細(xì)的研究 I do a detailed examination CHN2 我詳細(xì)地做了研究 I examine in detail When translating a Japanese light verb such as okonau do or suru do which has an action noun as an object it is sometimes preferable to use the action noun as a verb In this example kent o okonau do examination is translated as examine According to this the Japanese expression sh sai na detail which modifies kent o must be translated to Chinese adverb in detail 詳細(xì)地 In J to C MT the most appeared problem is the multiplicity of adverb meanings Due to the definition and classification of the Japanese and then to make Japanese Pattern and rule for the keyword verb fueru like Figure 5 I have tried to find the reason I made the sentence short like this 近年事故 増 kinnen jiko wa masumasu fue teiru And the translation of it comes out From it I found it is the problem of compound word koutujiko So I added a new pattern for the compound word like Figure1 2 finally I got the translation result like Figure 7 Our experiment is performed on sentences with at least one Japanese adverb taken from the 1000 example Japanese sentences It shows the translating meaning by a human translator We manually examined whether Chinese adverbs in the translation would be generated correctly using the proposed method Examined Objects Japanese adverb entries 86 words Sentences 122 sentences Accuracy rate 92 3 Adverbs generated in incorrect positions 21 05 Absolutely incorrect position 5 53 Strange position 15 52 This experiment confirmed that the proposed word ordering method can handle large amount of adverbs correctly Rule Type Class Member Adv Addition cw CProposition m adverb Noun Base CNoun Verb Base CProposition Adj Addition cw CNoun m adjectival 15 Our experiment had the Japanese to Chinese machine translation system JAW Chinese translate Japanese sentences to test various Chinese adverb functions The goal was to confirm that this adverb ordering method could handle various types of Chinese adverbs 6 Word Ordering Method for Chinese Adverbs 6 Word Ordering Method for Chinese Adverbs Adverbs usually have many meanings especially adverbs which are used frequently in our daily life Normally the difference in meaning is indicated by the position in the sentence The position of an adverb depends not only on the adverb s meaning but also on the relationship between the adverb and other sentence elements In Chinese the basic order is subject adverb verb adjective verb complement object here the Adverb is used in front of verbs or adjectives to show degree extent time or negation etc e g Degree 很 very 非 常 very 極 其 extremely 格 外 extraordinarily Extent 都 all 僅僅 only Time 已經(jīng) already 曾經(jīng) ever 剛剛 just 正 在 at the moment 立刻 immediately 常常 often Negation 不 not 沒 no 別 not Positive 必定 surely 必 sure Repetition or continuity 又 again 還 again 再 again Mood 卻 however 倒 竟 偏 even In Chinese the adverbs can be used only in front of the verbs or adjectives while in English they may a

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲(chǔ)空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論