Evaluation of a grammar of French determiners

  • Éric Laporte Université Paris-Est

Resumo


Existing syntactic grammars of natural languages, even with a far from complete coverage, are complex objects. Assessments of the quality of parts of such grammars are useful for the validation of their construction. We evaluated the quality of a grammar of French determiners that takes the form of a recursive transition network. The result of the application of this local grammar gives deeper syntactic information than chunking or information available in treebanks. We performed the evaluation by comparison with a corpus independently annotated with information on determiners. We obtained 86% precision and 92% recall on text not tagged for parts of speech.

Referências

Abeillé, A. and Barrier, N. (2004) “Enriching a French Treebank”, LREC, Lisbon.

Abney, S. (1996) “Partial parsing via finite-state cascades”, Workshop on Robust Parsing, ESSLLI, Prague, Czech Republic, p. 8-15.

Bick, E. (2004) “A Named Entity Recognizer for Danish”, LREC, Lisbon, p. 305-308.

Blanc, O. and Constant, M. (2005) “Lexicalisation of grammars with parameterized graphs”, RANLP, Borovets (Bulgaria), p. 117-121.

Blanc, O. and Constant, M. (2006) “Outilex, a platform for Text Processing”, Proceedings of Coling-ACL on Interactive Presentation Sessions, Sydney, p. 73-76.

Briscoe, E., Carroll, J., Graham, J. and Copestake, A. (2002) “Relational evaluation schemes”, Proceedings of the Beyond PARSEVAL Workshop at LREC, Las Palmas, Gran Canaría, p. 4-8.

Constant, M. (2000) “Description d'expressions numériques en français”, Revue Informatique et Statistique dans les Sciences Humaines 36, p. 119-135.

Constant, M. (2004) “Vers la construction d'une bibliothèque en-ligne de grammaires linguistiques”. Lexicometrica. Numéro spécial, Actes du colloque L'analyse de données textuelles : De l'enquête aux corpus littéraires, Québec, 2002.

Courtois, B. (1990) “Un système de dictionnaires électroniques pour les mots simples du français”, Langue française 87, p. 11-22.

Danlos, L. (2005) “Automatic Recognition of French Expletive Pronoun Occurrences”, IJCNLP, Companion Volume, p. 73-78, Jeju, Korea.

Das, D., Choudhury, M., Sarkar, S. and Basu, A. (2005) “An Affinity Based Greedy Approach towards Chunking for Indian Languages”, ICON, Kanpur, India.

Fairon, C., Paumier, S. and Watrin, P. (2005) “Can we parse without tagging? ”, Proceedings of the Language & Technology Conference: Human Language Technologies, Poznan, Poland, p. 473-477.

Gendner, V., Illouz, G., Jardino, M., Monceaux, L., Paroubek, P., Robba, I. and Vilnat, A. (2003) “PEAS, the first instantiation of a comparative framework for evaluating parsers of French”, Proceedings of the Research Note Sessions of EACL, Budapest.

Gross, M. (1967) “Sur une règle de cacophonie”, Langages 7, Paris: Larousse.

Gross, M. (1977) Grammaire transformationelle du français. Vol. 2, Syntaxe du nom. (Reprinted 1986, Paris: Cantilène).

Gross, M. (1997) “The Construction of Local Grammars”, Finite State Language Processing, Roche and Schabès (eds.), Cambridge, Mass.: MIT Press, pp. 329-352.

Gross, M. (1998-1999) “Lemmatization of Compound Tenses in English”, Lingvisticae Investigationes 22, Amsterdam/Philadelphia: Benjamins, p. 71-122.

Gross, M. (2000) “A Bootstrap Method for Constructing Local Grammars”, Bokan, N. (ed.), Proceedings of the Symposium on Contemporary Mathematics, University of Belgrad, Serbia, p. 229-250.

Gross, M. (2001) “Grammaires locales de déterminants nominaux”, Détermination et formalisation, LIS 23, Amsterdam/Philadelphia: Benjamins, p.177-193.

Hockey, B.A. and Mateyak, H. (2000) “Determining Determiner Sequencing: A Syntactic Analysis for English”, Tree Adjoining Grammars: Formalisms, Linguistic Analyses and Processing, Abeillé and Rambow (eds.), CSLI, p.221-249.

Humphreys, K., Gaizauskas, R., Azzam, S., Huyck, C., Mitchell, B., Cunningham, H.

and Wilks, Y. (1998) “University of Sheffield: Description of the LaSIE-II system as used for MUC-7”. Proceedings of the Message Understanding Conference.

Laporte, E., Ranchhod, E. and Yannacopoulou, A. (2006) “Syntactic variation of support verb constructions”, Proceedings of the Lexis and Grammar Conference (LGC), Palermo, Italy.

Li, W. and McCallum, A. (2003) “Rapid Development of Hindi Named Entity Recognition Using Conditional Random Fields and Feature Induction”, TALIP, vol. 2:3, p. 290-294.

Marcus, M., Santorini, B. and Marcinkiewicz, M.A. (1993) “Building a large annotated corpus of English: the Penn Treebank”, Computational Linguistics 19:2, p. 313-330.

Mason, O. (2004) “Automatic Processing of Local Grammar Patterns”, Proceedings of the Annual Colloquium for the UK Special Interest Group for Computational Linguistics, Birmingham, p.166-171.

Maynard, D., Tablan, V., Ursu, C., Cunningham, H. and Wilks, Y. (2001) “Named Entity Recognition from Diverse Text Types”, RANLP, Tzigov Chark, Bulgaria.

Nam, J. and Choi, K (1997) “A Local-Grammar-based Approach to Recognizing of Proper Names in Korean Texts”. Zhou & Church (eds.), Proceedings of the Workshop on Very Large Corpora, ACL/Tsing-hua University/Hong-Kong University of Science and Technology, p. 273-288.

Nenadic, G. (2000) “Local Grammars and Parsing Coordination of Nouns in SerboCroatian”, TSD, LNAI 1902, Springer, p. 57-62.

Paroubek, P., Robba, I., Vilnat, A. and Ayache, Ch. (2006) “Data, Annotations and Measures in EASY, the Evaluation Campaign for Parsers of French”, LREC, Genoa.

Paumier, S. (2006) The Unitex Manual. [link].

Piskorski, J. (2004) “Automatic Named-Entity Recognition for Polish”, Proceedings of the International International Workshop on Intelligent Media Technology for Communicative Intelligence, Warsaw, Poland.

Poibeau, Th. (2006) “Dealing with Metonymic Readings of Named Entities”, COGSCI, Vancouver, Canada.

Ranchhod, E., Carvalho, P., Mota, C. and Barreiro, A. (2004) “Portuguese Large-scale Language Resources for NLP Applications”, LREC, Lisbon, p.1755-1758.

Saetre, R. (2004) “GeneTUC - BioMolecular Information Retrieval”, Computer Science Graduate Student Conference (CSGSC), Trondheim, Norway.

Senellart, J., Plitt, M., Bailly, Ch. and Cardoso, F. (2001) “Resource alignment and implicit transfer”, Machine translation in the information age, MT Summit, p. 317323.

Sha, F. and Pereira, F. (2003) “Shallow parsing with conditional random fields”, HLTNAACL, Edmonton, Canada.

Silberztein, M. (2003) “Finite-State Description of the French Determiner System”, Journal of French Language Studies 13 (2), Cambridge University Press, p. 221-246.

Venkova, T. (2000) “A local grammar disambiguator of compound conjunctions as a pre-processor for deep analysers”, Proceedings of the Workshop on Linguistic Theory and Grammar Implementation, ESSLLI, Birmingham.
Publicado
30/06/2007
LAPORTE, Éric. Evaluation of a grammar of French determiners. In: SIMPÓSIO BRASILEIRO DE TECNOLOGIA DA INFORMAÇÃO E DA LINGUAGEM HUMANA (STIL), 5. , 2007, Rio de Janeiro/RJ. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2007 . p. 1625-1634.