Обработка естественного языка и изучение сложности дискурса
- Авторы: Солнышкина М.И.1, Макнамара Д.С.2, Замалетдинов Р.Р.1
-
Учреждения:
- Казанский (Приволжский) федеральный университет
- Университет штата Аризона
- Выпуск: Том 26, № 2 (2022): Компьютерная лингвистика и дискурсивная комплексология
- Страницы: 317-341
- Раздел: Статьи
- URL: https://bakhtiniada.ru/2687-0088/article/view/314951
- DOI: https://doi.org/10.22363/2687-0088-30171
- ID: 314951
Цитировать
Полный текст
Аннотация
В исследовании представлен обзор формирования и развития дискурсивной комплексологии - интегрального научного направления, объединившего лингвистов, когнитологов и программистов, занимающихся проблемами сложности дискурса. Статья включает три основных части, в которых последовательно изложены взгляды на категорию сложности, история дискурсивной комплексологии и современные методы оценки сложности текста. Разграничивая понятия сложности языка, текста и дискурса, мы признаем абсолютный характер оценки сложности текста и относительный, зависимый от языковой личности реципиента характер сложности дискурса. Проблематика теории сложности текста, основы которой были заложены в XIX в., сфокусирована на поиске и валидации предикторов сложности и критериев трудности восприятия текста. Мы кратко характеризуем пять предыдущих этапов развития дискурсивной комплексологии: формирующего, классического, периода закрытых тестов, конструктивно-когнитивного и периода обработки естественно языка, а также подробно описываем современное состояние науки в данной области. Мы представляем теоретическую базу автоматического анализатора Coh-Metrix - пятиуровневую когнитивную модель восприятия, позволившую обеспечить высокий уровень точности оценки сложности и включить в список предикторов сложности текста не только лексические и синтаксические параметры, но и параметры текстового уровня, ситуационной модели и риторических структур. На примере нескольких инструментов (LEXILE, ReaderBench и др.) мы показываем области применения данных инструментов, включающие образование, социальную сферу, бизнес и др. Ближайшая перспектива развития дискурсивной комплексологии состоит в параметризации и создании типологии сложности текстов различных жанров для обеспечения более высокой точности меж- и внутриязыкового сопоставления, а также для автоматизации подбора текстов в различных лингвопрагматических условиях.
Об авторах
Марина Ивановна Солнышкина
Казанский (Приволжский) федеральный университет
Email: mesoln@yandex.ru
ORCID iD: 0000-0003-1885-3039
доктор филологических наук, профессор кафедры теории и практики преподавания иностранных языков, руководитель НИЛ «Текстовая аналитика» Института филологии и межкультурной коммуникации
Россия, 420008, Казань, ул. Кремлевская, д. 18Даниэль С. Макнамара
Университет штата Аризона
Email: Danielle.McNamara@asu.edu
доктор наук, профессор кафедры психологии Пэйн Холл, Кампус TEMPE, ком. 108, 1104, США
Радиф Рифкатович Замалетдинов
Казанский (Приволжский) федеральный университет
Автор, ответственный за переписку.
Email: director.ifmk@gmail.com
ORCID iD: 0000-0002-2692-1698
доктор филологических наук, профессор, директор Института филологии и межкультурной коммуникации
Россия, 420008, Казань, ул. Кремлевская, д. 18Список литературы
- Anderson, Philip. 1972. More is different: Broken symmetry and the hierarchical nature of science. Science 177 (4047). 393-396.
- Biber, Douglas. 1988. Variation Across Speech and Writing. Cambridge, England: Cambridge University Press. https://doi.org/10.1017/S0022226700014201
- Biemiller, Andrew. 2009. Words Worth Teaching. Columbus, OH: SRA/McGraw-Hill.
- Bormuth, John R. 1969. Development of Readability Analysis. Technical report, Projet number 7-0052, U.S. Office of Education, Bureau of Research, Department of Health, Education and Welfare, Washington, DC.
- Bulté, Bram & Alex Housen. 2012. Defining and operationalising L2 complexity. In Housen Alex, Folkert Kuiken & Ineke Vedder (eds.), Dimensions of L2 performance and proficiency: Complexity, accuracy and fluency in SLA, 21-46. Amsterdam: John Benjamins. https://doi.org/10.1075/lllt.32.02bul
- Chall, Jeanne S. & Edgar Dale. 1995. Readability Revisited: The New Dale-Chall Readability Formula. Cambridge: Brookline Books.
- Charniak, Eugene. 2000. A maximum-entropyinspired parser. In Proceedings of the 1st North American Chapter of the Association for Computational Linguistics Conference. 132-139.
- Coleman, Edmund B. 1965. On Understanding Prose: Some Determiners of Its Complexity. NSF Final Report GB2604, Washington, D.C, National Science Foundation.
- Collins-Thompson, Kevyn. 2015. Computational assessment of text readability: A survey of current and future research. ITL - International Journal of Applied Linguistics 165 (2). 97-135.
- Crossley, Scott A., Philip M. Mccarthy, David F Duffy & Danielle McNamara. 2007. Toward a new readability: A mixed model approach. In Proceedings of the 29th Annual Conference of the Cognitive Science Society. 197-202.
- Dale, Edgar & Jeanne S. Chall. 1948. A formula for predicting readability. Educational Research Bulletin 27. 11-20, 37-54.
- Dale, Edgar & Joseph O'Rourke. 1981. Living Word Vocabulary. Chicago: World Book - Childcraft International.
- Danielson, Wayne A. & Sam D. Bryan. 1963. Computer automation of two readability formulas. Journalism Quarterly 40 (2). 201-205. https://doi.org/10.1177%2F107769906304000207
- Daoust, François, Léo Laroche & Lise Ouellet. 1996. SATO-CALIBRAGE: Présentation d’un outil d’assistance au choix et à la rédaction de textes pour l’enseignement. Revue Québécoise de Linguistique 25 (1). 205-234.
- Dascalu, Mihai. 2014. Analyzing discourse and text complexity for learning and collaborating. In Analyzing Discourse and Text Complexity for Learning and Collaborating, 1-3. Springer, Cham. https://doi.org/10.1007/978-3-319-03419-5
- Flesch, Rudolf. 1948. A new readability yardstick. Journal of Applied Psychology 32 (3). 221-233. https://doi.org/10.1037/h0057532
- Foltz, Peter W., Walter Kintsch & Thomas Landauer. 1998. The measurement of textual coherence with latent semantic analysis. Discourse Processes 25 (2). 285-307. https://doi.org/10.1080/01638539809545029
- Gatiyatullina, Galya, Marina Solnyshkina, Valery Solovyev, Andrey Danilov, Ekaterina Martynova & Iskander Yarmakeev. 2020. Computing Russian morphological distribution patterns using RusAC Online Server. In 13th International Conference on Developments in eSystems Engineering (DeSE). 393-398. https://doi.org/10.1109/DeSE51703.2020.9450753
- Graesser, Arthur C. & Danielle S. McNamara. 2011. Computational Analyses of Multilevel Discourse Comprehension. Topics in Cognitive Science 3. 371-398.
- Graesser, Arthur C., Matthew Singer & Tom Trabasso. 1994. Constructing inferences during narrative text comprehension. Psychological Review 101. 371-395.
- Gray, William & William Leary. 1935. What Makes a Book Readable. University of Chicago Press, Chicago: Illinois.
- Hall, Charles, Debra S. Lee, Gwenyth Lewis, Phillip M. McCarthy & Danielle S. McNamara. 2006. Language in law: Using Coh-Metrix to assess differences between American and English/Welsh language varieties. In Proceedings of the Annual Meeting of the Cognitive Science Society 28.
- Heilman, Michael, Le Zhao, Juan Pino & Maxine Eskenazi. 2008. Retrieval of reading materials for vocabulary and reading practice. In Proceedings of the Third Workshop on Innovative Use of NLP for Building Educational Applications. 80-88. https://doi.org/10.3115/1631836.1631846
- Hendrix, Gary G. 1980. Future prospects for computational linguistics. In ACL '80: Proceedings of the 18th Annual Meeting on Association for Computational Linguistics. 131-135. Association for Computational Linguistics, United States. https://doi.org/10.3115/981436.981476
- Jones, Michael N., Walter Kintsch & Douglas J. Mewhort. 2006. High-dimensional semantic space accounts of priming. Journal of Memory and Language 55(4). 534-552.
- Kemper, Susan. 1983. Measuring the inference load of a text. Journal of Educational Psychology 75 (3). 391-401.
- Kintsch, Walter & Vipond Douglas. 1979. Reading comprehension and readability in educational practice and psychological theory. In Lars-Göran Nilsson (ed.), Perspectives on memory research, 329-365. Hillsdale, NJ, Lawrence Erlbaum.
- Klare, George R. 1963. The Measurement of Readability. Iowa State University Press.
- Kortmann, Bernd & Benedikt Szmrecsanyi (eds.). 2012. Linguistic Complexity: Second Language Acquisition, Indigenization, Contact. Berlin: De Gruyter.
- Laposhina, Antonina N. & Maria Yu. Lebedeva. 2021. Tekstometr: Online-instrument opredeleniya urovnya slozhnosti teksta po russkomu yazyku kak inostrannomu. Rusistika 19(3). 331-345. (In Russ.) http://dx.doi.org/10.22363/2618-8163-2021-19-3-331-345
- Lively, Bertha & Sidney Pressey. 1923. A method for measuring the ‘vocabulary burden’ of textbooks. Educational Administration and Supervision 9. 389-398.
- Marujo, Luis, Jorge Baptista, José Lopes, Maxine Eskenazi, Ceu Viana, Juan Pino & Isabel Trancoso. 2009. Porting reap to European Portuguese. In SLaTE. 69-72. Citeseer.
- McCall, William & Lelah Crabbs. 1925. Standard Test Lessons in Reading. New York: Teacher's College Press.
- McCarthy, Philip M., John C. Myers, Stephen Briner & Arthur C. Graesser. 2009. A psychological and computational study of sub-sentential genre recognition. JLCL 24 (1). 23-55.
- McClusky, Howard. 1934. A quantitative analysis of the difficulty of reading materials. The Journal of Educational Research 28. 276-282. https://doi.org/10.1080/00220671.1934.10880487
- McLaughlin, G. Harry. 1969. Smog-grading - a new readability formula. Journal of Reading 13. 639-646.
- McNamara, Danielle & Arthur C. Graesser. 2012. Coh-Metrix: An Automated Tool for Theoretical and Applied Natural Language Processing. IGI Global. https://doi.org/10.4018/978-1-60960-741-8.ch011
- McNamara, Danielle S., Arthur C. Graesser, Philip M. McCarthy & Zhiqiang Cai. 2014. Coh-Metrix: Theoretical, Technological, and Empirical Foundations. In Automated Evaluation of Text and Discourse with Coh-Metrix. Cambridge: Cambridge University Press. https://doi.org/10.1017/CBO9780511894664.006
- Meyer, Bonnie J. F. 1982. Reading research and the composition teacher: The importance of plans. College Composition and Communication 33 (1). 37-49. https://doi.org/10.2307/357843
- Nelson, Jessica, David Liben, Meredith Liben & Charles Perfetti. 2012. Measures of Text Difficulty: Testing their Predictive Value for Grade Levels and Student Performance. New York, NY: Student Achievement Partners.
- Ojemann, Ralph. 1934. The reading ability of parents and factors associated with the reading difficulty of parent education materials. University of Iowa Studies in Child Welfare 8. 11-32.
- Rabin, Mikhael'. 1993. Slozhnost' vychislenii. In ACM Turing Award Lectures. 371-391. Moscow: Mir. (In Russ.)
- Rescher, Nicholas. 1998. Complexity: A Philosophical Overview. London: Transaction Publishers.
- Rosch, Eleanor & Carolyn B. Mervis. 1975. Family resemblances: Studies in the internal structure of categories. Cognitive Psychology 7. 573-605.
- Rubakin, Nikolai A. 1890. Notes on literature for the people. Russkoe Bogatstvo 10. 221-231. (In Russ.)
- Saimon, Gerbert. 2004. The Sciences of the Artificial. Moscow: Editorial URSS. (In Russ.)
- Schwarm, Sarah E. & Mari Ostendorf. 2005. Reading level assessment using support vector machines and statistical language models. In ACL '05: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. 523-530. https://doi.org/10.3115/1219840.1219905
- Sheehan, Kathleen M., Irene Kostin, Diane Napolitano & Michael Flor. 2014. The TextEvaluator tool: Helping teachers and test developers select texts for use in instruction and assessment. The Elementary School Journal 115 (2). 184-209. https://doi.org/10.1086/678294
- Sherman, Lucius A. 1893. Analytics of Literature: А Manual for the Objective Study of English Prose and Poetry. Boston: Ginn.
- Si, Luo & Jamie Callan. 2001. A statistical model for scientific readability. In Proceedings of the Tenth International Conference on Information and Knowledge Management. 574-576. ACM New York, NY, USA. https://doi.org/10.1145/502585.502695
- Simon, Herbert A. 1996. The Sciences of the Artificial. Cambridge: The MIT Press.
- Smith, Edgar A. & John Quackenbush. 1960. Devereux teaching aids employed in presenting elementary mathematics in a special education setting. Psychological Reports 7. 333-336. https://doi.org/10.2466/PR0.7.6.333-336
- Solnyshkina, Marina I., Elena V. Harkova & Aleksander S. Kiselnikov. 2014. Comparative Coh-metrix analysis of reading comprehension texts: Unified (Russian) state exam in English vs Cambridge first certificate in English. English Language Teaching 7 (12). 65-76. https://doi.org/10.5539/elt.v7n12p65
- Solnyshkina, Marina I. & Kisel'nikov Aleksandr. S. 2015. Slozhnost' teksta: Etapy izucheniya v otechestvennom prikladnom yazykoznanii. Vestnik Tomskogo Gosudarstvennogo Universiteta. Filologiya 6(38). (In Russ.)
- Solnyshkina, Marina I., Elena V. Harkova & Maria B. Kazachkova. 2020. The structure of Cross-Linguistic differences: Meaning and context of 'Readability' and its Russian equivalent 'Chitabelnost'. Journal of Language & Education 6 (1). 103-119. https://jle.hse.ru/article/view/7176/12052. https://doi.org/10.17323/jle.2020.v6.i1
- Solnyshkina, Marina I., Ehl'zara Gizzatullina-Gafiyatova, Ekaterina V. Martynova & Valery Solovyev. 2022. Text complexity as an interdisciplinary problem. Voprosy Kognitivnoi Lingvistiki 1. (In Russ.)
- Solovyev, Valery D., Vladimir V. Ivanov & Marina I. Solnyshkina. 2018. Assessment of reading difficulty levels in Russian academic texts: Approaches and Metrics. Journal of Intelligent & Fuzzy Systems 34 (5). 3049-3058. https://doi.org/10.3233/JIFS-169489
- Solovyev, Valery, Marina Solnyshkina, Vladimir Ivanov & Ildar Batyrshin. 2019. Prediction of reading difficulty in Russian academic texts. Journal of Intelligent & Fuzzy Systems 36 (5). 4553-4563. https://doi.org/10.3233/JIFS-179007
- Solovyev, Valerii, Yulia Volskaya, Maria Andreeva & Artem Zaikin. 2022. Russian dictionary with concreteness/abstractness indexes. Russian Journal of Linguistics 2. 514-548. (In Russ.)
- Spivey, Nancy N. 1987. Construing constructivism: Reading research in the United States. Poetics 16 (2). 169-192. https://doi.org/10.1016/0304-422X%2887%2990024-6
- Steger, Maria & Edgar W. Schneider. 2012. Complexity as a function of iconicity: The case of complement clause constructions in New Englishes. In Kortmann Bernd & Benedikt Szmrecsanyi (eds.), Linguistic complexity: Second language acquisition, indigenization, contact, 156-191. Berlin: De Gruyter.
- Stevens, Kathleen C. 1980. Readability Formulae and McCall-Crabbs Standard Test Lessons in Reading. The Reading Teacher 33 (4). 413-415.
- Sun, Haimei. 2020. Unpacking reading text complexity: A dynamic language and content approach. Studies in Applied Linguistics & TESOL at Teachers College 20 (2). 1-20. https://doi.org/10.7916/salt.v20i2.7098
- Taylor, Wilson L. 1953. Cloze procedure: A new tool for measuring readability. Journalism Quarterly 30 (4). 415-433. https://doi.org/10.1177%2F107769905303000401
- Thorndike, Edward. 1921. Word knowledge in the elementary school. The Teachers College Record 22 (5). 334-370.
- van Dijk, Teun A. & Walter Kintsch. 1983. Strategies of Discourse Comprehension. New York: Academic.
- Vergara, Fermina & Rachelle Lintao. 2020. War on drugs: The readability and comprehensibility of illegal drug awareness campaign brochures. International Journal of Language and Literary Studies 2 (4). 98-121. https://doi.org/10.36892/ijlls.v2i4.412
- Vogel, Mabel & Carleton Washburne. 1928. An objective method of determining grade placement of children’s reading material. The Elementary School Journal 28 (5). 373-381. https://doi.org/10.1086/456072
- Zwaan, Rolf A. & Gabriel A. Radvansky. 1998. Situation models in language comprehension and memory. Psychological Bulletin 123. 162-185. https://doi.org/10.1037/0033-2909.123.2.162
- Zeno, Susan, Robert T. Millard & Raj Duvvuri. 1995. The Educator's Word Frequency Guide. Brewster: Touchstone Applied Science Associates, Inc.
- Antonini, Alessio, Francesca Benatti, Edmund King, François Vignale & Guillaume Gravier. 2019. Modelling Changes in Diaries, Correspondence and Authors’ Libraries to Support Research on Reading: The READ-IT Approach. URL: https://hal.archives-ouvertes.fr/hal-02130008/document (accessed 25 January 2022)
- Antunes, Hélder M. M. 2019. Automatic Assessment of Health Information Readability. URL: https://repositorio-aberto.up.pt/bitstream/10216/121810/4/345408.pdf (accessed 25 January 2022)
- Development of the ATOS Readability Formula. 2014. URL: https://webcache.googleusercontent.com/search?q=cache:lWV4zvGcnhMJ:https://doc.renlearn.com/KMNet/R004250827GJ11C4.pdf+&cd=14&hl=ru&ct=clnk&gl=ru (accessed 25 January 2022).
- François, Thomas & Hubert Naets. 2011. Dmesure: A readability platform for French as a foreign language. URL: https://cental.uclouvain.be/team/tfrancois/articles/CLIN21.pdf (accessed 25 January 2022)
- Lennon, Colleen & Hal Burdick. 2004. The Lexile Framework as an Approach for Reading Measurement and Success. URL: http://www.lexile.com/m/resources/materials/Lennon__Burdick_2004.pdf (accessed 25 January 2022).
- Renaissance. 2022. URL: https://ukhosted43.renlearn.co.uk/2171850/ (accessed 25 January 2022).
- Special Collections. Accelerated Reader (ATOS Level: 5.0-5.9). Bookshare a Benetech Initiative. 2002-2022. URL: https://www.bookshare.org/browse/collection/371895 (accessed 25 January 2022).
- T.E.R.A.: The Coh-Metrix Common Core Text Ease and Readability Assessor. 2012-2022. URL: http://129.219.222.70:8084/Coh-Metrix.aspx (accessed 25 January 2022).
- The ATOS Readability Formula for Books and How it Compares to Other Formulas. 2000. URL: https://files.eric.ed.gov/fulltext/ED449468.pdf (accessed 25 January 2022).
- The Lexile Framework for Reading. 2022. URL: https://lexile.com (accessed 25 January 2022).
Дополнительные файлы
