USING R IN THE RESEARCH BY FUTURE PHILOLOGISTS
PDF (Ukrainian)

Keywords

R statistical software environment
corpus of academic speech
hedges
the Kolmogorov-Smirnov test
the Mann-Whitney U Test

How to Cite

[1]
V. V. Zhukovska, O. O. Mosiiuk, and V. V. Komarenko, “USING R IN THE RESEARCH BY FUTURE PHILOLOGISTS”, ITLT, vol. 66, no. 4, pp. 272–285, Sep. 2018, doi: 10.33407/itlt.v66i4.2196.

Abstract

Corpus linguistics is a newly emerging field of study in applied linguistics that deals with construction, processing, and exploitation of text corpora. To date, a high-quality analysis of vast amounts of empirical language data provided by computerized corpora is impossible without computer technologies and relevant statistical methods. Therefore, teaching future philologists to effectively apply statistical computer programs is an important stage in their research training. The article discusses the possibilities of using one of the leading in Western linguistics, but not well-known in Ukraine, software packages for statistical data analysis – R statistical software environment – in the research by future philologists. The paper reveals the advantages and disadvantages of this program in comparison with other similar software packages (SPSS and Statistica) and provides Internet links to R self-learn tutorials. The flexibility and efficacy of R for linguistic research are demonstrated on the example of a statistical analysis of the use of hedges in the corpus of academic speech. For novice philologists to properly understand the peculiarities of conducting a statistical linguistic experiment with R, a detailed description of each stage of the study is provided. The statistical verification of hedges in the speech of students and lecturers was carried out using such statistical methods as the Kolmogorov–Smirnov test and the Mann-Whitney U Test. The article presents the developed algorithms to calculate the specified tests applying the built-in commands and various specialized library functions, created by R user community to enhance the functionality of this statistical software. Each script for statistical calculations in R is accompanied by a detailed description and interpretation of the results obtained. Further study of the issue will involve a number of activities aimed at raising awareness and improving skills of future philologists in using R statistical software, which is important for their professional development as researchers.
PDF (Ukrainian)

References

L. A. Janda, Quantitative Methods in Cognitive Linguistics. An Introduction, Cognitive linguistics. The quantitative turn. The essential reader, Berlin : De Gruyter Mouton, 2013, 321 p.

С. Н. Бук, Основи статистичної лінгвістики, Львів: Видавничий центр ЛНУ імені Івана Франка, 2008.

"What is R? " [Електронний ресурс]. Доступно: https://www.r-project.org/ about.html.

"R resources (free courses, books, tutorials, & cheat sheets) ". [Електронний ресурс]. Доступно: https://paulvanderlaken. com/2017/08/10/r-resources-cheatsheets-tutorials-books/.

"Why RStudio? " [Електронний ресурс]. Доступно: https://www.rstudio.com/ about/.

"Michigan corpus of academic spoken English". [Електронний ресурс]. Доступно: https://quod.lib.umich.edu/m/micase/.

D. Lakoff, Hedges: "A study in meaning criteria and the logic of fuzzy concepts", Journal of philosophical logic, №. 2 (4), 1972, p. 458-508.

А. В. Ярхо, "Референційний хеджинг як стратегія етикетизації у дискурсі англомовної науково-дослідницької статті: контрастивний аналіз", Вісник Харківського національного університету імені В. Н. Каразіна. №930 Серія «Романо-германська філологія. Методика викладанні іноземних мов», 2010, Випуск 64, С. 82-90.

В. В. Шилюк, "Класифікація засобів вираження позиції мовця в усній комунікації: порівняльний аналіз", Вісник Житомирського державного університету, Вип. 2 (80), 2015, С. 302-308.

Е. В. Сидоренко, Методы математической обработки в психологии, СПб., ООО «Речь», 2000.

Л. В. Шелехова, Математические методы в педагогике и психологи: в схемах и таблицах: учебное пособие, Майкоп, изд-во АГУ, 2010.

В. В. Левицкий, Квантитативные методы в лингвистике, Черновцы, Рута, 2004.

Р. Г. Пиотровский, К. Б. Бектаев, А. А. Пиотровская, Математическая лингвистика : учеб. пособие для пед. институтов, М., Высшая школа, 1977.

H. W. Lilliefors, "On the Kolmogorov-Smirnov test for normality with mean and variance unknown", Journal of the American Statistical Association, Vol. 62, 1967, p. 399-402.

"Package ‘nortest’". [Електронний ресурс]. Доступно: https://cran.r-project.org/web/packages/nortest/nortest.pdf.

R. M. Conroy, "What hypotheses do “nonparametric” two-group tests actually test?", The Stata Journal, № 2, 2012, р. 182-190.

F. Wilcoxon, "Individual comparisons by ranling methods", Biometrics Bull, vol. 1, 1945, p. 80-83.

H. B. Mann, D. R. Whitney, "On a test of whether one of two random variables is stochastically larger than the other", Annals of Mathematical Statistics, vol. 18, № 1, 1947, p. 50-60.

А. Б. Шипунов, А. И. Коробейников, Е. М. Балдин, "Анализ данных с R (II"). [Електронний ресурс]. Доступно: http://www.inp.nsk.su/~baldin/DataAnalysis/ R/R-05-2var.pdf.


REFERENCES (TRANSLATED AND TRANSLITERATED)

L. A. Janda, Quantitative Methods in Cognitive Linguistics. An Introduction, Cognitive linguistics. The quantitative turn. The essential reader, Berlin : De Gruyter Mouton, 2013, (in English).

S. N. Buk, The Basics of Statistical Linguistics: educational method. manual, Lviv: Publishing Center of Ivan Franko National University of LNU, 2008, (in Ukrainian).

What is R? : [Online]. Available: https://www.r-project.org/about.html, (in English).

R resources (free courses, books, tutorials, & cheat sheets). [Online]. Available: https://paulvanderlaken.com/2017/08/10/r-resources-cheatsheets-tutorials-books/, (in English).

Why RStudio? [Online]. Available: https://www.rstudio.com/about/, (in English).

Michigan corpus of academic spoken English. [Online]. Available: https://quod.lib.umich.edu/m/ micase/, (in English).

D. Lakoff, Hedges: A study in meaning criteria and the logic of fuzzy concepts, Journal of philosophical logic, №. 2 (4), 1972, p. 458 - 508. (in English).

A. V. Yarkho, Referential hedging as an etiquette strategy in the discourse of an anglo-american scientific research paper: a contrastive analysis, Journal of Kharkiv National University named after V. N. Karazin, №930 Series «Romano-Germanic Philology. Methodology of Teaching Foreign Languages», 2010, issue 64, p. 82-90., (in Ukrainian).

V. V. Shiluk, Classification of means of expressing the position of the speaker in spoken communication: comparative analysis, Bulletin of Zhytomyr State University, issue 2 (80), 2015, p. 302 - 308, (in Ukrainian).

E. V. Sydorenko, Methods of mathematical processing in psychology, SPb: OOO “Rech”, 2000, (in Russian).

L. V. Shelekhova, Mathematical Methods in Pedagogy and Psychologists: in Schemes and Tables: Textbook, Maykop: ASU Publishing house, 2010, (in Russian).

V. V. Levitsky, Quantitative methods in linguistics, Chernivtsi: Ruta, 2004, (in Russian).

R. G. Piotrovsky, K. B. Bektaev, A. A. Piotrovskaya, Mathematical Linguistics: Textbook for pedagogical institutes, Moscow: «Higher School», 1977, (in Russian).

H. W. Lilliefors, On the Kolmogorov-Smirnov test for normality with mean and variance unknown, Journal of the American Statistical Association, vol. 62,1967, p. 399 - 402, (in English).

Package ‘nortest’. [Online]. Available: https://cran.r-project.org/web/packages/nortest/nortest.pdf, (in English).

R. M. Conroy, What hypotheses do “nonparametric” two-group tests actually test?, The Stata Journal, № 2, 2012, p. 182 - 190, (in English).

F. Wilcoxon, Individual comparisons by ranling methods, Biometrics Bull, vol. 1, 1945, p. 80 - 83, (in English).

H. B. Mann, D. R. Whitney, On a test of whether one of two random variables is stochastically larger than the other, Annals of Mathematical Statistics, vol. 18, № 1, 1947, p. 50 - 60, (in English).

A. B. Shipunov, A. I. Korobeinikov, E. M. Baldin, Analysis of data with R (II). [Online]. Available: http://www.inp.nsk.su/~baldin/DataAnalysis/R/R-05-2var.pdf, (in Russian).

Authors who publish in this journal agree to the following terms:

  1. Authors hold copyright immediately after publication of their works and retain publishing rights without any restrictions.
  2. The copyright commencement date complies the publication date of the issue, where the article is included in.

Content Licensing

  1. Authors grant the journal a right of the first publication of the work under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0) that allows others freely to read, download, copy and print submissions, search content and link to published articles, disseminate their full text and use them for any legitimate non-commercial purposes (i.e. educational or scientific) with the mandatory reference to the article’s authors and initial publication in this journal.
  2. Original published articles cannot be used by users (exept authors) for commercial purposes or distributed by third-party intermediary organizations for a fee.

Deposit Policy

  1. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) during the editorial process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (see this journal’s registered deposit policy at Sherpa/Romeo directory).
  2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
  3. Post-print (post-refereeing manuscript version) and publisher's PDF-version self-archiving is allowed.
  4. Archiving the pre-print (pre-refereeing manuscript version) not allowed.

Downloads

Download data is not yet available.