The Potential of the TikTok Platform as a “Natural” Linguistic Corpus
DOI:
https://doi.org/10.30827/cre.v21.32444Palabras clave:
natural corpus, linguistic search, nanodiachrony, social network, specialized corpus, TikTokResumen
The article presents a linguist’s view of TikTok as a “natural” linguistic corpus. The TikTok platform has actually been used as a corpus in a number of linguistic studies, but has not previously been evaluated as a linguistic search tool.
We draw attention to the non-specialized tools of the social network, facilitating the search for linguistic contexts in it and the documentation of the found examples of linguistic phenomena, and the advantages of filling the corpus: (1) semantic search; (2) dating, (3) sorting and (4) filtering by date of search results; (5) ample opportunities for documentation of linguistic phenomena; (6) representativeness of the material of oral and oral-written speech of modern youth; (7) self-sufficiency; (8) multilingualism; (9) representation of metalinguistic reflexives; (10) the possibility of saving search results and links to them; (11) the possibility of analyzing collocation and word connections; (12) the possibility of indirectly assessing the popularity of a linguistic phenomenon; (13) a wide representation of creolized texts; (14) completeness of contexts of use; (15) automated assistance in selecting information useful for the researcher. We emphasize the possibility of using TikTok as a “natural” linguistic corpus in the educational process.
We argue that the natural corpus of TikTok is invaluable for nanodiachronic studies of the newest words, phrasemes and proverbs of the youth sociolect, their semantic research in synchrony, and the study of modern oral and oral-written speech. We state that the specificity of TikTok search results does little to facilitate a rigorous quantitative assessment of large linguistic data, but is convenient for the researcher in terms of compiling a sufficient sample of the linguistic phenomenon of interest.
Descargas
Citas
BARANOVA, E. A., LAVROVA, K. A., CHEBANENKO, V. V. (2022). Amerikanskij molodyozhny`j sleng v rossijskix media. Vestnik Rossijskogo universiteta druzhby` narodov. Seriya: Literaturovedenie. Zhurnalistika 27 (4), s. 775–787. DOI 10.22363/2312-9220-2022-27-4-775-787. DOI: https://doi.org/10.22363/2312-9220-2022-27-4-775-787
BARKOVICH, A. A. (2015). Metodologicheskij aspekt izucheniya komp`yuterno-oposredovannogo diskursa. Vestnik Nizhegorodskogo gosudarstvennogo lingvisticheskogo universiteta imeni N. A. Dobrolyubova 30, s. 38–48.
BARTMIŃSKI, J. (2015). Leksykon aksjologiczny Słowian i ich sąsiadów… — co zawiera, na jakich zasadach się opiera, dla kogo jest przeznaczony? W Leksykon aksjologiczny Słowian i ich sąsiadów (T. 1: Dom, s. 7–14). Lublin: Wydawnictwo Uniwersytetu Marii Curie-Skłodowskiej. DOI: https://doi.org/10.11649/978-83-66369-61-0_1
BELIKOV, V. I. (2004). Yandex kak leksikograficheskij instrument. Komp`yuternaya lingvistika i intellektual`ny`e texnologii: trudy` Mezhdunarodnoj konferencii «Dialog 2004». Polucheno iz: https://www.dialog-21.ru/media/2492/belikov.pdf.
BELIKOV, V. I. (2016). Chto i kak mozhet poluchit` lingvist iz ocifrovanny`x tekstov. Sibirskij filologicheskij zhurnal, 3, s. 17–34. DOI: 10.17223/18137083/56/2. DOI: https://doi.org/10.17223/18137083/56/2
BELIKOV, V. I., SELEGEJ, V. P., SHAROV, S. A. (2012). Prolegomeny` k proektu General`nogo internet-korpusa russkogo yazy`ka (GIKRYa). Komp`yuternaya lingvistika i intellektual`ny`e texnologii: trudy` XVIII Mezhdunarodnoj konferencii «Dialog 2012»: v 2-x tomax, Bekasovo, 30 maya – 03 iyunya 2012 g. Vy`p. 11 (18), t. 1, s. 37–51.
BOGOYAVLENSKAYA, YU. V. (2017). Sopostavitel`ny`j ob``ektno-orientirovanny`j korpus: opredelenie ponyatiya i principy` formirovaniya. Mnogoyazy`chie v obrazovatel`nom prostranstve 9, s. 8–19.
BOGUSLAVSKIJ, I. M., DRUZHKIN, K. YU., IOMDIN, L. L., SIZOV, V. G., CINMAN, L. L. (2007). Standartny`e testy` dlya zadach avtomaticheskoj obrabotki tekstov na russkom yazy`ke i regressionnoe testirovanie. Trudy` mezhdunarodnoj konferencii «Dialog 2007, s. 62–69.
BORZENKO, E. O. (2024). Metody` korpusnogo analiza chastotnosti i dinamiki yazy`kovy`x edinicz v 2024 godu. Vestnik PSTGU. Ser. III: Filologiya 80, p. 11–22. DOI: 10.15382/sturIII202480.11-22. DOI: https://doi.org/10.15382/sturIII202480.11-22
GROMINOVA, A., SPISH`YAKOVA, A., DOGNAL, J. (2023). Innovatsiya uchebny`x programm doktorantury po slavistike / rusistike v slavyanskix i neslavyanskix stranax s tsel`yu povysheniya ix effektivnosti. Cuadernos de Rusística Española, 19, s. 213–224.
DIGITAL 2024: GLOBAL OVERVIEW REPORT (2024). Retrieved from: https://datareportal.com/reports/digital-2024-global-overview-report.
DIGITAL 2025: GLOBAL OVERVIEW REPORT (2025). Retrieved from: https://datareportal.com/reports/digital-2025-global-overview-report.
DOBRUSHINA, E. R. (2022). Korpusny`e metody` kak instrument issledovaniya mikrodiaxronii russkogo yazy`ka XVIII – XXI vekov: dissertaciya na soiskanie uchyonoj stepeni doktora filologicheskix nauk, Moskva, 435 s.
FERNÁNDEZ, A. G. (2017). La web como corpus: un esbozo. Lengua y Habla 21, p. 126–150.
KILGARRIFF, A. (2007). Googleology is bad science. Computational linguistics 33 (1), p. 147–151. DOI: https://doi.org/10.1162/coli.2007.33.1.147
KITANINA, E`. A. (2022). Leksiko-semanticheskie transformacii anglicizmov v setevom diskurse. Ural. filol. vestnik Ser.: Yazy`k. Sistema. Lichnost`: Lingvistika kreativa 2, s. 483–490.
MECHKOVSKAYA, N. B. (2006). Estestvenny`j yazy`k i metayazy`kovaya refleksiya v vek Interneta. Russkij yazy`k v nauchnom osveshhenii, 2 (12), s. 165–185.
MECHKOVSKAYA, N. B. (2014). Filolog v Internete: psixologicheskie trendy` i professional`ny`e riski. Russkij yazy`k: sistema i funkcionirovanie (k 75-letiyu filologicheskogo fakul`teta): sbornik materialov VІ Mezhdunarodnoj nauchnoj konferencii, Minsk, 28–29 oktyabrya 2014 g.: v 2 ch., ch. 1, s. 45–50.
MECHKOVSKAYA, N. B. (2024). Korpusno-chastotnaya leksikografiya, modeli GPT i tolkovy`e slovari: vektory` vzaimodejstviya. Anglistika v tret`em ty`syacheletii: novy`e podxody` i puti razvitiya: tezisy` dokladov Mezhdunarodnoj nauchnoj konferencii, Minsk, 3–5 oktyabrya 2024 g., s. 26–28.
PAVLOVA, O. V, RADZHABOVA, L. K. (2022). Analiz slovoobrazovatel`ny`x mexanizmov formirovaniya slengovy`x vy`razhenij molodyozhnoj sredy` v kitajskom yazy`ke. Filologicheskie nauki. Voprosy` teorii i praktiki, 15 (2), s. 559–563. DOI 10.30853/phil20220069.
PORICZKIJ, V. V. (2011). Russkoe yazy`kovoe soznanie i sluchajny`e chisla. Sbornik rabot 67 nauchnoj konferencii studentov i aspirantov Belorusskogo gosudarstvennogo universiteta, 17–20 maya 2010 g., Minsk. V 3 ch. Ch. 2, s. 85–89.
SCHRYVER, G.-M DE. (2023). Generative AI and Lexicography: The Current State of the Art Using ChatGPT. Intern. J. of Lexicography 36, 4, p. 355–387. DOI: 10.1093/ijl/ecad021. DOI: https://doi.org/10.1093/ijl/ecad021








