Philology Matters
Login
Journal Cover
Philology Matters · Volume 53, Issue 2 · 2025

A Machine Learning–Based Cultural Classification of Korean Idioms

Share Cite This Article DOI DOI: 10.36078/987655532
CC BY 4.0 Litsenziya
Creative Commons Attribution 4.0 Creative Commons License

Copyright © 2026 by the author(s). This work is licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0).

Abstract

This research introduces a structured cultural classification system for Korean idioms with the goal of enhancing their instructional value in foreign language classrooms. Recognizing idioms as deeply embedded in cultural thought and everyday communication, the study combines insights from linguistics, cultural studies, and computational methods. A corpus of 7.3 million tokens—compiled from textbooks, idiom dictionaries, and language databases—was analyzed to identify 782 idiomatic expressions, each annotated with one or more of seven cultural themes: Confucian values, agrarian life, food culture, body metaphors, family ties, communal spirit, and historical symbolism. Expert-led manual annotation achieved high consistency (κ = 0.82), while a KoBERT-based classifier was trained for automated tagging, yielding a macro F1-score of 0.87. In addition to evaluating the model, the study reflects on practical teaching strategies, including how to present low-frequency but culturally rich idioms using visuals and storytelling. By aligning computational analysis with cultural pedagogy, this study supports a more meaningful and scalable approach to Korean language education and encourages the use of idioms as a gateway to intercultural understanding.

Keywords:
Korean idioms; cultural codes; machine learning; BERT; TextRank; Korean language education

No Content Available