Skip to content

linyu329/Lin_Yu_Hsuan_project

Repository files navigation

Exploring the Differences of Spatial Terms in Chinese and English Language Networks.

不同文化中的語言網路在空間詞中差異探索

By Lin Yu-Hsuan, BrainHack School 2023

Content
  1. Project Description
  2. Getting Start
  3. Results
  4. Conclusions
  5. Future Work
  6. References

Project Description

Cultures difference impact our expression in language, but does it mean our perception was also effected? Researchers have also conducted behavioral studies by grouping participants based on their native language and their first foreign language, and the results is correlated to the pragmatic habits of different cultures.

Unfortunately, current research on spatial differences in cognition mostly focuses on differences in language presentation. Whether these differences are reflected in more fundamental brain representations remains an unexplored territory. Therefore, this article aims to present the differences in the use of spatial terms in Chinese and English texts through text analysis, and further investigate whether these linguistic differences are reflected in brain representations.

Background

Benjamin Lee Whorf, a American linguist, proposed the "Linguistic Relativity Hypothesis" in 1935, attempting to explain whether the rules of language affect our thinking in daily perception of various things. For example, different languages often use spatial terms to aid their understanding of the abstract concept of time. In Chinese, words associated with the front are frequently used to refer to the future, such as "前途"and "前程" (both refer to future prospects). Conversely, words associated with the back are used to refer to the past, such as "背景" (background). English exhibits similar patterns, such as "look forward to" and "background." Therefore, this article takes this as a starting point to explore how spatial terms shape the different perspectives through which humans perceive things in language.

Introduction

Previous research has shown that when presented with infrequent stimuli, native speakers of different languages exhibit different prototypes for sounds, which can be observed electrically through the mismatch negativity (MMN) in subjects(Näätänen, et al., 1997). This suggests that different native languages have distinct prototypes for recognizing speech sounds. This information indicates that the human brain's language processing system might have diverse brain representations for the same linguistic phenomenon due to variations in language usage.

Another research on the spatial differences between Chinese and English has found that Chinese often uses vertical concepts to represent temporal sequence (Boroditsky, L. (2008)). For example, expressions like "下一周" (next week, direct translation-lower week) and "上個學期" (previous semester, direct translation-upper semester) embody this characteristic, which is also reflected in the top-to-bottom writing style prevalent in Chinese. Researchers have also conducted behavioral studies by grouping participants based on their native language and their first foreign language. They asked and studied how these participants represent their understanding of time. By asking questions like "If today is here, where is tomorrow?", the researchers identified that native Chinese speakers are more likely than native English speakers to indicate relative positions using vertical arrangements.

Unfortunately, current research on spatial differences in cognition mostly focuses on differences in language presentation. We cannot rule out the possibility that language has not developed alternative ways of expression. Whether these differences are reflected in more fundamental brain representations remains an unexplored territory. Therefore, this article aims to present the differences in the use of spatial terms in Chinese and English texts through text analysis, and further investigate whether these linguistic differences are reflected in brain representations.

Main Objectives

  1.   ✅ Investigate the collocation of various spatial terms using the bilingual text of "The Little Prince."
  2.   ✅ By examining the differences in the frequency of spatial terms to explore the variations in how Chinese and English texts represent space.
  3.   ✅ Examining the differences in the collocation patterns of spatial terms between Chinese and English.
  4.   ⬜ Matching with the fMRI Brain Database.

Data

In this research, two main methodologies are involved. Firstly, a case study is conducted using the texts of "The Little Prince" in both Chinese and English versions, aiming to understand whether there are different ways of expressing spatial concepts in these two languages. Secondly, an open brain database available on the internet is utilized to study brain representations, with the goal of investigating whether these spatial expression patterns impact our brain representations.

Based on the "spatial sentence structure" in Chinese, I will use the spatial terms identified in previous research as the basis for this study. These terms include "shang 上 'up', xia 下 'down', qian 前 'in front of', hou 後 'behind', li 裡 'in', nei 內 'in', zhong 中 'in', and wai 外 'outside'" (Chen, 2020).

Furthermore, a publicly available fMRI database on the internet contains an article that uses "The Little Prince" as cross-linguistic data. The study collected fMRI brain scans from 49 English native speakers, 35 Chinese native speakers, and 28 French native speakers while listening to the story of "The Little Prince" (Stehwien et al., 2020).

Based on the content taught in this course, I will attempt to depict the co-occurrence of various spatial terms in different languages and illustrate the corresponding brain activation patterns when these spatial terms are encountered. The source of brain imaging data is open databases within OpenNeuro. The detail of the braindata is well reported in https://openneuro.org/datasets/ds003643

Project Deliverables

✅ Text collocation scripts & pics
✅ Get fMRI database
✅ Presentation slides
✅ Project report

Getting Start

Prerequisites

You may use Google Colab to run the scripts and no modules should be installed.

Project Tree

You may find more detailed description and method in each script

├── readme.md
├── analyzing_fMRI_data 
|   ├── fMRI_CH               ==>   Use the merged data acquired from get_CH_fMRI then perform some analysis
|   └── fMRI_EN               ==>   Use the merged data acquired from get_EN_fMRI then perform some analysis
├── Collocation
|   ├── the_little_prince_ch  ==>   The Little Prince Chinese corpus
|   ├── the_little_prince_en  ==>   The Little Prince English corpus
|   ├── Final_CH              ==>   Find collocation words, then plot and analyze
|   └── Final_EN              ==>   Find collocation words, then plot and analyze
├── get_fMRI_data
|   ├── get_CH_fMRI           ==>   Download fMRI data from openneuro dataset and calculate and average data then save as a zip file (Chinese participants)
|   └── get_EN_fMRI           ==>   Download fMRI data from openneuro dataset and calculate and average data then save as a zip file (English participants)
|
└── Project_final_brainhack.pptx

Results

1.Collocation of the context

UpDownIn Front ofBehind

UpDownIn Front ofBehind

in_01in_02in_03Outside

Note that 「裡」、「內」、「中」all mean "in" in English

in_03Outside

2.Frequency of the spatial term

EN_freq

CH_freq


3. Attempt with fMRI data
Below are the 1st ROI region of the average fMRI data of 9 English and Chinese participants applied with atlas masker, you may find more fMRI analysis attempts in analyzing_fMRI_data\fMRI_EN and analyzing_fMRI_data\fMRI_CH

EN_brain

CH_brain

Conclusions

  1. Tools I learned from the project

    • Python scripting
    • Use Python to analyze spatial terms in text and corpus.
    • Extract specific data from open brain database.
    • Understand the different categories between languages

  2. The relationship between various spatial terms

    • Spatial terms in EN version refer to some abstract concepts
    • The same object in different languages have preference in using spatial terms
  3. Data analysis of Chinese corpus

    • Should be careful with Chinese segmentation, clean the data manually if needed
  4. The connection between literatures, programming, and even neuroscience.

Future Work

  • Concern
    • The characteristics of fMRI (t_r=2) are not suitable for studying short-term differences in the brain (the lasting time of a spatial term may be about 0.2 sec)
  • Next Steps
    • Find the corresponding spatial terms in the database
    • Compare the differences in brain responses to these spatial terms between Chinese and English language users
  • Discussion
    • In order to correspond to the brain database, the accuracy of the co-location plot decrease
    • How to infer results from corpus data to brain data?

References

  1. Whorf, Benjamin Lee (1956) [1936?]. “An American Indian model of the universe”. In Carroll, J. B. (ed.). Language, Thought, and Reality: Selected Writings of Benjamin Lee Whorf. Cambridge, Massachusetts: Technology Press of Massachusetts Institute of Technology. pp. 57–64.
  2. Boroditsky L., Gaby, A. (2010). “Remembrances of times East: Absolute Spatial Representations of time in an Australian Aboriginal Community.” Sci.
  3. Boroditsky, L. (2008). Do English and Mandarin Speakers think Differently about Time? Proceedings of the 30th annual conference of the cognitive science society. pp. 64-70
  4. Chen, Alvin Cheng-Hsien. "Words, constructions and corpora: Network representations of constructional semantics for Mandarin space particles" Corpus Linguistics and Linguistic Theory, vol. 18, no. 2, 2022, pp. 209-235. https://doi.org/10.1515/cllt-2020-0012
  5. Stehwien, S., Henke, L., Hale, J., Brennan, J., & Meyer, L. (2020, May). The Little Prince in 26 languages: Towards a multilingual neuro-cognitive corpus. In Proceedings of the Second Workshop on Linguistic and Neurocognitive Resources(pp. 43-49).
  6. Näätänen, R., Lehtokoski, A., Lennes, M., Cheour, M., Huotilainen, M., Iivonen, A., ... & Alho, K. (1997). Language-specific phoneme representations revealed by electric and magnetic brain responses. Nature, 385(6615), 432-434.
  7. Thierry, G., Athanasopoulos, P., Wiggett, A., Dering, B., & Kuipers, J. R. (2009). Unconscious effects of language-specific terminology on preattentive color perception. Proceedings of the National Academy of Sciences, 106(11), 4567-4570.
  8. Li, J., Bhattasali, S., Zhang, S., Franzluebbers, B., Luh, W. M., Spreng, R. N., ... & Hale, J. (2021). Le Petit Prince: A multilingual fMRI corpus using ecological stimuli. Biorxiv, 2021-10.