§ 03
Corpus Data

Download the
subcorpora.

Each country-based sub-corpus of the Malaysia Tourism Review Corpus (MaTRiC) can be accessed below. Sign in with your account to download. The corpus is provided free of charge for research and educational purposes. Please acknowledge the corpus when using it in any research output.

Subcorpora
2
Words
9,173,914
Time period
2012–2022
Source
TripAdvisor
Sign in required

Free, but sign in is required to download.

Create a free account so we can track usage for grant reporting. You can still browse this page and use the search.

United Kingdom Corpus

This subcorpus contains 7,199,608 words from TripAdvisor reviews written by tourists from United Kingdom.

TripAdvisor 2012–2022
Words
7,199,608

United States Corpus

This subcorpus contains 1,974,306 words from TripAdvisor reviews written by tourists from United States.

TripAdvisor 2012–2022
Words
1,974,306
Acknowledgement requirement

The data were obtained from the Malaysia Tourism Review Corpus (MaTRiC). Available at: [website link]. All rights in the corpus data are reserved.