Data Overview
Newborn Cohort Data (NCD) is a part of the Thailand Childhood Longitudinal Survey (TCLS), which is a result of a close collaboration between the Research Institute for Policy Evaluation and Design (RIPED) at the University of the Thai Chamber of Commerce and the Equitable Education Fund (EEF). NCD is a comprehensive annual panel data from a rural area of Thailand. The survey started in June 2021, targeting newborn children in two regions of Thailand: Mahasarakham province in the northeast (a relatively poor region) and Narathiwat, Pattani, and Yala provinces in the south (with a Muslim majority).
Structure of the Questionnaire
The questionnaire (QN) comprises five main parts: household QN, pregnant-conditions QN, children QN, child-development QN, and school QN.
1. Household QN was adapted from the Annual Townsend Thai Data Survey and Thailand Socio-Economic Survey (SES), which collected detailed information about each individual in the household, e.g., jobs, education, age, gender, labor income, occupations, etc. Importantly, this survey asked about each household member’s working time, sleeping time, leisure time, and health problems. The dataset also contains comprehensive household information, e.g., land ownership, assets, expenditures, remittances, income from business, agriculture, livestock, and government assistance. Recently, it elicited caregivers’ risk preferences, time preferences, and social preferences, and assessed other essential skills or characteristics (e.g., IQ, digit span memory, personality traits).
2. Pregnant-conditions QN asked the mother for retrospective information when pregnant and retrieved information from the mother-child booklet (recorded by health care providers). This part interviewed mothers about depression during pregnancy and extracted information about the mother, health checkups during pregnancy, and delivery conditions from the booklet. This part was applied to the baseline survey only.
3. Children QN asked for information about parental time investment (e.g., reading, singing, playing, holding, child-rearing time, etc.), parental material investment (ownership of books, jigsaw, Lego, etc.), and nutritional intakes (e.g., foods, milks, breastfeeding, etc.). This part of the QN was adapted from several sources, including the Cohort Study of Thai Children, the World Health Organization Quality of Life, the National Educational Panel Study, and the Early Childhood Longitudinal Program. Like the pregnant conditions, we extracted information about the child’s vaccinations, health checkups, and delivery conditions (gestation duration, birth weight, body length at birth) from the mother-child booklet (recorded by health care providers).
4. Child-development QN interviewed primary caregivers regarding Gross Motor, Fine Motor, Receptive Language, Expressive Language, and Personal and Social (based on DSPM developed by the Ministry of Public Health of Thailand). Each child have been and will be directly assessed regularly using standard instruments, e.g., the DENVER II, the Wechsler Intelligence Scale for Children (WISC), executive functions (EF), school readiness, economic preferences, etc.
5.School QN collected basic information from school directors and teachers. This part of the QN will start once the children are in childcare centers. This QN may include classroom observations if the budget permits.
Sampling Framework
The survey covers 38 Tambons or subdistricts in two regions (the northeast with low fertility, and the far south with high fertility): 26 in the northeast (Mahasarakham province) and 12 in the south (Narathiwat, Pattani, and Yala provinces). Note that Tambon is the smallest official local governmental organization in Thailand. Each Tambon in the survey area consists of 4 to 24 villages.
We targeted children born between May 16, 2020, and May 15, 2021. This period was chosen according to the official cutoff date for school entry in Thailand (based on May 16 each year). These eligible children are referred to as “newborns”.
In 2021, we first requested the administrative data of newborns in each village from local healthcare centers. Based on this data, we excluded villages with fewer than two newborns in the record to minimize the logistical costs. After the exclusion, there were 162 eligible villages. We then ranked the villages by the number of newborns for each region. We chose 100 villages for the northeast and 50 for the south. Note that the number of villages in the northeast is larger because it has significantly lower fertility.
To confirm the number of eligible samples in each village, we asked village health volunteers to collect data on the number of newborns in each of those 150 villages. We then set a target sample size of 250 children for each region. The targeted sample size for each village is proportional to the number of newborns relative to the region’s target. For each village, we randomly chose samples using simple randomization. If a targeted household refused the interview or the primary caregiver was unavailable for an interview after three attempts during the survey period, a replacement would be randomly chosen from the same village unless there was no more eligible sample. If no eligible children were available in the village, we would randomly select from the rest of the villages in the region until we reached the target. The 2021 survey round’s newborn cohort sample consisted of 507 newborns from 484 households.
The 2022 survey round added a new cohort of newborns from the same villages, plus 48 more villages in the northeast. We had to add 48 more villages in the northeast to reach the target of around 250 children in each region for each cohort because the fertility rate in the northeast has been consistently low. In total, the latest cohort entering the survey in 2022 comprised 529 children from 503 households. At the same time, the first cohort has also been reinterviewed. We could reinterview the caregivers of 430 children out of 507 (attrition rate of 15%).
For the 2023 survey round, we added a new cohort of newborns from the same 198 villages. The latest cohort in 2023 comprised 382 children from 334 households. Again, we could reinterview the caregivers of 843 children out of 959 (attrition rate of 12%). See the sample size in each survey round in the table below.
With limited resources, no new cohort of newborns will be added after 2023. However, to enhance the dataset’s potential for studying spillover effects within households, we added siblings of the sample who were no older than 9 years of age during the survey round of 2024. The total number of siblings in the survey round of 2024 was 626. We also plan to add any newborn who will be born to the sample household in the future. However, to minimize the burden of caregivers in answering questions, we limit the total number of sample children in a household to no more than four.
Sample Size in Each Survey Year
Year |
Children |
Household |
2021 |
507 |
484 |
2022 |
959 |
920 |
2023 |
1,225 |
1,141 |
Map showing Survey Tambons

Funding
The survey has been continuously supported by the Equitable Education Fund (EEF).

