Appendix A. Sample Design
The major features of the sample design are described in this appendix. Sample design features include target sample size, sample allocation, sampling frame and listing, choice of domains, sampling stages, stratification, and the calculation of sample weights.
The primary objective of the sample design for the 2014 Kyrgyzstan MICS was to produce statistically reliable estimates of most indicators, at the national, urban/rural and oblast levels. Urban and rural areas in each of the seven regions (Batken, Chui, Djalal-abad, Issyk-Kul, Naryn, Osh and Talas oblasts) were defined as the sampling strata. A two-stage, stratified cluster sampling approach was used for the selection of the survey sample.
Sample Size and Sample Allocation
The sample size for the 2014 Kyrgystan MICS was calculated as 7,200 households. For the calculation of the sample size, the key indicator used was the stunting rate among children under five. The following formula was used to estimate the required sample size for this indicator:
where:
- nis the required sample size, expressed as number of households
- 4 is a factor to achieve the 95 percent level of confidence
- r is the predicted or anticipated value of the indicator, expressed in the form of a proportion
- RR is the anticipated response rate;
- deff is the design effect;
- RME is the relative margin of error to be tolerated at the 95 percent level of confidence;
- p is the proportion of the subpopulation upon which the indicator, r, is based;
- is the average number of persons per household
For the calculation, r (stunting rate among children under five) was assumed to be 17 percent. The value of deff (design effect) was taken as 1.5, p (proportion of children age 0 to 5 years in the total population) was taken as 14.4 percent, and the average number of persons per household was estimated as 4.5 per household from the sampling frame. The value of RME was taken as 0.24. Assuming a response rate of 98.5%,the resulting number of households from this exercise was 797 households which yields 7,170 in total for all 9 regions (7 oblasts and Bishkek and Osh cities).
The number of households selected per cluster for the 2014 Kyrgyzstan MICS was determined as 18 households, based on a number of considerations, including the design effect, the budget available, and the time that would be needed per team to complete one cluster. Dividing the total number of households by the number of sample households per cluster, it was calculated that 400 sample clusters would need to be selected.
The sample of 400 clusters (PSUs) was initially allocated equally over the nine regions with the final sample size calculated as 7,200 households (400 clusters * 18 sample households per cluster).
Within regions the sample was allocated proportionally over urban and rural areas. The initial allocation was adjusted in two cases. First: the sample was expanded by six PSUs in Jalalabad oblast and reduced with the same number in Osh city. The rationale for this is that Jalalabad is a large region area with some heterogeneity in “way of living” over the region and Osh City is much more homogeneous. Second: in Osh Region the allocation between urban and rural was adjusted by increasing the urban sample by three PSUs (and reducing the rural sample by three). Table SA.1 reflects the final allocation of clusters by oblast and area of residence.
Table SD.1: Final sample allocation |
|||
Allocation of number of clusters by region, Kyrgyzstan, 2014 |
|||
Total |
Area |
||
Urban |
Rural |
||
Total |
400 |
166 |
234 |
Region |
|||
Batken |
45 |
13 |
32 |
Djalal-Abad |
50 |
16 |
34 |
Issyk-Kul |
45 |
16 |
29 |
Naryn |
45 |
10 |
35 |
Osh Oblast |
45 |
8 |
37 |
Talas |
43 |
9 |
34 |
Chui |
45 |
12 |
33 |
Bishkek City |
44 |
44 |
- |
Osh City |
38 |
38 |
- |
Sampling Frame and Selection of Clusters
The 2009 census frame was used for the selection of clusters. Census enumeration areas were defined as primary sampling units (PSUs), and were selected from each of the sampling strata by using systematic probability-proportional-to-size sampling procedures, based on the number of households in each enumeration area from the 2009 Population and Housing Census frame.
The sample was selected in two stages. The first stage of sampling was thus completed by selecting the required number of enumeration areas from each of the seven regions, separately for the urban and rural strata.At the second stage, within the selected enumeration areas (clusters), a household listing was carried out and a systematic sample of 18 households was then drawn in each PSU.
Listing Activities
Since the sampling frame (the 2009 census) was not up-to-date, a new listing of households was conducted in all the sample enumeration areas prior to the selection of households. For this purpose, listing teams were formed who visited all of the selected enumeration areas and listed all households in the enumeration areas. They were provided with census enumeration area maps. A separate three day listing training including a pilot in both urban and rural areas was conducted in March 2014 according to recommended MICS procedures. A total of 18 listing teams were utilised for the listing exercise to cover the 400 EAs over March and April 2014.
Selection of Households
Lists of households were prepared by the listing teams in the field for each enumeration area. The households were then sequentially numbered from 1 to n (the total number of households in each enumeration area) at the National Statistics Committee, where the selection of 18 households in each enumeration area was carried out using random systematic selection procedures.
Calculation of Sample Weights
The 2014 Kyrgyzstan MICS sample was not self-weighting. Essentially different sampling fractions were used in each region since the sizes of the regions varied. For this reason, sample weights were calculated and these were used in the subsequent analyses of the survey data.
The major component of the weight is the reciprocal of the sampling fraction employed in selecting the number of sample households in that particular sampling stratum (h) and PSU (i):
The term f_{hi}, the sampling fraction for the i-th sample PSU in the h-th stratum, is the product of probabilities of selection at every stage in each sampling stratum:
where p_{shi} is the probability of selection of the sampling unit at stage s for the i-th sample PSU in the h-th sampling stratum. Based on the sample design these probabilities were calculated as follows:
p_{1hi}= _{,}
where
n_{h} - number of sample PSUs selected in stratum h
M_{hi} - number of households in the 2009 Census frame for the i-th sample PSU in stratum h
M_{h} - total number of households in the 2009 Census frame for stratum h
and
P_{2hi} =
where
M'_{hi} = number of households listed in the i-th sample PSU
Since the number of households in each enumeration area (PSU) from the 2009 Census frame used for the first stage selection and the updated number of households in the enumeration area from the listing are generally different, individual overall probabilities of selection for households in each sample enumeration area (cluster) were calculated.
A final component in the calculation of sample weights takes into account the level of non-response for the household and individual interviews. The adjustment for household non-response in each stratum is equal to
where RR_{h} is the response rate for the sample households in stratum h, defined as the proportion of the number of interviewed households in stratum h out of the number of selected households found to be occupied during the fieldwork in stratum h.
Similarly, adjustment for non-response at the individual level (women and under-5 children) for each stratum is equal to:
where RR_{h} is the response rate for the individual questionnaires in stratum h, defined as the proportion of eligible individuals (women and under-5 children) in the sample households in stratum h who were successfully interviewed.
After the completion of fieldwork, response rates were calculated for each sampling stratum. These were used to adjust the sample weights calculated for each cluster. Response rates in the 2014 Kyrgyzstan MICS are shown in Table HH.1 in this report.
The non-response adjustment factors for the individual women and under-5 questionnaires were applied to the adjusted household weights. Numbers of eligible women and under-5 children were obtained from the roster of household members in the Household Questionnaire for households where interviews were completed.
The design weights for the households were calculated by multiplying the inverse of the probabilities of selection by the non-response adjustment factor for each enumeration area. These weights were then standardized (or normalized), one purpose of which is to make the weighted sum of the interviewed sample units equal to the total sample size at the national level. Normalization is achieved by dividing the full sample weights (adjusted for non-response) by the average of these weights across all households at the national level. This is performed by multiplying the sample weights by a constant factor equal to the unweighted number of households at the national level divided by the weighted total number of households (using the full sample weights adjusted for non-response). A similar standardization procedure was followed in obtaining standardized weights for the individual women, men, and under-5 questionnaires. Adjusted (normalized) weights varied between 0.107229 to 3.434932 in the 400 sample enumeration areas (clusters).
Sample weights were appended to all data sets and analyses were performed after weighting households, women, or under-5s with these sample weights.