3.2 Data disaggregation as an entry point for further understanding
Data disaggregation is an entry point for understanding how gender affects men, women and people with non-binary identities. This is separate from data aggregation, which is where quantitative data is collected and expressed in a summary form. While data aggregation can show important overall trends, it can mask key differences between and within subgroups of individuals.
To begin, an intersectional gender analysis data first needs to be disaggregated by sex or gender (2,47). Sex or gender disaggregation means that the information collected is distinguished between men, women and people within non-binary identities.
To ensure research incorporates an intersectional perspective, data needs to be disaggregated by other social categories in addition to sex or gender, including different age groups, racial groups, income status, etc. Variation of research participants according to the different social stratifiers should be done in both quantitative and qualitative studies. This implies a deliberate effort to collect data and perform analyses while having gender and other social stratifiers in mind.
Data disaggregation is meant to act as a trigger that encourages deeper reflection, investigation and action. Simply disaggregating data by sex or gender identity is not gender analysis. Gender analysis can occur in both sex- or gender-specific studies (where only men, women or people with non-binary identities are included, for example) and sex- or gender-disaggregated studies (where both men, women and people with non-binary identities are included). When sex or gender disaggregation does occur, it is usually by men versus women. Very few data systems include other gender identities beyond men and women as a routine variable or demographic (2).
Within intersectionality, research samples can be either inter-categorical or intra-categorical.
• Inter-categorical samples are similar to sex- and gender-disaggregated samples as they include multiple social groups and compare experiences across groups, i.e. men’s and women’s vulnerability to disease exposure.
• Intra-categorical samples are similar to sex- and gender-specific samples in that they focus on one social group only and analyse experiences of that one group, i.e. adolescent girl’s vulnerability to disease exposure (50).
Disaggregation needs to be maintained throughout the research, rather than being aggregated at higher levels (2). This is important as aggregated data sets can mask differences between different groups (both within and between the sexes), which can lead to “assumptions that all people share the same experiences. This bias can affect the validity and reliability of research in negative ways” (47,51).
For example, if you are exploring vulnerability or exposure to infectious disease, while sex disaggregation will allow you to see whether a disease is more prevalent between men or women, disaggregating data by sex and age will allow you to see if certain age groups are more affected among men or women. It is also often necessary to reanalyse available aggregated data in a disaggregated manner to uncover these differences. This information will therefore allow you to tailor subsequent interventions accordingly, i.e. by focusing on groups that are most vulnerable.
Data that is disaggregated by sex or gender and other social stratifiers (including both quantitative and qualitative data) can help researchers to examine various factors of the disease process, such as those shown below (52):
• Who gets ill (different ages, sex, ethnic groups and socio-economic groups)?
• What types of illness do men, women and people with non-binary gender identities get?
• When do they get sick?
• Where do they get sick the most (place of work or specific regions)?
A study exploring the prevalence and risk factors of schistosomiasis among Hausa communities in Kano State, Nigeria found that the prevalence of schistosomiasis was much higher among men (20.6%) than women (13.3%) in the sample (53). Disaggregation by age showed that prevalence was highest among the 11-20 age group (27.4%), followed by the 21-20 age group (14.4%). Figure 10 ►.
While these stratifiers were explored separately and one can surmise that prevalence is highest among men aged 11-20, an intersectional analysis would combine these categories to explore prevalence among men and women within different age groups, which would potentially tell a different story. There is a need here to disaggregate data across different social stratifiers, moving beyond single categories to explore the intersection of social stratifiers.
Data exploring the incidence of HIV in sub-Saharan Africa by age and sex in 2013 shows that while the majority of new HIV infection occurs in adults aged 25-49, the proportion of new infections is much higher among young women and adolescent girls aged 15-24 compared to men (54). A gender analysis would look to explore the reasons for young women and adolescent girls’ increased vulnerability and ensure that this age group is not left behind in efforts to reduce HIV infection rates among the population as a whole.