Introduction
Self-esteem is a social and psychological concept that represents a person’s overall sense of value or worth, essentially measuring the extent to which a person appreciates, approves, and likes himself or herself. From the perspective of measurement, the construct is especially hard to identify due to the variety of factors contributing to the development of a certain level of self-esteem among individuals. For instance, Tomás, Oliver, Hontangas, Sancho, & Galiana (2015) mentioned that gender differences were the most prominent variables affecting the various levels of people’s self-esteem.
Self-esteem has been directly associated with the idea of self-concept, which is attributed to the thoughts “derived from individuals’ own beliefs about themselves, and form assumptions about how others view them” (Thomas, Murakami-Brundage, Bertolami, Beck, & Grant, 2018, p. 173). In its essence, self-concept is an important component to consider due to its dialogical nature – it can change and vary depending on the continuous dialogue a person may have with himself or herself. Therefore, self-concept is directly related to self-esteem because it explains the ways in which people make sense of their experiences and respond to them.
In the present study, the subject of self-esteem will be approached through the individual perspective, which means that cultural differences will not be studied. The aim of the research is to develop a self-esteem measurement scale that will consider the existing limitations and provide a comprehensive method that can be used across different populations. The universality of a self-esteem measurement scale is the goal that is expected to be reached at the end of the study due to the need to measure the concept across a broad range of individuals that may have different experiences, self-perceptions, and ideas of how they view their self-esteem.
Review of Literature
The study of self-esteem measurement has not been particularly extensive in literature, with Rosenberg’s Self-Esteem Scale (RSES) being the most widespread subject of research. Tomás et al. (2015) focused on exploring method effects and gender invariance of RSES. It was revealed that the scale had some small but significant associations with gender invariance when it comes to measuring the self-esteem of male and female samples. In addition, the scale showed to favor male respondents rather than females, which pointed to the need for creating a scale that will not favor either of the genders and offer a neutral perspective in regards to measuring self-esteem.
Salerno, Ingoglia, and Lo Coco (2017) aimed to test nine alternative RSES factor structures in relation to a sample of participants with eating and weight disorders as well as establish the scale’s measurement invariance across both clinical and non-clinical groups. This research is significant for its attention to a specific population of individuals who are expected to have complications in terms of reporting at least average levels of self-esteem. It was found that the structure of Rosenberg’s Scale led to the method effects contamination associated with item wording. In addition, the researchers concluded that both clinical and non-clinical groups of respondents interpreted the subject of self-esteem within the RSES substantially similarly.
The use of Rosenberg’s Self-Esteem scale was studied in terms of its application in a wide population of individuals with varying characteristics. While the study mentioned previously targeted the sample of the female population with weight and eating disorders, Sharratt, Boduszek, Jones, and Gallagher (2014) focused on the population of children of prisoners, which is a unique sample for research. The scholars revealed that RSES was the most efficiently represented with the help of a bifactor model associated with self-esteem measurement. The general factor of self-esteem was developed on the basis of ten scale items as well as “separate method effects for both positively and negatively phrased items” (Sharratt et al., 2014, p. 228). The research revealed that RSES was appropriate for measuring self-esteem among children of prisoners; however, it was necessary to account for the presence of method effects in order to refrain from false interpretations.
When it comes to the differences between individual- and culture-level self-esteem, it is worth mentioning the research conducted by Alessandri, Cenciotti, Łaguna, Różycka-Tran, and Vecchione (2017). The researchers aimed to find differences between the reported RSES factors associated with culture- and individual-level self-esteem (factor isomorphism) due to the lack of studies on this subject. After administering RSES in samples of individuals across thirty-seven countries, the scholars found cross-cultural stability of RSES’s bifactor structure that was present at both cultures- and individual levels. Nevertheless, only the factor of general self-esteem showed a significant degree of isomorphism.
Xu and Leung’s (2018) research was significant for the scholars’ attempt to identify “the effects of varying numbers of Likert scale points on RSES’s factor structure” (p. 118). This topic was important because Likert-based scales represented the most widespread tools of psychological studies, and various numbers linked to response categories were expected to impact data distribution, response styles, reliability, and other factors. The analysis showed that 4-point scales were not recommended to use in the measurement of self-esteem because of their increased skewness and lower loadings (Xu & Leung, 2018). 11-point scales that included a range between 0 to 10 showed to be more appropriate compared to 4-point measurements because of their composite reliability and higher loadings.
In contrast to RSES, Beck Self-Esteem Scale-Short Form (BSES-SF) approaches the measurement of self-esteem from the perspective of comparison: it asks respondents to measure their own self-esteem in comparison with the skills and knowledge of other people. This measurement tool has shown to be effective when measuring self-esteem in a unique sample of participants. For instance, Thomas et al. (2018) used BSES-SF in relation to individuals with schizophrenia spectrum disorders. The attention to such conditions is important in the context of self-esteem measurement because of the variability of perceptions among individuals diagnosed with different psychological conditions. The findings of the study showed that a 10-item BSES-SF scale could be used as a valid and reliable measurement of both self-esteem and self-concept in the context of a broad population sample. Importantly, the scale provides evidence for the existence of significant relationships between self-esteem and such issues as delusion severity, depression, and motivation.
Overall, the review of the literature on the subject of self-esteem measurement revealed that Rosenberg’s Self-Esteem Scale was the most widely used tool for studying the concept within a large sample population. The knowledge about the existing measurements will enhance the further creation of a scale for self-esteem evaluation. In the next sections, the justification for the proposed scale will be provided along with the discussion on item construction, the development of twenty sample items, the argument for a specific method of running item analysis, and the support for methods used to establish the validity of the proposed scale (Möller, 2014).
Justification for a Scale
As found from the exploration of multi- and unidimensional scales, the latter represents a better choice when it comes to the measurement of self-esteem. Rosenberg’s Scale is an example of a unidimensional approach to self-esteem measurement that has been shown to work in a wide population sample despite some limitations (Eklund, Bäckström, & Hansson, 2018). This means that a proposed scale will be developed on the basis of RSES with several adjustments to ensure its applicability to a wide population sample.
In addition to implementing a unidimensional approach to the development of a self-esteem scale, it is proposed to use a Likert scale due to its feasibility, the absence of need for a large number of judges, as well as the fairly easy process of sampling (Kline, 2015). A Likert scale is the suggested approach because its scores are linearly related to the dimension being measured, which makes it possible to eliminate unwarrantable assumptions. In addition to this, the Likert scale is justified in the present study for its frequent use in a broad range of contexts (Kline, 2015). In addition, the fact that Rosenberg’s Self-Esteem Scale was developed on the basis of a Likert-type scale supports its feasibility in regards to its use for self-esteem measurement.
However, the 0 to 10 Likert scale is proposed to implement in the present study. Because it implies the use of 11 points, it provides potential participants with the opportunity to choose a strictly median option (5) in cases when they hesitate with a response or when this option seems the most reasonable. In addition to this, 11-point scales have higher composite reliability and loadings, as found in the study by Xu and Leung (2018). In the further sections, sample items for the proposed 11-item Likert scale will be developed to illustrate the procedure that potential participants will be asked to undergo when measuring their levels of self-esteem.
Constructing Items
As mentioned in the previous section, the construct that will be measured in the study will be explored from the perspective of unidimensionality due to the overarching evidence to support its use in social and psychological research. Unidimensionality refers to a test’s or measurement’s capability to measure what it was intended to measure without mistakenly using others, even those that are associated with the targeted phenomenon. In the context of self-esteem, it is necessary to ensure that the developed scale does not measure self-concept, which is a similar construct that is often used interchangeably with the former.
When it comes to creating an item pool for a measurement scale, it is imperative that it excludes items that can contribute to multidimensionality and includes all possible variations of the construct that it was supposed to measure. Weak, unrelated items should be dropped during scale development while those directly contributing to the measurement of either high or low levels of self-esteem should be included. In addition, in the process of scale development, the deductive method of construction will be used: making conclusions about a specific concept on the basis of general knowledge and research. In terms of the role of a subject matter expert (SME) in the context of scale constructions, it could have been beneficial to seek the assistance of a scholar who has already conducted research in the measurement of self-esteem or a practicing psychologist whose expertise will contribute to the enhancement of the scale.
Sample Items
Item Analysis Procedure
The process of analysis is associated with examining the responses individuals gave to test items for the purpose of assessing the quality of items as well as the test as a whole. This process is also important for improving the items that will be used for future tests. The first item analysis procedure will be associated with the formation of the high and low groups of respondents (27% of first and last respondents). 27% is the chosen percentage for identifying high and low groups because it offers a balance between the following aims: making groups as large as possible and making them as different as possible.
The second procedure within the item analysis process will be associated with the identification of the mean and standard deviation. The mean refers to the average response to an item and is calculated by adding the total number of points earned for each item and dividing the total by the number of participants. Standard deviation refers to the measure indicating a certain level of score dispersion. A low standard deviation of a test will indicate the availability of data points that are closer to the identified meanwhile a high standard deviation will indicate that data points are spread out across a wide spectrum.
Scale Validity and Reliability
The most common issues associated with item construction relate to ensuring both validity and reliability. While validity describes an item’s ability to study a chosen phenomenon, reliability is associated with the consistency of items and therefore can produce consistent results (Kline, 2015). If the developed items were developed reliably, it means that they would offer the same results over a prolonged period of time. In regards to the developed measurement scale, in the beginning, it is proposed to implement test-retest reliability. An example of this is performing the same Likert-scale test at different times. If the results of the tests are the same when compared to each other, then they will be considered reliable.
When it comes to measuring the validity of the proposed scale, convergent validity was the most appropriate in the current study (Sharratt et al., 2014). It implies the measurement of validity by means of evaluating relationships that exist between an implemented tool and the construct it was intended to identify. Thus, convergent validity will be established by means of comparing the existing values of the generally observed self-esteem (objective look of a psychologist) with the indicators received from the developed 11-point Likert scale in order to determine whether there are some similarities between them. In contrast to convergent validity, discriminant validity will be established by means of demonstrating that indicators pertaining to one construct are dissimilar to others.
In terms of criterion-related validity, the assessment is implemented by comparing test scores with non-test criteria, the key objective is assessing whether a test exhibits any set of specific abilities. Therefore, it will show the extent of the relationship between a criterion and a predictor. In relation to the test used for measuring the self-esteem of individuals, several indicators will be identified. Such indicators can include the overall ratings of self-esteem given to participants, the severity of possible psychological issues, the ratings given to them by practicing psychologists, and so on. All of these variables will be correlated to the scores that participants give in the Likert scale of self-esteem measurements to make conclusions about criterion-related validity.
Conclusion
The measurement of self-esteem is a complex subject, and the proposed scale is intended to address the challenges that researchers have encountered previously. Rosenberg’s Self-Esteem Scale has shown to be the most widely used tool for making conclusions about the self-esteem of individuals, and it served as the basis for the developed questionnaire. However, the developed scale included eleven points, which is more effective compared to five-point scales because it offers greater variance. In addition, an eleven-point Likert scale allows for a greater difference in mean scores, which is a beneficial aspect in terms of conducting item analysis. Overall, the developed scale is expected to offer a comprehensive measurement of self-esteem while addressing the limitations of the tests that have been implemented previously. In terms of future studies, a three-stage self-esteem measurement can be implemented: combining the 11-point scale with interviews and peer observations to guarantee a mixed-methods approach to the measurement of self-esteem.
References
Alessandri, G., Cenciotti, R., Łaguna, M., Różycka-Tran, J., & Vecchione, M. (2017). Individual-level and culture-level self-esteem: A test of construct isomorphism. Journal of Cross-Cultural Psychology, 48(9), 1328-1341.
Kline, P. (2015). A handbook of test construction. New York, NY: Routledge.
Möller, H. (2014). Observer rating scales. In G. Alexopoulos, S. Kasper, H. Möller & C. Moreno (Eds.), Guide to assessment scales in major depressive disorder (pp. 7-22). Cham, Switzerland: Springer International Publishing.
Salerno, L., Ingoglia, S., & Lo Coco, G. (2017). Competing factor structures of the Rosenberg Self-Esteem Scale (RSES) and its measurement invariance across clinical and non-clinical samples. Personality and Individual Differences, 113, 13-19.
Sharratt, K., Boduszek, D., Jones, A., & Gallagher, B. (2014). Construct validity, dimensionality and factorial invariance of the Rosenberg Self-Esteem Scale: A bifactor modeling approach among children of prisoners. Current Issues in Personality Psychology, 2(4), 228-236.
Thomas, E. C., Murakami-Brundage, J., Bertolami, N., Beck, A. T., & Grant, P. M. (2018). Beck Self-Esteem Scale-Short Form: Development and psychometric evaluation of a scale for the assessment of self-concept in schizophrenia. Psychiatry research, 263, 173-180.
Tomás, J. M., Oliver, A., Hontangas, P. M., Sancho, P., & Galiana, L. (2015). Method effects and gender invariance of the Rosenberg Self-esteem Scale: A study on adolescents. Acta de Investigación Psicológica, 5(3), 2194-2203.
Xu, M. L., & Leung, S. O. (2018). Effects of varying numbers of Likert scale points on factor structure on the Rosenberg Self-Esteem Scale. Asian Journal of Social Psychology, 21, 119-128.