Personality assessment, the measurement of personal characteristics. Assessment is an end result of gathering information intended to advance psychological theory and research and to increase the probability that wise decisions will be made in applied settings (e.g., in selecting the most promising people from a group of job applicants). The approach taken by the specialist in personality assessment is based on the assumption that much of the observable variability in behaviour from one person to another results from differences in the extent to which individuals possess particular underlying personal characteristics (traits). The assessment specialist seeks to define these traits, to measure them objectively, and to relate them to socially significant aspects of behaviour.
A distinctive feature of the scientific approach to personality measurement is the effort, wherever possible, to describe human characteristics in quantitative terms. How much of a trait manifests itself in an individual? How many traits are present? Quantitative personality measurement is especially useful in comparing groups of people as well as individuals. Do groups of people from different cultural and economic backgrounds differ when considered in the light of their particular personality attributes or traits? How large are the group differences?
Overt behaviour is a reflection of interactions among a wide range of underlying factors, including the bodily state of the individual and the effects of that person’s past personal experiences. Hence, a narrowly focused approach is inadequate to do justice to the complex human behaviour that occurs under the constantly changing set of challenges, pleasures, demands, and stresses of everyday life. The sophisticated measurement of human personality inescapably depends on the use of a variety of concepts to provide trait definitions and entails the application of various methods of observation and evaluation. Personality theorists and researchers seek to define and to understand the diversity of human traits, the many ways people have of thinking and perceiving and learning and emoting. Such nonmaterial human dimensions, types, and attributes are constructs—in this case, inferences drawn from observed behaviour. Widely studied personality constructs include anxiety, hostility, emotionality, motivation, and introversion-extroversion. Anxiety, for example, is a concept, or construct, inferred in people from what they say, their facial expressions, and their body movements.
Personality is interactional in two senses. As indicated above, personal characteristics can be thought of as products of interactions among underlying psychological factors; for example, an individual may experience tension because he or she is both shy and desirous of social success. These products, in turn, interact with the types of situations people confront in their daily lives. A person who is anxious about being evaluated might show debilitated performance in evaluative situations (for example, taking tests), but function well in other situations in which an evaluative emphasis is not present. Personality makeup can be either an asset or a liability depending on the situation. For example, some people approach evaluative situations with fear and foreboding, while others seem to be motivated in a desirable direction by competitive pressures associated with performance.
Personality tests provide measures of such characteristics as feelings and emotional states, preoccupations, motivations, attitudes, and approaches to interpersonal relations. There is a diversity of approaches to personality assessment, and controversy surrounds many aspects of the widely used methods and techniques. These include such assessments as the interview, rating scales, self-reports, personality inventories, projective techniques, and behavioral observation.
In an interview the individual under assessment must be given considerable latitude in “telling his story.” Interviews have both verbal and nonverbal (e.g., gestural) components. The aim of the interview is to gather information, and the adequacy of the data gathered depends in large part on the questions asked by the interviewer. In an employment interview the focus of the interviewer is generally on the job candidate’s work experiences, general and specific attitudes, and occupational goals. In a diagnostic medical or psychiatric interview considerable attention would be paid to the patient’s physical health and to any symptoms of behavioral disorder that may have occurred over the years.
Two broad types of interview may be delineated. In the interview designed for use in research, face-to-face contact between an interviewer and interviewee is directed toward eliciting information that may be relevant to particular practical applications under general study or to those personality theories (or hypotheses) being investigated. Another type, the clinical interview, is focused on assessing the status of a particular individual (e.g., a psychiatric patient); such an interview is action-oriented (i.e., it may indicate appropriate treatment). Both research and clinical interviews frequently may be conducted to obtain an individual’s life history and biographical information (e.g., identifying facts, family relationships), but they differ in the uses to which the information is put.
Although it is not feasible to quantify all of the events occurring in an interview, personality researchers have devised ways of categorizing many aspects of the content of what a person has said. In this approach, called content analysis, the particular categories used depend upon the researchers’ interests and ingenuity, but the method of content analysis is quite general and involves the construction of a system of categories that, it is hoped, can be used reliably by an analyst or scorer. The categories may be straightforward (e.g., the number of words uttered by the interviewee during designated time periods), or they may rest on inferences (e.g., the degree of personal unhappiness the interviewee appears to express). The value of content analysis is that it provides the possibility of using frequencies of uttered response to describe verbal behaviour and defines behavioral variables for more-or-less precise study in experimental research. Content analysis has been used, for example, to gauge changes in attitude as they occur within a person with the passage of time. Changes in the frequency of hostile reference a neurotic makes toward his parents during a sequence of psychotherapeutic interviews, for example, may be detected and assessed, as may the changing self-evaluations of psychiatric hospital inmates in relation to the length of their hospitalization.
Sources of erroneous conclusions that may be drawn from face-to-face encounters stem from the complexity of the interview situation, the attitudes, fears, and expectations of the interviewee, and the interviewer’s manner and training. Research has been conducted to identify, control, and, if possible, eliminate these sources of interview invalidity and unreliability. By conducting more than one interview with the same interviewee and by using more than one interviewer to evaluate the subject’s behaviour, light can be shed on the reliability of the information derived and may reveal differences in influence among individual interviewers. Standardization of interview format tends to increase the reliability of the information gathered; for example, all interviewers may use the same set of questions. Such standardization, however, may restrict the scope of information elicited, and even a perfectly reliable (consistent) interview technique can lead to incorrect inferences.
The rating scale is one of the oldest and most versatile of assessment techniques. Rating scales present users with an item and ask them to select from a number of choices. The rating scale is similar in some respects to a multiple choice test, but its options represent degrees of a particular characteristic.
Rating scales are used by observers and also by individuals for self-reporting (see below Self-report tests). They permit convenient characterization of other people and their behaviour. Some observations do not lend themselves to quantification as readily as do simple counts of motor behaviour (such as the number of times a worker leaves his lathe to go to the restroom). It is difficult, for example, to quantify how charming an office receptionist is. In such cases, one may fall back on relatively subjective judgments, inferences, and relatively imprecise estimates, as in deciding how disrespectful a child is. The rating scale is one approach to securing such judgments. Rating scales present an observer with scalar dimensions along which those who are observed are to be placed. A teacher, for example, might be asked to rate students on the degree to which the behaviour of each reflects leadership capacity, shyness, or creativity. Peers might rate each other along dimensions such as friendliness, trustworthiness, and social skills. Several standardized, printed rating scales are available for describing the behaviour of psychiatric hospital patients. Relatively objective rating scales have also been devised for use with other groups. Rating scales often take a graphic form:
To what degree is John shy?
not at allslightlymoderatelyveryextremely
A number of requirements should be met to maximize the usefulness of rating scales. One is that they be reliable: the ratings of the same person by different observers should be consistent. Other requirements are reduction of sources of inaccuracy in personality measurement; the so-called halo effect results in an observer’s rating someone favourably on a specific characteristic because the observer has a generally favourable reaction to the person being rated. One’s tendency to say only nice things about others or one’s proneness to think of all people as average (to use the midrange of scales) represents other methodological problems that arise when rating scales are used.
The success that attended the use of convenient intelligence tests in providing reliable, quantitative (numerical) indexes of individual ability has stimulated interest in the possibility of devising similar tests for measuring personality. Procedures now available vary in the degree to which they achieve score reliability and convenience. These desirable attributes can be partly achieved by restricting in designated ways the kinds of responses a subject is free to make. Self-report instruments follow this strategy. For example, a test that restricts the subject to true-false answers is likely to be convenient to give and easy to score. So-called personality inventories (see below) tend to have these characteristics, in that they are relatively restrictive, can be scored objectively, and are convenient to administer. Other techniques (such as inkblot tests) for evaluating personality possess these characteristics to a lesser degree.
Self-report personality tests are used in clinical settings in making diagnoses, in deciding whether treatment is required, and in planning the treatment to be used. A second major use is as an aid in selecting employees, and a third is in psychological research. An example of the latter case would be where scores on a measure of test anxiety—that is, the feeling of tenseness and worry that people experience before an exam—might be used to divide people into groups according to how upset they get while taking exams. Researchers have investigated whether the more test-anxious students behave differently than the less anxious ones in an experimental situation.
Among the most common of self-report tests are personality inventories. Their origins lie in the early history of personality measurement, when most tests were constructed on the basis of so-called face validity; that is, they simply appeared to be valid. Items were included simply because, in the fallible judgment of the person who constructed or devised the test, they were indicative of certain personality attributes. In other words, face validity need not be defined by careful, quantitative study; rather, it typically reflects one’s more-or-less imprecise, possibly erroneous, impressions. Personal judgment, even that of an expert, is no guarantee that a particular collection of test items will prove to be reliable and meaningful in actual practice.
A widely used early self-report inventory, the so-called Woodworth Personal Data Sheet, was developed during World War I to detect soldiers who were emotionally unfit for combat. Among its ostensibly face-valid items were these: Does the sight of blood make you sick or dizzy? Are you happy most of the time? Do you sometimes wish you had never been born? Recruits who answered these kinds of questions in a way that could be taken to mean that they suffered psychiatric disturbance were detained for further questioning and evaluation. Clearly, however, symptoms revealed by such answers are exhibited by many people who are relatively free of emotional disorder.
Rather than testing general knowledge or specific skills, personality inventories ask people questions about themselves. These questions may take a variety of forms. When taking such a test, the subject might have to decide whether each of a series of statements is accurate as a self-description or respond to a series of true-false questions about personal beliefs.
Several inventories require that each of a series of statements be placed on a rating scale in terms of the frequency or adequacy with which the statements are judged by the individual to reflect his tendencies and attitudes. Regardless of the way in which the subject responds, most inventories yield several scores, each intended to identify a distinctive aspect of personality.
One of these, the Minnesota Multiphasic Personality Inventory (MMPI), is probably the personality inventory in widest use in the English-speaking world. Also available in other languages, it consists in one version of 550 items (e.g., “I like tall women”) to which subjects are to respond “true,” “false,” or “cannot say.” Work on this inventory began in the 1930s, when its construction was motivated by the need for a practical, economical means of describing and predicting the behaviour of psychiatric patients. In its development efforts were made to achieve convenience in administration and scoring and to overcome many of the known defects of earlier personality inventories. Varied types of items were included and emphasis was placed on making these printed statements (presented either on small cards or in a booklet) intelligible even to persons with limited reading ability.
Most earlier inventories lacked subtlety; many people were able to fake or bias their answers since the items presented were easily seen to reflect gross disturbances; indeed, in many of these inventories maladaptive tendencies would be reflected in either all true or all false answers. Perhaps the most significant methodological advance to be found in the MMPI was the attempt on the part of its developers to measure tendencies to respond, rather than actual behaviour, and to rely but little on assumptions of face validity. The true-false item “I hear strange voices all the time” has face validity for most people in that to answer “true” to it seems to provide a strong indication of abnormal hallucinatory experiences. But some psychiatric patients who “hear strange voices” can still appreciate the socially undesirable implications of a “true” answer and may therefore try to conceal their abnormality by answering “false.” A major difficulty in placing great reliance on face validity in test construction is that the subject may be as aware of the significance of certain responses as is the test constructor and thus may be able to mislead the tester. Nevertheless, the person who hears strange voices and yet answers the item “false” clearly is responding to something—the answer still is a reflection of personality, even though it may not be the aspect of personality to which the item seems to refer; thus, careful study of responses beyond their mere face validity often proves to be profitable.
Much study has been given to the ways in which response sets and test-taking attitudes influence behaviour on the MMPI and other personality measures. The response set called acquiescence, for example, refers to one’s tendency to respond with “true” or “yes” answers to questionnaire items regardless of what the item content is. It is conceivable that two people might be quite similar in all respects except for their tendency toward acquiescence. This difference in response set can lead to misleadingly different scores on personality tests. One person might be a “yea-sayer” (someone who tends to answer true to test items); another might be a “nay-sayer”; a third individual might not have a pronounced acquiescence tendency in either direction.
Acquiescence is not the only response set; there are other test-taking attitudes that are capable of influencing personality profiles. One of these, already suggested by the example of the person who hears strange voices, is social desirability. A person who has convulsions might say “false” to the item “I have convulsions” because he believes that others will think less of him if they know he has convulsions. The intrusive potentially deceiving effects of the subjects’ response sets and test-taking attitudes on scores derived from personality measures can sometimes be circumvented by varying the content and wording of test items. Nevertheless, users of questionnaires have not yet completely solved problems of bias such as those arising from response sets. Indeed, many of these problems first received widespread attention in research on the MMPI, and research on this and similar inventories has significantly advanced understanding of the whole discipline of personality testing.
Attributes of the MMPI
The MMPI as originally published consists of nine clinical scales (or sets of items), each scale having been found in practice to discriminate a particular clinical group, such as people suffering from schizophrenia, depression, or paranoia (see mental disorder). Each of these scales (or others produced later) was developed by determining patterns of response to the inventory that were observed to be distinctive of groups of individuals who had been psychiatrically classified by other means (e.g., by long-term observation). The responses of apparently normal subjects were compared with those of hospital patients with a particular psychiatric diagnosis—for example, with symptoms of schizophrenia. Items to which the greatest percentage of “normals” gave answers that differed from those more typically given by patients came to constitute each clinical scale.
In addition to the nine clinical scales and many specially developed scales, there are four so-called control scales on the inventory. One of these is simply the number of items placed by the subject in the “cannot say” category. The L (or lie) scale was devised to measure the tendency of the test taker to attribute socially desirable attributes to himself. In response to “I get angry sometimes” he should tend to mark false; extreme L scorers in the other direction appear to be too good, too virtuous. Another so-called F scale was included to provide a reflection of the subjects’ carelessness and confusion in taking the inventory (e.g., “Everything tastes the same” tends to be answered true by careless or confused people). More subtle than either the L or F scales is what is called the K scale. Its construction was based on the observation that some persons tend to exaggerate their symptoms because of excessive openness and frankness and may obtain high scores on the clinical scales; others may exhibit unusually low scores because of defensiveness. On the K-scale item “I think nearly anyone would tell a lie to keep out of trouble,” the defensive person is apt to answer false, giving the same response to “I certainly feel useless at times.” The K scale was designed to reduce these biasing factors; by weighting clinical-scale scores with K scores, the distorting effect of test-taking defensiveness may be reduced.
In general, it has been found that the greater the number and magnitude of one’s unusually high scores on the MMPI, the more likely it is that one is in need of psychiatric attention. Most professionals who use the device refuse to make assumptions about the factualness of the subject’s answers and about his personal interpretations of the meanings of the items. Their approach does not depend heavily on theoretical predilections and hypotheses. For this reason the inventory has proved particularly popular with those who have strong doubts about the eventual validity that many theoretical formulations will show in connection with personality measurement after they have been tested through painstaking research. The MMPI also appeals to those who demand firm experimental evidence that any personality assessment method can make valid discriminations among individuals.
In recent years there has been growing interest in actuarial personality description—that is, in personality description based on traits shared in common by groups of people. Actuarial description studies yield rules by which persons may be classified according to their personal attributes as revealed by their behaviour (on tests, for example). Computer programs are now available for diagnosing such disorders as hysteria, schizophrenia, and paranoia on the basis of typical group profiles of MMPI responses. Computerized methods for integrating large amounts of personal data are not limited to this inventory and are applicable to other inventories, personality tests (e.g., inkblots), and life-history information. Computerized classification of MMPI profiles, however, has been explored most intensively.
Comparison of the MMPI and CPI
The MMPI has been considered in some detail here because of its wide usage and because it illustrates a number of important problems confronting those who attempt to assess personality characteristics. Many other omnibus personality inventories are also used in applied settings and in research. The California Psychological Inventory (CPI), for example, is keyed for several personality variables that include sociability, self-control, flexibility, and tolerance. Unlike the MMPI, it was developed specifically for use with “normal” groups of people. Whereas the judgments of experts (usually psychiatric workers) were used in categorizing subjects given the MMPI during the early item-writing phase of its development, nominations by peers (such as respondents or friends) of the subjects were relied upon in work with the CPI. Its technical development has been evaluated by test authorities to be of high order, in part because its developers profited from lessons learned in the construction and use of the MMPI. It also provides measures of response sets and has been subjected to considerable research study.
From time to time, most personality inventories are revised for a variety of reasons, including the need to take account of cultural and social changes and to improve them. For example, a revision of the CPI was published in 1987. In the revision, the inventory itself was modified to improve clarity, update content, and delete items that might be objectionable to some respondents. Because the item pool remained largely unchanged, data from the original samples were used in computing norms and in evaluating reliability and validity for new scales and new composite scores. The descriptions of high and low scorers on each scale have been refined and sharpened, and correlations of scale scores with other personality tests have been reported.
Other self-report techniques
Beyond personality inventories, there are other self-report approaches to personality measurement available for research and applied purposes. Mention was made earlier of the use of rating scales. The rating-scale technique permits quantification of an individual’s reactions to himself, to others, and, in fact, to any object or concept in terms of a standard set of semantic (word) polarities such as “hot-cold” or “good-bad.” It is a general method for assessing the meanings of these semantic concepts to individuals.
Another method of self-report called the Q-sort is devised for problems similar to those for which rating scales are used. In a Q-sort a person is given a set of sentences, phrases, or words (usually presented individually on cards) and is asked to use them to describe himself (as he thinks he is or as he would like to be) or someone else. This description is carried out by having the subject sort the items on the cards in terms of their degree of relevance so that they can be distributed along what amounts to a rating scale. Examples of descriptive items that might be included in a Q-sort are “worries a lot,” “works hard,” and “is cheerful.”
Typical paper-and-pencil instruments such as personality inventories involve verbal stimuli (words) intended to call forth designated types of responses from the individual. There are clearly stated ground rules under which he makes his responses. Paper-and-pencil devices are relatively easy and economical to administer and can be scored accurately and reliably by relatively inexperienced clerical workers. They are generally regarded by professional personality evaluators as especially valuable assessment tools in screening large numbers of people, as in military or industrial personnel selection. Assessment specialists do not assume that self-reports are accurate indicators of personality traits. They are accepted, rather, as samples of behaviour for which validity in predicting one’s everyday activities or traits must be established empirically (i.e., by direct observation or experiment). Paper-and-pencil techniques have moved from their early stage of assumed (face) validity to more advanced notions in which improvements in conceptualization and methodology are clearly recognized as basic to the determination of empirical validity.
One group of assessment specialists believes that the more freedom people have in picking their responses, the more meaningful the description and classification that can be obtained. Because personality inventories do not permit much freedom of choice, some researchers and clinicians prefer to use projective techniques, in which a person is shown ambiguous stimuli (such as shapes or pictures) and asked to interpret them in some way. (Such stimuli allow relative freedom in projecting one’s own interests and feelings into them, reacting in any way that seems appropriate.) Projective techniques are believed to be sensitive to unconscious dimensions of personality. Defense mechanisms, latent impulses, and anxieties have all been inferred from data gathered in projective situations.
Personality inventories and projective techniques do have some elements in common; inkblots, for example, are ambiguous, but so also are many of the statements on inventories such as the MMPI. These techniques differ in that the subject is given substantially free rein in responding to projective stimuli rather than merely answering true or false, for example. Another similarity between projective and questionnaire or inventory approaches is that all involve the use of relatively standardized testing situations.
While projective techniques are often lumped together as one general methodology, in actual practice there are several approaches to assessment from a projective point of view. Although projective techniques share the common characteristic that they permit the subject wide latitude in responding, they still may be distinguished broadly as follows: (1) associative techniques, in which the subject is asked to react to words, to inkblots, or to other stimuli with the first associated thoughts that come to mind; (2) construction techniques, in which the subject is asked to create something—for example, make up a story or draw a self-portrait; (3) completion techniques, in which the subject is asked to finish a partially developed stimulus, such as adding the last words to an incomplete sentence; (4) choice or ordering techniques, in which the subject is asked to choose from among or to give some orderly sequence to stimuli—for example, to choose from or arrange a set of pictures or inkblots; (5) expressive techniques, in which the subject is asked to use free expression in some manner, such as in finger painting.
Hidden personality defense mechanisms, latent emotional impulses, and inner anxieties all have been attributed to test takers by making theoretical inferences from data gathered as they responded in projective situations. While projective stimuli are ambiguous, they are usually administered under fairly standardized conditions. Quantitative (numerical) measures can be derived from subjects’ responses to them. These include the number of responses one makes to a series of inkblots and the number of responses to the blots in which the subject perceives what seem to him to be moving animals.
The Rorschach Inkblot Test
The Rorschach inkblots were developed by a Swiss psychiatrist, Hermann Rorschach, in an effort to reduce the time required in psychiatric diagnosis. His test consists of 10 cards, half of which are in colour and half in black and white. The test is administered by showing the subject the 10 blots one at a time; the subject’s task is to describe what he sees in the blots or what they remind him of. The subject is usually told that the inkblots are not a test of the kind he took in school and that there are no right or wrong answers.
Rorschach’s work was stimulated by his interest in the relationship between perception and personality. He held that a person’s perceptual responses to inkblots could serve as clues to basic personality tendencies. Despite Rorschach’s original claims for the validity of his test, subsequent negative research findings have led many users of projective techniques to become dubious about the role assigned the inkblots in delineating relationships between perception and personality. In recent years, emphasis has tended to shift to the analysis of nuances of the subject’s social behaviour during the test and to the content of his verbal responses to the examiner—whether, for example, he seeks to obtain the assistance of the examiner in “solving” the inkblots presented to him, sees “angry lions” or “meek lambs” in the inkblots, or is apologetic or combative about his responses.
Over the years, considerable research has been carried out on Rorschach’s inkblots; important statistical problems in analyzing data gathered with projective techniques have been identified, and researchers have continued in their largely unsuccessful efforts to overcome them. There is a vast experimental literature to suggest that the Rorschach technique lacks empirical validity. Recently, researchers have sought to put the Rorschach on a sounder psychometric (mental testing) basis. New comprehensive scoring systems have been developed, and there have been improvements in standardization and norms. These developments have injected new life into the Rorschach as a psychometric instrument.
A similar method, the Holtzman Inkblot Test, has been developed in an effort to eliminate some of the statistical problems that beset the Rorschach test. It involves the administration of a series of 45 inkblots, the subject being permitted to make only one response per card. The Holtzman has the desirable feature that it provides an alternate series of 45 additional cards for use in retesting the same person.
Research with the Rorschach and Holtzman has proceeded in a number of directions; many studies have compared psychiatric patients and other groups of special interest (delinquents, underachieving students) with ostensibly normal people. Some investigators have sought to derive indexes or predictions of future behaviour from responses to inkblots and have checked, for example, to see if anxiety and hostility (as inferred from content analyses of verbal responses) are related to favourable or unfavourable response to psychotherapy. A sizable area of exploration concerns the effects of special conditions (e.g., experimentally induced anxiety or hostility) on the inkblot perceptions reported by the subject and the content of his speech.
Thematic Apperception Test (TAT)
There are other personality assessment devices, which, like the Rorschach, are based on the idea that an individual will project something of himself into his description of an ambiguous stimulus.
The TAT, for example, presents the subject with pictures of persons engaged in a variety of activities (e.g., someone with a violin). While the pictures leave much to one’s imagination, they are more highly specific, organized visual stimuli than are inkblots. The test consists of 30 black and white pictures and one blank card (to test imagination under very limited stimulation). The cards are presented to the subject one at a time, and he is asked to make up a story that describes each picture and that indicates the events that led to the scene and the events that will grow out of it. He is also asked to describe the thoughts and feelings of the persons in his story.
Although some content-analysis scoring systems have been developed for the TAT, attempts to score it in a standardized quantitative fashion tend to be limited to research and have been fewer than has been the case for the Rorschach. This is especially the state of affairs in applied settings in which the test is often used as a basis for conducting a kind of clinical interview; the pictures are used to elicit a sample of verbal behaviour on the basis of which inferences are drawn by the clinician.
In one popular approach, interpretation of a TAT story usually begins with an effort to determine who is the hero (i.e., to identify the character with whom the subject seems to have identified himself). The content of the stories is often analyzed in terms of a so-called need-press system. Needs are defined as the internal motivations of the hero. Press refers to environmental forces that may facilitate or interfere with the satisfaction of needs (e.g., in the story the hero may be physically attacked, frustrated by poverty, or suffer the effects of rumours being spread about him). In assessing the importance or strength of a particular inferred need or press for the individual who takes the test, special attention is given to signs of its pervasiveness and consistency in different stories. Analysis of the test may depend considerably on the subjective, personal characteristics of the evaluator, who usually seeks to interpret the subjects’ behaviour in the testing situation; the characteristics of his utterances; the emotional tone of the stories; the kinds of fantasies he offers; the outcomes of the stories; and the conscious and unconscious needs speculatively inferred from the stories.
The list of projective approaches to personality assessment is long, one of the most venerable being the so-called word-association test. Jung used associations to groups of related words as a basis for inferring personality traits (e.g., the inferiority “complex”). Administering a word-association test is relatively uncomplicated; a list of words is presented one at a time to the subject who is asked to respond with the first word or idea that comes to mind. Many of the stimulus words may appear to be emotionally neutral (e.g., building, first, tree); of special interest are words that tend to elicit personalized reactions (e.g., mother, hit, love). The amount of time the subject takes before beginning each response and the response itself are used in efforts to analyze a word association test. The idiosyncratic, or unusual, nature of one’s word-association responses may be gauged by comparing them to standard published tables of the specific associations given by large groups of other people.
The sentence-comple-tion technique may be considered a logical extension of word-association methods. In administering a sentence-completion test, the evaluator presents the subject with a series of partial sentences that he is asked to finish in his own words (e.g., “I feel upset when . . . ”; “What burns me up is . . . ”). Users of sentence-completion methods in assessing personality typically analyze them in terms of what they judge to be recurring attitudes, conflicts, and motives reflected in them. Such analyses, like those of TAT, contain a subjective element.
Objective observation of a subject’s behaviour is a technique that falls in the category of behavioral assessment. A variety of assessments could be considered, for example, in the case of a seven-year-old boy who, according to his teacher, is doing poorly in his schoolwork and, according to his parents, is difficult to manage at home and does not get along with other children. The following types of assessment might be considered: (1) a measure of the boy’s general intelligence, which might help explain his poor schoolwork; (2) an interview with him to provide insights into his view of his problem; (3) personality tests, which might reveal trends that are related to his inadequate social relationships; (4) observations of his activities and response patterns in school; (5) observations of his behaviour in a specially created situation, such as a playroom with many interesting toys and games; (6) an interview with his parents, since the boy’s poor behaviour in school may by symptomatic of problems at home; and (7) direct observation of his behaviour at home.
Making all of these assessments would be a major undertaking. Because of the variety of data that are potentially available, the assessor must decide which types of information are most feasible and desirable under a given set of circumstances. In most cases, the clinician is interested in both subjective and objective information. Subjective information includes what clients think about, the emotions they experience, and their worries and preoccupations. Interviews, personality inventories, and projective techniques provide indications of subjective experience, although considerable clinical judgment is needed to infer what is going on within the client from test responses. Objective information includes the person’s observable behaviour and usually does not require the assessor to draw complex inferences about such topics as attitudes toward parents, unconscious wishes, and deep-seated conflicts. Such objective information is measured by behavioral assessment. It is often used to identify behavioral problems, which are then treated in some appropriate way. Behavioral observations are used to get information that cannot be obtained by other means. Examples of such observations include the frequency of a particular type of response, such as physical attacks on others or observations by ward attendants of certain behaviours of psychiatric patients. In either case, observational data must meet the same standards of reliability as data obtained by more formal measures.
The value of behavioral assessment depends on the behaviours selected for observation. For example, if the goal of assessment is to detect a tendency toward depression, the responses recorded should be those that are relevant to that tendency, such as degrees of smiling, motor activity, and talking.
A type of behavioral assessment called baseline observations is becoming increasingly popular. These are recordings of response frequencies in particular situations before any treatment or intervention has been made. They can be used in several ways. Observations might be made simply to describe a person’s response repertoire at a given time. For example, the number of aggressive responses made by children of different ages might be recorded. Such observations also provide a baseline for judging the effectiveness of behaviour modification techniques. A similar set of observations, made after behaviour modification procedures have been used, could be compared with the baseline measurement as a way of determining how well the therapy worked.
Behavioral observations can be treated in different ways. One of these is to keep track of the frequency with which people make designated responses during a given period of time (e.g., the number of times a psychiatric patient makes his own bed or the number of times a child asks for help in a novel situation). Another approach involves asking raters to support their judgments of others by citing specific behaviour (critical incidents); a shop foreman, for example, may rate a worker as depressed by citing incidents when the worker burst into tears. Critical incidents not only add validity to ordinary ratings, but they also suggest behavioral details that might be promising predictors of success on the job, response to psychiatric treatment, or level of academic achievement.
Behavioral observations are widely made in interviews and in a variety of workaday settings. Employers, supervisors, and teachers—either formally or informally—make use of behavioral observations in making decisions about people for whom they have responsibility. Unfortunately the subject may know he is being studied or evaluated and, therefore, may behave atypically (e.g., by working harder than usual or by growing tense). The observer may be a source of error by being biased in favour of or against the subject. Disinterested observers clearly are to be preferred (other things being equal) for research and clinical purposes. The greater the care taken to control such contributions to error, the greater the likelihood that observations will prove to be reliable.
The types of thoughts experienced by individuals are reflective of their personalities. Just as it is important to know what people do and how their behaviour affects others, it is also necessary to assess the thoughts that may lie behind the behaviour. Cognitive assessment provides information about thoughts that precede, accompany, and follow maladaptive behaviour. It also provides information about the effects of procedures that are intended to modify both how subjects think about a problem and how they behave.
Cognitive assessment can be carried out in a variety of ways. For example, questionnaires have been developed to sample people’s thoughts after an upsetting event. Beepers (electronic pagers) have been used to signal subjects to record their thoughts at certain times of the day. There are also questionnaires to assess the directions people give themselves while working on a task and their theories about why things happen as they do.
The assessment of thoughts and ideas is a relatively new development. It has received impetus from the growing evidence that thought processes and the content of thoughts are related to emotions and behaviour. Cognitive assessment provides information about adaptive and maladaptive aspects of people’s thoughts and the role their thoughts play in the processes of planning, making decisions, and interpreting reality.
Bodily responses may reveal a person’s feelings and motivations, and clinicians pay particular attention to these nonverbal messages. Bodily functions may also reflect motivations and concerns, and some clinicians also pay attention to these. Sophisticated devices have been developed to measure such physiological changes as pupil dilation, blood pressure, and electrical skin responses under specific conditions. These changes are related to periodic ratings of mood and to other physiological states that provide measures of stability and change within the individual. Technological advances are making it possible to monitor an individual’s physiological state on a continuous basis. Sweat, heartbeat, blood volume, substances in the bloodstream, and blood pressure can all be recorded and correlated with the presence or absence of certain psychological conditions such as stress.
One type of information that is sometimes overlooked because of its very simplicity consists of the subject’s life history and present status. Much of this information may be gathered through direct interviews with a subject or with an informant through questionnaires and through searches of records and archives. The information might also be gathered by examining the subject’s personal documents (e.g., letters, autobiographies) and medical, educational, or psychiatric case histories. The information might concern the individual’s social and occupational history, his cultural background, his present economic status, and his past and present physical characteristics. Life-history data can provide clues to the precursors and correlates of present behaviour. This information may help the investigator avoid needlessly speculative or complex hypotheses about the causation of personality traits when simple explanations might be superior. Failure on the part of a personality evaluator to be aware of the fact that someone had spent two years during World War II in a concentration camp could result in misleading inferences and conjectures about the subject’s present behaviour.
NO ONE DOUBTS that the words we write or speak are an expression of our inner thoughts and personalities. But beyond the meaningful content of language, a wealth of unique insights into an author’s mind are hidden in the style of a text—in such elements as how often certain words and word categories are used, regardless of context.
It is how an author expresses his or her thoughts that reveals character, asserts social psychologist James W. Pennebaker of the University of Texas at Austin. When people try to present themselves a certain way, they tend to select what they think are appropriate nouns and verbs, but they are unlikely to control their use of articles and pronouns. These small words create the style of a text, which is less subject to conscious manipulation.
Pennebaker’s statistical analyses have shown that these small words may hint at the healing progress of patients and give us insight into the personalities and changing ideals of public figures, from political candidates to terrorists. “Virtually no one in psychology has realized that low-level words can give clues to large-scale behaviors,” says Pennebaker, who, with colleagues, developed a computer program that analyzes text, called Linguistic Inquiry and Word Count (LIWC, pronounced “Luke”). The software has been used to examine other speech characteristics as well, tallying up nouns and verbs in hundreds of categories to expose buried patterns.
Most recently, Pennebaker and his colleagues used LIWC to analyze the candidates’ speeches and interviews during last fall’s presidential election. The software counts how many times a speaker or author uses words in specific categories, such as emotion or perception, and words that indicate complex cognitive processes. It also tallies up so-called function words such as pronouns, articles, numerals and conjunctions. Within each of these major categories are subsets: Are there more mentions of sad or happy emotions? Does the speaker prefer “I” and “me” to “us” and “we”? LIWC answers these quantitative questions; psychologists must then figure out what the numbers mean. Before LIWC was developed in the mid-1990s, years of psychological research in which people counted words by hand established robust connections between word usage and psychological states or character traits
The political candidates, for example, showed clear differences in their speaking styles. John McCain tended to speak directly and personally to his constituency, using a vocabulary that was both emotionally loaded and impulsive. Barack Obama, in contrast, made frequent use of causal relationships, which indicated more complex thought processes. He also tended to be more vague than his Republican rival. Pennebaker’s team has posted a far more in-depth breakdown, including analyses of the vice presidential candidates, at www.wordwatchers.wordpress.com.
Skeptics of LIWC’s usefulness point out that many of these characteristics of McCain’s and Obama’s speeches could be gleaned without the use of a computer program. When the subjects of analysis are not accessible, however, LIWC may provide a unique insight. Such was the case with Pennebaker’s study of al Qaeda communications. In 2007 he and several co-workers, under contract with the FBI, analyzed 58 texts by Osama bin Laden and Ayman al-Zawahiri, bin Laden’s second in command.
The comparison showed how much pronouns are able to disclose. For example, between 2004 and 2006 the frequency with which al-Zawahiri used the word “I” tripled, whereas it remained constant in bin Laden’s writings. “Normally, higher rates of ‘I’ words correspond with feelings of insecurity, threat and defensiveness. Closer inspection of his ‘I’ use in context tends to confirm this,” Pennebaker says.
Other studies have shown that words that are used to express balance or nuance (“except,” “but,” and so on) are associated with higher cognitive complexity, better grades and even the truthfulness with which facts are reported. For bin Laden, analysis showed that the thought processes in his texts had reached a higher level over the years, whereas those of his lieutenant had stagnated.
This power of statistical analysis to quantify a person’s changing language use over time is a key advantage to programs such as LIWC. In 2003 Pennebaker and statistician R. Sherlock Campbell, now at Yale University, used a statistical tool called latent semantic analysis (LSA) to study the diary entries of trauma patients from three earlier studies, looking for text characteristics that had changed in patients who were convalescing and met rarely with their physician. Again, the researchers showed that content was unimportant. The factor that was most clearly associated with recovery was the use of pronouns. Patients whose writings changed perspective from day to day were less likely to seek medical treatment during the follow-up period.
It may be that patients who describe their situation both from their own viewpoint and from the perspective of others recover more quickly from traumatic experiences—a variation on the already well-established idea that writing about negative experiences is therapeutic. Or perhaps the LSA simply detected the patients’ recovery as reflected by their writing but not brought about by it—in that case, programs such as LIWC could aid doctors in diagnosing illness and gauging treatment progression. Researchers are currently investigating many other patient groups, including those with cancer, mental illness and suicidal tendencies, using LIWC to uncover clues about their emotional well-being and their mental state.
Although the statistical study of language is relatively young, it is clear that analyzing patterns of word use and writing style can lead to insights that would otherwise remain hidden. Because these tools offer predictions based on probability, however, such insights will never be definitive. “In the final analysis, our situation is much like that of economists,” Pennebaker says. “It’s too early to come up with a standardized analysis. But at the end of the day, we all are making educated guesses, the same way economists can understand, explain and predict economic ups and downs.”
He Said, She Said
The way we write and speak can reveal volumes about our identity and character. Here is a sampling of the many variables that can be detected in our use of style-related words such as pronouns and articles:
- Gender: In general, women tend to use more pronouns and references to other people. Men are more likely to use articles, prepositions and big words.
- Age: As people get older, they typically refer to themselves less, use more positive-emotion words and fewer negative-emotion words, and use more future-tense verbs and fewer past-tense verbs.
- Honesty: When telling the truth, people are more likely to use first-person singular pronouns such as “I.” They also use exclusive words such as “except” and “but.” These words may indicate that a person is making a distinction between what they did do and what they did not do—liars often do not deal well with such complex constructions.
- Depression and suicide risk: Public figures and published poets use more first-person singular pronouns when they are depressed or suicidal, possibly indicating excessive self-absorption and social isolation.
- Reaction to trauma: In the days and weeks after a cultural upheaval, people use “I” less and “we” more, suggesting a social bonding effect.
Note: This article was originally printed with the title, "You Are What You Say."