Research Design and Research Strategies  

Jeffery C. Johnson  

We need a powerful mode of argumentation, a mode that ensures we can represent our representations in credible ways. In such worlds, a systematic argument enjoys a star-spangled legitimacy. We need a way to argue what we know based on the process by which we came to know it. That's what I seek, not as the only possible representation that our field can offer, but as an essential lever to try and move the world. Michael A. Agar (1996:13)  

Introduction  

In a complex world of competing arguments, who is to be believed or trusted? Are data themselves, independently of how they were conceived and collected, proper evidence for making a case? Although some may be swayed by the elegance of a well-written essay, for many it's crucial to know something about the author , his or her motivations, experiences, skills, methods of investigation, and so on before passing judgment on the conclusions . In Agar's statement above, we get the impression that a credible argument should be systematic and based on a process that informs us about how researchers came to know what they know.

 It is the articulation of this "process by which we came to know it" that reflects the elements of research design. For Stinchcombe (1987:23), the observations produced by how a study was designed are fundamental to the proper assessment of empirical evidence: "We always want to reject evidence if it can be explained by the design of the research or by a large number of small, unorganized causes. " Some things, like perceptual errors, that hinder our observation may be beyond our control. Some things, like site selection, sampling, measurement, and recording are at least partly within our control. The value of empirical evidence can only be properly evaluated by understanding the details of how the research was conducted.

 According to Pelto and Pelto (1978:291): "Research design involves combining the essentials of investigation into an effective problem -solving sequence. Thus the plan of research is a statement that concentrates on the components that must be present in order for the objectives of the study to be realized. " This statement illustrates at least two important elements of research design.

 First, research design involves an a priori plan or strategy for all phases of the research (such as data collection and analysis) including, for some researchers , the production of the final product (like an ethnography ). By definition, a plan cannot deal with the unanticipated or unknown realities of research , such as tragedies or acts of nature that disrupt fieldwork . A good understanding of the research problem and the research site allows us to plan for some contingencies, but there is no research design crystal ball. In fact, chance factors often lead to great discoveries or unexpected findings . Still, while luck plays a role in research , planning for such luck is not within the realm of research design (Kirk and Miller 1986).

 Second, an idealized plan gives guidelines for linking theory to the methods of data collection and analysis that yield either valid or "defensible" results. I use "defensible" in addition to "valid," which I normally use, to make readers aware that I am broadening the traditional application of research design to include the variety of research strategies found in anthropology today. Interpretive, hermeneutic, and postmodern approaches make little explicit reference to ethnographic design issues, but well-written examples from ethnography may provide "moral evidence" to deal with current social problems , moving people (including politicians) in ways that numerical facts can't (Seidman 1994:134). Nevertheless, a well-articulated project design helps "to promote the effective conduct of research ," whether one starts from a positivist or humanist perspective (Ellen 1984:158).

 On a practical level, good research design is essential in the competition for research grants and contracts. There is much variation in what funding agencies and foundations expect regarding research design. One agency may require a detailed description of the proposed project paying attention to the research design logic of science (for example, validity, reliability , hypotheses , etc. , see also Plattner [In press]); others may require a description of the research problem and site but require less detail about the methods of data collection and analysis. All funding agencies expect a well-organized outline of the proposed project-one that meets the design expectations of peer reviewers and agency personnel.

 A distinction needs to be made between what's sometimes called the laundry-list component of research and research design. The laundry-list component is important. It involves details about getting into and out of the field situation, travel arrangements, getting proper government permissions, making contacts at the field site, arranging for living accommodations, and so on. Design, on the other hand, involves the methodological and analytical details that contribute to the credibility, validity, believability, or plausibility of any study . In this chapter I concentrate on elements of design related to the production of valid results or a believable ethnographic account.

 The Need for Design  

Evidence for the power of research design is all around us. The invention of the simple control/treatment design of clinical trials allowed researchers in this century to evaluate competing therapies and to select the ones that worked best. One result is that infectious childhood diseases that killed thousands of young people a century ago are today only a memory in industrialized countries. The lessons learned from controlled experimentation are applied today to the policy arena where groups are in conflict over resources or because of social inequalities (Johnson and Pollnac 1989; Porter 1995). Members of such competing groups-such as large-scale commercial producers, commodity producers, environmental groups, and real estate developers-believe strongly in their positions. They have evidence, often anecdotal, that their positions are credible. Without some unbiased means for assessing the evidence, the truth is only be a matter of who has the most political clout.

 The outcry for a ban on nets in tuna fishing is a famous recent example. Environmental organizations launched campaigns to ban nets in tuna fishing because dolphins are often caught incidentally in that fishery. Media campaigns in the U.S. showing pictures of dolphins being caught in nets (generally not in U.S. waters), contributed to Florida's totally banning fishing nets-even though no marine mammals were threatened by the use of nets in Florida waters. Thus, policy emerges from interactions between groups of differing political, ideological, social , and economic backgrounds.

 There has been similar concern over the incidental catch of harbor porpoises by net fishers in New England (Schneider 1996). This case led to a systematic test of a technology that might ameliorate the problem . Wildlife conservationists petitioned the U.S. federal government in 1991 to declare harbor porpoises a threatened species. In response, the fishing industry proposed the voluntary use of "pingers" an underwater acoustic device-to keep porpoises from their nets. The effectiveness of the device, however, was in question, and there was no firm evidence in the literature about it. Fishers petitioned the federal government to fund a study of pinger effectiveness. The study used the classic control/treatment design in which catch rates for a set of nets with pingers were compared to catch rates for set of nets without pingers.

 In the first experiment , the control net caught 10 porpoises while the treatment net caught none. Some conservationist groups claimed the study was biased in that the treatment nets were placed in areas known not to have large numbers of porpoises. So another study was conducted placing experimental treatment and control nets in the same proximity. This time, the treatment net caught only 1 porpoise while the control net caught 32. Some environmental groups were still concerned that evidence with more statistical power was needed . Lobbying efforts by fishers yielded more funds for a larger, more comprehensive study involving more than 10,000 fishing nets. Both control and treatment nets were outfitted with pingers, but only the pingers on treatment nets would activate once placed in the water. Thus, fishers were blind as to which nets were control and which were treatment-a classic double-blind experimental design. Again the evidence was impressive: The treatment nets caught 2 porpoises (1 was thought to be deaf), while the control nets caught 25.

 The issue is still under debate, but this series of studies illustrates how the elements of research design help muster evidence in light of competing beliefs and philosophies. In each successive study , investigators tried to control for as many extraneous variables as possible so that the hypothesized effect could be assessed (that is, the effectiveness of pingers compared to not using pingers). The logic of the research design contributed to the production of credible results.

 Although the power of experimental design is evident, concern for its application in anthropology -particularly cultural anthropology -has been limited . Some early exceptions include Brim and Spain 's (1974) book on hypothesis -testing designs, Pelto and Pelto's (1978) book on research methodology in cultural anthropology , and Naroll and Cohen's (1973) A Handbook of Method in Cultural Anthropology , which has several chapters that address issues in research design (LeVine 1973; Sechrest 1973; Spindler and Goldschmidt 1973). Bernard (1994) has elaborated in more detail on issues of design, but his treatment is necessarily limited , given his task of describing the range of methods available to anthropologists .

 If research design gets relatively little attention from anthropologists , other social scientists have written volumes about it. What should we make of this apparent dearth of specific treatments of research design in cultural anthropology ? I don't think we should make too much of it because the important elements of research design-reliability , informant accuracy, validity, objectivity, and operationalization of theoretical concepts-have been present in the writings of cultural anthropologists even before Boas .

 Boas , Malinowski, and Research Design in the Scientific Tradition  

Boas and most of his students advocated a natural science logic in the collection of ethnographic materials and a true concern for the collection of reliable data that could lead to the production of valid theory . Yet, despite his concern for scientific method, Boas was more explicit about his methods of data analysis than about his methods of fieldwork and data collection (Ellen 1984; Boas 1920). Malinowski was also concerned with the aims of science and with methodological rigor. His earliest contributions, however, were more a demonstration of the value of ethnographic writing -his "unusual literary sense" (Lowie 1937:231)--rather than of methodological details of proper ethnographic fieldwork (Ellen 1984).

 A good example of this tension between the stated early concerns for the methods of science and the actual use of such methods in ethnography comes from correspondence between Boas and his student Margaret Mead during her first fieldwork in Samoa . As Orans (1996) describes it, Mead wrote to Boas with her concerns about possible violations of scientific principles in the data she had collected to that point. She wrote of her doubts about the comparability of cases and about her ability, or even the need, to do a quantitative comparison of the similarity of attitudes among the adolescent girls in her study . She had concerns-and I believe she thought her mentor, Boas , would feel similarly-as to whether a valid comparison of this type could be made given the selection process for her sample of girls.

 The constraints of field research may lead one to stray from the idealized prescriptions of a research design, but Mead was attempting to exert her authority without necessarily following the research procedures advocated by Boas and others. Orans says: "What she wants is permission to present data simply as `illustrative material' for the representativeness of which one will simply have to take her word" (p. 127). What is most surprising is Boas 's response to Mead . He writes :  

I am very decidedly of the opinion that a statistical treatment of such intricate behavior as the one that you are studying , will not have very much meaning and that the characterization of a selected number of cases must necessarily be the material with which you operate. Statistical work will require the tearing out of its natural setting, some particular aspects of behavior which, without that setting may have no meaning whatever. A complete elimination of the subjective use of the investigator is of course quite impossible in a matter of this kind but undoubtedly you will try to overcome this so far as that is all possible. (from Orans 1996:128)  

This response is important for at least two reasons. First, it demonstrates the differences between the stated scientific objectives of ethnographic work as advocated by Boas and the actual practice of ethnographic research . There appears to be a perception that a systematic treatment of the data will have to be abandoned to preserve context and meaning. Ironically, this concern for context and meaning over methodological rigor, particularly for those in search of theoretical foundations (that is, the Boasian idea of data leading to the construction of theory ), would ultimately hinder the comparability of data from different ethnographic sources (see Moran [1995] for a recent discussion of this issue and see Ember and Ember, this volume).

 Second, Boas 's concern for contextual meaning over the statistical analysis of data was prophetic. Rightly or wrongly, the preeminence of contextualization has been a consistent issue in ethnographic research and has often clouded issues in research design. The idea that quantification detracts from context and meaning in the ethnographic endeavor-evident even in the time of Boas -and a failure to understand that systematic methods-whether quantitative or qualitative-help minimize the subjectivity of the investigator have impeded the development of well-delineated research strategies in anthropology .

 It's tempting to explain this as the consequence of the intensely personal nature of fieldwork , and the complexity of a holistic approach. However, this debate has its parallel in sociology where schools such as ethnomethodology and symbolic interactionism developed in response to the largely quantitative macro-level focus of the discipline. These micro-level approaches are attempts to get at a better understanding of meaning in everyday life (Cook 1994).

 Boas 's final sentence in his response to Mead illustrates that even at this early stage the issue of the subjectivity of ethnographic research was of concern. There was a faith, however, that awareness of the potential biases associated with the subjectivity of the investigator could be dealt with in some reasonable way. A further irony is that the one thing that might have lessened potential subjectivity biases-the use of standardized methods-was rejected outright because meaning might be compromised. Mead 's position on these various elements of research design provided fuel for the continuing discussions about the validity of her original findings (Brim and Spain 1974; Freeman 1983; Orans 1996).

 Thus, while early British and U.S. anthropologists advocated the scientific method in ethnographic research , there is little evidence that they considered appropriate design issues when they actually did the research . As Urry (1984) sees it:  

In Britain the claims that anthropology not only studied a distinctive body of data but also that it possessed a sophisticated methodology to collect these data, was an important factor in the establishment of anthropology as a discipline. This was less necessary in America where, by the late nineteenth century, anthropology was already established in universities, museums and government agencies. But in spite of claims to scientific methodology , particularly in the British tradition , there are surprisingly few details about actual methods anthropologists used in the field, beyond a few first principles and illustrative anecdotes. There was a wide belief among British anthropologists that fieldwork could not be taught to new recruits, but could only be experienced by individuals in the field. In the American tradition texts provided what was regarded as an objective body of data, whereas the British tradition was more a matter of subjective experience . It is a strange paradox in the development of field methods that the scientific study of other cultures has been built upon such a foundation . (p. 61)  

There is much anecdotal evidence for a belief, across the British and U.S. traditions , in a trial-by-fire method of training for ethnographers . This belief supports the current lack of formal training in methods and research design in anthropology . Agar (1980) and Bernard (1994) relate stories about Kroeber's recommendations regarding the teaching and conduct of ethnographic research . In the stories, one concerning Wagley's teaching of a field methods course and one concerning a graduate student at Berkeley asking for advice before going to the field, Kroeber's response was a terse, one liner that reflected the attitude of the times. Even in the late 1960s, when concern for methodological rigor was probably at its peak in anthropology , many treatments of research methods and design in the literature played down the need for more systematic methods and design detail, particularly with respect to hypothesis -testing approaches (LeVine 1973). A good example of this is a book by Thomas Rhys Williams (1967) published in the Spindlers's series on field methods. Williams writes :  

I believe that only someone wholly involved and fully immersed in fieldwork can really communicate the essence of cultural anthropology to students or general readers. And since I have indicated here that research in culture involves a great deal of unique personal experience for the anthropologist , I have taken the position that it is probably unlikely there can be a rigorous, systematic , and formal presentation of methods in the study of culture like those of the natural sciences and that there are overriding concerns among many sociologists, psychologists, and economists. I find this stance comfortable, for it is my conviction that so long as prime theoretical concerns in the study of culture are an attempt to record and understand the native's view of his culture and the objective and historical realities of culture, then methods for field study will have to reflect the end purpose of making a whole account of a part of the human experience . (pp. 64-65)  

LeVine (1973) and others (Johnson 1990) make the point that the nature of fieldwork , in terms of its requisite huge investments in time and geographical focus , has often limited the attractiveness of more formal research designs because of its commitment to studying specific problems in a specific way. The realities of fieldwork often dictate the need to change the problem focus or, finding that the proposed hypotheses are inappropriate to the cultural setting under study , the need to somehow salvage the research .

 Laboratory and survey researchers have some flexibility to change the problem focus and study populations in light of emerging problems , but field workers are limited in their ability to do so. Thus, the idea of researchers "putting all their eggs in one basket" may have limited the a priori formulation of problems in fieldwork (LeVine 1973:184). Further, the huge investment in time and resources limited another important goal of science , that of replication, since an ethnographer couldn't realistically be expected to replicate someone else's work. The "my natives" or "my village" mentality of some and the fact that careers were made by discovering new theories or describing exotic less well-known cultures has certainly inhibited replication efforts (Johnson 1990).

 Contemporary Design Issues in Cultural Anthropology  

There is an ongoing debate in cultural anthropology concerning science and its role in contemporary research . A discussion of the basic arguments as related to epistemology, objectivity, reality, authority , and the like are beyond the scope of this chapter (see Schweizer in this volume). Suffice to say that traditionally , research design and its logic have been associated with science and an underlying belief in objectivity and explanation . The historical tension between interpretive and scientific approaches in anthropology has given way to an outright rejection by some anthropologists of science and its logic of design. To say that the research design logic of science has been replaced by something that is recognizable as the research design logic of, say, postmodernism would, I think, be misleading. It is not that interpretive approaches lack some form of research plan; but the term "design" itself smacks of the very formalism that is being rejected. A more appropriate term that would encompass the diversity currently found in cultural anthropology might be "research strategy . "  

Figure I is a taxonomic characterization of the different types of research strategies found in contemporary cultural anthropology . The figure distinguishes between strategies within the realm of interpretive studies and those using systematic strategies that have more of the elements of science . This is a highly simplified representation. Many examples of research in anthropology fall within the two extremes of the continuum. Under the systematic distinction are the two primary categories of exploratory and explanatory approaches, each entailing a specific design strategy . The light line connecting the two categories indicates their complementarity and interrelatedness in that a design may include both within an overall research design framework. These approaches are by no means mutually exclusive in approaching a research problem (see section on Research Design in Systematic Research , below).

 In its most extreme form, systematic strategies tend to involve the search for explanations of phenomena and the pursuit of theoretical foundations . In searching for such foundations , there is a need for objectivity, replication, and control over possible sources of error leading to a valid assessment of a given theory . Epistemologically, systematic work is objectivist. Its practitioners are ultimately interested in research findings that approximate an external truth. As a result, the assessment of any theory involves research designs more heavily concerned with the means-the research process, rather than simply the way the study was written or argued-since the validity of study results depends on the scientific soundness of the research design. For any given research problem , it is the purpose of research design to ward off as many threats to validity as possible. This leads to designs that involve concern for a higher degree of methodological and analytical detail, whether quantitative or qualitative. In this line of thinking, the researcher is a field-worker-as-writer .

 Figure 1. Types of anthropological research strategies and their features.

 Exploratory: Exploratory approaches are used to develop hypotheses and more generally to make probes for circumscription, description, and interpretation of less well-understood topics. This is similar to the grounded theory ideas of Glaser and Strauss (1967), where exploratory descriptive research leads to the development of more meaningful theory and measures . Exploratory research can be the primary focus of a given design or just one of many components.

 Explanatory : Explanatory approaches generally involve testing elements of theory that may already have been proposed in the literature or that have been informed by exploratory research . Research designs in this mode are determined a priori and their primary purpose is to eliminate threats to validity, where validity is concerned with whether things are what they appear to be or are the best approximation to the truth (Cook and Campbell 1979). In this enterprise, explanation can involve a general search for causality or prediction.

 Interpretive strategies , on the other hand, differ from systematic approaches in that they question a researcher 's ability to maintain objectivity, particularly in the ethnographic context where the ethnographer is often the instrument of measurement. A variety of names are used in the lexicon of social scientists that can be associated to varying degrees with an interpretive strategy . Phenomenology, hermeneutics, symbolic anthropology , interpretive anthropology , interpretive interactionism, deconstructionism, postmodernism , and constructivism, to name a few, question, in one way or another, some or all of the ontology, epistemology, and methodology of systematic approaches. Although some of the older interpretive strategies that emerged from the scientific tradition in the social sciences, such as early interpretive anthropology , still adhered to some logical empiricist methodology and maintained a degree of belief in ethnographic authority , more recent approaches, such as postmodernism and constructivism, are more radical in their sweeping rejection of scientific method and design logic (see Schwandt 1994). In contrasting Geertz and early interpretive anthropology with some of the later postmodern turns of such ethnographic writers as James Clifford, Rabinow (1986) observes:  

At first glance James Clifford's work, like that of others in this volume, seems to follow naturally in the wake of Geertz's interpretive turn. There is, however, a major difference. Geertz (like the other anthropologists ) is still directing his efforts to reinvent an anthropological science with the help of textual mediations. The core activity is still social description of the other, however modified by new conceptions of discourse, author , or text. The other for Clifford is the anthropological representation of the other. This means that Clifford is simultaneously more firmly in control of his project and more parasitical. He can invent his questions with few constraints; he must constantly feed off others' texts. (p. 242)  

There is a fundamental belief that the intersubjective, everyday meanings and how they are produced, maintained, and changed in any given context often defy objective study and explanation . Practitioners of almost all interpretive paradigms are searching in one way or another for some understanding (verstehen) rather than for some explanation of social phenomena. However, some interpretive work is more similar in nature to the exploratory or descriptive strategies found under the systematic side of Figure 1 than to some of the more radical forays into, for example, postmodernism . Thus, the rather simple characterization of research strategies found in Figure 1 attempts to recognize the variation inherent in the range of work found in contemporary anthropology by placing "interpretive anthropology " adjacent to "exploratory/descriptive" (see, for example, the work of Zabusky 1995). Discussions about this debate can be found in Seidman (1994), on the one hand, and Faia (1993), on the other, and, more specifically for anthropology , by Kuznar (1997).

 An important implication here is that scholars who follow this line of inquiry are searching for local rationales rather than nomothetic theory or universal foundations and may be more interested in conveying a moral tale of some type rather than a value-free account (Seidman 1994). Further, the purpose of research strategies under these interpretive paradigms is more focused on the production of a believable or plausible account or story rather than a single depiction of the truth, since it is thought that there are a multitude of plausible accounts rather than just a single true story. Epistemologically, interpretive paradigms are subjective, with findings that are value mediated or even created. Thus, there is less focus on the means of research , such as methods of data collection and analysis as found in the systematic strategies , and more on the ends of research -the ethnographic or literary product. In contrast to the field-worker-as-writer , we find the writer -as-field-worker (Denzin and Lincoln 1994).

 For scholars like Geertz, analysis of ethnography has less to do with the methods of observation and description than the inscriptions and writings concerning the meaning of human action. In many ways, this blurs the distinction between what is anthropological and what is literary. More extreme forays into experimental ethnography have blurred this distinction even further, and there is more of a focus on writing strategies that include such approaches as montages, evocative representations, polyvocal texts, and even ethnographic fictions (Denzin and Lincoln 1994). While systematic analytical paradigms are primarily concerned with threats to validity, recent interpretive paradigms are focused more on threats to believability -as in "Do you believe my story? " (Tyler 1991:85)-or, in critical theory , threats to trustworthiness (Kincheloe and McLaren 1994). If we talk of an interpretive method, particularly with regard to postmodernism , it more than likely involves both the researcher 's immersion into the cultural context of the actor(s) and some means, usually literary, for conveying the understanding gained from such an immersion.

 As stated, many interpretive studies are closer in character to exploratory and descriptive research in the systematic mode than to some of the more extreme postmodern studies . A good example of this is Zabusky's (1995) ethnographic study of cooperation in European space science that she admits "took the form of mutual exploration rather than unidirectional examination" (p. 46). She contrasts her study with research on cooperation by "experimental " psychologists, emphasizing the cultural and social orientation of her work and the importance of considering context (social , cultural, political, etc. ) in her analysis. Following in the "thick description" tradition of Geertz, Zabusky clearly believes in some kind of ethnographic authority . In a short methodology section, she discusses the challenge of conducting participant observation research in this rather complex, geographically dispersed, cross-cultural setting. She also discusses the rationales for selecting the site and the group she studied , problems of working in a linguistically and technically diverse social milieu, the use of semistructured and unstructured interviews, and the effect of her role as ethnographer on informant relations and data quality. Although Zabusky doesn't talk specifically about design or about concerns for potential threats to validity, there is implicit concern for such issues throughout the ethnography .

 In contrast to Zabusky, there is a body of interpretive work in anthropology that is more extreme in its rejection of systematic design issues. Ramos (1995), for example, has recently published an ethnography based on a rewrite of her 1972 ; : dissertation, with additional ethnographic insights. She rejects the "anthropological `j austerity" of her original work in favor of an "intersubjective understanding" that captures the "flavor" of her ethnographic encounter with the Yanomami. To her, the original work was "old-fashioned and theoretically unsophisticated" and had to be replaced by a more reflexive work. This contrast between the old and the new reflects the increased variation in epistemological emphasis in the field that has developed over the last 30 years. As Ramos sees it, "I found myself making forays into the self-conscious meanderings of reflexive anthropology in order to shift the axis of analysis from the skeleton-like dissertation to the flesh and blood of ethnography " (p. 6).

 Along with this shift came the freedom not to be concerned with issues of bias and validity or with the need for working systematically, thus allowing for a less restrictive ethnographic narrative. Although Ramos discusses informant interviewing and various sources of data, her introduction is largely devoted to discussions of her reliance on her own memory in writing the ethnography and the shift in the narrative between synchrony and diachrony. Thus, there is little discussion of research design and methods of data collection as might be found in work in the systematic tradition . Instead, Ramos emphasizes the emergent and reflexive nature of data and the literary strategies used in producing the ethnographic product. Other examples in this vein include Panourgia's (1995) use of we and they in her "Athenian Anthropography" and Behar's (1993) use of montage in her collaboration with a single woman in the telling of that woman's life story. Behar discusses the multiplexity of roles, in that she was variously involved as "priest, interviewer, collector, transcriber, translator, analyst, academic, connoisseur, editor, and peddler" (p. 12).

 The idea of a montage as an organizing principle was also central to Taussig's (1987) historical and ethnographic account of shamanism, colonialism, and terror in South America. This work is important in at least two ways. First, it is representative of the genre that rejects explanation in favor of conveying a moral tale. Its purpose is not a traditional attempt at explanation where facts are considered real, but political interpretation and representation of facts, independent of their "realness. " Second, Taussig uses the "principle of montage" as a means, at least in his view, for better relating the lessons of history. As he states:  

As against the magic of academic rituals of explanation which, their alchemical promise of yielding system from chaos, do nothing to ruffle the placid surface of this natural order, I choose to work with a different conflation of modernism and the primitivism it conjures into life-namely the carrying over into history of the principle of montage, as I learned that principle not only from terror, but from Putumayo shamanism with its adroit, albeit unconscious, use of the magic of history and its healing power. (p. xiv)  

These examples offer only a brief glimpse of the range of possible strategies in use by interpretivists in anthropology . For some, interpretive work is an exploratory enterprise with an implicit concern for methodological issues. For others, interpretive work is concerned more with the strategies and methods of ethnographic presentation and with the reflexive character of the ethnographic enterprise. Thus, traditional methods sections are replaced by discussions on how to read the work or on the particular methods used in writing the ethnography itself (see, for example, Panourgia's discussion on the use of the parerga).

 In the following pages, I focus primarily on research designs in systematic research . For further discussion of research strategies in the interpretive mode, see Fernandez and Herzfeld (this volume).

 Research Design in Systematic Research :  The Challenge of Making a Case  

In some social science disciplines, like psychology, the design of research is driven by features of the analysis. Analysis-of-variance models and multigroup comparisons (factorial designs) may dictate the whos, whats, and wheres of a given project. In sociology, multiple regression models, structural equation models, and path analytic models (all related analytical techniques) have influenced the design of survey research . Ethnography , referred to as the anthropological method by William Foote Whyte (1984), has influenced the nature of design in anthropology , but in profoundly different ways.

 Whereas the analytical techniques most often used in psychology, sociology, and economics often led to rather standard designs, in anthropology the eclectic nature of ethnography leaves the design of research more open ended. There are generally no ethnographic "analytical techniques" driving the design, although ethnography has been variously associated with a number of qualitative methods. The good news is that ethnographic research is amenable to a wide range of research designs, including the use of multiple designs within a single ethnographic context. This allows for flexibility, multiple tests of a theory , increased chances for various types of validity, triangulation, and the potential for high levels of innovation and creativity. The bad news is that the open-ended character of ethnography contributes to a less well-focused discussion of research design issues in ethnographic approaches. Part of the confusion stems from a lack of consensus on what ethnography really is (Johnson 1990). To some, it is both a process and a product (Van Maanen 1988). Although this process might be equated to a method, it's better to think of ethnography as a strategy in which a variety of methods can be used in the quest for knowledge (Pelto and Pelto 1978). Thus, ethnography should involve multiple methods, both qualitative and quantitative, and may involve applying more than one research design. This is particularly true today, given the large number of computer analytical packages available for analyzing text (see Bernard and Ryan, this volume). Currently, the qualitative analysis of text and discourse is no longer: restricted to either interpretive or exploratory approaches, but can also be used in hypothesis testing and explanatory research .



 Figure 2 illustrates the relationship between exploratory and explanatory approaches within the ethnographic context. This contrast between explanatory and descriptive or exploratory approaches is commonly made in nonexperimental disciplines in both the natural and social sciences. Community ecologists, for example, similarly distinguish between exploratory or descriptive studies that seek to describe and determine patterns in ecological data and those studies that specifically seek to predict or test hypotheses . As with research in community ecology, ethnographic research can be purely exploratory or descriptive involving a research process focused on producing better theory -or purely explanatory , although this is usually not the case. Rather, the most common model has exploratory research informing and complementing explanatory research . As we will see in the examples to come, exploratory research is often an essential component of the explanatory research process. Exploratory research may contribute to the production of reliable and valid measures , provide information essential for constructing comparison groups, facilitate construction of structured questions or questionnaires, or provide information necessary for producing a sound probability or nonprobability sample.

 The figure shows that the overall research process is more than just a matter of study design. There is no substitute for a good theory , and there is a critical need to link theory , design, data collection, analysis, and interpretation in a coherent fashion. Design, however, is the foundation of good research . No amount of sophisticated statistics, computer intensive text analysis, or elegant writing can salvage a poorly designed study . Hurlbert (1984) emphasizes this in a classic paper on the design of field experiments in ecology. "Statistical analysis and interpretation," he says, "are the least critical aspects of experimentation , in that if purely statistical or interpretive errors are made, the data can be reanalyzed. On the other hand, the only complete remedy for design or execution errors is repetition of the experiment " (p. 189). Redoing an experiment because of fundamental design errors is one matter; redoing a year-long ethnographic field study because of such errors is quite another.

 Figure 2 shows that the research process involves a simultaneous concern for the development of empirical statements from theory (for example, hypotheses ), the operationalization of theoretical concepts (for example, meaningful and reliable measures ), design (for example, groups to be studied ), data collection (for example, qualitative versus quantitative), and data analysis (for example, multiple regression and text analysis). Theoretical knowledge is derived either from earlier studies or from exploratory work. The levels at which theoretical concepts are measured (for example, nominal or ordinal), the types of sampling strategies used, and the application of appropriate types of analysis must all be considered as a part of the design. For example, the particular structure of an empirical statement or hypothesis will partially determine the manner in which theoretical concepts are operationalized and eventually analyzed. (Stinchcombe [1987] provides an excellent discussion of how empirical statements are derived from theory . )  

Thus, research design is more than just methods of data collection and analysis. It involves constructing a logical plan that links all the elements of research together so as to produce the most valid assessment possible of some theory , given some set of realistic constraints (for example, cost, scope, geographical setting, etc. ). The purpose of research design is to ward off as many threats to validity as possible and to help one eliminate competing hypotheses . It requires careful attention to detail and, often, an admission concerning the potential weakness of a given design. Outside the laboratory, a multitude of influences can threaten the validity of any conclusions . In natural settings, particularly fieldwork , there is no perfect design that can control for all possible extraneous effects at once. A recognition of limitations doesn't invalidate a study 's results. Rather it creates an open forum that can contribute much to important theoretical and methodological debates. Without such attention to good design and methodological detail, researchers leave themselves open to one of the worst criticisms of all-of being "not even wrong" (Orans 1996). In other words, a lack of design and methodological detail makes it next to impossible to fairly and adequately assess the validity of any study 's conclusions such that "rightness" or "wrongness" may not even be debatable.

 True experiments involve random assignment and afford the best chances for controlling for things like: the effects of extraneous factors (that is, unmeasured variables that might affect the dependent variable); the effects of selection (that is, comparison groups differ because of the way they were selected and not due to the treatment); the effects of reactive measurement (that is, the measurement procedure itself caused a change in the dependent variable); or interaction effects involving selection (that is, when selection interacts with other factors to create erroneous findings ). These and other sources of error are all potential rival hypotheses and randomized experiments are best at eliminating the threats of rival explanations . Designs of this type, however, are often impossible in anthropological fieldwork . Nevertheless, the principles of experimentation are instructive and are a guide for understanding potential sources of error, even in a nonlaboratory setting. I borrow terminology from Kleinbaum et al. (1982) in constructing a typology of research designs. Included are experiments , quasi -experiments , observational study designs, and what I refer to as natural experiments .

 Experiments involve the random allocation of subjects to groups and afford the most control over distorting effects from extraneous factors. Random allocation produces equivalent comparison groups, and artificial manipulation of independent variables (also known as explanatory variables or study factors), with all other variables or factors controlled for, allows for the most valid assessment of the causal relationship between the independent and dependent variables or response variables . What separates quasi -experiments from true experiments is the lack of random assignment of group members. Random assignment maximizes the probability that experimental groups are equivalent on key variables prior to the introduction of an intervention. Nonrandom assignment lays an experiment open to validity threats and reduces our ability to make causal inferences. Observational studies involve neither random assignment of members to comparison groups nor the manipulation by the observer of independent variables .

 This distinction between experimental and observational approaches is similar to one in ecological field studies . Hurlbert (1984) distinguishes between two classes of experiments . He terms the first manipulative experiments . These are basically true experiments involving random assignment, multiple comparisons (for example, treatment versus control), and the manipulation of independent variables . He refers to the second as mensurative experiments , which involve simply the measurement of variables in space and time and among a number of comparison groups, without random allocation and the manipulation of experimental factors.

 The primary distinction lies between that of sampling versus allocation. In manipulative experiments , analytical units are randomly allocated to comparative groups, whereas in mensurative experiments selection of units is based on some probability or nonprobability sampling scheme. While random assignment aids in controlling for confounding variables by producing homogeneous comparative groups, random sampling of units produces comparison groups that are representative of such groups. Random sampling meets the restrictions of some statistical tests , but it does not afford the same protection as does random assignment of group members against the potential effects of extraneous factors. Mensurative designs, then, are observational and characteristic of the types of comparative designs found in field studies in anthropology .

 Finally, natural experiments are similar to quasi -experiments except that the manipulation of independent variables occurs naturally or is unplanned rather than artificial or directed. Thus, comparison groups may be chosen on the basis of different levels of exposure to some naturally occurring or human-induced phenomena (for example, natural disaster, war, or the building of a dam). Cook and Campbell (1979) make a similar distinction but refer to these kinds of natural experiments as "passive-observational studies . " Anthropologists involved in development and evaluation research are most likely to use this design.

 True experiments are, of course, rare in anthropology (but see Harris et al. [1993] for an example of a true experiment in a field setting). Even in quasi -experiments , it's often difficult to manipulate independent variables directly. Howevert, with careful attention to design and ethnographic context, quasi -experimental and natural experimental designs can be applied to anthropological field settings, particularly in evaluation research and development research . Johnson and Murray (1997), for example, used a quasi -experimental design to evaluate the use of fish aggregation devices (FADS) in small-scale fisheries development projects. Two fixed fishing structures (piers) were pretested for differences in catch rates. Then, FADS, umbrella-like units suspended in the water column, were alternately placed at the piers and individual fishers were interviewed simultaneously during randomly selected times at both the treatment (the pier with the FADS) and the control (the pier without the FADS) piers. Johnson and Murray compared and determined catch rates.

 From a statistical standpoint, designs that don't involve random assignment including quasi -experiments -are considered observational (Cook and Campbell 1979). It is important, though, to contrast quasi -experiments to what Kleinbaum et al. (1982) refer to as observational studies . The most common designs used traditionally by anthropologists have been observational in nature. Designs of this type lack direct control over independent variables and, thus, have more potential problems with various types of internal validity and with the ability to assess time order effects and causality. However, if done properly, such designs can have increased external validity and generalizability.


 Due to their predominance in anthropology , the examples that follow are comparative observational designs. Most research designs in the explanatory mode, like true experimental designs, are comparative (for example, control versus treatment). Table 1 describes examples from observational and quasi -experimental study designs discussed by Kleinbaum et al. (1982) and Cook and Campbell (1979). More details can be found in these and other sources (for example, Robson 1993). In anthropological fieldwork , these designs and others can be used in tandem to test or explore components of a theory (such as combinations of time series and repeated measures designs particularly applicable to long-term fieldwork ). For example, in their study of preschool children, Johnson et al. (1997) used a cross-sequential design, which involved cross-sectional research on a cohort of children carried out over time.

 When one is interested in explanation , the importance of comparative thinking in ethnographic work cannot be overemphasized. Discussing "common sense knowing" in evaluation research , Campbell (1988) gives an important critique of ethnography . His idea is that "to know is to compare" is fundamental to explanatory work in anthropology :  

The anthropologists have never studied a school system before. They have been hired after (or just as) the experimental program has got under way, and are inevitably studying a mixture of the old and the new under conditions in which it is easy to make the mistake of attributing to the program results which would have been there anyway. It would help in this if the anthropologists were to spend half of their time studying another school that was similar, except for the new experimental program. This has apparently not been considered. It would also help if the anthropologists were to study the school for a year or two prior to the program evaluation. (This would be hard to schedule, but we might regard the current school ethnographies as pre-studies for new innovations still to come. )  

All knowing is comparative, however phenomenally absolute it appears, and an anthropologist is usually in a very poor position for valid comparison , as their own student experience and their secondhand knowledge of schools involve such different perspectives as to be of little comparative use. (p. 372; emphasis added)  

While the purpose of experimental design is to ward off as threats to validity, there are several types of validity-face, construct, statistical conclusion , internal, external, etc. In one way or another, various study designs, in combination with other considerations such as the operationalization of theoretical constructs and sampling, are better or worse at dealing with each. Here, I stress the importance of thinking through how validity threats have influenced and will influence observations or data (for a more in-depth discussion of how these types of validity can impact study conclusions , see Cook and Campbell 1979). Potential errors and bias creep in at various steps in the research process. It's your job to contain these errors. In research design, forewarned is forearmed.

 



 Tables 2 and 3 give examples of threats to internal and external validity as discussed in Cook and Campbell (1979) for quasi -experimental designs. Internal validity is concerned with the approximation to the truth within the research setting. External validity is concerned with the approximation to the truth as expanded to other settings-that is, with the generalizability of research findings . The threats in Table 2 deal with extraneous factors that may account for the presence or absence of a hypothesized effect (that is, contrast validity with invalidity). In the quasi -experimental case, this means changes between pre- and posttest, but this way of thinking can be expanded to include hypothesized effects dealing with differences, similarities, or associations whether diachronic or synchronic. Cook and Cambell (1979) detail how each of the quasi -experimental designs in Table 1 are better or worse at dealing with each of the threats to validity that are found in Tables 2 and 3. For example, the pretest/posttest nonequivalent groups design controls for some internal threats to validity, but it's problematic with respect to controlling for changes due to how groups members were selected (selection maturation), changes due to how individuals were tested (instrumentation), changes due to the selection of individuals with extreme pretest measures leading to regression toward the mean (regression), and changes due to local events not a part of the study (history). Each of these threats may hamper a researcher 's ability to assess the contribution of a hypothesized effect to any changes observed. Similarly, threats to external validity, such as problems stemming from biased samples or research in atypical or unique settings, can hamper the generalizability of one's findings . Kleinbaum et al. (1982) offer a similar discussion of the strengths and weaknesses of observational designs in terms of controlling for threats to both internal and external validity.

 Other sources of potential bias include sampling error (that is, chance), nonresponse, the use of imprecise measures , data recording errors, informant inaccuracies, and interviewer effects (see Pelto and Pelto 1978; Bernard 1994). Careful attention to sampling, whether probabilistic (Babbie 1990) or nonprobabilistic (Johnson 1990), is essential. Measurement, operationalization of theoretical concepts, and type of analysis used are other important factors. How reliable are your measures in terms of precision, sensitivity, resolution, and consistency? Are they valid, particularly with respect to accuracy and specificity, in that they are actually measuring what they are intended to measure ? Attention and concern with all the potential sources of error, whether stemming from how the study was designed, how the data were collected (for example, face-to-face interviews or mail-out surveys), or how the data were analyzed (for example, statistical conclusion validity), will help lead to the production of solid evidence.

 Some Comments on Sampling  

Many probability and nonprobability sampling designs are available for any given research problem . These include systematic sampling, stratified random sampling, cluster sampling, and multistage sampling. The selection of any of these designs or the development of some hybrid design depends on the overall design of the research itself. The nature of the groups or characteristics to be compared-in terms of such things as the size of the comparison groups in the overall population , the frequency of characteristics of interest in the population , the availability of a sampling frame, the ability to identify members of the population (for example, hidden or clandestine populations )-all influence the choice of a sample design. But it's not always easy to know who or what you want to sample and to know enough about these sampling units to derive a valid sample.

 The selection of units of analysis, whether settings, events, times, households, or people, is important for understanding a variety of internal and external threats to validity, but it is particularly important for increasing external validity. We mostly think of selection in terms of some type of sample units. To generalize to a target population , the sample has to be representative of the population of interest. This is essential if we are to generalize to a whole population and is generally, though not always, a requirement for classical statistical tests .

 When generalization to a target population is the objective, you should strive to define a sampling universe or frame using a selection procedure with known error limits and one that represents the population of interest. This usually entails a random sample of some kind. There is a vast literature on sampling theory and random sampling procedures, including discussions of sample sizes (see, for example, Bernard [1994] for a summary and Babbie [1990] for detailed discussion of sampling issues).

 Cook and Campbell (1979) discuss two sampling models for increasing external validity in quasi -experiments . These models don't necessarily involve random selection and are consequently less powerful than are random samples. In one approach, the model of deliberate sampling for heterogeneity, target classes of units, whether classes or categories of persons, places, times, or events, are deliberately chosen to represent the range of such classes found in the population . Thus, testing for a treatment effect across a wide range of classes in the set of all possible classes (including both extremes and the modal class) in the population allows the researcher to say something about how the effect holds in a variety of settings. While this might not be generalized to the population as a whole, it does inform the researcher if an effect holds across wide ranging classes within the population . The logic behind this model can be extended beyond the quasi -experimental case to observational studies . Kempton et al. (1996) used a static-group comparative design sampling across a range of groups that varied with respect to their values on environmental issues. Kempton et al. interviewed members of Earth First (a radical environmentalist group) and dry cleaning shop owners (who depend on toxic chemicals for their business).

 For some populations , it may be impossible to develop a sampling frame from which to draw a sample. In these cases, there are a variety of solutions, including intercept sampling, snowball sampling, random walks, quota sampling, and purposive sampling. Each of these approaches has potential problems , and most do not allow for generalizations about a population since they involve elements of unknown error even if the method involves some form of random selection criteria (for example, random selection of locations in which to intercept respondents).

 Nonprobability sampling methods have come to be associated with qualitative approaches or for the selection of ethnographic informants, particularly key informants or consultants (Werner and Schoepfle 1987; Johnson 1990; Miles and Huberman 1994). In some cases, a researcher may not be interested in generalizing to a population but may just want to know whether two subgroups obtained from a snowball sample differ with respect to some variable of interest. In that case, much of the bias in the sample is a matter of the logic used in the original selection of sample seeds and any statistical analysis of the data must be concerned about violations of assumptions for the particular statistical test to be employed (for example, independence of observations or random sample from a population ). Such matters are particularly germane for observational designs using various social network approaches (see Johnson [ 1994] for a review).

 How samples are chosen is an important element of any research design. If you are interested in generalizing to a given population , random sampling of some kind is essential. If generalization is not a primary goal, then sampling requirements may be relaxed. In most cases, if you can use a random sample, do it! No matter what the sampling method, you should be explicit about how you chose the sampling units. This increases the chances of detecting potential bias and also makes replication feasible. Replication is extremely important to external and other types of validity, such as construct validity. Random sampling has been a primary requirement in the proper application of parametric statistics. If you don't use random sampling, pay careful consideration to possible violations of assumptions for a given statistical test .

 Recent developments in randomization and computer-intensive methods of statistical analysis involve less restrictive assumptions concerning the data (for example, assumption of a random sample from a population or skewed, sparse, or small sample sizes), opening the way for the development of new test statistics particularly suited for the problem at hand (Noreen 1989; Johnson and Murray 1997). These new approaches seem particularly well suited for the imperfect world of ethnographic research , where the rather restrictive assumptions of parametric analysis are often difficult to meet. But it is critical to remember the connection between theory , design (including sampling), and data analysis from the beginning, because how the data were collected, both in terms of measurement and sampling, is directly related to how they can be analyzed. The next section shows how concern for the elimination of potential errors and bias through design and attention to methodological detail applies to discussions about the findings of Margaret Mead and Derek Freeman in Samoa .

 Mead Versus Freeman : Research Design as Mediator  

Derek Freeman 's (1983) criticism of Margaret Mead 's work and her findings in Samoa has led to reactions from anthropologists who come from different epistemological traditions . Some have defended Mead (Shankman 1996); others have pointed to the biases and flaws in Freeman 's argument (Marcus 1983; Ember 1985). The criticisms and counter-criticisms are difficult to assess, given the time between Mead 's and Freeman 's studies , the differences in locations of their work, and the differences in their ideological positions (Ember 1985). Freeman contended that some of Mead 's informants lied to her and that Mead 's commitment to a particular ideological position caused her to evaluate evidence incorrectly. We certainly cannot hold Mead to the design standards available today. Still, it is instructive to review her work through a contemporary design lens, noting how slight modifications in design and method could have thwarted later criticisms.

 Mead used what can be referred to as a static group comparison design with a conjectural treatment group. The comparison group, Samoan adolescent girls, was compared to a conjectural treatment group, American adolescent girls, to test the proposition that exposure to Western civilization increases adolescent trauma. Implicit in this proposition is the overall theoretical notion that culture is the major factor contributing to human behavior . Brim and Spain (1974) recognized several problems in the design that could have affected Mead 's ability to draw valid conclusions .

 There were no equivalent measurement procedures for the two groups. In her use of a conjectural treatment group, Mead assumed some things about American adolescents without collecting comparable data. Mead relied mostly on herself as an instrument to measure the variables of interest. There were possible problems with interaction between selection and the effects of extraneous variables . That is, any observed difference between the two groups with respect to the dependent variable, adolescent trauma, might have been due to one or several extraneous (unmeasured) factors and might have had nothing to do with the independent variable, exposure to Western culture. In lieu of the between-culture comparisons , Mead could have made a within-case comparison that would have suffered less from problems with possible sources of error. She could have chosen comparison groups that were as similar as possible in order to rule out the effects of unmeasured variables as much as possible. For example, Mead could have compared girls living in the households of native pastors to those who did not. She could then have tested the proposition that exposure to competing standards of sexual morality leads to higher levels of emotional distress in adolescents.

 More recently, Martin Orans did fieldwork in Samoa . Some of his