Creating Effective Decision Aids for Complex Tasks

Abstract

Engineering design tasks require designers to continually compare, weigh, and choose among many complex alternatives. The quality of these selection decisions directly impacts the quality, cost, and safety of the final product. Because of the high degree of uncertainty in predicting the performance of alternatives while they are still just sketches on the drawing board, and the high cost of poor choices, mathematical decision methods incorporating uncertainty have long held much appeal for product designers, at least from a theoretical standpoint. Yet, such methods have not been widely adopted in practical settings. The goals of this work are to begin understanding why this is so and to identify future questions that may lead to solutions. This paper summarizes the results of several studies by the authors: two laboratory studies in which we asked product designers to use various mathematical models to compare and select design alternatives, and a set of ethnographic studies in which we observed product designers as they worked so that we could better understand their actual practices and needs during decision making. Based on these studies, we concluded that the mathematical models, as formulated, are not well suited to designers’ needs and approaches. We propose a research agenda for developing new approaches that combine decision theoretic and user-centered methods to create tools that can make product designers’ decision making work easier, more systematic, more effective, and more reportable.

Practitioner’s Take Away

This article looks at some of the issues in designing and developing tools for complex problem solving in work domains such as mechanical design, logistical planning, and medical decision making. It is particularly challenging to develop tools (software or otherwise) to assist in these tasks because so much of the work is cognitive. The steps are often internalized, highly nuanced, and dependant on a body of personal experience, rather than well-defined processes. Tools to support decision making must often cater to the needs of a diverse group of users who may range from domain novices to domain experts. Additionally, the tasks themselves and the knowledge associated with them may change rapidly with technological advances making incorporation of extensive volumes of complex knowledge in a tool impractical. The lessons learned from the work reported in this paper can be applied to many other complex domains.

In order to design tools to fit users’ needs in complex domains, it is important to understand (a) how they solve problems, (b) where their strengths and weaknesses lie, and (c) what type of challenges, constraints, and conditions exist in their actual work environments. Observational and measurement techniques for understanding work and problem solving, such as ethnographic studies, protocol analysis, and laboratory studies (in addition to usability studies of the evolving tool), may be even more important in complex, expert domains than they are in other tasks.
Domain experts, intermediates, and novices may not all have the same needs, nor do they have the same knowledge, strengths, and weaknesses. While it is desirable to design a tool to assist all of these different types of users, it may not be practical or possible to do so. Tool designers may need to choose to design specifically for users at a specific range of domain expertise levels.
Tools must respect users’ problem solving approaches. For example, in the study reported in this paper, users employed flexible approaches, with many short cuts such as only searching for additional information if it was needed to distinguish between two top design alternatives.
Users were impatient with having to gather, estimate, or make guesses pertaining to information that might not actually be necessary for a decision and viewed such tasks as unnecessary or “busy work.” (For example, user may feel, “I can choose the best alternative in my head much more quickly without having to specify all this information for the computer, so why should I bother with the computer tool?”). Unless users perceive that the work of using a tool will directly benefit them, they are likely to view these data entry tasks as unnecessary chores and are unlikely to adopt the tool.
Old style “expert systems” were not entirely successful because they attempted to encode complex, highly nuanced, and highly contextual expert knowledge in software. Such systems were expensive to build, as well as brittle and hard to maintain. It can be more cost effective and successful to provide tools that can help problem solvers explore, organize, and visualize problem-relevant data to which they can apply their own knowledge and judgment powers.
Similarly, tools should avoid requiring users to articulate and enter large quantities of their knowledge or judgments on a problem-by-problem basis. Unless users perceive this work as directly benefiting them in their current task, they are likely to perceive such data entry tasks as unnecessary chores.

Introduction

Creating effective decision aids is not simply a matter of finding a method that computes the most correct answer or the interface that best presents the data, but also of finding the most effective way to integrate tools with human problem solving needs. For example, tools based on mathematically correct and sophisticated models may not actually improve problem solving performance if they frame the problem in a way that does not fit human problem solvers’ approaches.

In this work, we have chosen to study decision aids that support the product design process, because the ability to compete in the global economy is highly dependant on the ability to rapidly produce high quality, low cost products. However, products such as cell phones, healthcare systems, or space stations are becoming increasingly complex. This means that product designers face greater decision making challenges than ever before.

We have further chosen to focus on design decisions made during the very early, conceptual stages of design, because decisions made at this stage have the largest impact on the cost, quality, and success of a product (Ishii, 2004). By the time one gets to the final stages of the design process, the major decisions have already been made and further choices have relatively little impact. Unfortunately, the conceptual design stage is an inconvenient time at which to make decisions and commitments because choices based on cost and performance must be made between alternatives that are little more than rough. It is simply not possible to produce accurate assessments of cost and performance during the conceptual design stage.

Yet product designers must make choices anyway in order to make the design task manageable. They must because there may be hundreds or even thousands of alternative designs and possible variants for a single complex product. It may require a team of 20, 100, or more designers to fully develop even one alternative to a sufficient level of detail that accurate cost and performance estimates can be produced. Thus, it is simply too expensive to develop them all to a level of detail sufficient to allow one to accurately choose the best with certainty. Some may argue that it is inappropriate to apply mathematical methods during conceptual design because of the uncertainty and lack of detail. However, this is where designers face the largest decision making challenges and where improvements can have the largest impact.

Mathematical decision making approaches that represent the uncertainty of a situation have long held great theoretical appeal for helping product designers make better design decisions for all the reasons above. However, while product designers routinely use many computer tools to help them visualize, analyze, and simulate the performance of products, mathematical decision methods are not used consistently in their daily work. This is not to say that they do not use them. In fact they do. Many companies use a variety of decision making techniques to explain or justify major decisions. However, designers do not tend to use these methods in their day-to-day design decision making to the extent that one might expect.

The overarching goals of this work are to begin to develop a better understanding of why mathematical decision methods have not been embraced by designers in the workplace, and how the mathematical decision methods can be made to better support designers’ needs in the workplace. The more immediate goals of this paper are to quantify benefits and costs experienced by designers when using a variety of different mathematically-based decision aids in the laboratory, to better understand how product designers go about decision making in the workplace, and to use this understanding to explain the laboratory study results, and to inform future research directions.

This paper begins with a presentation of our model of the product design process that incorporates not only our own data and observations but also unifies a number of models created by other researchers to describe the design process. To study the issues above we used a combination of quantitative and qualitative methods that include the following studies:

Two controlled laboratory studies in which we asked product designers to use various computer decision aids.
A set of ethnographic studies (Blomberg et al., 2002) in which we observed designers as they worked on real design tasks (e.g., fly-on-the-wall observations of work in a typical setting).

The research questions that we viewed as relevant to ask evolved during the sequence of studies reported as we gained knowledge of product designers, their approaches to decision making, and the challenges of their work environments. Initially, we set about asking the question: “Which decision aid supports product designers better – one that allows them to express their uncertainty about the price and performance of the design alternatives, or one that does not” Given the inherent uncertainty and incomplete knowledge associated with conceptual designs, we assumed that the former would yield the best results.

However, much to our surprise, our experiments did not show this to be true. The decision aid that allowed product designers to express uncertainty did not yield significantly better or worse results than the other decision aid. Furthermore, while both decision aids produced better results under certain circumstances than no decision aid, it was not clear that their benefits necessarily outweighed their costs (time for data entry, software installation, software training, etc.). This situation was unexpected and was far more nuanced than we had initially envisioned.

At this point, our questions turned towards finding explanations and a deeper understanding: “Why didn’t the ‘uncertainty’ decision aid yield more benefits than the other?” “Why didn’t either decision aid yield a clearer balance of benefits?” “How well does the framework imposed by most mathematical decision approaches fit with product designers’ actual approaches to decision making in the workplace?” and “How can we make tools that better support product designers’ actual needs in the workplace?” These questions form a future research agenda for development of human-centered decision aids to meet the workplace needs of designers working in any complex and uncertain domain. We feel that the results of such a research agenda will apply to electro-mechanical product design and, more generally, to any type of complex design task.

Methods

We used two methods in this work: ethnographic and laboratory studies. To a lesser extent, we also drew on protocol studies. We surveyed existing models of design with special emphasis on models derived from protocol studies of designers solving actual problems, because the studies provided insights into actual behavior. A
protocol study is one in which subjects are asked to think aloud as they solve problems. Everything they say is recorded for later analysis (Ericsson & Simon, 1980). In contrast, ethnographic observations are observations of work as it is carried out in a normal setting (Bloomberg et al., 2007). The ethnographic observations differ from protocol studies in that the people observed are not asked to solve specific problems or think aloud. Laboratory studies are more highly controlled than either ethnographic or protocol studies. While the situation in laboratory studies may be somewhat artificial, these studies allow measurement and quantification of phenomena in a way that ethnographic studies cannot. Thus, each of these study types, ethnographic, protocol, and laboratory studies, can provide different views of the complex phenomena associated with product design processes. Together they provide a mix of qualitative and quantitative data that allow construction of a richer overall picture than any one method alone.

Related Literature

The following section presents the mathematical decision making methods.

Mathematical decision making methods

Complex decision problems require decision-makers to choose from available alternatives characterized by multiple qualitative or quantitative criteria (Saaty, 1980). Multi-criteria decision making (MCDM) techniques (Klein, 1993) are a broad family of mathematical methods that compare alternatives in a set using multiple criteria. For example, a prospective car buyer might compare his or her car choices by criteria such as fuel efficiency, cost, and comfort. The criteria used may vary from buyer to buyer depending on what is most important to that particular person. One common MCDM method is the weighted sum method (Hayes, J. R., 1981) in which each term in the sum represents how well an alternative fulfills a given criterion, and the term’s weight represents that criterion’s importance to the decision maker. Variants of the weighted sum method are popular because they are relatively easy to understand and use. Note that MCDM methods do not automate the decision process, nor do we view that as a desirable goal. Instead, they provide a structured approach through which people arrive at their own decisions by allowing them to specify the criteria they view as important and their judgments of the values associated each alternative.

One can further divide MCDM (and weighted sum) methods into deterministic and non-deterministic methods. Deterministic decision making methods are those that do not explicitly incorporate a representation of uncertainty, for example, the cost of an option may be represented as a specific number or point value. While the decision maker may understand that this number is not exact, the degree to which it is not exact is not represented. In contrast, non-deterministic decision making methods are those that incorporate some explicit representation of uncertainty or unknowns. For example, the uncertainty in the cost of an option may be represented as a range of possible costs or as a function describing the likelihood of various costs.

Vagueness and ambiguity can be modeled by many techniques including those based on fuzzy set theory (Thurston & Carnahan, 1992). The merit of fuzzy techniques is that imprecision (Bellman & Zadeh, 1970) is recognized as an element of the decision model. The drawbacks of such techniques are the relatively high computational effort required for modeling the decision situation and processing the input information (Law, 1996). However, while much research has focused on the development of formal decision making methods, relatively few studies have assessed their practical utility and impact in complex tasks. In the laboratory study summarized later, deterministic and non-deterministic (fuzzy) decision making methods were compared against designers’ typical, informal methods.

Product Design Processes

In the following sections, we will summarize some of the existing literature describing salient properties and structure of design processes. The models described were developed primarily in the context of mechanical design processes. However, the general characteristics of most complex design processes, whether mechanical, software, or systems, are essentially similar.

Uncertainty is present in all designs (Aughernbaugh & Paredis, 2006), from hand-held computer devices to space station systems. Even when a design is considered to be complete, there may still be uncertainty concerning issues such as the performance of the design under all the conditions to which it may be exposed in its working life, its manufacturing feasibility, or the final cost.

Uncertainty is most prevalent in the early stages of design, also known as conceptual design, when the alternatives under consideration may be little more than quick sketches or brief outlines. At all stages of the design process, designers must repeatedly choose the most promising alternatives for further development. This winnowing of alternatives is known as the down selection process. Conceptual design, down selection, and their relationship to the overall design process are shown in Figure 1. The model of the design process shown in this figure represents the authors’; integration of their own observations with several other design process models (described below) to create a unified design process model.

There is an overall progression in the design process from conceptual design to detail design (Pahl & Beitz, 2006). There is no distinct dividing line between these stages. The design is gradually transformed from a set of sketchy alternatives created during the conceptual design stage into one or more detailed and polished designs that are finalized in the detail design stage. The transformation occurs through many iterations in which they explore, develop, and eliminate many alternatives (Simon, 1985). Multiple iterations of the design process are represented by the helix in Figure 1 (Blanchard & Fabrycky, 2006); each loop of the helix (depicted as a pair of curved arrows) represents one iteration in the design process. The labels on each loop, such as requirements gathering, design review, and down selection, represent some of the activities performed in each iteration. However, design activities rarely proceed in a precise and orderly progression.

Ullman, Dietterrich, and Stauffer (1998) performed protocol studies of mechanical design processes in which designers were asked to create designs to solve real engineering requirements. They found that, in practice, there is much jumping back and forth between steps. From these studies, they developed the Task/Episode Accumulation (TEA) model in which designers incrementally refined and patched design alternatives in a series of design episodes. Each design episode addressed one of six goals: plan, assimilate, specify, repair, document, and verify. These goals can be addressed in almost any order and can be viewed as an alternate way of dividing and describing design activities listed previously.

Figure 1. An iterative model of the design process.

The work reported in this article deepens prior work by providing a detailed study of down selection which is the process through which design alternatives are compared and selected for further development.

The Design Tasks

We studied student designers in the context of two different electro-mechanical design domains: design of a robot arm for a quadriplegic man and design of a manned lunar excursion vehicle. The robot arm for the quadriplegic man had to be capable of manipulating a variety of lightweight objects found in his home and office environment such as paper, small books, compact discs, and soft drink cans. It had to have a control interface that a quadriplegic person could manipulate and be powered by the on-board battery system of his electrically powered wheelchair. Additionally, it had to be simple for an assistant to mount and unmount from the arm of the wheelchair. The electronics and motors had to be reasonably weatherproof, light weight, and inexpensive. To meet the cost goals the students made extensive use of scrap aluminum and junk yard parts such as automobile seat motors. Finally, the students had to build and test their best design, shown in Figure 2.

Figure 2. A wheelchair mounted robot arm created by a student design team.

In the second domain, teams of student designers developed designs for manned lunar excursion vehicles, some of which are shown in Figure 3. All groups were given the same design goals by the NASA Johnson Space Center. They were to design a manned lunar excursion vehicle that would provide occupants with protection, life support, mobility, towing capability, communication, and sufficient power for an average excursion. The excursion vehicle must also fit inside the launch vehicle and deploy successfully at the landing site. Students did not have the tight budgetary restrictions as the students building the robot arm, nor did they have to build the excursion vehicle.

Figure 3. Four design alternatives for a lunar excursion vehicle developed by students.

Two Decision Aids

In order to study the impact of mathematical decision aids in the design domains described above, the authors implemented two computer decision aids to assist product designers in comparing design alternatives and making down selection decisions. These tools do not make decisions for people. They simply provide a structured interface that allowed them to enter their own criteria and judgments and a convenient computational method for systematically combining the components used in their decisions.

A deterministic decision aid

The deterministic decision aid allowed product designers to compare design concepts using a deterministic weighted sum method. The product designers first entered a list of criteria that they felt were important to the quality of the design. For example, for a manned lunar rover, designers included criteria that a design must include protection for the astronauts riding in the vehicle, life support systems, and structural integrity so that the vehicle will not crumple as it moves over the rough lunar surface and so on. Many more criteria can be entered; however, the snapshot of the interface shown in Figure 4 shows the interface after only the first three criteria have been entered and the fourth is about to be entered.

For each criterion an importance weight must be entered using a slider bar (as shown on the left side of Figure 4). An importance weight of 0 indicates that the criterion is not very important, and a weight of 10 indicates that it is very important.

Once criteria and importance weights are entered, then the names of each alternative design must be entered. The product designer entered four alternatives and named them concept x, y, z, and i (as shown across the top of the interface). One could enter more alternatives if desired or use more meaningful names.

Figure 4. The data entry interface for the deterministic decision aid. Decision makers enter values using a single slider bar.

Next, a value must be entered to indicate how well each alternative fulfills each criterion. For each alternative, product designers entered a single value for each criterion using a slider bar (as shown in the central area of Figure 4). A value of 0 indicates that the alternative does not fulfill the criterion well and 10 indicates that it fulfills the criterion very well.

Once all values are entered, the Calculate button (bottom right) is used to calculate an overall score for each alternative based on the importance weights and values. These scores are displayed near the top of Figure 5. The data entry interface for the fuzzy decision aid. Decision makers enter value ranges using a pair of slider bars. under each concept name. Higher scores indicate better overall value. These scores can be used to rank the design alternatives from best to worst.

A fuzzy decision aid

A second version of the decision aid implemented a fuzzy weighted-sum method, based on that described by Bellman and Zadeh (1970). The interface was very similar to the deterministic decision aid except that for each alternative concept product designers had to specify a range of values indicating how well each alternative fulfilled each criterion using a pair of sliders to enter an upper and a lower value (shown in Figure 5). Thus, the product designer has indicated that he or she thinks that concept 1 may be “providing protection” anywhere from very poorly to an average amount. This allows designers to indicate their uncertainty about how well this yet untested alternative may perform.

Figure 5. The data entry interface for the fuzzy decision aid. Decision makers enter value ranges using a pair of slider bars.

Laboratory Studies of Decision Aids in Product Design Decision Making

This section briefly summarizes two laboratory studies (Akhavi and Hayes, 2007) that used the decision aids described above to investigate the costs and benefits experienced by mechanical designers when using these decision aids verses no aid.

Study 1

In the first study, we asked seven student designers (all at the intermediate level of design expertise) to rank, from best to worst, four different design alternatives for the elbow joint on the robotic arm and three different design alternatives for a mounting plate (which would be used to mount the arm on the wheelchair). All students had been working on the robot arm design task since the start of the semester, so they were familiar with the task and the criteria for an effective solution. While it might have been desirable in some respects to use subjects who had no prior experience with this particular robot arm design task and the alternatives, it was necessary to use subjects who already had familiarity with the task in order to be able to understand the criteria and the properties of the alternatives, which were non-trivial to understand. Each student was asked to individually use the two decision aids described earlier to assist with the ranking: the decision aid based on the fuzzy technique and the other based on the deterministic technique.

We found that all students produced identical rankings for the solutions regardless of the decision aid used. However, the fuzzy decision aid required significantly more time on average than the deterministic one: 12.5 minutes versus 7 minutes, p-value = 0.02. This time difference appeared to result directly from the additional data entry required for the fuzzy method, which required entering two values for each alternative and criterion. The deterministic method only required one value. After discussions with the students we concluded that they all produced identical rankings because for both the elbow joint and the mounting plate, the alternatives were clearly very different from each other in quality with obvious winners and losers. In such situations, designers can make choices readily without the assistance or overhead of a tool. The students did, however, comment that they liked the way the tool allowed them to systematically layout the criteria and value judgments for all alternatives. They printed out the decision matrix produced by the tool so they could include it in their final project report as a convenient summary justifying their design decisions.

The important lesson learned from this first study was that computer decision aids may not add value for all design decisions, particularly if the top alternative can easily be distinguished from the others. For the next study, we designed a situation where the alternatives were very close in quality so that the top alternative was not easy to identify without careful consideration.

Study 2

The second study explored the use of decision aids in the context of the manned lunar excursion vehicle design task.

Subjects

Twenty-six participants were used in the study. Eighteen were senior undergraduates in a capstone design course, and eight were engineering design professors. The students were considered to be intermediate level designers (not novices) and the professors were considered to be experts in lunar vehicle design.

Tasks

All students in the class were given the same design task: to create a design for a lunar excursion vehicle. A total of 12 designs were created by four teams of students. Each team presented its three designs to the class, and the class decided which was best, average, and worst. Next, all 12 designs were re-sorted into three sets, each containing four designs. Set A contained only the best designs from each team. Set B contained all the average designs from each group, and Set C contained all the worst designs from each group. Thus, each set contained four designs that were similar in quality and would require some thought to rank them from best to worst. Furthermore, because the designs within each set crossed the boundaries of the student teams, none of the students had yet spent time comparing any of the designs with in the new sets. Thus, we created fresh comparison tasks for the students participating in the experiment.

Method and tool inputs

Each subject in the study was then asked independently to apply a different decision making method to each of the three design sets. The following were the three methods:

A fuzzy weighted sum method (Akhavi, 2006) that was incorporated in a computer decision aid. It provided users with two slider bars that allowed them to set the top and bottom of the range bounding the likely values for each alternative and criterion.
A deterministic method (a standard weighted sum) that was incorporated in a computer decision aid. It provided a single slider bay that allowed users to enter only a single value for each criterion.
A no mathematical method that presented subjects with a set of drawings and supporting descriptions, and asked subjects to use whatever their normal method was, which was typically a manual, “seat-of-the-pants” ranking of alternatives.

The reader should note that the mathematical methods do not apply any internally encoded design expertise. The design expertise and judgments come entirely from the human participants. The tools simply provide a systematic structure and method to facilitate their comparison process.

Tool outputs

Both mathematical methods computed and displayed a single overall value score for each alternative based on a combination of all criteria. However, the fuzzy method actually produced a probability density function for each design alternative describing a distribution of the probable values. Thus, we could have chosen to display the results in many ways, (e.g., as a function curve or a range) but for simplicity we choose to display only the average of each probability density function.

Ranking

Next, for each method participants were instructed to rank a set of four design alternatives from best to worst. The order and the pairing of methods with design sets were systematically varied. The participants had received instruction in their design class on how to use the weighted sum method to compare and rank design alternatives using a calculator or spreadsheet to perform the calculations.

Data recorded

The experimenters recorded the following data:

Rankings assigned to each alternative in a set.
Time required to rank each set.
Users’ preferences for one method over others.

Results

The following sections discuss rankings, time, and user preferences.

Rankings

The rankings for a set of alternatives, ordering them from best to worst, represent a decision; the top ranked alternative represents the decision maker’s top choice. However, not all decisions are of equal quality. One question that we wished to assess is whether the decision method had an impact on decision quality.

Unfortunately, decision quality is difficult to assess directly for many reasons. The knowledge and skill of the decision maker impact the likelihood that the alternative identified as “best” will actually prove to be the best once it is actually built and criteria (such as cost, performance, reliability, and marketability) can be tested. We define a high quality decision as one where the decision maker’s rankings accurately reflect the rankings computed from empirically measured data once the design alternatives are actually built and deployed, using the decision maker’s specified criteria.

Although alternative prototypes are sometimes built and tested during a design process, the cost of doing so is often prohibitively expensive, particularly for complex devices like lunar exploration vehicles. Thus, in many cases, it is not possible to directly measure decision quality because most of the alternatives are never built.

However, there are other measures that one can use as indicators of decision quality when decision quality cannot be measured directly. While experts lack perfect judgment, they are far better than others at making judgments in their own area of expertise. Experienced conceptual designers for space missions estimate cost within 10% of the actual cost (Mark, 2002), which is quite impressive given the novelty of the designs and the number or unknowns they must manage. Additionally, it has been found in domains ranging from manufacturing plans (Hayes & Parzen, 1997; Hayes & Wright, 1989) to medical diagnosis (Aikins et al., 1983) that while experts may sometimes disagree on which is the top alternative, there are high correlations in their rankings even when those rankings are arrived at independently without consultation. In other words, even if two experts independently rank two different alternatives as their top choices, it is likely that both alternatives will be ranked near the top for most experts. If one assumes that experts are able to judge quality, then one indicator of quality is the correlation of a decision maker’s rankings with those of experts. Figure 6 shows the average rank correlations between the rankings of the expert subjects and between the intermediates and the experts. Thus, the taller the bar is in Figure 6, the higher the level of agreement with experts (or of experts with each other).

Figure 6. The average correlation of subjects’ rankings to expert rankings, using three different decision methods for intermediate-level and expert designers.

These results show that experts were indeed more consistent in their rankings than were the intermediates. The decision aid used also made a difference in average correlation with expert rankings:

For expert product designers, ranking correlations between expert designers increased significantly when design experts used either of the decision aids (fuzzy or deterministic) then when they ranked alternatives by hand (alpha = 0.05 and p-value = 0.002).
For expert product designers, there was no significant difference between the fuzzy and deterministic decision aids. This was contrary to our initial expectations that the fuzzy method would be superior, particularly for conceptual design because it allows product designers to express the uncertainty inherently associated with an incomplete design.
For intermediate-level designers, there was no significant difference between any of the methods.

Each subject’s rankings were produced independently. They were not allowed to discuss the relative merits of the alternatives in a set with other subjects prior to the experiment. Thus, the correlation between subjects is not the result of group discussions. We feel that these results indicate that experts are better at making decisions than intermediates, most likely because their greater experience allowed them to assess the alternatives more accurately. Both decision aids helped the experts to produce better decisions, possibly by encouraging them to think more systematically and carefully about the criteria, the alternatives, and their relative merits. The decision aids did not appear to make a significant difference in the consistency of the rankings for the intermediate-level designers, possibly because they lacked the experience to make good assessments of the likely cost, performance, etc. of the alternatives.

What was surprising was that the fuzzy method did not have a significantly different impact on the consistency of rankings from the deterministic method. In fact, it appears slightly worse than the deterministic method for both groups (although not significantly). There are many possible explanations for this result. Perhaps the subjects were more used to thinking of criteria in terms of single values than ranges (e.g., they are more comfortable thinking in terms like “the cost is $10” as opposed to “the cost is probably between $8 and $14”); perhaps they were not good at estimating uncertain values (Tversky, 2003) or the display used in the experiment was not supporting this reasoning about uncertain values in a way that fit their internal concepts.

Time

Figure 7 shows the time required by both intermediates and experts to rank the alternatives, and identify a “best” alternative.

For intermediates, the fuzzy method required the most time at about 13 minutes on average, the deterministic method required about 9 minutes, and ranking alternatives by hand required only 4 minutes. The differences between all methods were significant (alpha = 0.05, p-value = 0.0001).
The results for experts were similar. Experts required more than 16 minutes to rank four alternatives using the fuzzy method, 14 minutes using the deterministic method, and less then 5 minutes when ranking the alternatives by hand. All differences were significant (alpha = 0.05 and p-value = 0.0001) except for that between the fuzzy and deterministic methods.

Figure 7. Both intermediate-level and expert designers used more time when ranking alternatives using the computer decision aids.

It was evident from observations of the subjects during the experiment that the decision aid required more time than ranking alternatives by hand largely because of the data entry associated with using the decision aids. When ranking alternatives by hand, subjects simply sorted or numbered a stack of drawings that required relatively little time. However, they had to enter numbers for each alternative when using the decision aid. The fuzzy decision aid required more data entry (two numbers for each value) than the deterministic aid (one number for each value) which explains why the fuzzy aid required more time. However, spending more time with each alternative may also have encouraged subjects to think more carefully about their relative merits.

It may appear counter-intuitive that the experts should spend more time than the intermediates to reach decisions, especially given results such as those reported by J. R. Anderson (1980) in which he describes experts performing tasks faster than non-experts. However, many of the task domains described by J. R. Anderson, such as cigar rolling and flash arithmetic, while not simple, are less complex than design tasks. For more complex tasks, such as manufacturing, planning (Hayes & Wright, 1990), military planning (Marshak, 1999), and equine nutrition (R. Anderson, 2003), experts have been observed to take more time to complete problem solving tasks than non-experts. The explanation offered in military and manufacturing domains has been that non-experts are simply not completing as many problem solving steps or considering as many issues as experts. The non-experts lacked the experience to know they should be doing these steps or considering these options. Solution quality usually suffered as a result.

User preferences

A survey given to each subject showed that 63% of the experts preferred the fuzzy decision aid, 38% preferred the deterministic aid, and none preferred to use no aid. In contrast, 50% of the intermediates favored the deterministic decision aid, 39% the fuzzy aid, and 11% no aid.

It is interesting that the intermediate-level product designers preferred a decision aid over no aid because it is not clear that the aid provided significant benefits, and it required more time than doing the task by hand. More surprising still is that the experts preferred the fuzzy decision aid over the deterministic aid; both provided similar benefits in increased consistency in rankings (which may indicate increased decision quality) but because the fuzzy aid required more time one might expect them to prefer the deterministic aid.

While users do not always prefer the method that improves performance the most, it is important to understand what users’ preferences are as indicators of what methods they may be willing to adopt and use, given the right conditions.

However, it is not uncommon for subjects in a laboratory experiment to express a preference for a technique that does not actually improve their performance (Morse et al., 1998). In this case, the subjects’ preferences may have reflected an intellectual appreciation of the mathematical methods encoded in the decision aids. The intermediates may have preferred the deterministic method for its simplicity. This may also reflect students’ lower level of comfort with statistical concepts of uncertainty. The expert designers, however, were more versed in and had a better working appreciation for the uncertainty associated with the cost and performance of design alternatives.

Ultimately, despite the preferences expressed during the laboratory study, subjects did not necessarily use the decision aids in subsequent product design work. An intellectual preference expressed during a laboratory experiment, in which subjects are removed from typical pressures and deadlines of the workplace, is not the same as a practical preference in the context of a working environment where perceived benefits must outweigh perceived costs. However, we believe it is important to understand users’ preferences because it may impact the ease with which they are willing to accept and use a particular decision aid, given that an appropriate balance of costs and benefits can be achieved.

Ethnographic Observations of Design Decision Making

The results above provided a very different picture of the usefulness of mathematically-based decision aids than we had predicted. We had expected that the fuzzy method would provide more benefits to both experts and intermediate-level designers than the deterministic method because it would allow the uncertainty of the conceptual design process to be reflected in the decision process. However, we did not find this to be the case. Only expert designers benefited from the decision aids, but only for difficult decision situations. They benefited about equally from both the fuzzy and deterministic aids, but the fuzzy aid required almost twice the time of the deterministic aid. Furthermore, the added time required for either decision aid made it unclear as to whether the benefits of using such tools would justify the costs. However, we were not yet convinced that mathematical decision aids and fuzzy methods could not provide a satisfactory balance between costs and benefits given the right application, tool design, and sensitivity to users’ needs.

The next step was to try to better understand the results above. Were the failures of the tools a matter of improving the information presentation? Reducing the data entry time? Increasing the users’ training in the underlying mathematical concepts? Or, were the problems deeper: was there a mismatch between the way in which designers make decisions and the assumptions and approaches assumed by these methods?

To gain insights into these issues, we turned to the ethnographic observations which we had collected prior to the laboratory experiments. We had studied the four teams that created designs for lunar excursion vehicles and the team that created a wheelchair mounted robot arm. The four lunar excursion vehicle teams were observed during their regularly scheduled design meetings over the course of a semester. The robot arm team was similarly observed over the course of a different semester.

The ethnographic observations revealed that designers continually generated and evaluated alternative designs throughout the design process, some of which were entirely new concepts, while others were minor variants on existing alternatives. To keep the number of alternatives under current consideration manageable, designers continually engaged in down selection to prune out the less promising ones. They appeared to follow at least two different approaches to down selection that we call rapid elimination and considered comparison. Additionally, they were often observed to engage in information seeking if they did not feel comfortable with the amount of information available on each alternative.

Rapid elimination

In this variant of the down select process, design alternatives were only briefly considered before being discarded on the basis of rapid, informal assessments. Alternatives were often discarded based on a single criterion. For example, a designer might say, “this option is way too expensive for our budget” or “that option is far more complex (mechanically) than our other options. I don’t see the need to consider it further unless we are desperate.” Options eliminated by this method were usually those that were clearly dominated by others (e.g., worse in all major criteria). Rapid elimination is by nature imprecise and may sometimes lead to inappropriate elimination of alternatives (some of which were later revisited). However, it is also a very pragmatic approach and probably necessary given the enormous volume of down select decisions that must be made during a typical design process. If designers considered all decisions in depth, design progress would rapidly come to a stand-still.

Considered comparison

In this variant of down selection, designers were observed to compare several options during group discussions, often comparing all alternatives by one criterion, then another, possibly revisiting one, and so on. They frequently added additional criteria discovered through discussion or revised initial estimates of criteria importance. Thus, unlike formal decision methods in which the criteria and their importance are pre-determined, designers mixed discovery and determination of criteria with the decision making process. It was only occasionally that designers performed this type of in-depth comparison. It occurred most commonly as a deadline was approaching at which time the team must select and justify a single “best” alternative which they would develop and prototype. In many, but not all cases the alternatives compared were not obviously inferior or superior to each other than those pruned through rapid elimination.

Their process resembled a blend of the decision making processes described as the lexicographic method and a weighted sum method (Hayes, J. R., 1981). In the lexicographic method, alternatives are first compared by the most important criterion; if they are equal by that criterion then they are compared by the next most important criterion and so on. This can be contrasted to a weighted sum method in that all major criteria are considered at one time for all alternatives. The considered comparison process differed from both in that designers initially focused a small handful of criteria that they deemed most important. Those criteria were not typically combined by a mathematical function but by an overall feeling which the designers developed for each alternative through discussion of their properties. (We will later refer to this as a seat-of the-pants judgment.) If consideration of that small handful of criteria did not produce a clear winner they often considered additional criteria. This process might best be described as a very flexible and very approximate weighted sum method in which criteria, importance weights, and values are continually added, subtracted, or modified.

Information seeking

Choosing an alternative in the down select process is tightly tied with information seeking. At many points in the design process, designers lacked sufficient information to make informed comparisons between alternatives, particularly during conceptual design. Designers seek information through many methods. Sometimes they create the information themselves by developing more detailed drawings of targeted areas of a design or by building and testing prototypes. For example, information about the likely performance of the Mars Rover on Mars was obtained by building and testing prototypes in the harsh conditions of the Anaconda desert. Sometimes information is produced through analytical methods such as calculations and simulations. And sometimes it is collected from external sources, for example by searching the library and web or by calling multiple venders to gather a range of price quotes.

Some information seeking activities require significant effort, knowledge, and cost. Designers must make judgments about when the cost of information seeking is likely to pay-off in the final product. An issue observed in senior undergraduate designer teams was that they did not always know when to seek more information or when to stop. Deadlines were very important in forcing them to think critically about what information was most important and to focus their information seeking efforts. Furthermore, they were far more likely to seek information in areas where they felt knowledgeable and comfortable and to avoid seeking it in areas unfamiliar to them. For example, they were very comfortable elaborating physical, three-dimensional details and conducting mathematical analyses of specific aspects such as stress and torque, but they were far less comfortable developing cost and manufacturability estimates. They did not know where to look for cost information or who to call or consult. They would do so when pushed by the instructors, but they often grossly underestimated costs and appeared completely unaware of the degree of uncertainty in their estimates. This avoidance is probably not laziness, but simply a hesitancy to engage in an effort of unknown magnitude for an unknown benefit.

Bradley and Agogino (1994) also describe information seeking as an important part of the process through which design alternatives are selected. They studied this process in the context of automated selection of design component choices from a catalog. They describe a mathematical formulation which can be used to decide when it is worthwhile to expend the cost and effort required to gather additional information. However, the method requires the designer to put in effort to collect the input data for the method, which they may not be willing to do if they believe they can fare almost as well without using any special analysis for information seeking decisions.

Summary of observations

Designers had more than one approach to decision making: they used both rapid elimination and considered comparison for more difficult deliberations. These represent two ends of a spectrum of rapid to deliberate focus. Decision aids may not be appropriate and may be viewed as burdensome in rapid elimination tasks, although they provide value in tasks requiring considered comparison. This observation supports the inference drawn from the laboratory studies that decision aids are not necessarily appropriate for all decisions.
The tasks of information seeking, comparison of alternatives, and down selection are tightly intertwined. This may suggest that to support designers’ actual work practices, a decision aid may need to support all of these tasks seamlessly. Neither of the mathematical models used in this study, nor the decision aids incorporating them, supported information seeking. This may limit the utility and impact of the decision aids.
The process of arriving at a decision is a flexible exploration process. The process of preparing for a decision is really one of exploration to develop a deeper understanding of the design goals and the alternatives and the unexplored possibilities. As Ullman et al. (1988) also observed designers exhibit great flexibility in this exploration process; they continually jump between adding or refining alternatives, gathering additional information, making comparisons, adding new criteria, adjusting estimates, etc. Ideally, a decision aid should support this flexible exploration process. Most mathematical decision methods assume that this exploration has already been done, and that design goals, relevant criteria, and alternatives have been specified and are now fixed. Such assumptions are likely too rigid for the ill-defined nature of complex design tasks, particularly conceptual design.
Precise information or statistical distributions describing likely design performance are often not available in practice. Furthermore, there may be high costs associated with information seeking. Thus, the assumption made by many mathematical models that statistical distributions estimating design performance can be obtained may not be reasonable. Decision aids must not assume it is.
Designers can rapidly apply much knowledge and experience in their heads. However, articulating this information and entering it into a tool may be perceived to be a burden. Additionally, the designers are very astute about which information they need most. They do not typically explore all criteria for all alternatives. They spend more time on criterion that will distinguish top options, skipping many others. This is a time-saving strategy that most mathematical methods do not support.

Discussion

The following sections discuss some questions that occurred as a result of the study.

What may discourage use of mathematical decision aids?

First, it was simply too time consuming to use mathematical methods for all decisions. As mentioned earlier, if these methods were applied to every small design variation, the design process would become exceedingly slow without improvement in most decisions, especially when there is one obvious winner. Users of any tool (software or otherwise) are very sensitive to the costs and benefits that they personally derive from the tools, and they may not be willing to use them if they perceive the benefits to be smaller on average than the extra work required (Grudin, 1988).

Second, designers did not have a clear metric or rule of thumb that allowed them to identify situations in which the mathematical tools would be likely to yield benefits. When designers are faced with a situation in which they are not sure whether time consuming mathematical methods will provide benefits, it is only natural to chose not to put in the additional work required to use them.

Third, the mathematical methods did not allow designers the flexibility to which they were accustomed when comparing design alternatives. We observed that designers would often incrementally consider design criteria, starting with those they considered most important, and conditionally exploring less important criteria as “tie breakers” if a winner did not emerge (this is the flexible considered comparison strategy which we described earlier). Designers can save much time by only considering criteria when they need to and only considering them for specific alternatives. In contrast, the mathematical methods assume that a fixed set of criteria will be compared and users must specify all of them, regardless.

Finally, and probably most importantly, most mathematical models assume a rather limited view of decision making (Klein, 1993). Naturalistic decision making is an approach in which decision making is studied in the context of actual tasks and the environments in which they are typically carried out (often work environments) (Klein, 1993; Suchman, 1987). Some of the premises underlying research in naturalistic decision making (Orasanu & Connolly, 1993) are that traditional, mathematically-oriented decision making research focuses on only one part of decision making, the decision event. In a decision event, a single decision maker compares a fixed set of alternatives using a fixed and well defined set of goals. Additionally, if precise information on the performance of each alternative is not available, statistical estimates can be obtained.

However, in natural design decision making situations (Dym, 1994; Thomas & Carroll, 1984; Ullman et al., 1988) few of these assumptions hold. Design is a good example of such a task; alternatives are constantly being added, modified, or refined, as are design goals. Much information is simply unknown or hard to obtain. Additionally, there are practical time considerations; the sheer number of decisions means that most must be made very rapidly. Finally, many important design activities are not captured by a decision event, for example, the information seeking behaviors that precede the selection of an alternative.

Why didn’t designers benefit more from the non-deterministic method?

One might expect that the non-deterministic (fuzzy) method would produce better results than the deterministic method because the former are based on more information. However, this was not observed to be the case in the experiment and set-up described; the average “goodness” values produced for each alternative by the fuzzy and the deterministic methods were very similar to each other (Akhavi, 2006). Thus non-deterministic methods may not provide any direct benefit if applied only to the task of ranking alternatives in a classically framed decision event. Even if the uncertainty of each alternative’s value is displayed, it may not produce a significantly better ranking.

Thus, by framing design decisions as “decision events,” one may be asking the wrong question, “Which alternative is best?” or more accurately, not enough questions. In practice, designers intertwine the questions of “Which alternative is best?” with “Do I have enough information to decide which is best?” The uncertainty in the values of the alternatives can be directly used to assist designers in answering this second question, as illustrated in the example in the next section. Thus, by applying non-deterministic MCDM methods to a wider range of tasks (alternative selection and information seeking decisions) they may better support a designer’s practical needs.

Recommendations

The studies described above provide the beginning of an understanding of why product designers do not tend to use formal mathematical methods in their daily work, and what their actual needs are. However, many additional issues need to be explored in order to fully understand the situation, and how to best create human-centered design decision aids.

Future research questions

The following questions are specific examples that should be explored:

Are there ways in which mathematical decision methods can be made more flexible?
Can mathematical methods be adapted to other decision making activities such as information seeking?
What types of visualizations might facilitate decisions pertaining to information seeking, alternative comparison, and selection?
Can the assumptions of mathematical methods be relaxed to fit realistic design situations?

Product designers may be more willing to use decision aids if those decisions aids can be better designed to fit the way they actually work. For example, flexible interfaces that allow product designers to switch rapidly between activities such as comparison of alternatives, information seeking, and adding or subtracting criteria may better fit their observed practice of jumping back and forth between these activities. Additionally, information solicited from product designers by the decision aid must consider designers’ willingness to find and enter the data and whether they can realistically obtain it in a timely and cost effective manner. For example, detailed statistical distributions describing cost and performance may not be readily obtainable for novel products, so it may not make sense to use a method that depends on highly accurate distributions. Finally, displays must be designed to present information in ways that facilitate understanding, integration, and navigation of product design information. For example, designers may want to “drill-down” into each of the alternatives to identify the major contributors to the uncertainty in that alternative’s overall value.

Closing Thoughts

The contribution of this work is in identifying (a) the actual needs and constraints of designers when making design decisions in work contexts, (b) the ways in which two decision theoretic methods, when formulated as decision events, fail to meet those needs, and (c) a set of future research questions and directions to explore so that user-centered decision support tools can be developed that reflect the way in which designers work, and consider both the costs and benefits of such tools for users. We feel that the major challenges in developing such tools do not necessarily lie in development of new decision theoretic methods, but in gaining a better understanding of how designers work and apply human-centered design principles to existing methods so that they support practical human needs as they exist in the workplace.

Additionally, we would like to emphasize that human-centered design does not mean simply design of understandable displays, although displays certainly play an important role. Equally important, if not more important, is the choice of tasks to which methods are applied and the interactions supported by the tool. All should support the way in which designers understand information, and the processes by which they solve problems. Ideally, displays should be designed so they can be understood with relatively little training by presenting concepts in familiar ways or by using familiar metaphors.

Conclusion

Laboratory studies, protocol studies, and ethnographic observations suggest there is a mismatch between the classical decision theoretic paradigm that focuses on a highly structured decision event, and the way in which designers actually approach such problems. By paying more attention to human-computer interaction issues associated with MCDM approaches, it may be possible to create mathematically-based tools that designers will actually want to use because the tools respect the constraints and challenges of real design tasks and work environments. This work takes the first steps in that direction by providing a greater understanding of how designers approach decision making tasks, what their needs are, and in what ways traditionally applied MCDM approaches meet and do not meet those needs. Tools that designers are willing to use with frequency will have much greater impact on engineering design than those that mostly sit on the shelf gathering dust.

References

Aikins, J. S., Kunz, J. C., Shortliffe, E.H., & Fallat, R.J. (1983, June). PUFF: an expert system for interpretation of pulmonary function data, Comput Biomed Res. 16(3),199-208.

Akhavi, F. (2006). Comparative Study of Fuzzy versus Deterministic Decision Making Techniques in Engineering Design. PhD Thesis, Department of Mechanical Engineering, University of Minnesota.

Akhavi, F. & Hayes, C. C. (2007, October). Decision Making in Engineering Design Tasks: Do Designers Benefit from Representations of Uncertainty? Human Factors and Ergonomics Society (HFES) 51st annual meeting. Baltimore, MD.

Anderson, J. R. (1993). The development of Expertise, Readings in Knowledge Acquisition and Learning: Automating the Construction and Improvement of Expert Systems (pp. 61-77).

Anderson, R. (2003). HorsePRO, Practical Horse Ration Optimizer, MS thesis, Minneapolis, MN, USA: University of Minnesota.

Aughernbaugh, J. M. & Paredis, C. J. (2006, July). The Value of Using Imprecise Probabilities in Engineering Design, Journal of Mechanical Design, 128(4), 969-979.

Bellman, R. E & Zadeh, L. A. (1970). Decision Making in a Fuzzy Environment, Management Sciences, 17, B141-B164.

Blanchard, B. S. & Fabrycky, W. J. (eds.) (2006). Systems Engineering and Analysis, fourth edition. International Series in Systems Engineering, Upper Saddle River, New Jersey, USA: Pearson Prentice Hall.

Blomberg, J., Burrel, M., & Guest, G. (2007) An Ethnographic Approach to Design, J. in A. Jacko and A. Sears (Eds.) The Human-Computer Interaction Handbook: Fundamentals, Evolving Technologies and Emerging Applications, second edition (pp. 964-986). CRC Press.

Bradley, S. R. & Agogino, A. M. (1994, December). An Intelligent Real Time Design Methodology for Catalog Selection, Transactions of the ASME, Journal of Mechanical Design, 116, 980-988.

Dym, C. L. (1994). Engineering Design: A Synthesis of Views. New York, NY: Cambridge University Press.

Ericsson, K. A. & Simon, H. A. (1980). Verbal Reports as Data, Psychological Data. Cambridge, MA USA: MIT Press.

Grudin, J. (1988). Why CSCW Applications Fail: Problems in the Design of Organizational Interfaces. Proceedings of the 1988 ACM Conference on Computer-supported Cooperative Work (pp. 85-93) Portland, Oregon, USA.

Hayes, C. C., Fiebig-Brodie, C., Winkler, R., & Schlabach, J. (2001). FOX-GA: A Course of Action Generator. M.S. Vassiliou & T.S. Huang (Eds.) (Rockwell Scientific Company) Computer-Science Handbook for Displays – Summary of Findings from the Army Research Lab’s Advanced Displays & Interactive Displays Federated Laboratory (pp. 187-195).

Hayes C. C. & Parzen, M.I. (1997). QUEM: An Achievement Test for Knowledge-Based Systems, IEEE Transactions on Knowledge and Data Engineering, 9(6):838-847.

Hayes, C. C. & Wright, P. K. (1989). Using a Manufacturing Constraint Network to Identify Cost-Critical Areas of Designs, Artificial Intelligence for Engineering Design and Manufacturing, 9, 73-87.

Hayes, J. R. (1981). The Complete Problem Solver (pp. 145-160). Philadelphia, PA: The Franklin Institute Press.

Ishii, K. (2004, September). Twenty years of DFM curriculum at Stanford: a Tribute to Philip Barkan, Plenary talk at the ASME Design for Manufacturing Conference, Salt Lake City, Utah, USA.

Klein, A. G. (1993). Decision Making in Action. Norwood, New Jersey, USA: Ablex Publishing Corporation.

Law, W. S. (1996). Evaluating imprecision in engineering design. Ph.D. Dissertation, California Institute of Technology, Pasadena, California.

Mark, G. (2002). Extreme collaboration, Communications of the ACM, 45(6):89-93.

Marshak, W. P., Brodie, C., Winkler, R., Stein, R., & Khakshour, A. (1999, February). Evaluating Intelligent Aiding of Course of Action Decisions Using the Fox Genetic Algorithm in 2-d and 3-d Displays. Advanced Displays and Interactive Displays ARL Federated Laboratories 3rd Annual Symposiums (pp. 37-31) College Park, MD.

Morse, E. L, Lewis, M., Korfhage, R. R., & Olsen, K. (1998). Evaluation of Text, Numeric and Graphical Representations for Retrieval Interfaces: User Preference and Task Performance Measures. IEEE International Conference on Systems, Man and Cybernetics: vol. 1 (pp. 1026-1031).

Orasanu, J. & Connolly, T. (1993). The Reinvention of Decision Making. In G. Klein, J. Orasanu, R.; Calderwood and C. E. Zsambok (Eds.) Decision Making in Action: Models and Methods, Norwood, New Jersey, USA: Ablex Publishing Corporation.

Pahl, G. & Beitz, W. (2006). Engineering Design: A Systematic Approach, second edition. London, England: Springer-Verlag.

Saaty, T. L. (1980). The Analytical Hierarchy Process. New York, NY, USA: MacGraw-Hill.

Simon, H. A. (1985). Sciences of the Artificial, second edition. Cambridge, MA USA: MIT Press.

Suchman, L. (1987). Plans and Situated Actions: the Problem of Human-machine Communication. New York: Cambridge University Press.

Thomas, J. C & Carroll, J. M. (1984) The Psychological Study of Design. In N. Cross (Ed.) Developments in Design Methodology (pp. 226-227) Chichester, England: John Wiley and Sons.

Thurston, D. L. & Carnahan, J. V. (1992). Fuzzy Rating and Utility Analysis in Preliminary Design Evaluation of Multiple Attributes, Transactions of the American Society of Mechanical Engineers, Journal of Mechanical Design, 114(44), 646-658.

Tversky, A. (2003). Preference, Belief, and Similarity: Selected Writings. E. Shafir (Ed.), MIT Press.

Ullman, D. G., Dietterich, T. G., & Stauffer, L. A. (1988). A Model of the Mechanical Design Process Based on Empirical Data, AI EDAM 2(1), 33-52, Academic Press.