1. Yes, randomization was done though details of the process (how it was actually done) are not described. Initially, both experimental and control group consisted of 21 two-person parties each; each party was later excluded from each group to avoid contamination of the design.
2. The "treatment" was a personal introduction on part of the server ("My name is Kim.") The parties in the control group did not receive this treatment.
3. The dependent variable in this experiment was the level of tipping.
The underlying causal hypothesis can be formulated as:
"If the server introduces him/herself by name, the guests will tip
more generously." OR: "The more personal the greeting by the server, the
higher the tip."
4. There was no pretest (=measurement of the dependent variable before the treatment), there was a posttest only. The posttest was simply to determine the amount of the tip. In the terminology of the textbook (see p.234), this study used a "randomized comparative posttest design".
Discussion of causal validity
1. The absence of a pretest is always reason for concern. Even if the randomization was properly done, it is possible (though not very likely) that experimental and control group differ at the start of the experiment.
2. The contact of experimenter (server) and subjects was not limited to the actual application of the treatment. Rather, the experimenter interacted with the subjects in the period between application of treatment and doing the posttest. Since the experiment took place in a natural setting ("field experiment") rather than in a lab, the outcome of the experiments may be affected by non-treatment interaction between experimenter and subject.
3. There may be an experimenter effect present which the experimenter (subconsciously) treats subjects in the experimental group different from those in the control group.
Conclusion: Yes, there are some concerns about the causal validity of the findings. However, without changing the contents of the hypothesis, it is not easily apparent how the research design could be improved to eliminate such concerns.
Obviously, there is more than just one solution to this problem. So, the following is just one possibility:
1. Take two day sections of the methods in a given semester, have them taught by the same instructor using the same textbook. The self-selection of students in one of two day sections can be considered as approximately random as long as the difference in teaching the sections is not made known beforehand.
2. One section is considered the experimental group and will spend a number of class sessions in a computer lab where each student will have to use the software program. The other section is taught in a regular classroom only and students are told that the contents of the diskette is irrelevant for how the class is taught and that students should just ignore the diskette.
3. The dependent variable is knowledge of methods as it covered by the instructional software. Multiple choice tests are given at the start of the semester (pretest) and at the end of the period over which the lab session are held and the instructional software is used. Similar but not identical tests are used for pretest and posttest to eliminate an "instrumentation" effect.