Front. Psychol. Frontiers in Psychology Front. Psychol. 1664-1078 Frontiers Media S.A. 10.3389/fpsyg.2018.01833 Psychology Original Research Why Can Only 24% Solve Bayesian Reasoning Problems in Natural Frequencies: Frequency Phobia in Spite of Probability Blindness Weber Patrick * Binder Karin Krauss Stefan Mathematics Education, Faculty of Mathematics, University of Regensburg, Regensburg, Germany

Edited by: Gorka Navarrete, Adolfo Ibáñez University, Chile

Reviewed by: Laura Felicia Martignon, Ludwigsburg University, Germany; Luana Micallef, University of Copenhagen, Denmark

*Correspondence: Patrick Weber patrick.weber@ur.de

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

12 10 2018 2018 9 1833 16 03 2018 07 09 2018 Copyright © 2018 Weber, Binder and Krauss. 2018 Weber, Binder and Krauss

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

For more than 20 years, research has proven the beneficial effect of natural frequencies when it comes to solving Bayesian reasoning tasks (Gigerenzer and Hoffrage, 1995). In a recent meta-analysis, McDowell and Jacobs (2017) showed that presenting a task in natural frequency format increases performance rates to 24% compared to only 4% when the same task is presented in probability format. Nevertheless, on average three quarters of participants in their meta-analysis failed to obtain the correct solution for such a task in frequency format. In this paper, we present an empirical study on what participants typically do wrong when confronted with natural frequencies. We found that many of them did not actually use natural frequencies for their calculations, but translated them back into complicated probabilities instead. This switch from the intuitive presentation format to a less intuitive calculation format will be discussed within the framework of psychological theories (e.g., the Einstellung effect).

Bayesian reasoning natural frequencies probabilities einstellung tree diagram

香京julia种子在线播放

    1. <form id=HxFbUHhlv><nobr id=HxFbUHhlv></nobr></form>
      <address id=HxFbUHhlv><nobr id=HxFbUHhlv><nobr id=HxFbUHhlv></nobr></nobr></address>

      Introduction

      Many professionals, such as medical doctors and judges in court, are expected to make momentous decisions based on statistical information. Often, Bayesian inferences are required, for example when a radiologist has to judge and communicate the statistical meaning of a positive mammography screening. Many empirical studies have documented faulty inferences and even cognitive illusions among professionals of various disciplines (Hoffrage et al., 2000; Operskalski and Barbey, 2016). In the medical context, the consequences are particularly severe because many patients are mistakenly found diseased, which can entirely change their lives (Brewer et al., 2007; Gigerenzer et al., 2007; Salz et al., 2010; Wegwarth and Gigerenzer, 2013). Similarly, insufficient knowledge of statistics in general and incorrect Bayesian reasoning in particular can result in false convictions or acquittals made by juries in court, for example when they have to evaluate evidence based on a fragmentary DNA sample. These faults bear the risk of destroying innocent people's lives, too, as happened, for instance, in the famous case of Sally Clark (Schneps and Colmez, 2013; Barker, 2017).

      Typically, the statistical information that the aforementioned professionals are confronted with is provided in probability format, that is, fractions or percentages describing the probability of a single event, for example the prevalence of breast cancer in the population. Generally, in situations where Bayesian inferences are necessary, three pieces of statistical information are given: the base rate (or a priori probability), sensitivity, and false alarm rate. Consider, for instance, the heroin addiction problem (adapted from Gigerenzer and Hoffrage, 1995):

      Heroin addiction problem (probability format):

      The probability of being addicted to heroin is 0.01% for a person randomly picked from a population (base rate). If a randomly picked person from this population is addicted to heroin, the probability is 100% that he or she will have fresh needle pricks (sensitivity). If a randomly picked person from this population is not addicted to heroin, the probability is 0.19% that he or she will still have fresh needle pricks (false alarm rate). What is the probability that a randomly picked person from this population who has fresh needle pricks is addicted to heroin (posterior probability)?

      With the help of Bayes' theorem, the corresponding posterior probability P(H|N), with H denoting “person is addicted to heroin” and N denoting “person has fresh needle pricks,” can be calculated.

      P(H|N) = P(N|H) · P(H)P(N|H) · P(H) + P(N|¬H) · P(¬H) = 100% · 0.01%100% · 0.01% + 0.19% · 99.99%  5% 

      Given the probabilistic information (the low base rate, high sensitivity, and low false alarm rate), the result of only 5% seems astonishingly low to most people—professionals and laypeople alike. In fact, only very few—on average as few as 4% of the participants included in a comprehensive meta-analysis (McDowell and Jacobs, 2017)—are able to draw the correct inferences necessary to come to the right conclusion in such Bayesian tasks. The vast majority of people have difficulties, which can result in severe misjudgments.

      The reasons for this poor performance in Bayesian reasoning are widely discussed. One explanation is the neglect of the base rate, which can be very low in many Bayesian situations (Tversky and Kahneman, 1974; Bar-Hillel, 1983). This leads to much greater estimates for the posterior probability, which is consistent with most people's intuition. Further reasons for the poor performance include participants neglecting the false alarm rate P(N|H) or confusing the false alarm rate with the posterior probability P(H|N) (Gigerenzer and Hoffrage, 1995) as well as participants overweighing the sensitivity (e.g., McCloy et al., 2007).

      In order to prevent dangerous misjudgments due to faulty Bayesian inferences, the concept of natural frequencies has proven to be a powerful instrument (e.g., Gigerenzer and Hoffrage, 1995; Siegrist and Keller, 2011). Natural frequencies can be obtained by natural sampling (Kleiter, 1994) or, alternatively, by translating probabilities (e.g., “80%”) into expressions consisting of two absolute frequencies (e.g., “80 out of 100”; for a discussion on the equivalence of natural frequencies and probabilities, see section Present Approach). Consider once again the heroin addiction example, this time, however, in natural frequency format:

      10 out of 100,000 people from a given population are addicted to heroin. 10 out of 10 people who are addicted to heroin will have fresh needle pricks. 190 out of 99,990 people who are not addicted to heroin will nevertheless have fresh needle pricks. How many of the people from this population who have fresh needle pricks are addicted to heroin?

      With the help of this format, significantly more people find the correct answer to the problem, which is 10 out of (10 + 190). As a consequence, performance rates in the frequency format typically increase to about 24% (McDowell and Jacobs, 2017). Errors due to base rate neglect as mentioned above occur less often with natural frequencies, since the base rate need not be attended to in the frequency version because it is already included in the information on the sensitivity and false alarm rate. Thus, Bayes' modified theorem containing natural frequencies yields the correct answer of “10 out of 200” in the heroin addiction problem based on a simpler computation: P(H|N)=#(NH)#(N)=1010+190=5%

      More than 20 years of research have confirmed the benefit that comes with the concept of natural frequencies in Bayesian reasoning situations. Laypeople, students, professionals across various domains (e.g., medicine, law, and management), and even children perform significantly better when working on a Bayesian reasoning task that is presented in natural frequencies instead of probabilities (e.g., Wassner, 2004; Zhu and Gigerenzer, 2006; Hoffrage et al., 2015; Binder et al., 2018).

      Additionally, various other factors are known to have an impact on performance in Bayesian reasoning tasks. Visualizations, for example tree diagrams (e.g., Yamagishi, 2003; Binder et al., 2018), unit squares (e.g., Böcherer-Linder and Eichler, 2017; Pfannkuch and Budgett, 2017), icon arrays (e.g., Brase, 2009, 2014) or roulette wheel diagrams (e.g., Yamagishi, 2003; Brase, 2014), have been shown to improve accuracies in Bayesian situations (for an exception, see, e.g., Micallef et al., 2012). An overview and categorization of visualizations that were used to boost performance in Bayesian situations is provided by Khan et al. (2015). Furthermore, individual differences of participants, particularly cognitive abilities such as numeracy, graphicacy, and spatial abilities, certainly have an impact on performance rates (e.g., Chapman and Liu, 2009; Brown et al., 2011; Micallef et al., 2012; Peters, 2012; Ottley et al., 2016). In addition, the specific numerical values for population size, base rate, sensitivity, and false alarm rate can influence accuracies (Schapira et al., 2001). Cognitive biases and judgment errors associated with different numerical information are, for example, size effect and distance effect (Moyer and Landauer, 1967). Finally, details of the representation and framing of the problem text can affect performance in Bayesian reasoning situations (Obrecht et al., 2012). Ottley et al. (2016), for example, were able to show that specific problem formulations (e.g., providing all numerical information in context of the task, that is, not only base rate, sensitivity, and false alarm rate but also the probability or frequency of their respective complement) influence accuracies significantly.

      However, instead of contributing to the abundance of empirical studies replicating and discussing the beneficial effect of natural frequencies or other factors (e.g., Hoffrage et al., 2002; Pighin et al., 2016; McDowell et al., 2018), in this article we will focus on the other side of the coin, that is, on the 76% of participants in these studies (on average in McDowell and Jacobs, 2017) who failed to solve Bayesian reasoning tasks with natural frequencies. Why can still on average only a quarter of participants solve the problem correctly, although the task is presented in the beneficial natural frequency format? Many psychological theories explain, discuss, and specify in detail if and why natural frequencies facilitate Bayesian inferences (e.g., the nested sets-hypothesis or the ecological rationality framework, see Gigerenzer and Hoffrage, 1999; Lewis and Keren, 1999; Mellers and McGraw, 1999; Girotto and Gonzalez, 2001, 2002; Hoffrage et al., 2002; Sloman et al., 2003; Barbey and Sloman, 2007; Pighin et al., 2016; McDowell et al., 2018) and how additional tools, such as visualizations, further increase their beneficial effect (e.g., Yamagishi, 2003; Brase, 2009, 2014; Spiegelhalter et al., 2011; Micallef et al., 2012; Garcia-Retamero and Hoffrage, 2013; Micallef, 2013; Ottley et al., 2016; Böcherer-Linder and Eichler, 2017). However, a satisfying answer to the question why only 24% of participants solve Bayesian reasoning problems in natural frequency format correctly has not yet been found.

      Present approach

      In order to explain why only 24% of participants draw correct Bayesian inferences when confronted with natural frequencies, in the present article we take one step back and switch our focus from performance rates to cognitive processes. In this respect, some important questions have not been addressed in detail so far: When given a Bayesian reasoning problem in frequency format, how do participants who fail to provide the correct answer approach the task? Where exactly do their calculations fail and why?

      In order to gain a first impression of what participants might do when confronted with a task in natural frequency format, we checked the questionnaires from our previous studies on Bayesian reasoning and natural frequencies (e.g., Krauss et al., 1999; Binder et al., 2015). Interestingly, we revealed some instances where participants had not applied the given natural frequencies but had translated them back into probabilities. In order to explore this phenomenon in depth, we had a closer look on what students usually learn about Bayesian reasoning problems in their high school statistics classes.

      Over the past two decades, statistics education has become an important column in German high school curricula. Here, just like in other countries, systematic calculation with probabilities has been in the center of teaching efforts. Alternative formats, such as natural frequencies, have despite the great amount of empirical research underpinning their benefits only played a minor role (cf. the American GAISE recommendations; Franklin et al., 2007). Even though there are some very recent efforts to implement the frequency concept in German curricula, for example in the new Bavarian high school curriculum for grade 10 (ISB, 2016), there still seems to be a tendency that this format is not accepted as equally mathematically valid as probabilities. This is supported by our impression from trainings for mathematics teachers that the concept of natural frequencies is not even familiar to most teachers. Furthermore, many schoolbooks tend to solve statistical tasks (not only Bayesian ones) with probability calculations, even when the task is presented in absolute frequencies (e.g., Freytag et al., 2008; Rach, 2018). Another observation we made based on a review of typical Bavarian school textbooks (Eisentraut et al., 2008; Freytag et al., 2008; Schmid et al., 2008) and workbooks (Sendner and Ruf-Oesterreicher, 2011; Reimann and Bichler, 2015) was that the more advanced students become in their high school career, the fewer statistical tasks are solved with natural frequencies by the respective textbooks. In conclusion, high school (and, consequently, university) students are a lot more familiar with probabilities than with natural frequencies due to their general (and sometimes even tertiary) statistical education. This implies that working with probabilities is a well-established strategy when it comes to solving statistical problems.

      While in many situations people profit from such an established strategy, in some cases, however, a previously fixed mindset can block simpler ways to approaching a problem (Haager et al., 2014). This phenomenon lies at the center of prominent psychological theories on cognitive rigidity. Consider, for example, the so-called Einstellung or mental set effect (Luchins, 1942). When solving a problem, people often rigidly apply a previously learnt solution strategy while neglecting possibly important information that would allow an easier solution. Such an Einstellung or mental set can be developed through repeated training, enabling the person to quickly solve problems of the same structure (Schultz and Searleman, 2002; Ellis and Reingold, 2014; Haager et al., 2014). However, the downside of these mental sets is that they can make a person “blind” to simpler solutions or—in the worst case—unable to find a solution at all.

      The most famous example for the Einstellung effect is Luchin's water jar experiment (1942; for more recent studies on the Einstellung effect in chess players and with anagram problems see, e.g., Bilalić et al., 2008; Ellis and Reingold, 2014). Participants in Luchin's study had to work out on paper how to obtain a certain volume of water using three empty jars of different sizes for measuring. The first five problems could all be solved by applying a relatively complicated strategy that was shown to the participants in an example problem. For the following five problems, a much simpler solution method was possible. However, the majority of participants kept using the complicated strategy they had previously learnt. Moreover, many of them could not solve the eighth problem at all, for which only the simple solution strategy was appropriate (Luchins, 1942).

      Recent research has shown that even experts can be subject to the Einstellung effect (e.g., Bilalić et al., 2008). Thus, mental sets developed over a long period of time can also lead to the blocking of simple solutions (for a detailed discussion of different aspects of cognitive rigidity see Schultz and Searleman, 2002). The probability strategy, which German students deal with during their whole high school career, would be an example for such a mental set that is developed over time. So taken together, these psychological theories and the strong familiarity of students with probabilities hint toward a possible answer to the question what participants might wish to do when they are confronted with a task in frequency format: They might try to represent the situation in the much more familiar probability format in order to be able to use established probabilities for their calculations.

      Such an Einstellung toward calculating with probabilities instead of natural frequencies would take away all benefits that come with the frequency concept. Calculating with probabilities in a Bayesian context—even though the task is provided in frequency format—has the consequence that the intuitive natural frequency algorithm [formula (2)] is no longer available, the more complicated probability algorithm [formula (1)] has to be applied, and people are no longer able find the correct solution. Thus, the Einstellung effect might explain why on average three quarters of participants fail with natural frequencies. In the same line, we assume that it is very unlikely that people translate probabilities into natural frequencies when given a task in probability format—despite over 20 years of research on the beneficial effects of natural frequencies.

      Here, the question might arise whether the two formats can actually be considered equivalent. In this respect, both mathematical and psychological aspects need to be addressed. First, we will shed light on the respective mathematical frameworks both formats operate in and to what extent these frameworks can be considered equivalent. Second, we will analyze the equivalence of probabilities and natural frequencies from a psychological viewpoint.

      Even though the two formats seem to follow different rules, from a mathematical perspective they can be defined analogously. Weber (2016) showed that natural frequencies can be embedded in a theoretical framework that is isomorphic to a probability space, that is, the structure at the basis of probability theory can be constructed in a similar way for natural frequencies. Thus, all fundamental mathematical properties of probabilities, for example closure, commutativity, and associativity of their addition, can theoretically also be assigned to natural frequencies (for details, see Weber, 2016). Therefore, the two concepts can be considered equivalent, implying that natural frequencies are an information format just as mathematically valid as probabilities.

      However, regardless of this theoretical equivalence of the two formats, a certain psychological uneasiness about the equivalence of natural frequencies and probabilities still seems to exist. It can be speculated that students who do not know about the mathematical framework of the frequency format might switch from natural frequencies to probabilities not only because they think that a probability algorithm is the only or the easiest way to solve the problem but also due to this subtle feeling of uneasiness, which stems from the assumption that natural frequencies are not a mathematically valid tool for solving Bayesian reasoning tasks. The latter implies that participants—even if they realize that a solution can be derived very easily by using natural frequencies—might think that a mathematically justified argumentation requires reasoning in terms of probabilities. All three assumptions (probabilities are the only, the easiest or the only allowed way) might trigger participants to rely on their Einstellung instead of actively using natural frequencies.

      To be clear, we theoretically consider natural frequencies as a superordinate concept for both “expected” and “empirically sampled” frequencies. Expected frequencies constitute frequencies expected in the long run (cf. Hertwig et al., 2004; Spiegelhalter and Gage, 2015; case 2 in Woike et al., 2017) and are often used for problem formulations in natural frequency format. In contrast, empirically sampled frequencies are derived from a natural sampling process (cf. Kleiter, 1994; Fiedler et al., 2000; cases 1 and 3 in Woike et al., 2017; for a discussion of the two sub-concepts of natural frequencies, see also Hertwig et al., 2004; Spiegelhalter and Gage, 2015).

      Of course, in the context of possibly switching between the two formats, besides the information format of the task, also the format in which the question is asked has to be taken into consideration (for a discussion on other details of textual problem representation, see, e.g., Ottley et al., 2016). It has to be noted that several studies (e.g., Cosmides and Tooby, 1996; Evans et al., 2000; Girotto and Gonzalez, 2001; Sirota et al., 2015) suggest that a question format that does not match the information format of the task reduces the natural frequency facilitation effect (Ayal and Beyth-Marom, 2014; Johnson and Tubau, 2015). However, only few studies directly test such incongruent problem and question formats (McDowell and Jacobs, 2017).

      We also do not want to examine incongruent formats (or other factors mentioned above) systematically (e.g., in order to boost performance), but rather aim to implement a question format as neutral as possible that allows for both answer formats simultaneously. Our interest is to observe and analyze a substantial amount of participants for all four possible cases, namely those who stay with the given format (probability or natural frequency) and those who switch to the other format for their calculations, in order to learn from the respective cognitive processes about possible mechanisms underlying the choice of calculation format.

      Since in our questionnaires from previous studies (Krauss et al., 1999; Binder et al., 2015), it was not always possible to judge which calculation format a participant applied, we will now explicitly ask participants to write down their solution algorithm in order to capture cognitive policies. Thus, in the present study we enter new research fields by investigating potential preferences in calculation format—when a problem introduction and question format as neutral as possible are given—that become visible by the way participants try to solve a given Bayesian task.

      Our research questions are:

      Research question 1: Do participants show a general preference of the probability format over natural frequencies that becomes manifest in a strong tendency to

      keep working with probabilities if a task is given in probability format, although a sample population is provided

      even translate a task given in frequency format into probabilities, if the question allows for answers in both formats?

      Research question 2:

      Regardless of the format in which the task is presented, do participants who work on this task actively using natural frequencies make more correct Bayesian inferences than participants who make their computations with probabilities?

      If questions allow for answers in both formats, which factor predicts correct Bayesian inferences better—the format that the task is presented in (presentation format) or the format that participants actively use for their calculations (calculation format)?

      Regarding research question 1, we hypothesized that participants do show a strong preference of probabilities over natural frequencies in both presentation formats. We further assumed that this preference has indeed a detrimental effect on performance in Bayesian reasoning tasks. With regard to research question 2, we therefore hypothesized that actively working with natural frequencies is a stronger predictor for correct inferences than the presentation format of a task.

      Experimental study

      To examine these research questions, we conducted an empirical study with a first sample (N = 114) in 2016 (see section Participants). In the light of the current debate on the replication crisis (e.g., Open Science Collaboration, 2015), we decided to check the robustness of the results obtained with another sample (N = 69) with the same materials and design in 2017/2018. Three participants from the second sample were excluded from the analysis because they indicated that they had already participated in the first sample. Since we detected the same effects for both samples independently, we report the results for the combined sample of N = 180 (see section Results).

      Method

      Participants in our study had to work on two Bayesian reasoning tasks with different scenarios (heroin addiction problem and car accident problem, adapted from Gigerenzer and Hoffrage, 1995) and different numerical data (for design see Table 1 and for problem wordings see Table 2). These two contexts were chosen since they are not as common as, for example, the famous mammography problem, and thus, the chance of a participant already knowing the task beforehand was small. Moreover, both problems refer to daily-life situations, so the participants were expected to have no difficulties understanding the scenarios. One of the two Bayesian problems was presented in probability format and the other one in natural frequency format. We systematically permuted the order of context as well as information format.

      Design of the implemented problem versions.

      Context
      Heroin addiction problem Car accident problem
      Presentation format Probabilities • Introduction: sample provided • Presentation format of the task: probabilities • Question format: probabilities • Visualization presented or to be constructed • Introduction: sample provided • Presentation format of the task: probabilities • Question format: probabilities • Visualization presented or to be constructed
      Natural frequencies • Introduction: sample provided • Presentation format of the task: natural frequencies • Question format: proportions • Visualization presented or to be constructed • Introduction: sample provided • Presentation format of the task: natural frequencies • Question format: proportions • Visualization presented or to be constructed

      Problem formulations.

      Heroin addiction problem Car accident problem
      Probability version Natural frequency version Probability version Natural frequency version
      Introduction Imagine that you randomly meet a person with fresh needle pricks in the street. You are interested in whether this person is addicted to heroin. On the internet, you find the following information for a sample of 100,000 people: Imagine you see a drunken person getting behind the wheel of his or her car after a party. You are interested in the risk of a car accident caused by this person. On the internet, you find the following information for a sample of 10,000 drivers:
      Statistical information The probability that one of these people is addicted to heroin is 0.01%. If one of these people is addicted to heroin, the probability is 100% that he or she will have fresh needle pricks. If one of these people is not addicted to heroin, the probability is 0.19% that he or she will nevertheless have fresh needle pricks. 10 out of 100,000 people are addicted to heroin. 10 out of 10 people who are addicted to heroin will have fresh needle pricks. 190 out of 99,990 people who are not addicted to heroin will nevertheless have fresh needle pricks. The probability that one of these drivers will cause an accident is 1%. If one of these drivers causes an accident, the probability is 55% that he or she is drunk. If one of these drivers does not cause an accident, the probability is 5% that he or she is nevertheless drunk. 100 out of 10,000 drivers cause an accident. 55 out of 100 drivers who cause an accident are drunk. 500 out of 9,900 drivers who do not cause an accident are nevertheless drunk.
      Question What is the probability that one of these people is addicted to heroin, if he or she has fresh needle pricks? Of the people who have fresh needle pricks, what is the proportion of them addicted to heroin? What is the probability that one of these drivers causes an accident, if he or she is drunk? Of the drivers who are drunk, what is the proportion of them causing an accident?
      Visual aid • First task: construct a tree diagram • Second task: consider a presented tree diagram • First task: construct a tree diagram • Second task: consider a presented tree diagram • First task: construct a tree diagram • Second task: consider a presented tree diagram • First task: construct a tree diagram • Second task: consider a presented tree diagram
      Prompt “Please write down your calculations!”

      In typical natural frequency versions, the question reads “How many of the … have/are …?,” often followed by a line “Answer: ____ out of ____.” Note that we are interested in cognitive processes triggered purely by the presentation format and not by a provided question or answer format. Thus, in all natural frequency versions, we wanted to implement a question format that allows both for probability and for natural frequency answers. In order to be as neutral as possible, we decided to use questions for proportions (see Tables 1, 2), which are a common question format in schoolbooks, too. The question “What is the proportion of people…” can be answered by, for example, “5%” or by “10 out of 200” and thus is settled in between probabilities and natural frequencies.

      In the probability versions, formulating a neutral question is rather difficult because a proportion usually refers to a concrete sample. Thus, instead of making the question format as neutral as possible, we decided to provide the participants already in the introduction with a sample population that the probabilities could be referred to (e.g., “On the internet, you find the following information for a sample of 100,000 people”). Thereby, we again allowed for both calculation formats. While in natural frequency versions the option for probability answers lies in the neutral question format, a possible natural frequency answer in probability versions was opened up by providing a concrete sample in the beginning of the task. It is important to note that we did not primarily want to compare performances by presentation format (which would just be a replication of many other studies) but by calculation format, so a total parallelization of the task versions was neither necessary nor the optimal design for our research questions.

      Because Bayesian reasoning tasks in German schoolbooks are usually presented with tree diagrams (Binder et al., 2015), after the question, we either asked for the construction of a tree diagram (in the first task) or presented a tree diagram (in the second task). The aim here was to present stimuli that are as ecologically valid as possible [with respect to (German) teaching contexts both in school and in university] and that provide the option to switch between the two formats. Both at school and at university level, 2 × 2-tables and tree diagrams are most commonly used for teaching Bayesian reasoning, whereas alternative visualizations (unit squares, icon arrays, etc.) are usually omitted. Since both 2 × 2-tables and tree diagrams allow for switching between the two formats (unlike, e.g., icon arrays) and since tree diagrams but not 2 × 2-tables can be directly equipped with conditional probabilities, only tree diagrams remained as visualizations suitable for our study. By using the latter, our hope was to exploratively shed light on whether a tree diagram might influence participants' choice of calculation format, for example by making the given presentation format more salient (for tree diagrams equipped with probabilities or natural frequencies in the heroin addiction problem see Figure 1). In sum, rather than systematically varying specific factors (or boosting performance), we wanted (1) to know how participants reason with the materials usually presented in German schools and universities, and (2) to observe a substantial number of people switching or staying with the presentation format in order to analyze their respective reasoning processes. For the same reasons, we implemented standard problem wordings.

      Tree diagrams visualizing the heroin addiction problem equipped with probabilities and natural frequencies.

      Since participants were explicitly asked to write down all calculations they made in order to solve the task, we were able to judge precisely and systematically which format they used for their calculations (see Supplementary Table 2; also see section Coding).

      The paper and pencil questionnaire contained a short information paper on the study and some general questions, for example on participants' age or study program, as well as the two tasks. Before participants were allowed to start with the second task, they had to hand in their solution for the first task. Participants were allowed to use a pocket calculator that was provided along with the questionnaire. There was no time limit; on average, participants took approximately 5 min to complete the demographic items and 25 min for both tasks.

      Coding

      The normatively correct solutions of the problems were 5% (or 10 out of 200) for the heroin addiction problem and 9.9% (or 55 out of 555) for the car accident problem (the results differ marginally if the task was presented in natural frequencies as opposed to probabilities, e.g., exactly 10% in the car accident probability version vs. 9.9% in the car accident frequency version). In order to guarantee maximum objectivity for classifying the answers as “correct Bayesian inference” or “incorrect Bayesian inference” and also for deciding whether either a probability algorithm or a frequency algorithm had been applied, we used strict coding guidelines (see Supplementary Table 1), which were applied by all coders. Since we were especially interested in whether participants used the correct algorithm for solving the task, mere calculation or rounding errors were neglected, resulting in answers that were classified as “correct Bayesian inference” even though the mathematical result was not entirely correct. In the same line, answers that appeared mathematically correct at first glance were classified as “incorrect Bayesian inference” if the result was just incidentally correct, but a wrong algorithm was applied (this rarely happened).

      Furthermore, we focused on the cognitive processes underlying each response when determining the “calculation format” of an answer. This cognitive process was measured by analyzing the exact calculations each participant wrote down to come to a solution. When a participant used probabilities (or natural frequencies) only, we classified the solution as “calculated with probabilities” (or natural frequencies, respectively). When both formats were clearly visible in the calculations, we classified the answer according to whether the participant used probabilities or natural frequencies for the crucial step in the calculation process, that is, the computation of the denominator in Bayes' formula, as can be seen in equations (1) and (2). Thus, the decisive factor in such unclear cases was the addition of two absolute numbers (in favor of a frequency algorithm) or the multiplication of probabilities (in favor of a probability algorithm, respectively). If, for example, in the heroin addiction problem a participant used both formats for his or her calculations, but added two absolute numbers (e.g., 10 + 190) to obtain the denominator in (2), the answer was classified as “calculated with natural frequencies”. If, on the other hand, a participant used both formats, but multiplied two probabilities (e.g., 0.01 × 100%) like in (1) to obtain the respective probabilities for the numerator or the denominator, we classified the answer as “calculated with probabilities” (no participant added frequencies and multiplied probabilities).

      Two raters coded 21% of all inferences independently according to the coding guidelines (see Supplementary Tables 1, 2). Since in 100% of all cases the correctness was rated in congruence (Cohen's κ = 1; Cohen, 1960), and the calculation format was classified identically in 97% of all cases (Cohen's κ = 0.95), the remaining inferences were rated by one coder.

      Participants

      We recruited N = 114 students from the University of Regensburg (Bavaria) in summer 2016, and N = 69 in winter 2017/2018 (three of which were excluded from the analysis since they had already participated in the study in 2016). Most of these students were enrolled in a teaching math program (N = 147), while some of them studied economic information technology, so a certain level of mathematics competency among the participants can be assumed (see also section Discussion). They were at different stages of their studies (most of them in their first two years) and their age ranged from 18 to 38, with an average of 22 years. Out of the total of N = 180 participants, 121 were female. Since each participant worked on two tasks, we obtained a total of 360 Bayesian inferences including participants' detailed solution algorithms.

      The study was carried out in accordance with the University Research Ethics Standards. Participants were informed that the study was voluntary and anonymous, and no incentives were paid. Participants were asked to give their written informed consent to participate in the study in advance. Thereupon, two students refrained from participating.

      Results

      In the following, we report the results for the combined sample of N = 180 participants, but all detected effects also hold for both the original (N = 114) and the replication sample (N = 66) independently. As far as our first research question is concerned, the results indeed show a strong preference of participants for calculating with probabilities in both contexts. This is illustrated by Figure 2, where, for example, P→ F denotes participants who were provided with a task in probability format but calculated with natural frequencies. On the one hand, when presented with a task in natural frequency format (second and fourth bars of Figure 2), almost half of participants (49%) nevertheless chose to apply probabilities for their calculations, although the neutral question explicitly allowed for answers in both formats. On the other hand, when they faced a probability version of a task (first and third bars of Figure 2), only 18% across both contexts chose to translate the problem into natural frequencies—despite the explicitly given sample population in the introduction. Taken together, according to our design natural frequencies represented the preferred calculation format in only about one third (34%) of all 360 Bayesian tasks although 50% of all tasks were presented in natural frequency format.

      Calculation format by presentation format and context.

      While Figure 2 does not yet display performances, Figure 3 shows performance rates in the resulting four combinations of presentation format and calculation format (P→P, P→F, F→F, F→P) for both problem contexts. It becomes clear that when natural frequencies were actively used for the calculations, performance rates were significantly higher than when probabilities were applied. Remarkably, in our design this holds true almost regardless of the presentation format: For both problems, the patterns look very similar for the two presentation formats. The performance in both problems obviously mainly depends on the calculation format, but only to a small amount on the presentation format. In the heroin addiction problem, the difference between both calculation formats is especially pronounced. The highest performance was detected when both variables presentation format and calculation format were natural frequencies (61% correct responses), descriptively followed by probability tasks that were worked on with frequencies (53% correct responses). In the two other cases (when participants calculated with probabilities), performance rates were considerably lower (13% if the presentation format was probabilities and 9% if the presentation format was natural frequencies).

      Percentages of correct inferences dependent on the presentation and calculation format in both problems.

      In general, the beneficial effect of presenting natural frequencies was replicated by our study. While 20% of the Bayesian tasks in probability format were solved correctly across both contexts, the performance rate for the tasks presented in frequency format was 36% (see Table 3). Compared to McDowell and Jacobs (2017), both of these numbers seem rather high. An explanation might lie within our sample: more than 80% of participants were enrolled in a mathematics education program and might therefore have comparably high numeracy, enabling them to perform above average in math tasks (for an analysis of participants' individual differences and switching behavior depending on their cognitive abilities, see below). Note that we also found context effects (36% correct responses in the heroin context vs. 20% correct inferences in the car accident context).

      Percentage of correct Bayesian inferences by context and presentation format (independent of calculation format).

      Presentation format Context Average
      Heroin addiction problem Car accident problem
      Probabilities 22% (n = 92 inferences) 19% (n = 89 inferences) 20% (n = 181 inferences)
      Natural frequencies 51% (n = 87 inferences) 22% (n = 92 inferences) 36% (n = 179 inferences)

      In order to separate the effects of presentation format and calculation format, we ran a generalized linear mixed model (GLMM) with a logistic link function. Here, we specified probabilities (both as presentation format and as calculation format) as reference category and included the possible explanatory factors “presentation format,” “calculation format” (via dummy coding), and the interaction term of presentation format and calculation format to predict the probability of a correct Bayesian inference in our design.

      According to the results of the generalized linear mixed model, the unstandardized regression coefficient for solving a task that was both presented and calculated in probability format was significant (b0 = −7.03, SE = 1.32, z = −5.32, p < 0.001), showing large inter-individual differences (for a discussion of these results, see below). The (unstandardized) regression coefficient for the presentation format was non-significant (b1 = −3.04, SE = 2.00, z = −1.52, p = 0.13), whereas the calculation format showed a significant regression coefficient (b2 = 9.85, SE = 3.85, z = 2.56, p = 0.01). Finally, the interaction of presentation format and calculation format yielded another significant regression coefficient (b3 = 4.85, SE = 2.22, z = 2.19, p = 0.03), indicating that calculating with natural frequencies increases performance even more when the task is also formulated in natural frequency format (i.e., when the absolute numbers for the frequency algorithm can be directly taken from the problem wording).

      The strong differences of individual competencies lead to extreme (unstandardized) regression coefficients in the model. However, a generalized linear model (neglecting inter-individual differences) estimated regression coefficients that—converted into probabilities via the logistic link function—exactly replicated the performance rates found in our data. This is because the GLMM accounts for these large differences in performances by estimating large inter-individual differences between the participants, as the intercepts (denoting the performances when presentation and calculation format was probabilities) were allowed to vary freely between participants. The substantial influence of the inter-individual differences also becomes apparent when inspecting the model fit: Whereas 6.5% of the variance is explained by the fixed GLMM regression coefficients (marginal R2 = 0.065), the inter-individual differences and the fixed regression coefficients together explain 68.5% of the variance (conditional R2 = 0.685). However notably, despite the large inter-individual differences, the influence of the fixed effects on the results was clear and strong.

      Although we did not explicitly collect data about participants' cognitive abilities (e.g., numeracy, spatial and graphical literacy), these inter-individual differences suggested a closer analysis of our data with this respect. Indeed, we found significant differences in performance especially between two subgroups of our sample: The N = 42 mathematics education students aspiring to teach at the academic school track of the German school system (Gymnasial students) outperformed the other N = 138 participants significantly (50% correct inferences vs. 21%; t(358) = 5.294, p < 0.001). We assume that this difference is due to the higher numerical, spatial, and graphical abilities of the first group, since they generally outperform the other mathematics education students in mathematics exams or mathematical knowledge tests (e.g., Krauss et al., 2008; see also Lindl and Krauss, 2017, Table 5, p. 396). Moreover, the Gymnasial students receive a considerably more thorough education in mathematics through their study program than the rest of our participants. However interestingly, these differences in cognitive abilities did not have any influence on calculation format preferences. Both subgroups tended in a similar way to prefer using probabilities over natural frequencies for their calculations (32% of Gymnasial students' solutions were based on a frequency algorithm, whereas 35% of the other participants calculated with natural frequencies; t(358) = −0.506; p = 0.613). As a consequence, although an overall shift of performances might be expected depending on participants' cognitive abilities and education, we assume a certain generalizability of our results across varying abilities and education levels regarding the switching rates (cf. section Discussion).

      By examining exploratively participants' reactions on a presented tree diagram, we revealed several instances where the participants had added probabilities to the branches of a tree diagram originally presented with natural frequencies in the nodes. Conversely, only few of the participants equipped a tree diagram that was originally presented in probability format with natural frequencies. When the participants had to construct actively a tree diagram visualizing the textual problem, we detected some instances where already before the diagram was drawn, participants had switched in their calculation format (in both directions: from natural frequencies to probabilities and vice versa). Therefore, some participants translated the presentation format into their calculation format right at the beginning of their problem solution process. However, since we did not systematically test versions without a visualization clue, these findings have to be considered only explorative hints concerning possible cognitive mechanisms that might lead participants to stay with a certain format or to switch from one to the other. These mechanisms will have to be addressed more closely in future research.

      Discussion

      In an empirical study with N = 180 students from the University of Regensburg, we found that the majority of participants do not actively use natural frequencies in Bayesian reasoning tasks. Even if the task is presented in the intuitive natural frequency format (with a neutral question asking for proportions), about half of the participants still prefer calculating with probabilities instead. Therefore, and since the “standardized” probability format is the “sine qua non” in probability theory, the results of our study reveal the Einstellung effect in Bayesian reasoning situations (Luchins, 1942; Luchins and Luchins, 1959; McCloy et al., 2007). We speculate that such an Einstellung might be enhanced by the still widespread idea that natural frequencies are not “mathematically correct” enough to actually work with in high school and university contexts. As a consequence, participants who might actually notice a possible solution of the Bayesian reasoning task based on a frequency algorithm might still rely on probabilities due to a certain kind of “phobia” to use natural frequencies for their calculations (for a discussion on the impact of affect on overcoming fixed mindsets, see Haager et al., 2014)—despite the ever-growing body of research pointing to the beneficial effects of the frequency concept (e.g., Gigerenzer and Hoffrage, 1995; Barbey and Sloman, 2007; Micallef et al., 2012; Obrecht et al., 2012; Ottley et al., 2016; McDowell and Jacobs, 2017).

      Although with our study, we cannot ultimately decide whether the Einstellung effect or this kind of “phobia” lies at the heart of participants' switching back to probabilities, we want to emphasize that both formats are mathematically equivalent in the sense that they can be defined analogously with the same properties and structure. Whatever the case may be, since recent efforts to implement natural frequencies in high school and university curricula appear not to be enough to make people actively take advantage of their benefits, we vouch for an even stronger implementation of the natural frequency concept in secondary education (especially in the higher grades), tertiary education, and in teacher training.

      The Einstellung toward preferring probabilities has a negative impact on performance rates: participants working with probabilities perform significantly worse than those who apply natural frequencies for their calculations. Moreover, at least in our design, the calculation format is an even stronger predictor for performance than the presentation format that previous research has mainly concentrated on (e.g., Barbey and Sloman, 2007; Siegrist and Keller, 2011). This suggests that participants who translate natural frequencies into probabilities follow a path that is disadvantageous in two respects: First, they choose the unintuitive probability over the natural frequency format, and second, they are prone to make further mistakes due to translation errors (that we did not explicitly consider in our study). Interestingly, a few participants (18%) did translate probabilities into natural frequencies. This suggests that at least a small minority is to some extent familiar with the natural frequency concept. These participants profit indeed from calculating with natural frequencies since their performance rates increased substantially compared to performances of participants who stay with probabilities (13 vs. 53% across both implemented contexts). This tendency is a first sign that natural frequencies might become an established solution strategy for Bayesian reasoning tasks.

      It has to be noted that our sample consisted of university students entirely. Since their mindsets and cognitive abilities (especially numeracy as well as graphical and spatial literacy) probably differ from the general population (Micallef et al., 2012), a different sample might, of course, yield different performance rates. However, we assume that even though the total population might generally perform worse than our sample, those using natural frequencies for their calculations will still outperform those who resort to probabilities. In the same way, we would expect an overall shift of performance rates depending on item difficulty or wording (for factors determining the difficulty of Bayesian reasoning tasks as well as for different problem wordings, see, e.g., Ottley et al., 2016), but we assume relative consistency with respect to format preferences across different Bayesian reasoning tasks. Future research might investigate in detail whether our results indicating an Einstellung effect in Bayesian reasoning situations hold also true when individual differences and item difficulty are systematically controlled.

      The context effects in our study in favor of the heroin addiction problem could be explained by having a closer look at the question formulation in the car accident problem. Here, the two relative clauses in the frequency version (see Table 2) demand higher verbal processing abilities and thus make the question harder to understand compared to the frequency question in the heroin addiction problem (only one relative clause, see Table 2). Consequently, the heroin addiction problem presented with natural frequencies yields significantly higher performance rates than the respective version of the car accident problem (51% correct inferences vs. 22%; see Table 3). Moreover, coding in our study was fairly complex (see Supplementary Tables 1, 2), even though we obtained interrater reliability scores of κ = 1 for the correctness of a Bayesian inference and of κ = 0.95 (Cohen, 1960) for determining the calculation format. In addition, we focused only on the correct algorithm applied for classifying an answer as “correct” (see Supplementary Table 1). Thus, we did not concentrate on calculation errors, including those that resulted from translating an information format into the other one. Therefore, we did not systematically detect translation errors dependent on the respective presentation format, in particular. This, however, is a conservative approach, since we assume that more people make mistakes when translating frequencies into probabilities than vice versa.

      Furthermore, in an explorative analysis, we detected several instances where the participants had equipped a presented frequency tree diagram with probabilities, suggesting that such a visualization does not prevent the participants from switching from the natural frequency to the probability format for their calculations. We speculate that even the opposite is the case: Since students are familiar with probability tree diagrams but not so much with frequency tree diagrams from their high school careers, the sight of a tree diagram (even though it is equipped with natural frequencies) might trigger their memories of the familiar probability trees and might thus provoke them to fill the diagram with probabilities. Moreover, many participants equipped the tree diagram they had been asked to draw with their chosen calculation format—even if the latter differed from the presentation format. This suggests that the participants tend to decide on their calculation format right at the beginning of their solution process. We thus speculate that the exact moment of the format switch lies immediately after (or even at the same time as) reading the task. Therefore, further research might investigate systematically when exactly people decide on the format they want to use for their calculations and if people possibly alter their decision during the solution process. In addition, it would be interesting to determine whether presenting a visualization such as a tree diagram or actively constructing one enhances or diminishes the Einstellung effect in Bayesian reasoning tasks (e.g., by systematically comparing versions with and without visualization)—and, more generally, whether visualizations affect the calculation format at all.

      The question remains open to what extent natural frequencies should be implemented in statistics education, since they can only be used in specific situations (e.g., in Bayesian reasoning problems or tasks where cumulative risk judgment is necessary; see McCloy et al., 2007). We suggest that natural frequencies be taught already at a young age to establish the concept over a longer period of time. When—at a later stage—the focus is shifted more and more to probabilities, a permanent interplay between the two formats seems reasonable. By using natural frequencies to illustrate, for example, the multiplication rule or Bayes' theorem, students can understand the two coexisting formats as equally legitimate representations for the underlying concept of uncertainty. Here, natural frequencies can be used to eliminate typical errors, to make difficult problems more understandable, and to prevent cognitive illusions. When probabilities are presented simultaneously, the connection between the two formats might become more apparent and a deeper understanding of the concept of uncertainty might be achieved. In this respect, future work, for example systematic training studies (cf. Sedlmeier and Gigerenzer, 2001), needs to determine the most successful ways to incorporate natural frequencies in statistics education at secondary and tertiary level in order to overcome the Einstellung effect.

      Future research on this topic might also investigate in more detail how much current teachers already know about the frequency concept in order to decide if natural frequencies indeed need a stronger focus in teacher training as we suggest. This could, for example, be realized by systematic teacher interviews. Moreover, future research might address empirically the cognitive mechanisms underlying the Einstellung effect as detected by our study, that is, whether participants assume that a probability algorithm is (a) the only way, (b) the easiest way, or (c) due to a feeling of uneasiness with the frequency concept the only mathematically allowed way to approach the Bayesian problem. Here, qualitative methods such as student interviews might be a valuable tool to clarify situation-specific causes of the Einstellung effect. Finally, it would be interesting to determine effective methods (e.g., visualizations or hints in the problem wording) to prevent people from falling back into probabilities in Bayesian reasoning tasks.

      Data availability statement

      The dataset generated can be found on https://epub.uni-regensburg.de/37693/.

      Ethics statement

      This study was carried out in accordance with the recommendations of University Research Ethics Standards, University of Regensburg. The protocol was approved by the University of Regensburg. All subjects gave written informed consent in accordance with the Declaration of Helsinki.

      Author contributions

      All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

      Conflict of interest statement

      The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

      We want to thank all participants of our study for contributing to our research project. We further thank Sven Hilbert for his statistical advice. This work was supported by the German Research Foundation (DFG) within the funding program Open Access Publishing.

      Supplementary material

      The Supplementary Material for this article can be found online at: /articles/10.3389/fpsyg.2018.01833/full#supplementary-material

      References Ayal S. Beyth-Marom R. (2014). The effects of mental steps and compatibility on Bayesian reasoning. Judgm. Decis. Mak. 9, 226242. Barbey A. K. Sloman S. A. (2007). Base-rate respect: from ecological rationality to dual processes. Behav. Brain Sci. 30, 241297. 10.1017/S0140525X0700165317963533 Bar-Hillel M. (1983). The base rate fallacy controversy. Adv. Psychol.16, 3961. 10.1016/S0166-4115(08)62193-7 Barker M. J. (2017). Connecting applied and theoretical Bayesian epistemology: data relevance, pragmatics, and the legal case of Sally Clark. J. Appl. Philos. 34, 242262. 10.1111/japp.12181 Bilalić M. McLeod P. Gobet F. (2008). Why good thoughts block better ones: the mechanism of the pernicious Einstellung (set) effect. Cognition 108, 652661. 10.1016/j.cognition.2008.05.00518565505 Binder K. Krauss S. Bruckmaier G. (2015). Effects of visualizing statistical information: an empirical study on tree diagrams and 2 × 2 tables. Front. Psychol. 6:1186. 10.3389/fpsyg.2015.0118626379569 Binder K. Krauss S. Bruckmaier G. Marienhagen (2018). Visualizing the Bayesian 2-test case: the effect of tree diagrams on medical decision making. PLoS ONE 13:e0195029. 10.1371/journal.pone.019502929584770 Böcherer-Linder K. Eichler A. (2017). The impact of visualizing nested sets. An empirical study on tree diagrams and unit squares. Front. Psychol. 7:241. 10.3389/fpsyg.2016.0202628123371 Brase G. (2009). Pictorial representations in statistical reasoning. Appl. Cogn. Psychol. 23, 369381. 10.1002/acp.1460 Brase G. (2014). The power of representation and interpretation: doubling statistical reasoning performance with icons and frequentist interpretations of ambiguous numbers. J. Cogn. Psychol. 26, 8197. 10.1080/20445911.2013.861840 Brewer N. T. Salz T. Lillie S. E. (2007). Systematic review: the long-term effects of false-positive mammograms. Ann. Intern. Med. 146, 502510. 10.7326/0003-4819-146-7-200704030-0000617404352 Brown S. M. Culver J. O. Osann K. E. MacDonald D. J. Sand S. Thornton A. A. . (2011). Health litereacy, numeracy, and interpretation of graphical breast cancer risk estimates. Patient Educ. Couns. 83, 9298. 10.1016/j.pec.2010.04.02720554149 Chapman G. B. Liu J. (2009). Numeracy, frequency, and Bayesian reasoning. Judgm. Decis. Mak. 4:34. Cohen J. (1960). A coefficient of agreement for nominal scales. Educ. Psychol. Meas. 20, 3746. 10.1177/001316446002000104 Cosmides L. Tooby J. (1996). Are humans good intuitive statisticians after all? Rethinking some conclusions from the literature on judgment under uncertainty. Cognition 58, 173. 10.1016/0010-0277(95)00664-8 Eisentraut F. Ernst S. Keck K. Leeb P. Schätz U. Steuer H. Schätz, R. . (2008). Delta 10 – Mathematik für Gymnasien [Delta 10 – Mathematics for the Academic School Track]. Bamberg: CC Buchners Verlag. Ellis J. J. Reingold E. M. (2014). The Einstellung effect in anagram problem solving: evidence from eye movements. Front. Psychol. 5:679. 10.3389/fpsyg.2014.0067925071650 Evans J. S. B. T. Handley S. J. Perham N. Over D. E. Thompson V. A. (2000). Frequency versus probability formats in statistical word problems. Cognition 77, 197213. 10.1016/S0010-0277(00)00098-611018509 Fiedler K. Brinkmann B. Betsch T. Wild B. (2000). A sampling approach to biases in conditional probability judgments: beyond base rate neglect and statistical format. J. Exp. Psychol. General 129, 399418. 10.1037//0096-3445.129.3.39911006907 Franklin C. Horton N. Kader G. Moreno J. Murphy M. Snider V. . (2007). Guidelines for Assessment and Instruction in Statistics Education (GAISE) Report – A pre-K-12 Curriculum Framework. Alexandria, VA: American Statistical Association. Available Online at: www.amstat.org/education/gaise Freytag C. Herz A. Kammermeyer F. Kurz K. Peteranderl M. Schmähling R. . (2008). Fokus Mathematik 10 Gymnasium Bayern [Focus on Mathematics 10 for the Bavarian Academic School Track]. Berlin: Cornelsen Verlag. Garcia-Retamero R. Hoffrage U. (2013). Visual representation of statistical information improves diagnostic inferences in doctors and their patients. Soc. Sci. Med. 83, 2733. 10.1016/j.socscimed.2013.01.03423465201 Gigerenzer G. Gaissmaier W. Kurz-Milcke E. Schwartz L. M. Woloshin S. (2007). Helping doctors and patients make sense of health statistics. Psychol. Sci. Public Interest 8, 5396. 10.1111/j.1539-6053.2008.00033.x26161749 Gigerenzer G. Hoffrage U. (1995). How to improve Bayesian reasoning without instruction: frequency formats. Psychol. Rev. 102, 684704. 10.1037/0033-295X.102.4.684 Gigerenzer G. Hoffrage U. (1999). Overcoming difficulties in Bayesian reasoning: a reply to Lewis and Keren (1999) and Mellers and McGraw (1999). Psychol. Rev. 106, 425430. 10.1037/0033-295X.106.2.425 Girotto V. Gonzalez M. (2001). Solving probabilistic and statistical problems: a matter of information structure and question form. Cognition 78, 247276. 10.1016/S00100277(00)00133-511124351 Girotto V. Gonzalez M. (2002). Chances and frequencies in probabilistic reasoning: rejoinder to Hoffrage, Gigerenzer, Krauss, and Martignon. Cognition 84, 353359. 10.1016/S0010-0277(02)00051-312044740 Haager J. S. Kuhbandner C. Pekrun R. (2014). Overcoming fixed mindsets: the role of affect. Cogn. Emot. 28, 756767. 10.1080/02699931.2013.85164524215265 Hertwig R. Barron G. Weber E. U. Erev I. (2004). Decisions from experience and the effect of rare events in risky choice. Psychol. Sci. 15, 534539. 10.1111/j.0956-7976.2004.00715.x15270998 Hoffrage U. Gigerenzer G. Krauss S. Martignon L. (2002). Representation facilitates reasoning: what natural frequencies are and what they are not. Cognition 84, 343352. 10.1016/S0010-0277(02)00050-112044739 Hoffrage U. Krauss S. Martignon L. Gigerenzer G. (2015). Natural frequencies improve Bayesian reasoning in simple and complex inference tasks. Front. Psychol. 6:1473. 10.3389/fpsyg.2015.0147326528197 Hoffrage U. Lindsey S. Hertwig R. Gigerenzer G. (2000). Communicating statistical information. Science 290, 22612262. 10.1126/science.290.5500.226111188724 ISB, (2016). Staatsinstitut für Schulqualität und Bildungsforschung LehrplanPLUS Gymnasium Mathematik 10 [Curriculum for year 10 of the Bavarian academic school track]. Available online at http://www.lehrplanplus.bayern.de/fachlehrplan/gymnasium/10/mathematik. (Accessed 18 July, 2018). [ISB] (Ed.). Johnson E. D. Tubau E. (2015). Comprehension and computation in Bayesian problem solving. Front. Psychol. 6:938. 10.3389/fpsyg.2015.0093826283976 Khan A. Breslav S. Glueck M. Hornbæk K. (2015). Benefits of visualization in the mammography problem. Int. J. Hum. Comput. Stud. 83, 94113. 10.1016/j.ijhcs.2015.07.001 Kleiter G. D. (1994). Natural sampling. Rationality without base rates, in Contributions to Mathematical Psychology, Psychometrics, and Methodology, eds Fisher G. H. Laming D. (New York, NY: Springer), 375388. Krauss S. Baumert J. Blum W. (2008). Secondary mathematics Teachers' pedagogical content knowledge and content knowledge: validation of the COACTIV constructs. Int. J. Math. Educ. 40, 873892. 10.1007/s11858-008-0141-9 Krauss S. Martignon L. Hoffrage U. (1999). Simplifying Bayesian Inference: the General Case, in Model-based Reasoning in Scientific Discovery, ed Magnani N. E. A. (New York, NY: Kluwer Academic/Plenum Publishers), 165179. Lewis C. Keren G. (1999). On the difficulties underlying Bayesian reasoning: a comment on Gigerenzer and Hoffrage. Psychol. Rev. 106, 411416. 10.1037/0033-295X.106.2.411 Lindl A. Krauss S. (2017). Transdisziplinäre Perspektiven auf domänenspezifische Lehrerkompetenzen. Eine Metaanalyse zentraler Resultate der Forschungsprojektes FALKO [Transdisciplinary perspectives on domain specific teacher competences. A meta-analysis of central results of the FALKO research project], in FALKO: Fachspezifische Lehrerkompetenzen. Konzeption von Professionswissenstests in den Fächern Deutsch, Englisch, Latein, Physik, Musik, Evangelische Religion und Pädagogik [FALKO: Subject specific teacher competences. Conception of professional knowledge test in the subjects German, English, Latin, Physics, Musical Education, Evangelical Religious Education, and Pedagogy], eds Krauss S. Lindl A. Schilcher A. Fricke M. Göhring A. Hofmann B. Kirchhoff P. Mulder R. H. (Münster: Waxmann), 381438. Luchins A. S. (1942). Mechanization in problem solving: the effect of einstellung. Psychol. Monogr. 54, 195. 10.1037/h0093502 Luchins A. S. Luchins E. H. (1959). Rigidity of Behavior: A Variational Approach to the Effect of Einstellung. Eugene, OR: University of Oregon Books. McCloy R. Beaman C. P. Morgan B. Speed R. (2007). Training conditional and cumulative risk judgements: the role of frequencies, problem-structure and einstellung. Appl. Cogn. Psychol. 21, 325344. 10.1002/acp.1273 McDowell M. Galesic M. Gigerenzer G. (2018). Natural frequencies do foster public understanding of medical tests: comment on Pighin, Gonzalez, Savadori and Girotto (2016). Medical Decis. Making. 38, 390399. 10.1177/0272989X1875450829448883 McDowell M. Jacobs P. (2017). Meta-Analysis of the Effect of Natural Frequencies on Bayesian Reasoning. Psychol. Bull. 143, 12731312. 10.1037/bul000012629048176 Mellers B. A. McGraw A. P. (1999). How to improve Bayesian reasoning: comment on Gigerenzer and Hoffrage (1995). Psychol. Rev. 106, 417424. 10.1037/0033-295X.106.2.417 Micallef L. (2013). Visualizing Set Relations and Cardinalities Using Venn and Euler Diagrams. University of Kent. Dissertation. Micallef L. Dragicevic P. Fekete J. (2012). Assessing the effect of visualizations on Bayesian reasoning through crowdsourcing. Visualization and Computer Graphics. IEEE Trans. Visual. Comput. Graph. 18, 25362545. 10.1109/TVCG.2012.199 Moyer R. S. Landauer T. K. (1967). Time required for judgements of numerical inequality. Nature 215, 15191520. 10.1038/2151519a06052760 Obrecht N. A. Anderson B. Schulkin J. Chapman G. B. (2012). Retrospective frequency formats promote consistent experience-based Bayesian judgments. Appl. Cogn. Psychol. 26, 436440. 10.1002/acp.2816 Open Science Collaboration (2015). Estimating the reproducibility of psychological science. Science 349, 18. 10.1126/science.aac4716 Operskalski J. T. Barbey A. K. (2016). Risk literacy in medical decision-making. Science 352, 413414. 10.1126/science.aaf796627102467 Ottley A. Peck E. M. Harrison L. T. Afergan D. Ziemkiewicz C. Taylor H. A. . (2016). Improving Bayesian reasoning: the effects of phrasing, visualization, and spatial ability. IEEE Trans. Vis. Comput. Graph. 22, 529538. 10.1109/TVCG.2015.246775826390491 Peters E. (2012). Beyond comprehension: the role of numeracy in judgments and decisions. Curr. Dir. Psychol. Sci. 21, 3135. 10.1177/0963721411429960 Pfannkuch M. Budgett S. (2017). Reasoning from an eikosogram: an exploratory study. Int. J. Res. Undergraduate Math. Educ. 3, 283310. 10.1007/s40753-016-0043-0 Pighin S. Gonzalez M. Savadori L. Girotto V. (2016). Natural frequencies do not foster public understanding of medical test results. Medical Decision Making 36, 686691. 10.1177/0272989X16640785 Rach S. (2018). Visualisierungen bedingter Wahrscheinlichkeiten – Präferenzen von Schülerinnen und Schülern [Visualizations of conditional probabilities – preferences of students]. Mathemat. Didact. 41, 118. Reimann S. Bichler E. (2015). Abitur 2016: Original-Prüfungsaufgaben mit Lösungen – Gymnasium Bayern Mathematik [Final secondary-school examinations 2016: Original mathematics exam tasks with solutions – Bavarian academic school track]. Hallbergmoos: Stark Verlag. Salz T. Richman A. R. Brewer N. T. (2010). Meta-analyses of the effect of false-positive mammograms on generic and specific psychosocial outcomes. Psycho-Oncol. 19, 10261034. 10.1002/pon.167620882572 Schapira M. M. Nattinger A. B. McHorney C. A. (2001). Frequency or probability? A qualitative study of risk communication formats used in health care. Med. Decis. Making 21, 459467. 10.1177/0272989X010210060411760103 Schmid A. Weidig I. Götz H. Herbst M. Kestler C. Kosuch H. . (2008). Lambacher Schweizer 10 – Mathematik für Gymnasien Bayern [Lambacher Schweizer 10 – Mathematics for the Bavarian academic school track]. Stuttgart: Ernst Klett Verlag. Schneps L. Colmez C. (2013). Math on trial: How Numbers Get Used and Abused in the Courtroom. New York, NY: Basic Books. Schultz P. W. Searleman A. (2002). Rigidity of thought and behavior: 100 years of research. Genet. Soc. Gen. Psychol. Monogr. 128, 165207. 12194421 Sedlmeier P. Gigerenzer G. (2001). Teaching Bayesian reasoning in less than two hours. J. Exp. Psychol. 130, 380400. 10.1037//0096-3445.130.3.38011561916 Sendner S. Ruf-Oesterreicher K. (2011). Lambacher Schweizer 10 – Mathematik für Gymnasien Bayern: Lösungen und Materialien [Lambacher Schweizer 10 – Mathematics for the Bavarian academic school track: Solutions and materials]. Stuttgart: Ernst Klett Verlag. Siegrist M. Keller C. (2011). Natural frequencies and Bayesian reasoning: the impact of formal education and problem context. J. Risk Res. 14, 10391055. 10.1080/13669877.2011.571786 Sirota M. Kostovičová L. Vallée-Tourangeau F. (2015). Now you Bayes, now you don't: effects of set-problem and frequency-format mental representations on statistical reasoning. Psychon. Bull. Rev. 22, 14651473. 10.3758/s13423-015-0810-y25711182 Sloman S. A. Over D. Slovak L. Stibel J. M. (2003). Frequency illusions and other fallacies. Organ. Behav. Hum. Decis. Process. 91, 296309. 10.1016/S0749-5978(03)00021-9 Spiegelhalter D. Gage J. (2015). What can education learn from real-world communication of risk and uncertainty? Math. Enthus. 12, 410. Spiegelhalter D. Pearson M. Short I. (2011). Visualizing uncertainty about the future. Science 333, 13931400. 10.1126/science.119118121903802 Tversky A. Kahneman D. (1974). Judgment under uncertainty: heuristics and biases. Science 185, 11241131. 10.1126/science.185.4157.112417835457 Wassner C. (2004). Förderung Bayesianischen Denkens – Kognitionspsychologische Grundlagen und Didaktische Analysen [Promoting Bayesian Reasoning – Principles of Cognitive Psychology, and Didactical Analyses]. Hildesheim: Franzbecker. Weber P. (2016). Natürliche Häufigkeiten – Chancen und Grenzen aus fachwissenschaftlicher und Fachdidaktischer Sicht [Natural Frequencies – Benefits and Limits From a Mathematical and an Educational Perspective]. Master's thesis, University of Regensburg. Wegwarth O. Gigerenzer G. (2013). Overdiagnosis and overtreatment: evaluation of what physicians tell their patients about screening harms. JAMA Intern. Med. 173, 20862088. 10.1001/jamainternmed.2013.1036324145597 Woike J. K. Hoffrage U. Martignon L. (2017). Integrating and testing natural frequencies, naïve Bayes, and fast-and-frugal trees. Decision 4, 234260. 10.1037/dec0000086 Yamagishi K. (2003). Facilitating normative judgments of conditional probability: frequency or nested sets? Exp. Psychol. 50, 97106. 10.1027//1618-3169.50.2.97 Zhu L. Gigerenzer G. (2006). Children can solve Bayesian problems: the role of representation in mental computation. Cognition 98, 287308. 10.1016/j.cognition.2004.12.00316399266
      ‘Oh, my dear Thomas, you haven’t heard the terrible news then?’ she said. ‘I thought you would be sure to have seen it placarded somewhere. Alice went straight to her room, and I haven’t seen her since, though I repeatedly knocked at the door, which she has locked on the inside, and I’m sure it’s most unnatural of her not to let her own mother comfort her. It all happened in a moment: I have always said those great motor-cars shouldn’t be allowed to career about the streets, especially when they are all paved with cobbles as they are at Easton Haven, which are{331} so slippery when it’s wet. He slipped, and it went over him in a moment.’ My thanks were few and awkward, for there still hung to the missive a basting thread, and it was as warm as a nestling bird. I bent low--everybody was emotional in those days--kissed the fragrant thing, thrust it into my bosom, and blushed worse than Camille. "What, the Corner House victim? Is that really a fact?" "My dear child, I don't look upon it in that light at all. The child gave our picturesque friend a certain distinction--'My husband is dead, and this is my only child,' and all that sort of thing. It pays in society." leave them on the steps of a foundling asylum in order to insure [See larger version] Interoffice guff says you're planning definite moves on your own, J. O., and against some opposition. Is the Colonel so poor or so grasping—or what? Albert could not speak, for he felt as if his brains and teeth were rattling about inside his head. The rest of[Pg 188] the family hunched together by the door, the boys gaping idiotically, the girls in tears. "Now you're married." The host was called in, and unlocked a drawer in which they were deposited. The galleyman, with visible reluctance, arrayed himself in the garments, and he was observed to shudder more than once during the investiture of the dead man's apparel. HoME香京julia种子在线播放 ENTER NUMBET 0016www.lylinlong.com.cn
      www.jmqdky.com.cn
      www.lnjyzyq.org.cn
      www.rbtaobao.com.cn
      qqgjhz.com.cn
      www.suyin.net.cn
      npchain.com.cn
      www.qnjswp.com.cn
      www.szcxj5288.com.cn
      www.ly100.com.cn
      处女被大鸡巴操 强奸乱伦小说图片 俄罗斯美女爱爱图 调教强奸学生 亚洲女的穴 夜来香图片大全 美女性强奸电影 手机版色中阁 男性人体艺术素描图 16p成人 欧美性爱360 电影区 亚洲电影 欧美电影 经典三级 偷拍自拍 动漫电影 乱伦电影 变态另类 全部电 类似狠狠鲁的网站 黑吊操白逼图片 韩国黄片种子下载 操逼逼逼逼逼 人妻 小说 p 偷拍10幼女自慰 极品淫水很多 黄色做i爱 日本女人人体电影快播看 大福国小 我爱肏屄美女 mmcrwcom 欧美多人性交图片 肥臀乱伦老头舔阴帝 d09a4343000019c5 西欧人体艺术b xxoo激情短片 未成年人的 插泰国人夭图片 第770弾み1 24p 日本美女性 交动态 eee色播 yantasythunder 操无毛少女屄 亚洲图片你懂的女人 鸡巴插姨娘 特级黄 色大片播 左耳影音先锋 冢本友希全集 日本人体艺术绿色 我爱被舔逼 内射 幼 美阴图 喷水妹子高潮迭起 和后妈 操逼 美女吞鸡巴 鸭个自慰 中国女裸名单 操逼肥臀出水换妻 色站裸体义术 中国行上的漏毛美女叫什么 亚洲妹性交图 欧美美女人裸体人艺照 成人色妹妹直播 WWW_JXCT_COM r日本女人性淫乱 大胆人艺体艺图片 女同接吻av 碰碰哥免费自拍打炮 艳舞写真duppid1 88电影街拍视频 日本自拍做爱qvod 实拍美女性爱组图 少女高清av 浙江真实乱伦迅雷 台湾luanlunxiaoshuo 洛克王国宠物排行榜 皇瑟电影yy频道大全 红孩儿连连看 阴毛摄影 大胆美女写真人体艺术摄影 和风骚三个媳妇在家做爱 性爱办公室高清 18p2p木耳 大波撸影音 大鸡巴插嫩穴小说 一剧不超两个黑人 阿姨诱惑我快播 幼香阁千叶县小学生 少女妇女被狗强奸 曰人体妹妹 十二岁性感幼女 超级乱伦qvod 97爱蜜桃ccc336 日本淫妇阴液 av海量资源999 凤凰影视成仁 辰溪四中艳照门照片 先锋模特裸体展示影片 成人片免费看 自拍百度云 肥白老妇女 女爱人体图片 妈妈一女穴 星野美夏 日本少女dachidu 妹子私处人体图片 yinmindahuitang 舔无毛逼影片快播 田莹疑的裸体照片 三级电影影音先锋02222 妻子被外国老头操 观月雏乃泥鳅 韩国成人偷拍自拍图片 强奸5一9岁幼女小说 汤姆影院av图片 妹妹人艺体图 美女大驱 和女友做爱图片自拍p 绫川まどか在线先锋 那么嫩的逼很少见了 小女孩做爱 处女好逼连连看图图 性感美女在家做爱 近距离抽插骚逼逼 黑屌肏金毛屄 日韩av美少女 看喝尿尿小姐日逼色色色网图片 欧美肛交新视频 美女吃逼逼 av30线上免费 伊人在线三级经典 新视觉影院t6090影院 最新淫色电影网址 天龙影院远古手机版 搞老太影院 插进美女的大屁股里 私人影院加盟费用 www258dd 求一部电影里面有一个二猛哥 深肛交 日本萌妹子人体艺术写真图片 插入屄眼 美女的木奶 中文字幕黄色网址影视先锋 九号女神裸 和骚人妻偷情 和潘晓婷做爱 国模大尺度蜜桃 欧美大逼50p 西西人体成人 李宗瑞继母做爱原图物处理 nianhuawang 男鸡巴的视屏 � 97免费色伦电影 好色网成人 大姨子先锋 淫荡巨乳美女教师妈妈 性nuexiaoshuo WWW36YYYCOM 长春继续给力进屋就操小女儿套干破内射对白淫荡 农夫激情社区 日韩无码bt 欧美美女手掰嫩穴图片 日本援交偷拍自拍 入侵者日本在线播放 亚洲白虎偷拍自拍 常州高见泽日屄 寂寞少妇自卫视频 人体露逼图片 多毛外国老太 变态乱轮手机在线 淫荡妈妈和儿子操逼 伦理片大奶少女 看片神器最新登入地址sqvheqi345com账号群 麻美学姐无头 圣诞老人射小妞和强奸小妞动话片 亚洲AV女老师 先锋影音欧美成人资源 33344iucoom zV天堂电影网 宾馆美女打炮视频 色五月丁香五月magnet 嫂子淫乱小说 张歆艺的老公 吃奶男人视频在线播放 欧美色图男女乱伦 avtt2014ccvom 性插色欲香影院 青青草撸死你青青草 99热久久第一时间 激情套图卡通动漫 幼女裸聊做爱口交 日本女人被强奸乱伦 草榴社区快播 2kkk正在播放兽骑 啊不要人家小穴都湿了 www猎奇影视 A片www245vvcomwwwchnrwhmhzcn 搜索宜春院av wwwsee78co 逼奶鸡巴插 好吊日AV在线视频19gancom 熟女伦乱图片小说 日本免费av无码片在线开苞 鲁大妈撸到爆 裸聊官网 德国熟女xxx 新不夜城论坛首页手机 女虐男网址 男女做爱视频华为网盘 激情午夜天亚洲色图 内裤哥mangent 吉沢明歩制服丝袜WWWHHH710COM 屌逼在线试看 人体艺体阿娇艳照 推荐一个可以免费看片的网站如果被QQ拦截请复制链接在其它浏览器打开xxxyyy5comintr2a2cb551573a2b2e 欧美360精品粉红鲍鱼 教师调教第一页 聚美屋精品图 中韩淫乱群交 俄罗斯撸撸片 把鸡巴插进小姨子的阴道 干干AV成人网 aolasoohpnbcn www84ytom 高清大量潮喷www27dyycom 宝贝开心成人 freefronvideos人母 嫩穴成人网gggg29com 逼着舅妈给我口交肛交彩漫画 欧美色色aV88wwwgangguanscom 老太太操逼自拍视频 777亚洲手机在线播放 有没有夫妻3p小说 色列漫画淫女 午间色站导航 欧美成人处女色大图 童颜巨乳亚洲综合 桃色性欲草 色眯眯射逼 无码中文字幕塞外青楼这是一个 狂日美女老师人妻 爱碰网官网 亚洲图片雅蠛蝶 快播35怎么搜片 2000XXXX电影 新谷露性家庭影院 深深候dvd播放 幼齿用英语怎么说 不雅伦理无需播放器 国外淫荡图片 国外网站幼幼嫩网址 成年人就去色色视频快播 我鲁日日鲁老老老我爱 caoshaonvbi 人体艺术avav 性感性色导航 韩国黄色哥来嫖网站 成人网站美逼 淫荡熟妇自拍 欧美色惰图片 北京空姐透明照 狼堡免费av视频 www776eom 亚洲无码av欧美天堂网男人天堂 欧美激情爆操 a片kk266co 色尼姑成人极速在线视频 国语家庭系列 蒋雯雯 越南伦理 色CC伦理影院手机版 99jbbcom 大鸡巴舅妈 国产偷拍自拍淫荡对话视频 少妇春梦射精 开心激动网 自拍偷牌成人 色桃隐 撸狗网性交视频 淫荡的三位老师 伦理电影wwwqiuxia6commqiuxia6com 怡春院分站 丝袜超短裙露脸迅雷下载 色制服电影院 97超碰好吊色男人 yy6080理论在线宅男日韩福利大全 大嫂丝袜 500人群交手机在线 5sav 偷拍熟女吧 口述我和妹妹的欲望 50p电脑版 wwwavtttcon 3p3com 伦理无码片在线看 欧美成人电影图片岛国性爱伦理电影 先锋影音AV成人欧美 我爱好色 淫电影网 WWW19MMCOM 玛丽罗斯3d同人动画h在线看 动漫女孩裸体 超级丝袜美腿乱伦 1919gogo欣赏 大色逼淫色 www就是撸 激情文学网好骚 A级黄片免费 xedd5com 国内的b是黑的 快播美国成年人片黄 av高跟丝袜视频 上原保奈美巨乳女教师在线观看 校园春色都市激情fefegancom 偷窥自拍XXOO 搜索看马操美女 人本女优视频 日日吧淫淫 人妻巨乳影院 美国女子性爱学校 大肥屁股重口味 啪啪啪啊啊啊不要 操碰 japanfreevideoshome国产 亚州淫荡老熟女人体 伦奸毛片免费在线看 天天影视se 樱桃做爱视频 亚卅av在线视频 x奸小说下载 亚洲色图图片在线 217av天堂网 东方在线撸撸-百度 幼幼丝袜集 灰姑娘的姐姐 青青草在线视频观看对华 86papa路con 亚洲1AV 综合图片2区亚洲 美国美女大逼电影 010插插av成人网站 www色comwww821kxwcom 播乐子成人网免费视频在线观看 大炮撸在线影院 ,www4KkKcom 野花鲁最近30部 wwwCC213wapwww2233ww2download 三客优最新地址 母亲让儿子爽的无码视频 全国黄色片子 欧美色图美国十次 超碰在线直播 性感妖娆操 亚洲肉感熟女色图 a片A毛片管看视频 8vaa褋芯屑 333kk 川岛和津实视频 在线母子乱伦对白 妹妹肥逼五月 亚洲美女自拍 老婆在我面前小说 韩国空姐堪比情趣内衣 干小姐综合 淫妻色五月 添骚穴 WM62COM 23456影视播放器 成人午夜剧场 尼姑福利网 AV区亚洲AV欧美AV512qucomwwwc5508com 经典欧美骚妇 震动棒露出 日韩丝袜美臀巨乳在线 av无限吧看 就去干少妇 色艺无间正面是哪集 校园春色我和老师做爱 漫画夜色 天海丽白色吊带 黄色淫荡性虐小说 午夜高清播放器 文20岁女性荫道口图片 热国产热无码热有码 2015小明发布看看算你色 百度云播影视 美女肏屄屄乱轮小说 家族舔阴AV影片 邪恶在线av有码 父女之交 关于处女破处的三级片 极品护士91在线 欧美虐待女人视频的网站 享受老太太的丝袜 aaazhibuo 8dfvodcom成人 真实自拍足交 群交男女猛插逼 妓女爱爱动态 lin35com是什么网站 abp159 亚洲色图偷拍自拍乱伦熟女抠逼自慰 朝国三级篇 淫三国幻想 免费的av小电影网站 日本阿v视频免费按摩师 av750c0m 黄色片操一下 巨乳少女车震在线观看 操逼 免费 囗述情感一乱伦岳母和女婿 WWW_FAMITSU_COM 偷拍中国少妇在公车被操视频 花也真衣论理电影 大鸡鸡插p洞 新片欧美十八岁美少 进击的巨人神thunderftp 西方美女15p 深圳哪里易找到老女人玩视频 在线成人有声小说 365rrr 女尿图片 我和淫荡的小姨做爱 � 做爱技术体照 淫妇性爱 大学生私拍b 第四射狠狠射小说 色中色成人av社区 和小姨子乱伦肛交 wwwppp62com 俄罗斯巨乳人体艺术 骚逼阿娇 汤芳人体图片大胆 大胆人体艺术bb私处 性感大胸骚货 哪个网站幼女的片多 日本美女本子把 色 五月天 婷婷 快播 美女 美穴艺术 色百合电影导航 大鸡巴用力 孙悟空操美少女战士 狠狠撸美女手掰穴图片 古代女子与兽类交 沙耶香套图 激情成人网区 暴风影音av播放 动漫女孩怎么插第3个 mmmpp44 黑木麻衣无码ed2k 淫荡学姐少妇 乱伦操少女屄 高中性爱故事 骚妹妹爱爱图网 韩国模特剪长发 大鸡巴把我逼日了 中国张柏芝做爱片中国张柏芝做爱片中国张柏芝做爱片中国张柏芝做爱片中国张柏芝做爱片 大胆女人下体艺术图片 789sss 影音先锋在线国内情侣野外性事自拍普通话对白 群撸图库 闪现君打阿乐 ady 小说 插入表妹嫩穴小说 推荐成人资源 网络播放器 成人台 149大胆人体艺术 大屌图片 骚美女成人av 春暖花开春色性吧 女亭婷五月 我上了同桌的姐姐 恋夜秀场主播自慰视频 yzppp 屄茎 操屄女图 美女鲍鱼大特写 淫乱的日本人妻山口玲子 偷拍射精图 性感美女人体艺木图片 种马小说完本 免费电影院 骑士福利导航导航网站 骚老婆足交 国产性爱一级电影 欧美免费成人花花性都 欧美大肥妞性爱视频 家庭乱伦网站快播 偷拍自拍国产毛片 金发美女也用大吊来开包 缔D杏那 yentiyishu人体艺术ytys WWWUUKKMCOM 女人露奶 � 苍井空露逼 老荡妇高跟丝袜足交 偷偷和女友的朋友做爱迅雷 做爱七十二尺 朱丹人体合成 麻腾由纪妃 帅哥撸播种子图 鸡巴插逼动态图片 羙国十次啦中文 WWW137AVCOM 神斗片欧美版华语 有气质女人人休艺术 由美老师放屁电影 欧美女人肉肏图片 白虎种子快播 国产自拍90后女孩 美女在床上疯狂嫩b 饭岛爱最后之作 幼幼强奸摸奶 色97成人动漫 两性性爱打鸡巴插逼 新视觉影院4080青苹果影院 嗯好爽插死我了 阴口艺术照 李宗瑞电影qvod38 爆操舅母 亚洲色图七七影院 被大鸡巴操菊花 怡红院肿么了 成人极品影院删除 欧美性爱大图色图强奸乱 欧美女子与狗随便性交 苍井空的bt种子无码 熟女乱伦长篇小说 大色虫 兽交幼女影音先锋播放 44aad be0ca93900121f9b 先锋天耗ばさ无码 欧毛毛女三级黄色片图 干女人黑木耳照 日本美女少妇嫩逼人体艺术 sesechangchang 色屄屄网 久久撸app下载 色图色噜 美女鸡巴大奶 好吊日在线视频在线观看 透明丝袜脚偷拍自拍 中山怡红院菜单 wcwwwcom下载 骑嫂子 亚洲大色妣 成人故事365ahnet 丝袜家庭教mp4 幼交肛交 妹妹撸撸大妈 日本毛爽 caoprom超碰在email 关于中国古代偷窥的黄片 第一会所老熟女下载 wwwhuangsecome 狼人干综合新地址HD播放 变态儿子强奸乱伦图 强奸电影名字 2wwwer37com 日本毛片基地一亚洲AVmzddcxcn 暗黑圣经仙桃影院 37tpcocn 持月真由xfplay 好吊日在线视频三级网 我爱背入李丽珍 电影师傅床戏在线观看 96插妹妹sexsex88com 豪放家庭在线播放 桃花宝典极夜著豆瓜网 安卓系统播放神器 美美网丝袜诱惑 人人干全免费视频xulawyercn av无插件一本道 全国色五月 操逼电影小说网 good在线wwwyuyuelvcom www18avmmd 撸波波影视无插件 伊人幼女成人电影 会看射的图片 小明插看看 全裸美女扒开粉嫩b 国人自拍性交网站 萝莉白丝足交本子 七草ちとせ巨乳视频 摇摇晃晃的成人电影 兰桂坊成社人区小说www68kqcom 舔阴论坛 久撸客一撸客色国内外成人激情在线 明星门 欧美大胆嫩肉穴爽大片 www牛逼插 性吧星云 少妇性奴的屁眼 人体艺术大胆mscbaidu1imgcn 最新久久色色成人版 l女同在线 小泽玛利亚高潮图片搜索 女性裸b图 肛交bt种子 最热门有声小说 人间添春色 春色猜谜字 樱井莉亚钢管舞视频 小泽玛利亚直美6p 能用的h网 还能看的h网 bl动漫h网 开心五月激 东京热401 男色女色第四色酒色网 怎么下载黄色小说 黄色小说小栽 和谐图城 乐乐影院 色哥导航 特色导航 依依社区 爱窝窝在线 色狼谷成人 91porn 包要你射电影 色色3A丝袜 丝袜妹妹淫网 爱色导航(荐) 好男人激情影院 坏哥哥 第七色 色久久 人格分裂 急先锋 撸撸射中文网 第一会所综合社区 91影院老师机 东方成人激情 怼莪影院吹潮 老鸭窝伊人无码不卡无码一本道 av女柳晶电影 91天生爱风流作品 深爱激情小说私房婷婷网 擼奶av 567pao 里番3d一家人野外 上原在线电影 水岛津实透明丝袜 1314酒色 网旧网俺也去 0855影院 在线无码私人影院 搜索 国产自拍 神马dy888午夜伦理达达兔 农民工黄晓婷 日韩裸体黑丝御姐 屈臣氏的燕窝面膜怎么样つぼみ晶エリーの早漏チ○ポ强化合宿 老熟女人性视频 影音先锋 三上悠亚ol 妹妹影院福利片 hhhhhhhhsxo 午夜天堂热的国产 强奸剧场 全裸香蕉视频无码 亚欧伦理视频 秋霞为什么给封了 日本在线视频空天使 日韩成人aⅴ在线 日本日屌日屄导航视频 在线福利视频 日本推油无码av magnet 在线免费视频 樱井梨吮东 日本一本道在线无码DVD 日本性感诱惑美女做爱阴道流水视频 日本一级av 汤姆avtom在线视频 台湾佬中文娱乐线20 阿v播播下载 橙色影院 奴隶少女护士cg视频 汤姆在线影院无码 偷拍宾馆 业面紧急生级访问 色和尚有线 厕所偷拍一族 av女l 公交色狼优酷视频 裸体视频AV 人与兽肉肉网 董美香ol 花井美纱链接 magnet 西瓜影音 亚洲 自拍 日韩女优欧美激情偷拍自拍 亚洲成年人免费视频 荷兰免费成人电影 深喉呕吐XXⅩX 操石榴在线视频 天天色成人免费视频 314hu四虎 涩久免费视频在线观看 成人电影迅雷下载 能看见整个奶子的香蕉影院 水菜丽百度影音 gwaz079百度云 噜死你们资源站 主播走光视频合集迅雷下载 thumbzilla jappen 精品Av 古川伊织star598在线 假面女皇vip在线视频播放 国产自拍迷情校园 啪啪啪公寓漫画 日本阿AV 黄色手机电影 欧美在线Av影院 华裔电击女神91在线 亚洲欧美专区 1日本1000部免费视频 开放90后 波多野结衣 东方 影院av 页面升级紧急访问每天正常更新 4438Xchengeren 老炮色 a k福利电影 色欲影视色天天视频 高老庄aV 259LUXU-683 magnet 手机在线电影 国产区 欧美激情人人操网 国产 偷拍 直播 日韩 国内外激情在线视频网给 站长统计一本道人妻 光棍影院被封 紫竹铃取汁 ftp 狂插空姐嫩 xfplay 丈夫面前 穿靴子伪街 XXOO视频在线免费 大香蕉道久在线播放 电棒漏电嗨过头 充气娃能看下毛和洞吗 夫妻牲交 福利云点墦 yukun瑟妃 疯狂交换女友 国产自拍26页 腐女资源 百度云 日本DVD高清无码视频 偷拍,自拍AV伦理电影 A片小视频福利站。 大奶肥婆自拍偷拍图片 交配伊甸园 超碰在线视频自拍偷拍国产 小热巴91大神 rctd 045 类似于A片 超美大奶大学生美女直播被男友操 男友问 你的衣服怎么脱掉的 亚洲女与黑人群交视频一 在线黄涩 木内美保步兵番号 鸡巴插入欧美美女的b舒服 激情在线国产自拍日韩欧美 国语福利小视频在线观看 作爱小视颍 潮喷合集丝袜无码mp4 做爱的无码高清视频 牛牛精品 伊aⅤ在线观看 savk12 哥哥搞在线播放 在线电一本道影 一级谍片 250pp亚洲情艺中心,88 欧美一本道九色在线一 wwwseavbacom色av吧 cos美女在线 欧美17,18ⅹⅹⅹ视频 自拍嫩逼 小电影在线观看网站 筱田优 贼 水电工 5358x视频 日本69式视频有码 b雪福利导航 韩国女主播19tvclub在线 操逼清晰视频 丝袜美女国产视频网址导航 水菜丽颜射房间 台湾妹中文娱乐网 风吟岛视频 口交 伦理 日本熟妇色五十路免费视频 A级片互舔 川村真矢Av在线观看 亚洲日韩av 色和尚国产自拍 sea8 mp4 aV天堂2018手机在线 免费版国产偷拍a在线播放 狠狠 婷婷 丁香 小视频福利在线观看平台 思妍白衣小仙女被邻居强上 萝莉自拍有水 4484新视觉 永久发布页 977成人影视在线观看 小清新影院在线观 小鸟酱后丝后入百度云 旋风魅影四级 香蕉影院小黄片免费看 性爱直播磁力链接 小骚逼第一色影院 性交流的视频 小雪小视频bd 小视频TV禁看视频 迷奸AV在线看 nba直播 任你在干线 汤姆影院在线视频国产 624u在线播放 成人 一级a做爰片就在线看狐狸视频 小香蕉AV视频 www182、com 腿模简小育 学生做爱视频 秘密搜查官 快播 成人福利网午夜 一级黄色夫妻录像片 直接看的gav久久播放器 国产自拍400首页 sm老爹影院 谁知道隔壁老王网址在线 综合网 123西瓜影音 米奇丁香 人人澡人人漠大学生 色久悠 夜色视频你今天寂寞了吗? 菲菲影视城美国 被抄的影院 变态另类 欧美 成人 国产偷拍自拍在线小说 不用下载安装就能看的吃男人鸡巴视频 插屄视频 大贯杏里播放 wwwhhh50 233若菜奈央 伦理片天海翼秘密搜查官 大香蕉在线万色屋视频 那种漫画小说你懂的 祥仔电影合集一区 那里可以看澳门皇冠酒店a片 色自啪 亚洲aV电影天堂 谷露影院ar toupaizaixian sexbj。com 毕业生 zaixian mianfei 朝桐光视频 成人短视频在线直接观看 陈美霖 沈阳音乐学院 导航女 www26yjjcom 1大尺度视频 开平虐女视频 菅野雪松协和影视在线视频 华人play在线视频bbb 鸡吧操屄视频 多啪啪免费视频 悠草影院 金兰策划网 (969) 橘佑金短视频 国内一极刺激自拍片 日本制服番号大全magnet 成人动漫母系 电脑怎么清理内存 黄色福利1000 dy88午夜 偷拍中学生洗澡磁力链接 花椒相机福利美女视频 站长推荐磁力下载 mp4 三洞轮流插视频 玉兔miki热舞视频 夜生活小视频 爆乳人妖小视频 国内网红主播自拍福利迅雷下载 不用app的裸裸体美女操逼视频 变态SM影片在线观看 草溜影院元气吧 - 百度 - 百度 波推全套视频 国产双飞集合ftp 日本在线AV网 笔国毛片 神马影院女主播是我的邻居 影音资源 激情乱伦电影 799pao 亚洲第一色第一影院 av视频大香蕉 老梁故事汇希斯莱杰 水中人体磁力链接 下载 大香蕉黄片免费看 济南谭崔 避开屏蔽的岛a片 草破福利 要看大鸡巴操小骚逼的人的视频 黑丝少妇影音先锋 欧美巨乳熟女磁力链接 美国黄网站色大全 伦蕉在线久播 极品女厕沟 激情五月bd韩国电影 混血美女自摸和男友激情啪啪自拍诱人呻吟福利视频 人人摸人人妻做人人看 44kknn 娸娸原网 伊人欧美 恋夜影院视频列表安卓青青 57k影院 如果电话亭 avi 插爆骚女精品自拍 青青草在线免费视频1769TV 令人惹火的邻家美眉 影音先锋 真人妹子被捅动态图 男人女人做完爱视频15 表姐合租两人共处一室晚上她竟爬上了我的床 性爱教学视频 北条麻妃bd在线播放版 国产老师和师生 magnet wwwcctv1024 女神自慰 ftp 女同性恋做激情视频 欧美大胆露阴视频 欧美无码影视 好女色在线观看 后入肥臀18p 百度影视屏福利 厕所超碰视频 强奸mp magnet 欧美妹aⅴ免费线上看 2016年妞干网视频 5手机在线福利 超在线最视频 800av:cOm magnet 欧美性爱免播放器在线播放 91大款肥汤的性感美乳90后邻家美眉趴着窗台后入啪啪 秋霞日本毛片网站 cheng ren 在线视频 上原亚衣肛门无码解禁影音先锋 美脚家庭教师在线播放 尤酷伦理片 熟女性生活视频在线观看 欧美av在线播放喷潮 194avav 凤凰AV成人 - 百度 kbb9999 AV片AV在线AV无码 爱爱视频高清免费观看 黄色男女操b视频 观看 18AV清纯视频在线播放平台 成人性爱视频久久操 女性真人生殖系统双性人视频 下身插入b射精视频 明星潜规测视频 mp4 免賛a片直播绪 国内 自己 偷拍 在线 国内真实偷拍 手机在线 国产主播户外勾在线 三桥杏奈高清无码迅雷下载 2五福电影院凸凹频频 男主拿鱼打女主,高宝宝 色哥午夜影院 川村まや痴汉 草溜影院费全过程免费 淫小弟影院在线视频 laohantuiche 啪啪啪喷潮XXOO视频 青娱乐成人国产 蓝沢润 一本道 亚洲青涩中文欧美 神马影院线理论 米娅卡莉法的av 在线福利65535 欧美粉色在线 欧美性受群交视频1在线播放 极品喷奶熟妇在线播放 变态另类无码福利影院92 天津小姐被偷拍 磁力下载 台湾三级电髟全部 丝袜美腿偷拍自拍 偷拍女生性行为图 妻子的乱伦 白虎少妇 肏婶骚屄 外国大妈会阴照片 美少女操屄图片 妹妹自慰11p 操老熟女的b 361美女人体 360电影院樱桃 爱色妹妹亚洲色图 性交卖淫姿势高清图片一级 欧美一黑对二白 大色网无毛一线天 射小妹网站 寂寞穴 西西人体模特苍井空 操的大白逼吧 骚穴让我操 拉好友干女朋友3p